Download - Bits and Bytes
Page 1
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Bits and BytesChapter 1
Page 2
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits & Bytes
OROff On
Bit = Binary Digit (or Binary Digit) = {0, 1}
BitsBits
Assume you wish to send a message using a Light SwitchA binary condition since the light switch can be either:
Any binary condition can be represented with a single light switch :
Good OR Bad Yes OR No
Male OR Female Dead OR Alive
Page 3
Business Data Structures in C/C++ Kirs and Pflughoeft
But, what if there are more than two states? What if I want But, what if there are more than two states? What if I want to represent the conditions to represent the conditions GOOD, SO-SOGOOD, SO-SO, and B, and BADAD????????
Add more Light SwitchesSimple …..
If there are 2 light switches, the total combinations are:
#1 #2 #3 #4
Off0
Off0
Off0
On1
Off0
On1
On1
On1
Interpreted as:Bad
Interpreted as:So-So
Interpreted as:Good
Not Used
Page 4
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
If I can transmit 4 messages with two bits, how many could If I can transmit 4 messages with two bits, how many could I transmit if I had 3 bits? Or 4 bits?I transmit if I had 3 bits? Or 4 bits?
With 3-bits, there are 8 possible combinations:
000 100001 101010 110011 111
And, with 4-bits, there are 16 possible combinations:
0000 0100 1000 11000001 0101 1001 11010010 0110 1010 11100011 0111 1100 1111
Page 5
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Is there any way to know how many messages we could Is there any way to know how many messages we could transmit for a given number of bits without having to test transmit for a given number of bits without having to test all possible combinations??all possible combinations??
As in Decimal (base 10), it is possible to determine how many messages can be transmitted for any number of decimal places. In Binary (base 2), the same calculations are made, but using bits (instead of decimals).
Decimal Number Number Number Places Messages Bits Messages
0 100 = 1 0 20 = 1 1 101 = 10 1 21 = 2 2 102 = 100 2 22 = 4 3 103 = 1,000 3 23 = 8 4 104 = 10,000 4 24 = 16 5 105 = 100,00 5 25 = 32 6 106 = 1,000,000 6 26 = 64 7 107 = 10,000,000 7 27 = 128 8 108 = 100,000,000 8 28 = 256 9 109 = 1,000,000,000 9 29 = 512 10 1010 = 10,000,000,000 10 210 = 1,024
Page 6
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
The General formula is:The General formula is:
I = Bn where: I = The amount of Information (messages) available B = The base we are working in (Decimal or Binary) n = The number of digits (decimals or bits) we have
Applying the formula to both decimal and binary values:Applying the formula to both decimal and binary values:
100 = 1 20 = 1101 = 10 21 = 2102 = 100 22 = 4103 = 1,000 23 = 8104 = 10,000 24 = 16105 = 100,000 25 = 32106 = 1,000,000 26 = 64107 = 10,000,000 27 = 128108 = 100,000,000 28 = 256109 = 1,000,000,000 29 = 5121010 = 10,000,000,000 210 = 1,024
Page 7
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
What if I Know how much information (I = Number of What if I Know how much information (I = Number of Messages) I want to transmit. How do I determine the Messages) I want to transmit. How do I determine the number of bits I need?number of bits I need?
Just reverse the process.
If I = 10n (decimal) OR I = 2n (binary) then log(I) = n log(10) log(I) = n log(2)
log(I) log(I) log(I) log(I)And n = = n = =
log(10) 1.000 log(2) 0.30103 = log(I) Since 100.30103 = 2
Information Decimals Needed Bits Needed
10 log(10) = 1.000 log(10)/log(2) = 1.000/.30103 = 3.32 50 log(50) = 1.699 log(50)/log(2) = 1.699/.30103 = 5.64 100 log(100) = 2.000 log(100)/log(2) = 2.000/.30103 = 6.64 500 log(500) = 2.699 log(500)/log(2) = 2.699/.30103 = 8.97 1,000 log(1000) = 3.000 log(1000)/log(2) = 3.000/.30103 = 9.9710,000 log(10000) = 4.000 log(10000)/log(2) = 4.000/.30103 = 13.29
Page 8
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
How can we have partial bits (or decimals)? For example, How can we have partial bits (or decimals)? For example, how can we have 5.64 bits to represent 50 messages?how can we have 5.64 bits to represent 50 messages?
We Can’t
The formula given should have been:
log(I) log(I) Where: n = = log(2) 0.30103 is the ceiling of the result (i.e., rounded up)
And the number of bits needed would be:
Messages Bits Needed
10 log(10)/log(2) = 1.000/.30103 = 3.32 = 4
50 log(50)/log(2) = 1.699/.30103 = 5.64 = 6
100 log(100)/log(2) = 2.000/.30103 = 6.64 = 7
500 log(500)/log(2) = 2.699/.30103 = 8.97 = 9
1,000 log(1000)/log(2) = 3.000/.30103 = 9.97 = 10
10,000 log(10000)/log(2) = 4.000/.30103 = 13.29 = 14
Page 9
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Notice that we could have predicted that, for example, it would take 6 bits to represent 50 pieces of information since:
25 = 32 and 26 = 64
If we need 6 bits to represent 50 pieces of information, and we If we need 6 bits to represent 50 pieces of information, and we could represent 64 pieces of information, what happens to the could represent 64 pieces of information, what happens to the remaining 16 pieces of information??remaining 16 pieces of information??
They either remain unused, or are available for future use
Page 10
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
What does this have to do with Computers?What does this have to do with Computers?
If we were to look inside a computer (especially earlier ones) we might see a series of ‘doughnuts’:
Which were merely metal rings with wires running through them
Depending on whether there was voltage running through them or not (actually, high voltage or low voltage) the series represented a sequence of messages.
A BINARY SITUATION!
Page 11
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Notice that since there are 5 ‘doughnuts’, there are 25 or 32 Combinations
Where and represent the different voltage states
Page 12
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
How Many bits (or ‘doughnuts’) do we really need?How Many bits (or ‘doughnuts’) do we really need?
Good question! What symbols/information do we wish to convey?
Pieces of InformationThe digits (0, …, 9) 10 The alphabet (a, …, z) 26The upper case alphabet (A, …, Z) 26Special characters (! + ( ) . ? / * - % & # =, etc.) 32 (?)
94
Since: n = log(I)/log(2) = log(94)/log(2) = 1.973/0.301 = 6.55we need 7 bits, which we could have predicted since:
26 = 64 and 27 = 128
Page 13
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
What about the remaining 34 (128 - 94) bits?What about the remaining 34 (128 - 94) bits?
There are a number of additional special characters and a number of ‘hidden’ characters which we didn’t account for:
Carriage Return (CR)Back Space (BS)End of File (EOF)etc.
So the additional bits will be used.
Are 7 bits normally used to represent a character set?Are 7 bits normally used to represent a character set?
Yes. The Standard coding scheme consists of 128 characters
Page 14
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
LIAR!! LIAR!! PANTS ON FIRE !!!LIAR!! LIAR!! PANTS ON FIRE !!!
Yes - sort of.
BUT, the standard character still contains only 128 characters, which requires 7-bits
Then Why does a byte contain 8-bits?Then Why does a byte contain 8-bits?
Doesn’t a byte represent a character? And isn’t a byte equal Doesn’t a byte represent a character? And isn’t a byte equal to 8 bits, not 7?to 8 bits, not 7?
• 1-Byte = 8-bits• A Byte is used to represent a character• A Byte is the basic addressable unit in RAM
Page 15
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
There are a few reasons. Primarily, however, it is because earlier machines suffered some reliability problems (remember what the term debugging really means)1:
1 In the days of vacuum, tubes, bugs were attracted to the heat given off by the tubes. Programmers frequently spent much of their time scrapping dead bugs off the circuitry, or ‘de-bugging’.
There were problems with storage and data transmission
One additional bit was added to help detect errors:
The Parity BitThe Parity Bit
How does adding one additional bit help detect errors?How does adding one additional bit help detect errors?
Page 16
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Assume that we wished to send the series of bits:
1001100
But, because of transmission errors, actually sent the message:
1001101
How can we tell that an error was made? How do we know How can we tell that an error was made? How do we know that the sequence that the sequence 10011011001101 was not the true message? was not the true message?
As it stands now, we can’t.
Page 17
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
If we were to send a transmission using an extra bit:
1001100 1 Parity-Bit
We could determine if the message was correctly transmitted by counting the total number of on bits
E.G. If the total number of on bits is an EVEN number, the message was correctly transmitted.
Since the message sent contains 4 bits (an even number) the message sent was correct.
Page 18
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
IF, however, we received the message:
1001101 1 Parity-Bit
We know it is incorrect because the message contains 5 (an odd number) bits
Other examples using EVEN Parity:
Message Sent:
1101101 1
Mess. Received:
1101101 1
No. Bits:
6 (Even) Correct
0001100 0 0101100 0 3 (Odd) Incorrect
1101011 1 1001011 1 5 (Odd) Incorrect
0101110 0 1010110 0 4 (Even) Correct
Page 19
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
What gives? The last message:What gives? The last message:
Message Sent:
0101110 0
Mess. Received:
1010110 0
No. Bits:
4 (Even) Correct
Was Was NOTNOT correct, even though the total number of on bits correct, even though the total number of on bits received was even???received was even???
Yes - The system is NOT perfect, but if there are thousands or millions of messages sent, it is highly unlikely that mistakes will not be caught.
All it takes is one incorrect message.
Page 20
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Must Parity always be equal??Must Parity always be equal??
No, it can be ODD (or there can be NO parity). That decision is made by the software designer.
If we look at our previous examples using ODD parity:
Message Sent:
1101101 0
Mess. Received:
1101101 0
No. Bits:
5 (Odd) Correct
0001100 1 0101100 1 4 (Even) Incorrect
1101011 0 1001011 0 4 (Even) Incorrect
0101110 1 1010110 1 5 (Odd) Correct
Notice that errors can still go undetected.
Page 21
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
There was one other problem with bytes:There was one other problem with bytes:• CompatibilityCompatibilityGiven the
binary sequences:
0000000000000100000100000011
1111110111110111111101111111
Manufacturers Interpreted them differentlyManufacturers Interpreted them differently
Manufact.#1:
ABCD
6789
Manufact.#2:
0123
vxyz
Manufact.#3:
+-*?
TABCRLFFF
Page 22
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Which is the Correct Interpretation???Which is the Correct Interpretation???
Each is equally CorrectEach is equally Correct
• 0000010 CouldCould be either a ‘C’ OR a ‘2’• The letter ‘C’ CouldCould be pronounced either ‘cee’ OR ‘ess’
What’s the Solution ???What’s the Solution ???
ASCIIASCII
The AAmerican SStandard CCode for
IInformation IInterchange
Page 23
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Sample ASCII Codes:Sample ASCII Codes:Binary Sequence0000000
Value0
CharacterNULL
Description . NULL/Tape feed
0000111 7 BEL Rings Bell0001000 8 BS Back Space
0001101 13 CR Carriage Return
0011011 27 ESC Escape
0100000 32 SP Space
0110000 48 0 Zero0110001 49 1 One
1000001 65 A Capital ‘A’1000010 66 B Capital ‘B’
1100001 97 a Lower Case ‘a’1100010 98 b Lower Case ‘b’
Page 24
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
A Preview of Things to Come:A Preview of Things to Come:
For the first Exam MemorizeMemorize the Numeric Values for:
• NULL Value: 0• BEL (Ring The Bell) Value: 7 • BS (Backspace) Value: 8• CR (Carriage Return) Value: 13• ESC (Escape) Value: 27• SP (Space) Value: 32• The digits (0, 1, …, 9) NOTE: The Digit 0 (zero) has the value: 48• The Uppercase Alphabet NOTE: The Character ‘A’ has the value: 65• The Lowercase Alphabet NOTE: The Character ‘a’ has the value: 97
Page 25
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Are We limited to only 128 (= 2Are We limited to only 128 (= 277) characters ??) characters ??
• The STANDARD ASCII Character Set Consists of 128 Characters (as given in Addendum 1.1)
Yes and no:
There is an EXTENDED ASCII Character set which uses ALL 8-bits (1-byte) available (parity is NOT an issue)• The extended ASCII Character set consists of 256
(= 28) characters (See Addendum 1.2)
• The Majority of the characters included in the extended ASCII character set are extensions of the Greco-Roman Alphabet (e.g., ß, Ü, å) or ‘graphics’ characters (e.g., )
Page 26
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
What does the term ‘ASCII file’ Mean ??What does the term ‘ASCII file’ Mean ??An ASCII File assumes that every 8-bits (1-byte) in the file are grouped together according to the ASCII tables
Aren’t ALL Files ASCII Files ??Aren’t ALL Files ASCII Files ??
NONO - As we will see later, not all data is stored according to ASCII formats
That Helps (sort-of) to explain why when we display non-ASCII files we sometimes get characters such as , , , , , and
Page 27
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Do ALL computers use ASCII to Represent Do ALL computers use ASCII to Represent Symbols???Symbols???
NONO - Although most do.
IBM had the first Coding Scheme (dating back to 1880)
EBCDICEBCDIC
EExtended BBinary CCoded DDecimal IInterchange CCode
EBCDIC is still used (?) in IBM Mainframes and to store data on large reel-to-reel Tape Drives
Page 28
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
And so that’s it ??And so that’s it ??There is only ASCII and EBCDIC ??There is only ASCII and EBCDIC ??
Well, … NoWell, … NoIt became obvious that Even the Extended ASCII and Character Sets were insufficient
How So – Kimo Sabi ??How So – Kimo Sabi ??
Suppose you wanted to represent ALL the characters used by ALL the languages in the World ---
How Many Are there ????
I Don’t know, How Many ??I Don’t know, How Many ?? I Don’t know, Either --I Don’t know, Either --
But it’s a lot !!!
Page 29
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
Enter Unicode (1990):Enter Unicode (1990):If we were to use 16-bits, instead of 8, to represent characters we could represent:
216 = 65,536 Characters
AHA!! So Everyone is using Unicode now -- Right ??AHA!! So Everyone is using Unicode now -- Right ??
Well, … NoWell, … No
Well, why not ?? Well, why not ??
Life is not so simple …Life is not so simple …
Page 30
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
There are a lot of problems still be worked out:There are a lot of problems still be worked out:
• There is a lot of disagreement about what should be included
(Even though there are 65,536 combinations, you would be surprised at how quickly those combinations can be used up)
• The large number of characters in this set poses a severe problem for a font vendor
(No fonts – No Characters)
• By doubling the number of bits (or bytes), we are doubling the storage and processing requirements
• Result: It will take years to get this straightened out
Page 31
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
SO – What have we learned ????SO – What have we learned ????• What a bit is• How a bit corresponds to computer architecture• How combinations of bits can be used to store information• How to calculate how much information a given number of bits yields• How to calculate how many bits we need to store information• What a byte is and why it is 8-bits• What parity is and why it is/was necessary• What ASCII is and why it was developed• What EBCDIC is• What Unicode is and why it was developed• … And many other things in between …
Do I have to know this stuff ?? Do I have to know this stuff ??
Of Course not !! – I just like to waste my time Of Course not !! – I just like to waste my time andand yours !! yours !!
Page 32
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes
So what do we need to do ??So what do we need to do ??• Make sure you THOUROGHLY understand ALL of the
concepts covered in these slides
• Answer ALL of the relevant questions on the Review Page
• Memorize the assigned ASCII codes
• Submit your References
• Submit your Question(s)
• Look at the Bits/Bytes/ASCII C/C++ Programming Assignment (it’s not due yet, but it can’t hurt to look at it)
??? Any Questions ??? (Please!!)??? Any Questions ??? (Please!!)
Page 33
Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes