dale & lewis chapter 3 data representation. analog and digital information the real world is...
TRANSCRIPT
![Page 1: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/1.jpg)
Dale & Lewis Chapter 3Data Representation
![Page 2: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/2.jpg)
Analog and digital information
• The real world is continuous and finite, data on computers are finite need to approximate real-world data for our computational needs
• Analog data: information represented in a continuous form• Digital data: information represented in digital form
![Page 3: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/3.jpg)
Analog and digital information
![Page 4: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/4.jpg)
Noise in signals
![Page 5: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/5.jpg)
Digitizing a signal
• Sample the signal in time within discrete levels• The pieces are numbered• The binary number system is used to represent the
numbers• n bits can represent 2n numbers• Q: how many bits are needed to represent m numbers?• Actual number of bits that can be easily addressed in a
computer sets some constraints
![Page 6: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/6.jpg)
Representing text
• English language character set: 26 letters (both upper and lower case), punctuation, numeric digits, etc
• How many bits can we use?• What about other languages?
![Page 7: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/7.jpg)
ASCII character set
• American Standard Code for Information Interchange• Each character is coded as a byte (8 bits)• 7-bit code (1 check bit)
− Later all 8 bits used in the “extended character set”− 128 characters encoded (27)
95 visible characters 33 invisible (control) characters
![Page 8: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/8.jpg)
7-bit ASCII character set
![Page 9: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/9.jpg)
ASCII Table
• The table above was sorted in decimal values• These decimal values are really representing binary
sequences• So the character J is in position 74
− This would be 01001010 in Binary or 4A in Hexadecimal
− j in 106 is 01101010 in Binary or 6A in Hexadecimal
− Notice anything? There is a purpose for that!
• The Unicode character set− 16-bit standard, 65,536 possible codes− Enough to cover the principal languages of the World− Superset of ASCII so the first 256 codes of Unicode are the
same as Extended ASCII
![Page 10: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/10.jpg)
Text compression
• Keyword encoding− Substitute frequently used words with single characters− i.e.: “as” ^, “the” ~, “and” +, “that” $, etc.− Problems:
These characters can’t be part of the text Frequently used words tend to be short, so not much gain Word variations are not handled: i.e. “The” vs. “the”
![Page 11: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/11.jpg)
Run-length encoding
• Replace long series of a repeated character with a special short code− i.e.: replace “AAAAAAA” with *A7− This is equivalent to 01000001 01000001 01000001 01000001 01000001 01000001 01000001 with 00101010 01000001 00000111
− Note that repetitions shorter than 4 characters are not worth encoding
− Also note that the repetition number is encoded in binary, not ASCII, so that repetitions longer than 9 can be captured
• Used in limited-palette image compression and fax machines
![Page 12: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/12.jpg)
Huffman encoding
• Generalization of Morse Code• Morse code (dots & dashes) is based on distribution of
letters in general English usage• Huffman encoding in based on distribution in a given
message• Algorithm:
− Encoding: Build frequency table of letter usage Build the code and encode the message
− Decoding Huffman code has the prefix property Prefix property: no code is the front part of another code Decoding processes the bit stream until a match
is found
![Page 13: Dale & Lewis Chapter 3 Data Representation. Analog and digital information The real world is continuous and finite, data on computers are finite need](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649dc65503460f94abaa4c/html5/thumbnails/13.jpg)
Example of Huffman encoding/decoding
• Message: DOORBELL• Encoding: 1011110110111101001100100• Compression ratio (vs ASCII): 25/64 = 0.39• Decode: