chapter 3

34
Chapter 3 Data Formats

Upload: svea

Post on 16-Mar-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Chapter 3. Data Formats. Data Conversion. Types of Data. Alphanumeric Numeric Image Sound Video Graphics and Fonts All MUST be converted to binary form before they can be used by the computer. Proprietary Formats. Used by individual programs Unique: cannot be read by other programs - PowerPoint PPT Presentation

TRANSCRIPT

Chapter 3

Data Formats

Data Conversion

Types of Data

• Alphanumeric• Numeric• Image• Sound• Video• Graphics and Fonts

All MUST be converted to binary form before they can be used by the computer.

Proprietary Formats

Used by individual programs Unique: cannot be read by other programs Example - a Microsoft WORD document is

different from a WordPerfect document even through they are both word processing documents

As more info is shared over the Web, new proprietary formats are less desirable.

Standards

Formats which are recognized by a large variety of hardware and software programs

Possible to use on different platforms Possible for different systems to share

data

How Are Standards Developed?

People use it because it is the most popular - so-called, defacto standard.• Adobe’s PostScript

A committee is formed because a need for a standard is recognized• MPEG2 JPEG• MP3

Characteristics of Standards

A well designed data standard should:• simplify interconnections• usefully reflect the ways the data is

used• be recognized by a wide variety of

applications

Data Formats

Alphanumeric Character Data• EBCDIC• ASCII

Image data • Bit map images• Object images

Audio data • .MOD• .MIDI• .WAV

Alphanumeric CharactersASCII

American Standard Code for Information Interchange (pg. 67)

7-bit code => 128 characters Includes Latin alphabet, Arabic numerals,

standard English punctuation characters, and a few non-printable characters

Collating sequence Most personal computers use ASCII Latin-I ASCII - 8-bit code for Western European

cultures

Alphanumeric CharactersEBCDIC

Extended Binary Coded Decimal Interchange Code (pg. 68)

8-bit code => 256 characters Developed by IBM Use is generally restricted to IBM and

IBM-compatible mainframe computers Limited: no ~ [ ] ^ representations

Alphanumeric CharactersUNICODE

Supports many additional character alphabets including Cyrillic, Greek, Hebrew, Arabic, Thai, Chinese, Japanese, Korean, etc.

16-bit standard => 65,536 characters (49,000 of which are defined)

Code values of 0-255 corresponds to ASCII Latin-I codes => conversion from ASCII to Unicode is simple (just append 8 0’s to ASCII code)

Unicode is standard in Windows OS

Keyboard Input

Keyboard Conversion

A binary scan code is generated via the keyboard circuitry for each key stroke

Computer software converts scan code to Unicode or ASCII or EBCDIC

NOTE: This permits different keyboards to be used for different natural languages

Characters are echoed to monitor

Image Data

Bit Map (Raster) Image - image is represented by a set of picture elements (pixels - points)

Object (Vector) Image - image is represented by a set of graphical shapes such as lines and curves

Bit Map Images

A pixel, a single point of the image, is stored as binary data

Each pixel contains intensity level and color - both of which may have large ranges of values (requiring perhaps up to 3 bytes/pixel)

Single image requires large amounts of data - 1.5+ MB of data

Image processing requires large arrays of data - representing pixels and their locations

Graphics Interchange Format GIF - most common method of storing bit map

images CompuServe proprietary format - 1987 GIF89a - latest standard also supporting

animated GIF images GIF used extensively on Web 8-bits/pixel - 256 colors (not good for details of

paintings and photographs) GIF is better suited for line drawings and

simple images such as clip art, logos, and areas with solid colors

GIF Screen Layout

GIF File Format Layout

Joint Photographers Expert Group

JPEG suited to photographs and paintings. 24-bits per pixel - 16.7 million colors Lossy Compression - assumes some data can

be lost without noticing• subtle color changes will not show• some clarity is lost in order to have a

smaller file very small file sizes

Object Images Image is composed of geometric

shapes represented mathematically Efficient Flexible Stored in compact form Images can be easily moved, scaled,

rotated

An Object Image from Textbook

PostScript

Page Description Language for storing, transmitting, displaying and printing object images.

Contains procedures and statements that describe each of the objects on a page.

Program is stored in ASCII or Unicode and thus is a text file.

Program in printer or computer interprets PostScript file and creates pages that can be printed or displayed.

Large set of PostScript functions for manipulating objects.

PostScript Program for Drawing Concentric Circles

PostScript Program for Drawing Pie Segments

Video Images Video images require massive amounts of

data. Video Camera: 640x480 resolution, true color,

30 frames/second => 28 MB/sec of data => 1.6GB for 1-minute of film

Solution to massive files:• limit number of colors• limit image size• reduce frame rate• compress data

Video Compression

MPEG - family of digital video compression standards

high compression achieved by storing only what changes from frame to frame

file sizes still large and take a long time to download

additional hardware support provided for displaying real-time video

Audio Data Analog sound wave must be converted to

digital form Analog waveform is sampled at regular time

intervals Amplitude is measured and converted to the

binary equivalent (A-to-D converter) Sampling rate - how often the sound is

digitized (50,000 times/sec) Higher rates - better quality and more storage

space

Digitizing Audio Waveform

Audio Formats

MOD - store samples of sounds that are subsequently used to produce a new sound

MIDI - used to coordinate sounds and signals between computer and musical devices (such as keyboards)

MP3 - used for transmission and storage of high-quality audio signals. Popular for Web use. Low-cost, portable devices available for handling MP3 data.

WAV - simple Microsoft format supporting various sampling rates in mono or stereo

MP3 Popular file format for compressing and

playing CD-quality music 12:1 compression with no loss in quality Has caused much controversy in the music

world Procedure: encode a song, distribute over the

Internet, download to a PC, transfer to a $200 portable MP3 player or listen on computer

Over 80 MP3 sites and 20,000 songs and growing

WAV Files Some are found on your hard drive Also can be downloaded from the Web Create your own WAV file from an audio

CD Uses for Sound Files

• to signify an event on the system (I.e. a file opening)

• add sound to your web page

WAV Sound Format

Streaming Audio and Video

Streaming - data is downloaded continuously from web server or network server

Solution to large file size download time - start the download and play while more is being downloaded in the background

RealAudio and RealVideo most common players - over 125 Million registered users

Delivers content to 85% of all streaming enabled web sites

Also used for live broadcasts

Data Compression Lossless Compression (GIF Files)

• no data is lost when decompressed• compression algorithms attempt to

eliminate redundancies in data, such as a string of 0-bits

Lossy Compression (JPEG)• assume user can live with a certain amount

of data degradation• used in multimedia applications