1 mis 304 winter 2006 bits, bytes, file systems data modeling and databases

33
1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

Upload: brandon-lucas

Post on 29-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

1

MIS 304 Winter 2006

Bits, Bytes, File Systems Data Modeling and Databases

Page 2: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

2

1

Class Objectives:

• What a database is, what it does, and why database design is important

• How modern databases evolved from files and file systems

• About flaws in file system data management• What a DBMS is, what it does, and how it fits into

the database system.• Describe types of database systems and

database models:– Files - Hierarchical– Network - Relational– Object Oriented - Tagged

Page 3: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

3

1

DATA

• The basic element of data is the BIT.– Represented by a ON/OFF or 0/1 relationship.

• Other ways of thinking about it.– Represents a binary choice between events.

Shannon

– A way to draw a distinction between two things. G. Spencer-Brown

• Bits can be defined to produce more complex choices.

• Signaling methods with bits proceeded computer technology by several centuries.

Page 4: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

4

1

BITS• You can also think of the state of one or more

bits as defining a probability that an event will occur.

• This lead to what became known as Information Theory.– Defined by Claude Shannon of Bell Labs in 1949

who used it to define how to code signals on a noisy phone line.

• The amount of “Information” can be expressed in “BITS” according to the formula.

H = n log s

n=number of symbols selecteds=the number of symbols in the set

Page 5: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

5

1

OT: Entropy

• This formula blew peoples minds because it reminded them so much of a law of Boltzman’s Law in classical physics.

S = K log W

S = Entropy or the measure of “disorder in the system”

K = a constant (Boltzman’s constant)

W = the probability of a given state

Page 6: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

6

1

Information and Entropy

• When we think of Information in the modern sense we think of it as a measure of how much “Order” we can see in a system.

• Entropy is the flip side, or how much “Disorder” there is in a system.

• Databases create order out of random data and so increase the amount of Information and reduce Entropy.

Page 7: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

7

1

BYTES• 7 bits can support up to 128 combinations.

– 0000000 thru 1111111– These 128 combinations can code the 26 upper

case letters, 26 lower case letters, 10 numbers, 32 symbols (+=!@#$%^&…), and 34 control codes (bell, cr, lf…)

• You can tack on 1 bit to create a “test” bit or “parity” bit. 8 bits is what most PCs use.

• The letters and numbers stored by the computer are made up of these bytes.

• To get to the number of combinations you need to describe eastern character sets (Chinese) requires two bytes per character.

Page 8: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

8

1

Early Data Management

• Almost immediately computer scientists began seeking ways to organize the data they were accumulating.

• How many computer programs require no “data”?

Page 9: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

9

1

Introducing the Database

• Data versus Information– Data constitute building blocks of

information

– Information produced by processing data

– Information reveals meaning of data

– Good, timely, relevant information key to decision making

– Good decision making key to organizational survival

Page 10: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

10

1

Database Management

• Database is shared, integrated computer structure housing:– End user data

– Metadata

• Database Management System (DBMS)– Manages Database structure

– Controls access to data

– Can support a query language

Page 11: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

11

1

Importance of DBMS

• Makes data management more efficient and effective

• Query language allows quick answers to ad hoc queries

• Provides better access to more and better-managed data

• Promotes integrated view of organization’s operations

• Reduces the probability of inconsistent data

Page 12: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

12

1

DBMS Manages Interaction

Figure 1.2

Page 13: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

13

1

Database Design

• Importance of Good Design– Poor design results in unwanted data

redundancy

– Poor design generates errors leading to bad decisions

• Practical Approach– Focus on principles and concepts of

database design

– Importance of logical design

Page 14: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

14

1

Historical Roots of Database

• First applications focused on clerical tasks

• Requests for information quickly followed• File systems developed to address needs

– Data organized according to expected use

– Data Processing (DP) specialists computerized manual file systems

Page 15: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

15

1

File Terminology

• Data – Raw Facts

• Field– Group of characters with specific meaning

• Record– Logically connected fields that describe a

person, place, or thing

• File– Collection of related records

Page 16: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

16

1

Simple File System

Figure 1.5

Page 17: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

17

1

File System Critique

• File System Data Management– Requires extensive programming

in third-generation language (3GL)

– Time consuming

– Makes ad hoc queries impossible

– Leads to islands of information

Page 18: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

18

1

File System Critique (con’t.)

• Data Dependence– Change in file’s data characteristics

requires modification of data access programs

– Must tell program what to do and how

– Makes file systems cumbersome from programming and data management views

• Structural Dependence– Change in file structure requires

modification of related programs

Page 19: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

19

1

File System Critique (con’t.)

• Field Definitions and Naming Conventions– Flexible record definition anticipates

reporting requirements

– Selection of proper field names important

– Attention to length of field names

– Use of unique record identifiers

Page 20: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

20

1

File System Critique (con’t.)

• Data Redundancy– Different and conflicting versions of same

data– Results of uncontrolled data redundancy

• Data anomalies– Modification– Insertion– Deletion

• Data inconsistency– Lack of data integrity

Page 21: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

21

1

Database Systems

• Database consists of logically related data stored in a single repository

• Provides advantages over file system management approach– Eliminates inconsistency, data anomalies,

data dependency, and structural dependency problems

– Stores data structures, relationships, and access paths

Page 22: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

22

1

Database vs. File Systems

Figure 1.6

Page 23: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

23

1

Database System Environment

Figure 1.7

Page 24: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

24

1

Database System Types

• Single-user vs. Multiuser Database– Desktop

– Workgroup

– Enterprise

• Centralized vs. Distributed• Use

– Production or transactional

– Decision support or data warehouse

Page 25: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

25

1

DBMS Functions

• Data dictionary management• Data storage management• Data transformation and

presentation• Security management• Multiuser access control• Backup and recovery management• Data integrity management• Database language and application

programming interfaces • Database communication interfaces

Page 26: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

26

1

Database Models

• Collection of logical constructs used to represent data structure and relationships within the database– Conceptual models: logical nature of data

representation

– Implementation models: emphasis on how the data are represented in the database

Page 27: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

27

1

• Relationships in Conceptual Models– One-to-one (1:1)

– One-to-many (1:M)

– Many-to-many (M:N)

• Implementation Database Models– Hierarchical

– Network

– Relational

– Object Oriented

– Tagged

Database Models (con’t.)

Page 28: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

28

1

Hierarchical Database Model

• Logically represented by an upside down tree– Each parent can have many children

– Each child has only one parent

Page 29: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

29

1

Hierarchical Database Model

• Advantages– Conceptual simplicity– Database security and integrity– Data independence– Efficiency

• Disadvantages– Complex implementation– Difficult to manage and lack of standards– Lacks structural independence– Applications programming and use

complexity– Implementation limitations

Page 30: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

30

1

Network Database Model• Each record can have multiple parents

– Composed of sets

– Each set has owner record and member record

– Member may have several owners

Figure 1.10

Page 31: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

31

1

Network Database Model

• Advantages– Conceptual simplicity– Handles more relationship types– Data access flexibility– Promotes database integrity– Data independence– Conformance to standards

• Disadvantages– System complexity– Lack of structural independence

Page 32: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

32

1

Other Models

• Object Oriented • Relational• Tagged (XML, HTML)• Associative

• We will talk about all these models in detail later in the class.

Page 33: 1 MIS 304 Winter 2006 Bits, Bytes, File Systems Data Modeling and Databases

33

1

Conclusion

• Organizing and managing data is essential to running a modern organization.

• History has taught us a number of lessons about how to apply certain techniques and “models” to particular kinds of problems.

• Identifying the appropriate model for your particular problem and objective is key to successful implementation.