introduction to databases

50
1 Introduction to Databases Lecture One Relational and Object-Oriented Database Concepts Introduction to Database Design and Techniques

Upload: keith-lawson

Post on 30-Dec-2015

23 views

Category:

Documents


0 download

DESCRIPTION

Introduction to Databases. Lecture One Relational and Object-Oriented Database Concepts Introduction to Database Design and Techniques. Examples of Database Applications. Purchases from the supermarket Purchases using your credit card Booking a holiday at the travel agents - PowerPoint PPT Presentation

TRANSCRIPT

1

Introduction to Databases

Lecture One

Relational and Object-Oriented Database Concepts

Introduction to Database Design and Techniques

2

Examples of Database Applications

Purchases from the supermarket Purchases using your credit card Booking a holiday at the travel agents Using the local library Taking out insurance Using the Internet Studying at university

3

File-Based Systems

Collection of application programs that perform services for the end users (e.g. reports).

Each program defines and manages its own data.

4

Limitations of File-Based Approach

Separation and isolation of data» Each program maintains its own set of data.» Users of one program may be unaware of

potentially useful data held by other programs.

Duplication of data» Same data is held by different programs.» Wasted space and potentially different values

and/or different formats for the same item.

5

Database Approach

Arose because:» Definition of data was embedded in application

programs, rather than being stored separately and independently.

» No control over access and manipulation of data beyond that imposed by application programs.

Result: » the database and Database Management System

(DBMS).

6

Database

Shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization.

System catalog (metadata) provides description of data to enable program–data independence.

Logically related data comprises entities, attributes, and relationships of an organization’s information.

7

Database Approach

Controlled access to database may include:» A security system.» An integrity system.» A concurrency control system.» A recovery control system.» A user-accessible catalog.

A view mechanism.» Provides users with only the data they want or need to

use.

8

Views

Allows each user to have his or her own view of the database.

A view is essentially some subset of the database.

9

Views

Benefits include:» Reduce complexity;» Provide a level of security;» Provide a mechanism to customize the

appearance of the database; » Present a consistent, unchanging picture of the

structure of the database, even if the underlying database is changed.

10

History of Database Systems

First-generation » Hierarchical and Network

Second generation» Relational

Third generation» Object-Oriented

11

The DBMS Marketplace

Relational DBMS companies – Oracle, Sybase – are among the largest software companies in the world.

IBM offers its relational DB2 system. With IMS, a non-relational system, IBM is by some accounts the largest DBMS vendor in the world.

Microsoft offers SQL-Server, plus Microsoft Access for the cheap DBMS on the desktop

12

Terminology

Database: persistent collection of data Database Management System (DBMS): software

that controls access to the database Database Administrator (DBA): person who controls

database Data Model: general structure of the data in the

database Data Language: commands used to define the data

model and give users access to the database

13

Utility of Databases

Data has value independent of use Organized approach to data management Eliminate redundancy in data Share data Archive data Security of data Integrity of data

14

DB Terms and Techniques

Database access is a key feature of current enterprise computing

Relational DB: tables To link/merge tables and extract/write information:

Structured Query Language (SQL) – language of all modern databases (but many dialects)

SQL is transparent; operates with statements like SELECT, INSERT, DELETE, etc.

SQL provides its result sets in table format

15

DB and the Internet

One vs. multiple user access

Internet browsers make it easy to access database programs (compared with traditional client/server programs)

16

Relational Database Model

Database» Database is a collection of tables (relations)» Data are stored in tables

Tables» Each table has a name » Each table has a set of columns (fields) and rows of data

(records)» Each table has a fixed number of columns » Each table has an arbitrary number of rows

Based on set theory SQL (Structured Query Language)

» DBMS independent language

17

Database Columns (Fields)

Columns » Each column has a name» Columns are accessed by name» No standard column ordering » Data in a column belongs to a particular domain

– Columns are the “attributes” of the dataset– Each value in a column is from the same domain – Each value in a column is of the same data type

18

Database Rows (Records)

Rows » Each row entry is either a simple value or empty ("null") » Rows are sets of values for the columns (attribute values) » Primary key: a set of columns that uniquely identifies each row» Each row must be unique given the primary key (no duplicates) » Rows are referenced by the primary key» Row order cannot be determined by the user» Does not make sense to say “the fourth row” like it does in a

“paper” table or spreadsheet

19

Data Types

Each row value is an instance of a primitive data type » Integer » Real (e.g., number, currency» Character (e.g., text, hyperlink, yes/no)» Date/Time

No complex types in standard DBMS (matrix, drawing) » MS Access will allow drawings and some objects» Object oriented databases may allow objects and structures

Non existent value is “null”

20

Database Design

Database design deals with how to design a database Importance of Good Design

» Poor design results in unwanted data redundancy» Poor design generates errors leading to bad decisions

Practical Approach» Focus on principles and concepts of database design» Importance of logical design

21

Database Design Goals

Create a balanced design which is good for all users Based on a set of assumptions about the world being

modeled Determine the data to be stored Determine the relations among the data Determine the operations to be performed Specify the structure of the tables

22

Database Design Process

1. Identify all the objects, entities, and attributes

2. Identify all the dependencies, draw a dependency diagram

3. Design tables to represent the data items and dependencies

4. Verify the design

5. Implement the database

6. Design the queries

7. Test and revise

23

Identify All Objects and Entities

Determine the objects of your Database For each object, describe each entity to be stored

» example: better to store first name and last name separately

Determine the data type for each item» text, currency, date, etc.

Determine the range of allowable values for each item » non-negative?» greater than zero?» decimal points?» any of the 50 state abbreviations» zip code between 00000 and 99999» phone number

24

Turn Data Items into Attributes

Each attribute should have:» a meaningful name» a description of what the attribute means or what

kind of data make up the attribute» a domain

– the data type of the attribute– the range or a list of allowable values of the attribute

25

Identify All the Dependencies

Assume a set of relationships between data items» a model of the world» may have to make assumptions» these assumptions should be listed clearly

Turn these relationships into dependencies» single-valued : there is one and only one value of ‘x’ for every value of ‘y’

– a person Y receives a grade X for a course in a semester– a person Y has a birth date X

» multi-valued : there are zero (or one) or more values of ‘x’ for every value of ‘y’– a student Y enrolls in one or more classes (X) each semester– a person Y has zero or more sisters

Draw a dependency diagram

26

Single-Valued (One-to-One) Dependencies

Draw a single-headed arrow for single-valued dependencies

PERSONBIRTHDATE

STUDENTFINAL

COURSE GRADE

a person has one and only one birth date

a student has one and only one final grade for a course

27

Multi-Valued (One-to-Many) Dependencies

Draw a double-headed arrow between multi-valued dependencies

STUDENTCLASSES

PERSON SISTERS

a student can enroll in one or more classes

a person has zero or more sisters

28

Independent vs. Dependent Attributes

Some attributes are independent» E.g., in a business – client relationship, your client’s phone

number does not depend on when you are scheduled to meet him

» your client still exists whether or not you have an appointment with him

Some attributes are dependent» the length of a side rails on a bridge is dependent on the

structure of the bridge» the side rails of a bridge would not exist if the bridge itself

was not there

29

Dependent vs. Independent Attribute Representation

Start a new bubble around an independent attribute» properties of that attribute are attached to the new bubble» properties that are dependent on other attributes are attached to the

old bubble» Each appointment is with one or more clients. Each appointment

with one or more clients has a time. Each client has a single phone number.

CLIENT PHONE NUMBERAPPOINTMENT

TIME

30

Design the Tables

Draw a dependency diagram Each dependency statement is a part of the

diagram Each statement is a single path through the

diagram Tables are formed by traversing the dependency

diagram

31

Traversing the Dependency Diagram

Choose an attribute at the end of a path Follow the chain of arrows upwards

» each multi-valued dependency on the path becomes a primary key for the table

» combine all single-valued attributes at first level up into a single table

» all attributes on the path should be included in the table» stop when you reach a bubble that has no arrows coming

into it» each path becomes a separate table

Mark off your traversed path Repeat until all paths have been traversed

32

Verify the design

Inspect your tables» are all of the data included?

Do you have too many tables? too few? If your design does not appear correct

» go back to step 1» you must repeat all steps of process in order» do not try to “rearrange” dependency diagram to

give you the tables you think you should have

33

Common Database Design Mistakes

Assuming the order of rows and columns is known» this is not a spreadsheet!» do not assume sorted order unless you explicitly sort

Guessing the design, not following the process Storing what you can compute (when the value will change) – transitive

dependency» e.g., do not store age if you are already storing birth date

Represent multi-valued dependencies in fixed size sets» if you know that there are exactly X number of something, create X single-

valued dependencies, otherwise use multi-valued dependency Adding a key when a unique value exists

» adding an ID number for each person when you are already storing their social security number

34

Results

If you follow the process correctly» you will not have redundant data» you will not lose unrelated data when you delete

values Databases with these characteristics are called

3NF (Third Normal Form) databases Normalization to be discussed later in the course

35

Goal: Build a Business Application

Tools:Database DesignSQL (queries)Programming

Des

ign

SQ

L

Pro

gram

Des

ign

SQ

L

Pro

gram

Best:Spend your timeon design and SQL.

Worst:Compensate for poor designand limited SQL with programming.

36

Application Development

Feasibility Identify scope, costs, and schedule

Implementation Transfer data, install, train, review

Development Create forms, reports, and help; test

Design Define tables, relationships, forms, reports

Analysis Gather information from users

tasks

time

37

DBMS Features/Components

Database engine» Storage» Retrieval» Update

Query Processor Data dictionary Utilities Security

Report writer Forms generator

(input screens) Application generator Communications Programming

Interface

38

DBMS Engine, Security, Utilities

DataTables

DatabaseEngine

ProductItemID Description887 Dog food946 Cat food

OrderOrderID ODate9874 3-3-979888 3-9-97

CustomerCustomerID Name1195 Jones2355 Rojas

ProductItemID Integer, UniqueDescription Text, 100 char

CustomerCustomerID Integer, UniqueName Text, 50 char

SecurityUser IdentificationAccess Rights

Utilities

Concurrency andLock Manager

Backup andRecovery

Administration

DataDictionary

39

Database Tables (MS Access)

40

Database Tables (Oracle)

41

DBMS Report Writer

All Data

Database EngineData Dictionary

Query Processor

Report Writer

ReportFormat

and Query

42

Report Writer (Oracle)

43

DBMS Input FormsAll Data

Database EngineData Dictionary

Query Processor

Form Builder

InputForm

Design

44

DBMS Components

All Data

Database EngineData Dictionary

Security

Query Processor

FormBuilder

ReportWriter

CommunicationNetwork

3GLConnector

Program

ApplicationGenerator

45

Relational Database

Customer(CustomerID, Name, …

Order(OrderID, CustomerID, OrderDate, …

ItemsOrdered(OrderID, ItemID, Quantity, …

Items(ItemID, Description, Price, …

46

Object-Oriented DBMS

CustomerCustomerIDName… Add CustomerDrop CustomerChange Address

OrderOrderIDCustomerID… NewOrderDeleteOrder…

OrderItemOrderIDItemID… OrderItemDropOrderItem…

ItemItemIDDescription… New ItemSell ItemBuy Item …

GovernmentCustomer

ContactNameContactPhoneDiscount, …

NewContact

CommercialCustomer

ContactNameContactPhone…

NewContact

47

Objects

Object Definition» Object Name» Properties» Methods

Inheritance» Combine into one

table.» Use multiple tables

and link by primary key.– More efficient.– Need to add rows

to many tables.

Customer

CustomerIDAddressPhone

AddCustomerDropCustomer

Class name

Properties

Methods

Commercial

ContactVolumeDiscount

ComputeDiscount

Government

ContactBalanceDue

BillLateFeesAddCustomer

Inheritance

48

Objects in a Relational Database

CustomerIDAddressPhone

Customer

CustomerIDContactVolumeDiscount

CommercialCustomer

CustomerIDContactBalanceDue

GovernmentCustomer

Separate inherited classes. Link by primary key. Adding a new customer

requires new rows in each table.

49

OO Difficulties: Methods

Database Object

CustomerMethod:

Add New CustomerApplication

CustomerNameAddressPhone

Personal Computer

Unix Server

IBM Server

Program code

Database Object

How can a method run on different computers?

Different processors use different code.

Possibility: Java

50

End of Lecture