©silberschatz, korth and sudarshan1.1 chapter 1: introduction purpose of database systems view of...

24
©Silberschatz, Korth and Sudarsha 1.1 Chapter 1: Introduction Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database Users Database Management System Structure

Upload: mervin-burke

Post on 27-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.1

Chapter 1: IntroductionChapter 1: Introduction

Purpose of Database Systems

View of Data

Data Models

Data Definition Language

Data Manipulation Language

Transaction Management

Storage Management

Database Administrator

Database Users

Database Management System Structure

Page 2: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.2

What is a Database?What is a Database?

According to the book: Collection of interrelated data

Set of programs to access the data

A DBMS contains information about a particular enterprise

DBMS provides an environment that is both convenient and efficient to use.

Another definition: A database is a collection of organized, interrelated data, typically

relating to a particular enterprise

A Database Management System (DBMS) is a set of programs for managing and accessing databases

Page 3: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.3

Some PopularSome PopularDatabase Management SystemsDatabase Management Systems

Commercial “off-the-shelf” (COTS): Oracle IBM DB2 (IBM) SQL Server (Microsoft) Sybase Informix (IBM) Access (Microsoft)

Open Source: MySQL PostgreSQL

Note: This is not a course on any particular DBMS!

Page 4: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.4

Some Database ApplicationsSome Database Applications

Databases touch all aspects of our lives: Banking – accounts, loans, customers Airlines - reservations, schedules Universities - registration, grades Sales - customers, products, purchases Manufacturing - production, inventory, orders, supply chain Human resources - employee records, salaries, tax deductions

Anywhere there is data, there could be a database.

Course context is an “enterprise” that has requirements for: Storage and management of 100’s of gigabytes or terabytes of data Support for 100’s or more of concurrent users and transactions Traditional supporting platform, e.g, Sun Enterprise server, 2GB RAM, 10TB

of disk space

Page 5: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.5

Purpose of Database SystemPurpose of Database System

In the early days, database applications were built on top of file systems – coded from the ground up. Sometimes this approach is still advocated.

Drawbacks of this approach: Data redundancy and inconsistency Multiple files and formats Difficulty accessing data

• A new program is required to carry out each new task Data integrity problems

• Integrity constraints (e.g. account balance > 0) become embedded throughout program code

• Hard to add new constraints or change existing ones Plus others…

Page 6: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.6

Purpose of Database Systems (Cont.)Purpose of Database Systems (Cont.)

Database systems offer solutions for the above problems.

Database systems also support: Atomicity of updates

Failures may leave database in an inconsistent state with partial updates

• E.g., transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users

Concurrent accessed is needed for performance

Uncontrolled concurrent accesses can lead to inconsistencies

• E.g. two people reading a balance and updating it at the same time

Data security

Recoding this functionality from scratch is not easy!

Page 7: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.7

Levels of AbstractionLevels of Abstraction

Physical level: defines low-level details about how a data item (e.g., customer) is stored on disk.

Logical level: describes data stored in a database, and the relationships among the data.

type customer = recordname : string;street : string;city : integer;

end;

View level: defines how information is presented to users. Views can also hide details of data types, and information (e.g., salary) for security purposes.

Page 8: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.8

View of DataView of Data

An “architecture” for a database system:

Page 9: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.9

View of Data, Cont.View of Data, Cont.

In general, the interfaces between the three levels should be defined so that changes in some parts do not seriously influence others.

Carefully defining the interfaces helps enhance Physical Data Independence, which is the ability to modify the physical schema without changing the logical schema.

Page 10: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.10

Instances vs. SchemasInstances vs. Schemas

Similar to types and variables in programming languages

Schema – defines the structure or design of a database Analogous to type information of a variable in a program

E.g., a database might consists of information about a set of customers and accounts and the relationship between them

More precisely:

Logical schema: database design at the logical level

Physical schema: database design at the physical level

Instance – a database and its’ contents at one point in time Analogous to a variable and its’ value

Page 11: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.11

What is a Data Model?What is a Data Model?

The phrase “data model” is used in a couple of different ways.

The phrase (use #1) is frequently used to refer to an overall approach or philosophy for database design and development.

For those individuals, groups and corporations that subscribe to a specific data model, that model permeates all aspects of database design, development, implementation, etc.

Current data models: Entity-Relationship model Relational model Object-oriented model Object-relational model Semi, and non-structured data models

Legacy models: Network Hierarchical

Page 12: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.12

What is a Data Model, Cont?What is a Data Model, Cont?

During the early phases of database design and development, a “data model” is frequently developed (use #2).

The purpose of developing the data model is to define: Data Relationships between data items Semantics of data items Constraints on data items

In other words, a data model defines the logical schema, i.e., the logical level of design of a database.

A data model is typically conveyed as one or more diagrams.

The type of diagrams used depends on the overall approach or philosophy (i.e., the data model, as defined in the first sense).

This early phase is referred to as data modeling.

Page 13: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.13

Entity-Relationship ModelEntity-Relationship Model

Example of an entity-relationship diagram:

Widely used for database modelling. An ER model is converted to tables in a relational

database.

Page 14: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.14

Relational ModelRelational Model

Example of tabular data in the relational model

From a data modeling perspective, which approach is preferable? The ER model, or the relational model?

Attributes

customer-namecustomer-id customer-street customer-city account-number

Johnson

Smith

Johnson

Jones

Smith

192-83-7465

019-28-3746

192-83-7465

321-12-3123

019-28-3746

Alma

North

Alma

Main

North

Palo Alto

Rye

Palo Alto

Harrison

Rye

A-101

A-215

A-201

A-217

A-201

Page 15: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.15

A Sample Relational DatabaseA Sample Relational Database

Page 16: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.16

Data Definition Language (DDL)Data Definition Language (DDL)

Notation and program for defining a (physical) database schema (see page 129 for a more complete example):

create table account ( account-number char(10), balance integer)

Given a DDL file, the DDL compiler generates a set of tables

A description of those tables is stored in a data dictionary: Contains information from the database schema Frequently referred to as metadata (i.e., data about data)

Data storage and definition language: Language in which the storage structure and access methods used by the

database system are specified Usually an extension of the data definition language

Page 17: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.17

Data Manipulation Language (DML)Data Manipulation Language (DML)

Language for accessing and manipulating the data DML is also known as query language

Two classes of DML languages: Procedural – user specifies what data is required and how to get

that data

Non-procedural – user specifies what data is required without specifying how to get that data

SQL is the most widely used query language Usually referred to as a non-procedural query language

Page 18: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.18

SQL ExamplesSQL Examples

Find the name of the customer with customer-id 192-83-7465:

select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’

Find the balances of all accounts held by the customer with customer-id 192-83-7465:

select account.balancefrom depositor, accountwhere depositor.customer-id = ‘192-83-7465’ and

depositor.account-number = account.account-number

Databases are typically accessed by: Users through a command line interface Application programs that (generally) access them through:

• Language extensions to allow embedded SQL

• An application program interface (e.g. ODBC/JDBC) which allow SQL queries to be sent to a database

Page 19: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.19

Database UsersDatabase Users

Users are differentiated by the way they interact with the system

Naïve users – invoke application programs that have been written previously E.g. people accessing a database over the web, bank tellers, clerical staff,

ATM users

Application programmers – interact with the system by making DML calls through an API, e.g., ODBC or JDBC from within a computer program

Sophisticated users – form requests in a database query language, typically submitted at the command-line

Specialized users – write specialized database applications that do not fit into the traditional data processing framework

Page 20: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.20

Database Administrator (DBA)Database Administrator (DBA)

Coordinates all the activities of the database system; the DBA has a good understanding of the enterprise’s information resources and needs.

DBA duties include: Granting user authority to access the database Acting as liaison with users Installing and maintaining DBMS software Monitoring performance and performance tuning Backup and recovery

According to the book, the DBA is also responsible for: Schema definition Storage structure and access method definition Schema and physical organization modification Specifying integrity constraints Responding to changes in requirements

These latter tasks are typically performed by a DB designer.

Page 21: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.21

Transaction ManagementTransaction Management

A transaction is a collection of operations that performs a single logical function in a database application

The backup and recovery components of a DBMS ensure that the database remains in a consistent (correct) state despite failures: system, power, network failures

operating system crashes

transaction failures.

The concurrency-control manager in a DBMS controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Page 22: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.22

Storage ManagementStorage Management

The storage manager in a DBMS provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

The storage manager is responsible to the following tasks: interaction with the file manager

efficient storing, retrieving and updating of data

Note that the DBMS may or may not make use of the facilities of the operating systems’ file management facilities.

Page 23: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.23

Overall System Structure Overall System Structure

Query Optimizer

Page 24: ©Silberschatz, Korth and Sudarshan1.1 Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation

©Silberschatz, Korth and Sudarshan1.24

Application ArchitecturesApplication Architectures

Architectures: Mainframe – client programs and DBMS reside on one platform.

Two Tier – client programs and DBMS reside on different platforms; clients connect to DBMS via an API such as ODBC/JDBC.

Three Tier – client programs, application server (or other “middleware”), and DBMS; clients connect to DBMS indirectly through the application server (also via an API). Typically used in web-based applications.

N Tier – recent generalization of 2 and 3 tier architectures.