chapter 1: introduction and basic concepts ( [s] chp . 1)

45
©Silberschatz, Korth and Sudarsha 1.1 Database System Concepts Chapter 1: Introduction and Chapter 1: Introduction and Basic concepts ( [S] chp. 1) Basic concepts ( [S] chp. 1) Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database Users Overall System Structure

Upload: lysandra-bullock

Post on 04-Jan-2016

28 views

Category:

Documents


3 download

DESCRIPTION

Chapter 1: Introduction and Basic concepts ( [S] chp . 1). Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database Users Overall System Structure. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.1Database System Concepts

Chapter 1: Introduction and Basic Chapter 1: Introduction and Basic concepts ( [S] chp. 1)concepts ( [S] chp. 1)

• Purpose of Database Systems

• View of Data

• Data Models

• Data Definition Language

• Data Manipulation Language

• Transaction Management

• Storage Management

• Database Administrator

• Database Users

• Overall System Structure

Page 2: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.2Database System Concepts

• A database represents some aspect of the real world, sometimes called the mini-world or the Universe of Discourse (UoD).

• A database is a logically coherent collection of data with some inherit meaning.

A random assortment of data cannot correctly be referred to as a database.

• A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested

Page 3: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.3Database System Concepts

What Is a Database?What Is a Database?

• A very large, integrated collection of data.

• Models real-world enterprise. Entities (e.g., students, courses)

Relationships (e.g., Madonna is taking CS564)

• A Database Management System (DBMS) is a software package designed to store and manage databases.

Page 4: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.4Database System Concepts

Database Management System (DBMS)Database Management System (DBMS)

Collection of interrelated data

Set of programs to access the data

DBMS provides an environment that is both convenient and efficient to use.

Database Applications:• Banking: all transactions

• Airlines: reservations, schedules

• Universities: registration, grades

• Sales: customers, products, purchases

• Manufacturing: production, inventory, orders, supply chain

• Human resources: employee records, salaries, tax deductions

Databases touch all aspects of our lives

Page 5: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.5Database System Concepts

Purpose of Database SystemPurpose of Database System

• In the early days, database applications were built on top of file systems

• Drawbacks of using file systems to store data: Data redundancy and inconsistency

Multiple file formats, duplication of information in different files

Difficulty in accessing data

Need to write a new program to carry out each new task

Data isolation — multiple files and formats

Integrity problems

Integrity constraints (e.g. account balance > 0) become part of program code

Hard to add new constraints or change existing ones

Page 6: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.6Database System Concepts

Purpose of Database Systems (Cont.)Purpose of Database Systems (Cont.)

• Drawbacks of using file systems (cont.) Atomicity of updates

Failures may leave database in an inconsistent state with partial updates carried out

E.g. transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users

Concurrent accessed needed for performance

Uncontrolled concurrent accesses can lead to inconsistencies

E.g. two people reading a balance and updating it at the same time

Security problems

• Database systems offer solutions to all the above problems

Page 7: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.7Database System Concepts

Why Use a DBMS?Why Use a DBMS?

• Separation of the Data definition and the Program

• Abstraction into a simple model

• Data independence and efficient access.

• Reduced application development time – ad-hoc queries

• Data integrity and security.

• Uniform data administration.

• Concurrent access, recovery from crashes.

• Support for multiple different views

Page 8: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.8Database System Concepts

Why Study Databases??Why Study Databases??

• Shift from computation to information at the “low end”: scramble to webspace (a mess!)

at the “high end”: scientific applications

• Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, EOS

project

... need for DBMS exploding

• DBMS encompasses most of CS OS, languages, theory, “AI”, multimedia, logic

?

Page 9: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.9Database System Concepts

Levels of AbstractionLevels of Abstraction

• Many views, single conceptual (logical) schema and physical schema.

Views describe how users see the data.

Conceptual schema defines logical structure. Sometime we separate between conceptual level and logical level

Physical schema describes the files and indexes used.

Schemas are defined using DDL (Data Definition Language)data is modified/queried using DML (Data Manipulation Language)

Physical Schema

Conceptual Schema

View 1 View 2 View 3

Page 10: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.10Database System Concepts

Levels of AbstractionLevels of Abstraction

• Physical level describes how a record (e.g., customer) is stored.

• Logical level: describes data stored in database, and the relationships among the data.

type customer = recordname : string;street : string;city : integer;

end;

• View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.

Page 11: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.11Database System Concepts

Instances and SchemasInstances and Schemas

• Similar to types and variables in programming languages

• Schema – the logical structure of the database e.g., the database consists of information about a set of customers and

accounts and the relationship between them)

Analogous to type information of a variable in a program

Physical schema: database design at the physical level

Logical schema: database design at the logical level

• Instance – the actual content of the database at a particular point in time Analogous to the value of a variable

Page 12: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.12Database System Concepts

Page 13: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.13Database System Concepts

majorclassstudent numbernamestudentcosc117smithcosc28brown

depertmentCradit hourscoursenumbercourseNamecoursecosc4cosc1310intro to comduts science

cosc4cosc3320data structures

math3math2410discosc3cosc3380database

rerequisite number

coursenumberprerequisite

cosc3320cosc3380math2410cosc3330cosc1310cosc3320

instructoryearsemestercoursenumbersectionldentifiersectionking86fallmath241085

anderson86fallcosc131092kuuth87springcosc3320102chang87fallmath2410112

anderson87fallcosc1310119stone87fallcosc3380135

gradesectionldentifierstudent numbergrade_reportB11217C11917A858A928B1028A1358

Page 14: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.14Database System Concepts

Data ModelsData Models

• A collection of modeling tools for describing data data relationships data semantics data constraints

• Entity-Relationship model

• Relational model

• Other models: object-oriented model semi-structured data models (XML) Older models: network model and hierarchical model

Page 15: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.15Database System Concepts

Entity-Relationship ModelEntity-Relationship Model

Example of schema in the entity-relationship model

Page 16: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.16Database System Concepts

Entity Relationship Model (Cont.)Entity Relationship Model (Cont.)

• E-R model of real world Entities (objects)

E.g. customers, accounts, bank branch

Relationships between entities

E.g. Account A-101 is held by customer Johnson

Relationship set depositor associates customers with accounts

• Widely used for database design Database design in E-R model usually converted to design in the

relational model (coming up later) which is used for storage and processing

Page 17: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.17Database System Concepts

Relational ModelRelational Model

• Example of tabular data in the relational model

customer-name

Customer-idcustomer-street

customer-city

account-number

Johnson

Smith

Johnson

Jones

Smith

192-83-7465

019-28-3746

192-83-7465

321-12-3123

019-28-3746

Alma

North

Alma

Main

North

Palo Alto

Rye

Palo Alto

Harrison

Rye

A-101

A-215

A-201

A-217

A-201

Attributes

Page 18: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.18Database System Concepts

A Sample Relational DatabaseA Sample Relational Database

Page 19: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.19Database System Concepts

Physical (Storage) schema decisionsPhysical (Storage) schema decisions

• Mapping of entities to files (OS files)

• Data representation and encoding (compression)

• Access methods (Direct, Hashing, Indexed)

• Which indexes to maintain

• Clustering of records

• OS/DBMS issues (buffer management)

Page 20: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.20Database System Concepts

External (View) schema decisionsExternal (View) schema decisions

• Which entities to present/filter

• Data representation and encoding (compression)

• Programming language dependent issues

• Changes to names, order of attributes

• Derived (computed) fields and joined tables

Page 21: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.21Database System Concepts

majorclassstudent numbernamestudentcosc117smithcosc28brown

depertmentCradit hourscoursenumbercourseNamecoursecosc4cosc1310intro to comduts science

cosc4cosc3320data structures

math3math2410discosc3cosc3380database

rerequisite number

coursenumberprerequisite

cosc3320cosc3380math2410cosc3330cosc1310cosc3320

instructoryearsemestercoursenumbersectionldentifiersectionking86fallmath241085

anderson86fallcosc131092kuuth87springcosc3320102chang87fallmath2410112

anderson87fallcosc1310119stone87fallcosc3380135

gradesectionldentifierstudent numbergrade_reportB11217C11917A858A928B1028A1358

Page 22: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.22Database System Concepts

(*) Not relational…

Page 23: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.23Database System Concepts

Data IndependenceData Independence

• Physical Data Independence – the ability to modify the physical schema without changing the application programs Applications depend on the logical schema

DBA may change physical level (tuning) without affecting applications

The DBMS automatically make the required adjustments, and application programs are not changed (queries may need to be recompiled and optimized…)

• Logical Data Independence – the ability to modify the logical schema without changing the application programs Applications depend on the logical schema via the Views

Can be supported on a limited basis only (if view is not affected)

Page 24: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.24Database System Concepts

Data Definition Language (DDL)Data Definition Language (DDL)

• Specification notation for defining the database schema E.g.

create table account ( account-number char(10), balance integer)

• DDL compiler generates a set of tables stored in a data dictionary

• Data dictionary contains metadata (i.e., data about data) database schema

Data storage and definition language

language in which the storage structure and access methods used by the database system are specified

Usually an extension of the data definition language

Page 25: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.25Database System Concepts

Data Manipulation Language (DML)Data Manipulation Language (DML)

• Language for accessing and manipulating the data organized by the appropriate data model A declarative DML is also known as query language

• Two classes of languages Procedural – user specifies what data is required and how to get

those data (DML)

Nonprocedural – user specifies what data is required without specifying how to get those data (Query language)

• SQL is the most widely used query language

Page 26: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.26Database System Concepts

SQLSQL

• SQL: widely used non-procedural language E.g. find the name of the customer with customer-id 192-83-7465

select customer.customer-namefrom customerwhere customer.customer-id = ‘192-83-7465’

E.g. find the balances of all accounts held by the customer with customer-id 192-83-7465

select account.balancefrom depositor, accountwhere depositor.customer-id = ‘192-83-7465’ and depositor.account-number = account.account-

number

• Application programs generally access databases through one of Language extensions to allow embedded SQL

Application program interface (e.g. ODBC/JDBC) which allow SQL queries to be sent to a database

Page 27: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.27Database System Concepts

Database UsersDatabase Users

• Users are differentiated by the way they expect to interact with the system

• Application programmers – interact with system through DML calls

• Sophisticated users – form requests in a database query language

• Specialized users – write specialized database applications that do not fit into the traditional data processing framework

• Naïve users – invoke one of the permanent application programs that have been written previously E.g. people accessing database over the web, bank tellers, clerical

staff

Page 28: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.28Database System Concepts

Database AdministratorDatabase Administrator

• Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise’s information resources and needs.

• Database administrator's duties include: Schema definition Storage structure and access method definition Schema and physical organization modification Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in

requirements

Page 29: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.29Database System Concepts

Structure of a DBMSStructure of a DBMS

• A typical DBMS has a layered architecture.

• The figure does not show the concurrency control and recovery components.

• This is one of several possible architectures; each system has its own variations.

Query Optimizationand Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

These layersmust considerconcurrencycontrol andrecovery

Page 30: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.30Database System Concepts

Transfer money from: account A to: account B

SUBTRACT 100 FROM A

ADD 100 TO B

End Transaction

Abort, Commit, Rollback

Begin Transaction

CRASH!

Page 31: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.31Database System Concepts

READ # SEATS

# SEATS = SEATS –1

WRITE # SEATS

READ # SEATS

#SEATS = #SEATS – 1

WRITE # SEATS

Solution: Two-Phase locking

Page 32: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.32Database System Concepts

Overall System Structure Overall System Structure

Page 33: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.33Database System Concepts

Storage ManagementStorage Management

• Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

• The storage manager is responsible to the following tasks: interaction with the file manager

efficient storing, retrieving and updating of data

Page 34: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.34Database System Concepts

Concurrency ControlConcurrency Control

• Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important

to keep the cpu humming by working on several user programs concurrently.

• Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed.

• DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

Page 35: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.35Database System Concepts

Transaction ManagementTransaction Management

• A transaction is a collection of operations that performs a single logical function in a database application

• Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.

• Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Page 36: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.36Database System Concepts

Transaction: An Execution of a DB ProgramTransaction: An Execution of a DB Program

• Key concept is transaction, which is an atomic sequence of database actions (reads/writes).

• Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins. Users can specify some simple integrity constraints on the data, and the

DBMS will enforce these constraints.

Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed).

Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility!

Page 37: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.37Database System Concepts

Scheduling Concurrent TransactionsScheduling Concurrent Transactions

• DBMS ensures that execution of {T1, ... , Tn} is equivalent to some serial execution T1’ ... Tn’. Before reading/writing an object, a transaction requests a lock on the object,

and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2PL locking protocol.)

Idea: If an action of Ti (say, writing X) affects Tj (which perhaps reads X), one of them, say Ti, will obtain the lock on X first and Tj is forced to wait until Ti completes; this effectively orders the transactions.

What if Tj already has a lock on Y and Ti later requests a lock on Y? (Deadlock!) Ti or Tj is aborted and restarted!

Page 38: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.38Database System Concepts

The importance of the Data DictionaryThe importance of the Data Dictionary

• Contains all definitions: DDL (logical schema), Views definition, Physical schema definitions including Indexing and clustering information, Integrity constraints, security rules, stored procedures (SQL)

• Essential for query parsing and optimization

• Contains other important documentation and programs (regulations, standards, codes, etc.)

• There are companies who sell Data Dictionary tools as a separate product!

Page 39: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.39Database System Concepts

•Logical Design and Data-Dictionary Tools

•Loading

•Physical Design and File reorganization

•Backup / Restore / Recovery

•Performance Monitoring and Tuning

Page 40: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.40Database System Concepts

Application ArchitecturesApplication Architectures

Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a databaseThree-tier architecture: E.g. web-based applications, and applications built using “middleware”

Page 41: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.41Database System Concepts

•Hierarchical – Pre-historic – IMS

•Network – Historic –IDMS, ADABAS, lead to Object- Oriented

•RELATIONAL- current – 95% of the market – Oracle, Informix, SQL/ Server, Progress, IBM DB2, etc.

•Object- ORIENTED Current – lot of HuHa but very narrow market, mainly CAD AND Engineering – Objectivity, Versant, Jasmine

•Object – Relational- Current / Future – SQL3, Informix UDO , Oracle-9, IBM DB2.

•XML – not much commercial success as a Database, in-spite of much research

•Cloud and NOSQL databases

Page 42: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.42Database System Concepts

PRE-1960S

1945-magnetic tapes developed (the first medium to allow searching).

1957- First commercial computer installed.

1959- McGee proposed the notion of generalized access to electronically stored data.

THE 60s

1961- The first generalized DBMS-GEs Integrated Data Store (IDS) designed by Bachman.

THE 70s – database technology experienced rapid growth.

1970- The relational model is developed by Ted Codd, an IBM research fellow.

1971- CODASYL Database Task Group Report.

1975- ACM Special Interest Group on Management of data organized first SIGMOD international conference.

1976- Entity- relationship (ER)model introduced by chen.

THE 80s- DBMSs developed for personal computers (DBASE, PARADOX, etc).

1983 -ANSI/SPARC survey revealed>100 relational systems had been implemented by the beginning of the 80s.

Page 43: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.43Database System Concepts

1985- Preliminary SQL standard published. Business world influenced by “Fourth Generation Languages”.

*Trends in the ‘80s: extendable database systems:object- oriented DBMSs, client server architecture for distributed database.

The ’90s

* Demand for extending DBMS capabilities to meet new applications.

* Emergence of commercial object- oriented DBMSs.

* Demand for exploiting massively parallel processors (MPPs).

•Total victory by the relational model

•SQL 3

•Object relational systems.

The ’00s

•The emergence of XML and the integration of XML and Relational databases

•Web databases, Search engines, Semantic web

•Cloud and NOSQL Databases

Page 44: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.44Database System Concepts

Databases make these folks happy ...Databases make these folks happy ...

• End users and DBMS vendors

• DB application programmers E.g. smart webmasters

• Database administrator (DBA) Designs logical /physical schemas

Handles security and authorization

Data availability, crash recovery

Database tuning as needs evolve

Must understand how a DBMS works!

Page 45: Chapter 1:  Introduction and Basic concepts (  [S]  chp . 1)

©Silberschatz, Korth and Sudarshan1.45Database System Concepts

SummarySummary

• DBMS used to maintain, query large datasets.

• Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security.

• Levels of abstraction give data independence.

• A DBMS typically has a layered architecture.

• DBAs hold responsible jobs and are well-paid!

• DBMS R&D is one of the broadest, most exciting areas in CS.

• Advanced databases course at the graduate level