1 database technologies yelena yesha. 2 contents zdatabase overview zdbms zrelational data model...
TRANSCRIPT
1
Database Technologies
Yelena Yesha
2
Contents
Database OverviewDBMSRelational Data ModelSQLWeb-Database Connectivity
3
What is a Database
A system that stores data “persistent” – Exists beyond the
immediate useCentralized storage Single or multiple users
4
Advantages
Reduces redundancy Reduces inconsistency SharedData representation standards can be
enforcedEnables security restrictions Integrity maintained
Valid cross references between records
Allows data-independent applications Applications ignorant of how data is stored
5
DBMS
Database Management System Examples
Oracle DB2 Microsoft SQL Server Sybase
DBMS Users End users Programmers Database administrations
6
DBMS Features
Optimize Queries Manage memory Control concurrent data access
7
Advanced DBMS Features
Integrity maintenance Entity Integrity ( foreign key references) Referential Integrity (foreign key
references)Cascading deletes Set Null or default No action (command is rejected)
Access ControlGrant & Revoke
• Access to tables • Capability to perform operations (Insert, update,
delete, etc.)
8
Data Independence and Access Efficiency
DBMS allows to avoid rewriting all access routines every time the data format changes or data is added/modified/deleted insulate applications from data storage details. Logical independence: protection from changes
in logical structure of data Physical independence: protection from
changes in physical structure of data DBMS maintains data structures and
implements algorithms allowing to avoid linear search indexing: search in O(log n) fast access even on complex data queries
9
Concurrency Control and Data Integrity
DBMS insures semantically correct access to the same data by concurrent applications two programs accessing the same data at
the same time can result in an inconsistent update
implement sharing in a controlled manner Data semantics may require certain
constraints to be satisfied DBMS guarantees that application
programs comply with the constraints when adding/modifying the data
10
Reliability and Security
DBMS provides techniques for recovery from software and hardware failures guarantee survival of the data across
catastrophes.DBMS prevents unauthorized users
from accessing/modifying data or denying service to other users.
11
Data Distribution and Heterogeneity
Centralization is the enemy of scalability a vast number of modern applications are
distributedData sharing in a distributed environment
is a challengeHeterogeneity applies to networks,
hardware, operating systems, programming languages, data formats, etc.
Distributed applications must mask the differences
Need distributed data management
12
Categories of Data Models
High-level or conceptual entities, attributes, relationships
Representational or implementation or logical relational, network hierarchical, object-
oriented, object-relationalPhysical or low-level
data storage
13
3-schema Architecture
Physical level description of a database: how things are stored on disk:
files, record structures, indices, data structures for disk blocks, methodology for dealing with too long records, etc.
Conceptual level description of a database The description of application data (its
schema) using one of the traditional data models.
14
3-Schema Architecture (cont'd)
View-level description of a database What users of a particular application see
their own customized schema, e.g., for payroll, for the ticket agent, for a simulation program.
Multiple levels helps with data independence; helps with maintenance.
Many views, single logical and physical schema.
Levels of abstraction give data independence.
15
The Entity-Relational Model
Entity: a distinguishable object.Entity set: a set of entities all of the
same type.Attribute: a single property of an entity;
simple vs composite; single-valued vs multi-valued; stored vs derived; null values.
Domain: set of values permitted for that attribute.
16
The E-R Model (cont’d)
Relationship: an association between two or more entities.
Relationship set: a set of relationships all of the same type
There is no correct schema for a batch of data. Which schema is best depends on the application.
Many basic data modelling choices depend on an understanding of the application.
17
Data Model
Data model: notation for describing data, plus a set of operations used to manipulate that data. a set of primitives for defining the structure of a
DB; a set of operations for specifying the retrievals
and updates on a DB; relational, hierarchical, network, object-oriented.
18
The Relational Model (Codd 1970)
The relational data model is the most important data model currently existing.
Value-oriented, i.e., allows operations on relations whose results are relations, thus enables to combine operations. As opposed to object-oriented models, in which
Operations cannot be applied to the result of other operations;
The result of an operation may be a new data type, and operations may not be available for this type.
19
Domain and Relation
A domain is a set of atomic values.A relation is a finite subset of the Cartesian
product of a finite list of domains; relation is a set of tuples; order of tuples is irrelevant and no relation has 2 identical tuples; each tuple value is atomic
no composite attributes;no multi-valued attributes.
20
Relational Model (cont’d)
Everything is represented by relations– Formally: Given sets D1, D2, ....Dn (not necessarily
distinct), a relation R D1 X D2 X ...X Dn
– Di 's are the domains and n is the arity (degree) of R– elements of R are called tuples– number of tuples in R is the cardinality of R
relational data model helps to view a relation as a table
Observe the following properties:
21
Relational Model (cont’d)
Everything is represented by relations; Given sets D1, D2, ....Dn (not necessarily distinct), a relation R
D1 X D2 X ...X Dn; Di 's are the domains and n is the arity (degree) of R; elements of R are called tuples; number of tuples in R is the cardinality of R.
Relational data model helps to view a relation as a table: each row represents a tuple (record); each column represents an attribute (field).
Properties: no two rows are identical; the ordering of tuples is unimportant; the ordering of columns is important.
22
E-R to Relations (i.e., defining relations)
Done using DDL (Data Definition Language)
Name whole database schemaDeclare domains for attributesDefine relations:
name attribute names and domains primary and other keys foreign keys
23
Translating from E-R
Represent entity set E by a relation whose attributes are all the E-R attributes of E. Then each tuple represents one entity of E.
To represent relation R between entity sets E1, …, Ek, create relation R with key attributes of E1, …, key attributes of Ek, as attributes (rename duplicates). Each tuple of the relation represents one combination of entities that are related to one another.
You might have some redundant relations, which you can delete.
24
Schema Normalization
Formal theory of database designbased on grouping attributes in a
particular way using attribute dependencies to achieve ‘good’ schemas
1NF, 2NF, 3NF, BCNF, 4NF, …Goal:
don’t store redundant information can represent everything (otherwise, the
schema is useless!)
25
Query and Update Languages
DDL : data definition language used by DBA to define schemas, create views, create
indicesDML : data manipulation language
used by sophisticated casual user to query data or update data
26
Relational Query Languages
Query languages allow manipulation and retrieval of data from a database.
Relational model supports simple, powerful query languages; strong formal foundation based on logic; allows for optimization.
Two mathematical languages form the basis for relational languages (e.g., SQL) and for implementation: Relational Algebra: More operational, useful for representing
execution plans; Relational Calculus: Lets users describe what they want, rather
than how to compute it (non-operational, declarative). Basic operations:
selection, projection, cross-product, set-difference, union, intersection, join, division
27
SQL
Originally “ Structured Query Language” Common language for interacting with relational
databases Syntax
SELECT column_nameFROM table_nameWHERE where_clause
SQL is now the query language for IBM's DB2 and the de-facto standard on most commercial RDBMS.
SQL is a comprehensive language providing statements for data definition, query and update.
28
Using SQL
Interactively Via a command line program
Embedded SQL Embedded complete SQL statement in code Precompiler modifies the source code
Dynamic SQL Embedded in scripts or other programs Dynamically interpreted Variables bound at run time
29
SQL Schema
EMP(Name,SSN,DNO,BirthPlace) DEPT(DName,DNO,MGRSSN) PROJECT(PName,PNO,PLocation,DNum) WORKSON(ESSN,PNO,Hours)
CREATE SCHEMA 'COMPANY';
CREATE TABLE EMP( EName name_dom NOT NULL, SSN CHAR(9) NOT NULL, DNO INTEGER NOT NULL, BirthPlace city_dom, PRIMARY KEY(SSN), FOREIGN KEY (DNO) REFERENCES DEPT (DNO));
30
Data Definition Language
DDL is used to define the (schema of) database to create a database schema; to create a domain; to create, drop. alter a table; to create, remove an index [defunct in SQL2]; to create or drop a view; to define integrity constraints; to define access privileges to users (Oracle: CONNECT,
RESOURCE, DBA); to GRANT or REVOKE privileges ON/TO object/user
SQL2 supports multiple schemas CREATE SCHEMA name AUTHORIZATION user;– CREATE SCHEMA EMPLOYEE AUTHORIZATION yesha;
31
Web Server to Database Connectivity
The CGI-Based Approach:
Web Browser
WebServer
CGIHTTP DB Protocol
DatabaseServer
- Easy to implement- HTTP server dependent
- Lack of user access control
- Lack of presentation graphics
32
Web Server to Database Connectivity (cont’d)
The Java-Based Approach:
Applet
Gateway Server
WebServer
Web BrowserHTTP
TCP/IP DB protocolDatabase
Server
- Higher development complexity- True session based database access- User access control through database server
- Presentation graphics
33
Web-Database Connectivity : Middleware
Overview of the Web database connectivity, and focus on the middleware of the Web database Trinity – Three tires of Web Database :
use to build a Web databaseWeb ServerWeb application server (middleware)Database(see picture)
34
How a user interacts with a Web Database
In a Web browser, a user submits a request to the Web server.
The Web server passes it onto the middleware
The middleware writes the request in SQL queries and sends it to a back-end database.
The data retrieved are handed back to the middleware
The middleware generates a Web page for the data
The Web server sends the Web page to the browser
The browser displays the Web page in front of the user