D b dDatabase: Introduction
1
Database SystemsDatabase SystemsImportant topic in the study of information systemsinformation systems
Store/track items: scalar data (names, dates, …), pictures, audio, videoE l l l Early applications: internal usesInternet: publishing database content to users
Database design and developmentDatabase design and developmentUnderstand, gather and organize user requirementsqTransform the designs into physical databases (applications)
2
What is a Database System?What is a Database System?An electronic filing cabinetA ll ti f d t hi h iA collection of data which is:
IntegratedCollect/unify data from a number of different Collect/unify data from a number of different sources
SharedAccess by more than one person/application
Advantages:Data is shared, redundancy can be reduced, and integrity maintained
3
Description of System
requirements
Dataanalysis
ER Data modelanalysis model
Data LogicalLook for entities Data
designLogical
Data modelentities, relationships and attributes
C ER
Physical D t d l
PhysicalDatabase
Convert ER model (with cardinalities) i t t bl Data modelb se
designinto table definitions
Choose appropriate
4
pp pcolumn types and constraints
El m ts f E tit R l ti ship Elements of Entity-Relationship ModelModel
Entities, attributes, identifiers, and relationshipsrelationshipsEntity:
Thing or object, whether real or imagined, Thing or object, whether real or imagined, about which information needs to be known/trackedExample:Example:
A customer called Peter Lam is an entityEntity can be
Weak entity: existence-dependent on other entitiesParent and child
Strong entity: non-dependent on other entities
5
g y p
El m ts f E tit R l ti ship Elements of Entity-Relationship Model: EntityModel Entity
Entity class/type:A collection that share
CUSTOMER entity class t iA collection that share
common properties or characteristics
contains:CustIDFirstName
UPPER CASE (noun)Entity instance:
R f
FirstNameLastNamephoneNo
Representation of a particular entity
Many instances in an Two instances:
Many instances in an entity class 1 Peter Lam 2134
2 Jenny Chan 1234
6
El m ts f E tit R l ti ship Elements of Entity-Relationship Model: AttributesModel Attributes
Attributes:Properties
attributeValue of theProperties
describe the entity’s characteristicsValue of the attribute “phoneNo”
CustID FirstName LastName phoneNoCustID FirstName LastName phoneNo
1 Peter Lam 2134
2 Jenny Chan 1234
7An entity instance
Elements of Entity-Relationship f y pModel: Relationships
D ib h titi Describes how one or more entities are related to each otherEntity: noun, relationship: verb
places
ER Diagram
CUSTOMER ORDERp
attendsBinary
RelationshipsSTUDENT SUBJECT
attendsRelationships(degree 2)
GRADEEXAM determines
Degree 3 relationships
8
CONTASSrelationships
T p s f Bi R l ti shipTypes of Binary RelationshipOne to one (1:1)One-to-one (1:1)One to many (1:N)yMany to many (N:M)
i i dPROJECT STUDENT1:1
is_assigned_to
GROUP PROJECT STUDENT1:Nis_assigned_to
SUBJECT STUDENTN:M
9
enrolled
R l ti hiRelationshipMaximum cardinality:Maximum cardinality:
The maximum number of entities that can id f th l ti hioccur on one side of the relationship
1:N, if a group project involves at most 3 st d tsstudents
Maximum cardinality: 1:3Mi i di litMinimum cardinality:
Indicates whether participation in the l h d lrelationship is mandatory or optional
0: optional, 1: mandatory
10
Relationshipi i d t
PROJECT STUDENT1:1is_assigned_to1 0
entity optional Relationship Number entityinvolved
O j t M b A i d t O St d tOne project May be Assigned to One Student
One Student Must be Assigned one project
11
RelationshipRelationship
GROUP PROJECT STUDENT1 Nis_assigned_to1 0GROUP PROJECT STUDENT1:N1 0
entity optional Relationship Number i l d
entityinvolved
Group Studentproject
St d t
Student
j tStudent project
12
RelationshipSUBJECT STUDENTN:M1 1
enrolled
entity optional Relationship Number entityy p pinvolved
y
subject Student
Student subject
13
Attributes?
Example 1Example 1Project:
P j t titl (P jTitl ) P j t ID (P jID) d Project title (ProjTitle), Project ID (ProjID) and Budget (ProjBudget)
Student:Student:Year of study (YearStudy), Student Number (StudNo), First Name (FirstName) and Last Name ( ), m ( m ) m(LastName)
Relationship:pA project might be assigned to one studentA student must be assigned one project
14
Example 2Example 2An order must contain at least one product.A product might not associate with any orderAn order
consists of an order number as well as an order dateThe product name and its description will be listed outAlso, the quantity of a particular product will be listed as well.
15
Example 3Example 3A subject must have at least one student
ll denrolledA student must enroll at least one subjectA subject has a title as well as a subject code A student has a student number, first name and last nameand last nameThe enrollment date should be recorded
16
Binary RelationshipsEntitles tables
Subject tablejTwo columns: subcode, title
One to one binary relationshipOne-to-one binary relationshipPrimary key on one side (mandatory side) b f i k th th id becomes a foreign key on the other side (optional side)
Both mandatory: place on one side onlyOne side mandatory, one side optional: place on
ti l id17
optional side
Rule 1D i 1 Rule 1is_assigned_to
Design 1
PROJECT STUDENT1:1
PROJECT ProjID title name
STUDENT studNo ProjID firstName lastName
Go from Student to Project dN b d ’ DENuse studNo to obtain a student’s row in STUDENT
obtain ProjID assigned to studNouse this ProjID to look up the data in PROJECT
18
use this ProjID to look up the data in PROJECT
Rule 1D i 2 Rule 1is_assigned_to
Design 2
PROJECT STUDENT1:1
PROJECT ProjID title name studNo
STUDENT studNo firstName lastName
Go from Project to Studentuse projID to obtain a project’s row in PROJECTobtain studNo assigned to projIDuse this studNo to look up the student info in STUDENT
19
use this studNo to look up the student info in STUDENT
Rule 1Rule 1Both side mandatory:
One project must assign to one studentOne student must be assigned one projectDesign 1 and design 2 are conceptually the sameDesign 1 and design 2 are conceptually the same
Might be different in performanceQuery in one direction is more common than query in th th di tithe other direction
One side mandatory, one side optional?A student must be assigned one projectA student must be assigned one projectA particular project may not be assigned to any student
l k kDesign 1: looks okDesign 2: null value
Design 1 is preferable
20
Design 1 is preferable
Rule 2Rule 2One-to-many binary relationship
Key on parent: placed in childCf: one-to-one relationshipp
GROUP PROJECT STUDENT1is_assigned_to1 0GROUP PROJECT STUDENT1:N1 0
GROUP PROJECT
ProjID title namePROJECT
STUDENT studNo ProjID firstName lastName
21
STUDENT studNo ProjID firstName lastName
Rule 3many-to-many binary relationship
Cannot be directly represented by relations as one-to-one or one-to-many casesReason
B E { b d }SUBJECT entity: {subCode, name}STUDENT entity: {studNo, firstName, lastName}lastName}Multiple values for Many-to-many relationship!
SUBJECT STUDENTN:M
enrolled
1 1
22
Rule 3many-to-many binary relationship
Create a new relation with primary keys of the two entities as its primary keyp y yIntersection relations
SUBJECT Subcode name
STUDENT studNo firstName lastName
SUBJECTSUBJECT-STUDENT studNo SubID
23
Many to Many RelationshipMany-to-Many RelationshipAssociative entities:Associative entities:
title studNoenrolledtitle
SUBJECT STUDENTN:M
enrolled
1 1
codedateEnrolled firstName
lastNamedateEnrolled
SUBJECT code title Natural?
STUDENT studNo firstName lastName
24
SUBJECT-STUDENT studNo code dateEnrolled
Many to Many RelationshipMany-to-Many RelationshipAssociative entities:Associative entities:
SUBJECT code title
STUDENT studNo firstName lastName
SUBJECT-STUDENT studNo code dateEnrolledrefNoSUBJECT-STUDENT studNo code dateEnrolledrefNo
25
Many-to-Many Relationshipy y pOrder: unique orderID and dateProducts: prodID name descriptionProducts: prodID, name, descriptionEach order can include more than one product
containsorderID
prodIDprodName
ORDER PRODUCTM:Ncontains
0 1
prodDesorderDate qty
26
Ternary RelationshipsCUSTOMER, SALESPERSON, ORDER
Each order involves one and only one Each order involves one and only one customerA customer can have many ordersA customer can have many ordersOne order has one and only one salesA sales has many ordersA sales has many orders
27
Ternary RelationshipTernary RelationshipFurther constraint?
Each customer can place orders only with a particular salespersonp pBusiness rule
Need to be documentedNeed to be documented
28
“Relational” TerminologiesP im KPrimary Key
An attribute (or combination or attributes) that uniquely identifies a row in a relationuniquely identifies a row in a relation
Candidate KeyAttribute that can be chosen to be the primary k
p ykey
Composite KeyA im k th t sists f m th A primary key that consists of more than one attribute
Foreign KeyForeign KeyAn attribute in a relation that serves as the primary key of another relation
29
NormalizationNormalizationProcess for evaluating and converting a Process for evaluating and converting a relation to reduce modification n m li s ( nd si bl ns n f anomalies (undesirable consequence of
a data modification) d l d d dDetects and eliminates data redundancy
Well structure relations should avoidInsertion anomalyDeletion anomalyDeletion anomalyModification anomaly
30
Example for Normal FormspProject Assignment:
One project: assigned to a no of deptsOne project: assigned to a no of deptsUnnormalized form: all information together
ProjID Title DeptNo DeptName Locaton NoStaff HourChg ExChg Hours
123 A 12 Engineering China 5 100 20 10
123 A 40 HRO HK 6 50 0 4123 A 40 HRO HK 6 50 0 4
123 A 70 Finance HK 2 80 0 4
234 B 12 Engineering China 3 100 20 60
234 B 70 Finance HK 1 80 0 7
345 C 14 Purchase Taiwan 3 50 40 6
456 D 70 Finance HK 1 80 0 5456 D 70 Finance HK 1 80 0 5
567 E 12 Engineering China 10 100 20 30
567 E 70 Finance HK 1 80 0 18
31
577 F 58 Sales China 20 60 20 20
First Normal FormFirst Normal Form1NF:
A relation must have only single-valued attributes
No repeating groupsPROJECT entity:JE y
ProjID, DeptNoTitle, DeptName, Location, NoStaff, pHourChg, ExChg, Hours
May have modification anomaliesA new dept added? Delete ProjID 577?
32
1NFProjID DeptNo Title DeptName Locaton NoStaff HourChg ExChg Hours
123 12 A Engineering China 5 100 20 10
123 40 A HRO HK 6 50 0 4
123 70 A Finance HK 2 80 0 4123 70 A Finance HK 2 80 0 4
234 12 B Engineering China 3 100 20 60
234 70 B Finance HK 1 80 0 7
345 14 C Purchase Taiwan 3 50 40 6
456 70 D Finance HK 1 80 0 5
567 12 E Engineering China 10 100 20 30567 12 E Engineering China 10 100 20 30
567 70 E Finance HK 1 80 0 18
577 58 F Sales China 20 60 20 20
K : {P jID D tN }
33
Key: {ProjID, DeptNo}
Second Normal FormSecond Normal Form2NF: 2NF
A relation must be in 1 NF and each non-key attribute must be dependent on the key attribute must be dependent on the whole key
T d t hi h i f ti ll To remove data which is functionally dependent upon only part of a key andTo place this data to a new relation, with that part of the key as the with that part of the key as the primary key
34
Second Normal FormSecond Normal FormPROJECT entity:
Key: {ProjID, DeptNo}Title, DeptName, Location, NoStaffInv, HourChg, p gExChg, Hours
Dependency:p yProjID Title DeptNo DeptName, Location, HourChg, ExChgD p D p m , , g, E g{ProjID, DeptNo} NoStaffInv, Hours
Split up Relations
35
Split up Relations
Second Normal Form: AnswerPROJECT entity:
ProjID, TitleProjID, TitleDEPT entity
DeptNo DeptName Location HourChg ExChgDeptNo, DeptName, Location, HourChg, ExChgASSINMENT entity
P jID D tN N St ffI HProjID, DeptNo, NoStaffInv, Hours
36
2NFASSINMENT entityProjID Title
123 A
234 B
PROJECT entityProjID DeptNo NoStaff Hours123 12 5 10123 40 6 4 234 B
345 C
456 D
123 70 2 4234 12 3 60234 70 1 7 456 D
567 E
577 F
234 70 1 7345 14 3 6456 70 1 5
DeptNo DeptName Locaton HourChg ExChg
577 F567 12 10 30567 70 1 18577 58 20 20 DeptNo DeptName Locaton HourChg ExChg
12 Engineering China 100 20
40 HRO HK 50 070 Finance HK 80 014 Purchase Taiwan 50 40
58 Sales China 60 20
DEPT entity
37
58 Sales China 60 20
Second Normal FormProblem:
ExChg depends on LocationExChg depends on LocationIf Location= HK, no ExChgIf L ti Chi E Ch 20If Location= China, ExChg = 20If Location = Taiwan, ExChg = 40
E lExample:Delete deptNo=14, deletion anomaly
38
Third N rmal F rmThird Normal Form
3NF:A relation must be in 2 NF and no m Ftransitive dependencies may exist
A functional dependency between two or more A functional dependency between two or more non-key attributesDeptNo Location, ExChgp , gLocation ExChg
ExChg depends on non-key attribute “Location”Solution: split relation
39
Third Normal Form: AnswerThird Normal Form AnswerPROJECT entity:
ProjID, TitleDEPT entity
DeptNo, DeptName, Location, HourChgEXTRACHARGE entity:y
Location, ExChgASSINMENT entityASSINMENT entity
ProjID, DeptNo, NoStaffInv, Hours
40
3NFASSINMENT entityPROJECT entity
ProjID Title
123 A
PROJECT entityProjID DeptNo NoStaff Hours123 12 5 10123 40 6 4
EXTRACHARGE entity234 B
345 C
123 70 2 4234 12 3 60234 70 1 7
Location ExChg
China 20
456 D
567 E
234 70 1 7345 14 3 6456 70 1 5
HK 0
Taiwan 40
DeptNo DeptName Locaton HourChg
577 F567 12 10 30567 70 1 18577 58 20 20 DeptNo DeptName Locaton HourChg
12 Engineering China 100
40 HRO HK 5070 Finance HK 8014 Purchase Taiwan 50
58 Sales China 60
DEPT entity
41
58 Sales China 60
Normalization vs D li tiDenormalization
Normalization:Normalization:Avoid modification anomaliesS li l iSplit relations
Extra processingMight not be desirable Denormalization: control data redundancy to
fimprove performance
42
ExampleExampleAssume one HoD, one to three Deputy Assume one HoD, one to three Deputy HoDU /D li d l ti 1Un/De-normalized relation 1:
DEPT (DeptName, HOD, DeputyHoD)
Not in ________:D tN H D
Key: {DeptName, DeptyHoD}
DeptName HoDNormalized
DEPT (DeptName, HOD)DepHOD (DeptName DeputyHOD)
43
DepHOD (DeptName, DeputyHOD)
ExampleExampleIn 2NF:
Not efficient: obtain data about dept, need to read at least 2 rows and as many as 4 rowsy
AlternativeDEPT (DeptName, HOD, DeHOD1, DeHOD2, DeHOD3)Key: DeptName
In __________All attributes are functionally dependent on All attributes are functionally dependent on DeptNameProblem: search the name of the DeHoD
44
Problem search the name of the DeHoD
SQLSQL
45
DBMSDBMSAccess:
Microsoft productWindows platformEasy to use: user-interfaceSmall companies, personal uses
M QL MySQL server:www.mysql.comPopular: available for windows, Linux, UnixM ltiMultiuserMultithread:
Enable database to perform multiple tasks concurrentlyIncrease server efficiency in handling client requestIncrease server efficiency in handling client request
Can handle large databases (tens of thousands of tables)(nearly) open source
46
Graphical mode
47
Graphical mode
48
MySQLMySQL
Output format: table formC i i iCase insensitiveSemi-colon: line ends
49
Data Definition LanguageData Definition Languagemysql> create database ABCpharmacy
>Create
->-> ;
Query OK 1 row affected (0.00 sec)databaseCreate Query OK, 1 row affected (0.00 sec)
mysql> use ABCpharmacy
Create tables
Database changedmysql> show tables;Empty set (0 00 sec)Empty set (0.00 sec)
mysql>
50
mysqlDROP DATABASE db_name;
51
Data Definition LanguageData Definition LanguageCreate tables
C t t bl t bl N m (DEFINITION)Create table tableName (DEFINITION)DEFINITION:
Col_name type [NOT NULL | NULL] [PRIMARY KEY]INT: integerDOUBLE: floating-point numberDECIMAL [M,D]: stores the digits in bytes: M: max display size, D: [ , ] g y p y ,max decimalsChar(M): fixed string: always occupying M bytesVARCHAR(M): variable length occupying at most M+1 bytes( ) g py g yENUM (‘val1’, ‘val2’, …) a string taken from a fixed set of values
Color ENUM (‘red’, ‘orange’, ‘yellow’, ‘green’, ‘blue’, ‘purple’)DESCRIBE tableName
52
mDROP TABLE tableName
DDL Text File
53
Data Definition LanguageData Definition Language
54
The text file must be tab-delimited
Foreign key?Foreign key?Alter table table1Name ADD FOREIGN KEY (columnName) REFERENCES table2Name(columnName)E gE.g.,
Alter table orderInfo add foreign key ( ) f ( )(custName) references customer (name);
55 56
Data Manipulation LanguageData Manipulation LanguageDML commands:
Querying DB: SELECTSQL: querying tablesSQL: querying tables
Statements written in multiple linesR d d l lReserved words: in capital letters
Conventions: provide clarityDo not required by compilers
57
SELECT statement: retrievalSELECT statement retrievalSELECT <list of column expressions>FROM <list of tables and join operations>WHERE <list of logical expressions for rows>WHERE <list of logical expressions for rows>GROUP BY <list of grouping columns>HAVING <list of logical expressions for groups>HAVING <list of logical expressions for groups>ORDER BY <list of sorting specifications>
58
SummarySummaryRevision
ER diagramTable designTable designNormalizationD n m li ti nDenormalizationSQL
59