introduction to databases cse 414 · introduction to databases cse 414 lecture 2: data models cse...
TRANSCRIPT
![Page 1: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/1.jpg)
Introduction to DatabasesCSE 414
Lecture 2: Data Models
1CSE 414 - Autumn 2018
![Page 2: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/2.jpg)
Class Overview
• Unit 1: Intro• Unit 2: Relational Data Models and Query Languages
– Data models, SQL, Relational Algebra, Datalog
• Unit 3: Non-relational data• Unit 4: RDMBS internals and query optimization• Unit 5: Parallel query processing• Unit 6: DBMS usability, conceptual design• Unit 7: Transactions
CSE 414 - Autumn 2018 2
![Page 3: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/3.jpg)
Review
• What is a database?– A collection of files storing related data
• What is a DBMS?– An application program that allows us to
manage efficiently the collection of data files
CSE 414 - Autumn 2018 3
![Page 4: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/4.jpg)
Data Models
• Recall our example: want to design a database of books:– author, title, publisher, pub date, price, etc– How should we describe this data?
• Data model = mathematical formalism (or conceptual way) for describing the data
CSE 414 - Autumn 2018 4
![Page 5: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/5.jpg)
Data Models• Relational
– Data represented as relations• Semi-structured (JSon)
– Data represented as trees• Key-value pairs
– Used by NoSQL systems• Graph• Object-oriented
CSE 414 - Autumn 2018 5
Unit 2
Unit 3
![Page 6: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/6.jpg)
Example: storing FB friends
CSE 414 - Autumn 2018 6
Peter
Mary John
Phil
As a graph
Or
Person1 Person2 is_friendPeter John 1John Mary 0Mary Phil 1Phil Peter 1… … …
As a relation
We will learn the tradeoffs of different data models later this quarter
![Page 7: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/7.jpg)
3 Elements of Data Models
• Instance– The actual data
• Schema– Describe what data is being stored
• Query language– How to retrieve and manipulate data
CSE 414 - Autumn 2018 7
![Page 8: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/8.jpg)
Turing Awards in Data Management
CSE 414 - Autumn 2018 8
Charles Bachman, 1973IDS and CODASYL
Ted Codd, 1981Relational model
Michael Stonebraker, 2014INGRES and Postgres
Jim Gray, 1998Transaction processing
![Page 9: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/9.jpg)
Relational Model
• Data is a collection of relations / tables:
• mathematically, relation is a set of tuples
– each tuple appears 0 or 1 times in the table
– order of the rows is unspecified
CSE 414 - Autumn 2018 9
cname country no_employees for_profit
GizmoWorks USA 20000 True
Canon Japan 50000 True
Hitachi Japan 30000 True
HappyCam Canada 500 False
columns /
attributes /
fields
rows /
tuples /
records
![Page 10: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/10.jpg)
The Relational Data Model
• Degree (arity) of a relation = #attributes• Each attribute has a type.
– Examples types:• Strings: CHAR(20), VARCHAR(50), TEXT• Numbers: INT, SMALLINT, FLOAT• MONEY, DATETIME, …• Few more that are vendor specific
– Statically and strictly enforced
CSE 414 - Autumn 2018 10
![Page 11: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/11.jpg)
Keys• Key = one (or multiple) attributes that
uniquely identify a record
CSE 414 - Autumn 2018 11
![Page 12: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/12.jpg)
Keys• Key = one (or multiple) attributes that
uniquely identify a record
CSE 414 - Autumn 2018 12
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
Key
![Page 13: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/13.jpg)
Keys• Key = one (or multiple) attributes that
uniquely identify a record
CSE 414 - Autumn 2018 13
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
Key Not a key
![Page 14: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/14.jpg)
Keys• Key = one (or multiple) attributes that
uniquely identify a record
CSE 414 - Autumn 2018 14
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
Key Not a key Is this a key?
![Page 15: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/15.jpg)
Keys• Key = one (or multiple) attributes that
uniquely identify a record
CSE 414 - Autumn 2018 15
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
Key Not a key Is this a key?No: future updates to thedatabase may create duplicateno_employees
![Page 16: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/16.jpg)
Multi-attribute Key
CSE 414 - Autumn 2018 16
fName lName Income DepartmentAlice Smith 20000 TestingAlice Thompson 50000 TestingBob Thompson 30000 SWCarol Smith 50000 Testing
Key = fName,lName(what does this mean?)
![Page 17: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/17.jpg)
Multiple Keys
CSE 414 - Autumn 2018 17
SSN fName lName Income Department111-22-3333 Alice Smith 20000 Testing222-33-4444 Alice Thompson 50000 Testing333-44-5555 Bob Thompson 30000 SW444-55-6666 Carol Smith 50000 Testing
Key Another key
We can choose one key and designate it as primary keyE.g.: primary key = SSN
![Page 18: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/18.jpg)
Foreign Key
CSE 414 - Autumn 2018 18
cname country no_employees for_profitCanon Japan 50000 YHitachi Japan 30000 Y
name populationUSA 320MJapan 127M
Company(cname, country, no_employees, for_profit)Country(name, population)
Foreign key toCountry.nameCompany
Country
![Page 19: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/19.jpg)
Keys: Summary• Key = columns that uniquely identify tuple
– Usually we underline– A relation can have many keys, but only one
can be chosen as primary key• Foreign key:
– Attribute(s) whose value is a key of a record in some other relation
– Foreign keys are sometimes called semantic pointer
CSE 414 - Autumn 2018 19
![Page 20: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/20.jpg)
Query Language• SQL
– Structured Query Language– Developed by IBM in the 70s– Most widely used language to query
relational data• Other relational query languages
– Datalog, relational algebra
CSE 414 - Autumn 2018 20
![Page 21: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/21.jpg)
Our First DBMS
• SQL Lite• Will switch to SQL Server later in the
quarter
CSE 414 - Autumn 2018 21
![Page 22: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/22.jpg)
Demo 1
CSE 414 - Autumn 2018 22
![Page 23: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/23.jpg)
Discussion• Tables are NOT ordered
– they are sets or multisets (bags)• Tables are FLAT
– No nested attributes• Tables DO NOT prescribe how they are
implemented / stored on disk– This is called physical data independence
CSE 414 - Autumn 2018 23
![Page 24: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/24.jpg)
Table Implementation• How would you implement this?
CSE 414 - Autumn 2018 24
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
![Page 25: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/25.jpg)
Table Implementation• How would you implement this?
CSE 414 - Autumn 2018 25
cname country no_employees for_profitGizmoWorks USA 20000 TrueCanon Japan 50000 TrueHitachi Japan 30000 TrueHappyCam Canada 500 False
Row major: as an array of objects
GizmoWorksUSA20000True
CanonJapan50000True
HitachiJapan30000True
HappyCamCanada500False
![Page 26: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/26.jpg)
Table Implementation
• How would you implement this?
cname country no_employees for_profit
GizmoWorks USA 20000 True
Canon Japan 50000 True
Hitachi Japan 30000 True
HappyCam Canada 500 False
Column major: as one array per attribute
GizmoWorks Canon Hitachi HappyCam
USA Japan Japan Canada
True True True False
20000 50000 30000 500
26CSE 414 - Autumn 2018
![Page 27: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/27.jpg)
Table Implementation
• How would you implement this?
CSE 414 - Autumn 2018
Physical data independenceThe logical definition of the data remains
unchanged, even when we make changes to
the actual implementation
cname country no_employees for_profit
GizmoWorks USA 20000 True
Canon Japan 50000 True
Hitachi Japan 30000 True
HappyCam Canada 500 False
27
![Page 28: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/28.jpg)
First Normal Form
• All relations must be flat: we say that the relation is in first normal form
cname country no_employees for_profitCanon Japan 50000 YHitachi Japan 30000 Y
CSE 414 - Autumn 2018 28
![Page 29: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/29.jpg)
First Normal Form
• All relations must be flat: we say that the
relation is in first normal form• E.g. we want to add products manufactured
by each company:
cname country no_employees for_profit
Canon Japan 50000 Y
Hitachi Japan 30000 Y
CSE 414 - Autumn 2018 29
![Page 30: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/30.jpg)
First Normal Form
• All relations must be flat: we say that the relation is in first normal form
• E.g., we want to add products manufactured by each company:
cname country no_employees for_profitCanon Japan 50000 YHitachi Japan 30000 Y
cname country no_employees for_profit products
Canon Japan 50000 Y
Hitachi Japan 30000 Y pname price category
AC 300 Appliance
pname price category
SingleTouch 149.99 Photography
Gadget 200 Toy
CSE 414 - Autumn 2018 30
![Page 31: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/31.jpg)
First Normal Form
• All relations must be flat: we say that the relation is in first normal form
• E.g., we want to add products manufactured by each company:
cname country no_employees for_profitCanon Japan 50000 YHitachi Japan 30000 Y
cname country no_employees for_profit products
Canon Japan 50000 Y
Hitachi Japan 30000 Y pname price category
AC 300 Appliance
pname price category
SingleTouch 149.99 Photography
Gadget 200 Toy
Non-1NF!
CSE 414 - Autumn 2018 31
![Page 32: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/32.jpg)
First Normal Form
CSE 414 - Autumn 2018 32
cname country no_employees for_profitCanon Japan 50000 YHitachi Japan 30000 Y
pname price category manufacturerSingleTouch 149.99 Photography CanonAC 300 Appliance HitachiGadget 200 Toy Canon
Company
Products
Now it’s in 1NF
![Page 33: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/33.jpg)
Demo 1 (cont’d)
CSE 414 - Autumn 2018 33
![Page 34: Introduction to Databases CSE 414 · Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 -Autumn 2018 1. Class Overview •Unit 1: Intro ... Relational Model •Data](https://reader036.vdocuments.net/reader036/viewer/2022062507/5fbe590d2e94b342850b4955/html5/thumbnails/34.jpg)
Data Models: Summary
• Schema + Instance + Query language• Relational model:
– Database = collection of tables– Each table is flat: “first normal form”– Key: may consists of multiple attributes– Foreign key: “semantic pointer”– Physical data independence
CSE 414 - Autumn 2018 34