relational algebra 2. relational algebra formalism for creating new relations from existing ones its...

25
Relational Algebra 2

Upload: leslie-fleming

Post on 14-Dec-2015

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Relational Algebra 2

Page 2: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Relational Algebra

• Formalism for creating new relations from existing ones

• Its place in the big picture:

Declartivequery

language

Declartivequery

languageAlgebraAlgebra ImplementationImplementation

SQL,relational calculus

Relational algebraRelational bag algebra

Page 3: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Relational Algebra• Five operators:

– Union: – Difference: -– Selection:– Projection: – Cartesian Product:

• Derived or auxiliary operators:– Intersection, complement– Joins (natural,equi-join, theta join, semi-join)– Renaming:

Page 4: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

1. Union and 2. Difference

• R1 R2

• Example: – ActiveEmployees RetiredEmployees

• R1 – R2

• Example:– AllEmployees -- RetiredEmployees

Page 5: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

What about Intersection ?

• It is a derived operator

• R1 R2 = R1 – (R1 – R2)

• Also expressed as a join (will see later)

• Example– UnionizedEmployees RetiredEmployees

Page 6: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

3. Selection

• Returns all tuples which satisfy a condition

• Notation: c(R)

• Examples– Salary > 40000 (Employee)

– name = “Smithh” (Employee)

• The condition c can be =, <, , >, , <>

Page 7: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Selection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name DepartmentID Salary888888888 Alice 2 45,000

Find all employees with salary more than $40,000.Salary > 40000 (Employee)

Page 8: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

4. Projection

• Eliminates columns, then removes duplicates

• Notation: A1,…,An (R)

• Example: project social-security number and names:– SSN, Name (Employee)

– Output schema: Answer(SSN, Name)

Page 9: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Projection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name999999999 John777777777 Tony888888888 Alice

SSN, Name (Employee)

Page 10: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

5. Cartesian Product

• Each tuple in R1 with each tuple in R2

• Notation: R1 R2

• Example: – Employee Dependents

• Very rare in practice; mainly used to express joins

Page 11: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Cartesian Product Example Employee Name SSN John 999999999 Tony 777777777 Dependents EmployeeSSN Dname 999999999 Emily 777777777 Joe Employee x Dependents Name SSN EmployeeSSN Dname John 999999999 999999999 Emily John 999999999 777777777 Joe Tony 777777777 999999999 Emily Tony 777777777 777777777 Joe

Page 12: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Relational Algebra• Five operators:

– Union: – Difference: -– Selection:– Projection: – Cartesian Product:

• Derived or auxiliary operators:– Intersection, complement– Joins (natural,equi-join, theta join, semi-join)– Renaming:

Page 13: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Renaming

• Changes the schema, not the instance

• Notation: B1,…,Bn (R)

• Example:– LastName, SocSocNo (Employee)

– Output schema: Answer(LastName, SocSocNo)

Page 14: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Renaming Example

EmployeeName SSNJohn 999999999Tony 777777777

LastName SocSocNoJohn 999999999Tony 777777777

LastName, SocSocNo (Employee)

Page 15: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Natural Join• Notation: R1 R2⋈• Meaning: R1 R2 = ⋈ A(C(R1 R2))

• Where:– The selection C checks equality of all common

attributes– The projection eliminates the duplicate common

attributes

Page 16: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Natural Join Example

EmployeeName SSNJohn 999999999Tony 777777777

DependentsSSN Dname999999999 Emily777777777 Joe

Name SSN DnameJohn 999999999 EmilyTony 777777777 Joe

Employee Dependents = Name, SSN, Dname( SSN=SSN2(Employee x SSN2, Dname(Dependents))

Page 17: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Natural Join

• R= S=

• R ⋈ S=

A B

X Y

X Z

Y Z

Z V

B C

Z U

V W

Z V

A B C

X Z U

X Z V

Y Z U

Y Z V

Z V W

Page 18: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Natural Join

• Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ?

• Given R(A, B, C), S(D, E), what is R ⋈ S ?

• Given R(A, B), S(A, B), what is R ⋈ S ?

Page 19: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Theta Join

• A join that involves a predicate

• R1 ⋈ R2 = (R1 R2)

• Here can be any condition

Page 20: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Eq-join

• A theta join where is an equality

• R1 ⋈A=B R2 = A=B (R1 R2)

• Example:– Employee ⋈SSN=SSN Dependents

• Most useful join in practice

Page 21: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Semijoin

• R ⋉ S = A1,…,An (R ⋈ S)

• Where A1, …, An are the attributes in R

• Example:– Employee ⋉ Dependents

Page 22: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Semijoins in Distributed Databases

• Semijoins are used in distributed databases

SSN Name

. . . . . .

SSN Dname Age

. . . . . .

EmployeeDependents

network

Employee ⋈ssn=ssn (age>71 (Dependents))Employee ⋈ssn=ssn (age>71 (Dependents))

T = SSN age>71 (Dependents)R = Employee T⋉

Answer = R ⋈ Dependents

Page 23: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Complex RA Expressions

Person Purchase Person Product

name=fred name=gizmo

pid ssn

seller-ssn=ssn

pid=pid

buyer-ssn=ssn

name

Page 24: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Operations on Bags

A bag = a set with repeated elements

All operations need to be defined carefully on bags• {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}• {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}

• C(R): preserve the number of occurrences

• A(R): no duplicate elimination

• Cartesian product, join: no duplicate elimination

Important ! Relational Engines work on bags, not sets !

Reading assignment: 5.3 – 5.4

Page 25: Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language

Finally: RA has Limitations !

• Cannot compute “transitive closure”

• Find all direct and indirect relatives of Fred• Cannot express in RA !!! Need to write C program

Name1 Name2 Relationship

Fred Mary Father

Mary Joe Cousin

Mary Bill Spouse

Nancy Lou Sister