inside database

26
Inside Database System Takashi HOSHINO Cybozu Labs 1

Upload: takashi-hoshino

Post on 22-Jun-2015

1.273 views

Category:

Technology


0 download

DESCRIPTION

An introduction to database management systems.Use internal study meeting named "DATABASE NO KIMOCHI WO SHIRU KAI" inside Cybozu.

TRANSCRIPT

Page 1: Inside database

Inside Database System

Takashi HOSHINOCybozu Labs

1

Page 2: Inside database

Overview

• Control/data Flow• DBMS– Query Processor– Storage Engine• Transaction Management• Buffer Cache Management• Data Structures

• Storage

2

Page 3: Inside database

Control/Data Flow

ApplicationApplication

DBMSDBMS

OSOS

StorageStorage

SQL/Records

RW/Blocks

3

Page 4: Inside database

DBMS

Query ProcessorQuery Processor

Storage EngineStorage Engine

4

Page 5: Inside database

Query Processor

Hector 2010

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizes

statistics

5

Page 6: Inside database

Query Plan Example

Hector 2010

B,D

R.A = “c” S.E = 2

R S

natural join

6

Page 7: Inside database

Which Plan is Good?

Hector 2010

R S

T

T R

S

S T

R

7

Page 8: Inside database

Storage Engine

TransactionManagementTransaction

ManagementBuffer CacheManagementBuffer CacheManagement

Data StructuresData Structures

8

Page 9: Inside database

Transaction Management

• Keep ACID property of data– Atomicity– Consistency– Isolation– Durability

• Concurrency Control• Logging & Recovery

9

Page 10: Inside database

Concurrency Control by Locking

• Target resources– Database– Table– Block– Record

• Locking algorithm– Shard/exclusive lock– Intention lock for fine granularity

10

Page 11: Inside database

Shared/Exclusive Lock

• S: shared lock for read• X: exclusive lock for write

SS

Trn 1Trn 1

Trn 2Trn 2

Trn 3Trn 3

XX

Trn 1Trn 1

Trn 2Trn 2

Trn 3Trn 3

11

Page 12: Inside database

Intention Lock

O O O _

_

_

____

O_O

_OO

IS IX S X

IS

IXSX

http://dev.mysql.com/doc/refman/5.5/en/innodb-lock-modes.html

IXIX

IXIX ISIS

XX SS

12

Page 13: Inside database

Logging with Redo Log

Hector 2010

T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t);

Output(A); Output(B)

A: 8B: 8

A: 8B: 8

memory DB

LOG

1616

<T1, start><T1, A, 16><T1, B, 16>

<T1, commit>

<T1, end>

output

1616

13

Page 14: Inside database

Buffer Cache Management

• Allowance of dirty cache– No: write through– Yes: write back

• Eviction strategy– LRU: least recently used– …

• Prefetch– Sequential– …

14

Page 15: Inside database

Data Structures

DictionaryDictionary

TableTable

IndexIndex …

LogLogLogLogLogLog

TableTable

IndexIndex

StatisticsStatistics

15

Page 16: Inside database

Inside Data Block

R3

R4

R1 R2

Hector 2010

Header

Free space

16

Page 17: Inside database

Structures for Index

Hash FunctionHash Function

Tree Hash

17

Page 18: Inside database

B+tree Example

Hector 2010

Root

100

120

150

180

30

3 5 11 30 35 100

101

110

120

130

150

156

179

180

200

18

Page 19: Inside database

Hash Example

Hector 2010

INSERT:h(a) = 1h(b) = 2h(c) = 1h(d) = 0

0

1

2

3

d

a

c

b

h(e) = 1

e

19

Page 20: Inside database

Tree vs Hash for Indexing

• Tree– O(log N) for single record retrieval– Efficient range scan is available

• Hash– O(1) for single record retrieval– Range scan is not supported

20

Page 21: Inside database

Storage

Hard Disk Drive Solid State Drive

RAID StorageStorage Unit

ControllerControllerCacheCache

SCSI Protocol Stack/HBA DriversSCSI Protocol Stack/HBA Drivers

Buffer Cache ManagerBuffer Cache ManagerFile SystemFile System

Logical Unit/Software RAID ManagerLogical Unit/Software RAID Manager

OS Functionalities for Storage IO

ControllerControllerCacheCache

ControllerControllerCacheCache

21

Page 22: Inside database

Hard Disk Drive

TrackSector

Disk Platter

transferrotationheadseekaccess TTTT

Lseek size lseek size

Small lseek Large lseek (smoothed)

IO R

espo

nse

IO R

espo

nse

22

Page 23: Inside database

Summary

• DBMS– Query Processor– Storage Engine

• Storage

23

Page 24: Inside database

References

• Database System ImplementationLecture notes at Stanford University.– http://infolab.stanford.edu/~ullman/dbsi.html

• MySQL InnoDB Internal – http://www.innodb.com/wp/wp-content/uploads/

2009/05/innodb-file-formats-and-source-code-structure.pdf

• MySQL Reference Manual– http://dev.mysql.com/doc/

24

Page 25: Inside database

For Further Study

• Fundamentals of Database systems– http://www.amazon.com/Fundamentals-Database-

Systems-Ramez-Elmasri/dp/0136086209

• Books recommended by Leo’s Chronicle– http://leoclock.blogspot.com/2009/01/blog-post_07.html

25

Page 26: Inside database

Fundamentals of Database Systems

26