cps216: advanced database systems
Post on 31-Dec-2015
40 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
CPS216: Advanced CPS216: Advanced Database SystemsDatabase Systems
Notes 05: Operators for Data Notes 05: Operators for Data Access (contd.)Access (contd.)Shivnath BabuShivnath Babu
2
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36
Insert: 62
3
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36
Insert: 62
62
4
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36 62
Insert: 50
5
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36 50
Insert: 50
62
62
6
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36 50
Insert: 75
62
62
7
Insertion in a B-TreeInsertion in a B-Tree
49
49
n = 2
15 36 50
Insert: 75
62
62
75
8
InsertionInsertion
9
InsertionInsertion
10
InsertionInsertion
11
InsertionInsertion
12
InsertionInsertion
13
InsertionInsertion
14
InsertionInsertion
15
InsertionInsertion
16
InsertionInsertion
17
InsertionInsertion
18
InsertionInsertion
19
Insertion: PrimitivesInsertion: Primitives
Inserting into a leaf nodeInserting into a leaf node Splitting a leaf nodeSplitting a leaf node Splitting an internal nodeSplitting an internal node Splitting root nodeSplitting root node
20
Inserting into a Leaf Inserting into a Leaf NodeNode
54 57 60 62
58
21
Inserting into a Leaf Inserting into a Leaf NodeNode
54 57 60 62
58
22
Inserting into a Leaf Inserting into a Leaf NodeNode
54 57 60 62
58
58
23
61
54 57 60 6258
54 66
Splitting a Leaf NodeSplitting a Leaf Node
24
61
54 57 60 6258
54 66
Splitting a Leaf NodeSplitting a Leaf Node
25
61
54 57 61 6258
54 66
60
Splitting a Leaf NodeSplitting a Leaf Node
26
61
54 57 61 6258
54 66
60
59
Splitting a Leaf NodeSplitting a Leaf Node
27
61
54 57 61 6258
54 66
60
59
Splitting a Leaf NodeSplitting a Leaf Node
59
54 6640
[ 59, 66)[54, 59)
74 84
9921 ……
[66,74)
Splitting an Internal NodeSplitting an Internal Node
59
54 6640 74 84
9921 ……
[ 59, 66)[54, 59) [66,74)
Splitting an Internal NodeSplitting an Internal Node
5954
66
40 74 84
9921 ……
[66, 99)
[ 59, 66)[54, 59)
[21,66)
[66,74)
Splitting an Internal NodeSplitting an Internal Node
54 6640 74 84
59
[ 59, 66)[54, 59) [66,74)
Splitting the RootSplitting the Root
54 6640 74 84
59
[ 59, 66)[54, 59) [66,74)
Splitting the RootSplitting the Root
54
66
40 74 8459
[ 59, 66)[54, 59) [66,74)
Splitting the RootSplitting the Root
34
DeletionDeletion
35
DeletionDeletion
redistribute
36
DeletionDeletion
37
Deletion - IIDeletion - II
Deletion - IIDeletion - II
merge
39
Deletion - IIDeletion - II
40
Deletion - IIDeletion - II
41
Deletion - IIDeletion - II
42
Deletion - IIDeletion - II
merge
Not needed
43
Deletion - IIDeletion - II
44
Deletion: PrimitivesDeletion: Primitives
Delete key from a leafDelete key from a leaf Redistribute keys between sibling Redistribute keys between sibling
leavesleaves Merge a leaf into its siblingMerge a leaf into its sibling Redistribute keys between two Redistribute keys between two
sibling internal nodessibling internal nodes Merge an internal node into its Merge an internal node into its
siblingsibling
45
Merge Leaf into SiblingMerge Leaf into Sibling
54 58 64 68 72 75
67 85…72
46
Merge Leaf into SiblingMerge Leaf into Sibling
54 58 64 68 75
67…72 85
47
Merge Leaf into SiblingMerge Leaf into Sibling
54 58 64 68 75
67…72 85
48
Merge Leaf into SiblingMerge Leaf into Sibling
54 58 64 68 75
…72 85
49
Merge Internal Node into Merge Internal Node into SiblingSibling
41 48 52 63 74
59
[52, 59) [59,63)
……
50
Merge Internal Node into Merge Internal Node into SiblingSibling
41 48 52 63
59
[52, 59) [59,63)
59
……
51
B-Tree RoadmapB-Tree Roadmap
B-TreeB-Tree RecapRecap Insertion (recap)Insertion (recap) DeletionDeletion ConstructionConstruction EfficiencyEfficiency
B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
52
QuestionQuestion
How does insertion-based constructionperform?
53
B-Tree ConstructionB-Tree Construction
11 1315 21 344148 57 6275 81 97
Sort
B-Tree ConstructionB-Tree Construction
75 9721 41 571511 13 4834 62 81
Scan
75 81 9711 13 15 21 34 41 48 57 62
B-Tree ConstructionB-Tree Construction
21 48 75
11 13 15 21 34 41 48 57 62 75 81 97
Scan
56
B-Tree ConstructionB-Tree Construction
Why is sort-based construction better thaninsertion-based one?
57
Cost of B-Tree Cost of B-Tree OperationsOperations
Height of B-Tree: HHeight of B-Tree: H Assume no duplicatesAssume no duplicates Question: what is the random I/O Question: what is the random I/O
cost of:cost of: Insertion:Insertion: Deletion:Deletion: Equality search:Equality search: Range Search: Range Search:
58
Height of B-TreeHeight of B-Tree
Number of keys: NNumber of keys: N B-Tree parameter: nB-Tree parameter: n
Height ≈ log N = Height ≈ log N = nn
log Nlog N
log nlog n
In practice: 2-3 levelsIn practice: 2-3 levels
59
Question: How do you pick parameter n? Question: How do you pick parameter n?
1.1. Ignore inserts and deletesIgnore inserts and deletes2.2. Optimize for equality searchesOptimize for equality searches3.3. Assume no duplicatesAssume no duplicates
60
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants
Sparse IndexSparse Index Duplicate KeysDuplicate Keys
Hash-based IndexesHash-based Indexes
61
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
Static Hash TableStatic Hash Table Extensible Hash TableExtensible Hash Table Linear Hash TableLinear Hash Table
62
Hash-Based IndexesHash-Based Indexes
Adaptations of main memory hash Adaptations of main memory hash tablestables
Support equality searchesSupport equality searches No range searchesNo range searches
Indexing Problem (recap)Indexing Problem (recap)
a1
2a
ia
na
A = val
Index Keysrecord pointers
64
Main Memory Hash Main Memory Hash TableTable
buckets
32
(null)
(null)
(null)
(null)
(null)
10
48
27 75
21
55
0
3
1
2
4
5
6
7
keyh (key)
h (key) = key % 8
65
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in
the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually
accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys …
66
Adapting to DiskAdapting to Disk
How do we handle this?
67
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block All keys that hash to bucket stored in All keys that hash to bucket stored in
the blockthe block Intuition: keys in a bucket usually Intuition: keys in a bucket usually
accessed togetheraccessed together No need for linked lists of keys …No need for linked lists of keys … … … but need linked list of blocks but need linked list of blocks
((overflow blocksoverflow blocks))
68
Adapting to DiskAdapting to Disk
69
Adapting to DiskAdapting to Disk
0
1
2
Is there any otherissue?
Map ‘bucket id’to disk location
70
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block1 Hash Bucket = 1 Block Bucket Id Bucket Id Disk Address mapping Disk Address mapping
Contiguous blocksContiguous blocks Store mapping in main memoryStore mapping in main memory
Too large?Too large?
71
Beware of claims that assume 1 I/O Beware of claims that assume 1 I/O for hash tables and 3 I/Os for B-Tree!!for hash tables and 3 I/Os for B-Tree!!
72
Adapting to diskAdapting to disk
1 Hash Bucket = 1 Block 1 Hash Bucket = 1 Block (or more than one contiguous (or more than one contiguous blocks)blocks)
Bucket Id Bucket Id Disk Address mapping Disk Address mapping Number of bucketsNumber of buckets
≈ ≈ Number of keys (main memory Number of keys (main memory version)version)
≈ ≈ Number of blocks (disk version)Number of blocks (disk version)Textbook: Static Hash TableTextbook: Static Hash Table
73
Assigned ReadingAssigned Reading
Insertion and Deletion on Static Hash TableInsertion and Deletion on Static Hash TableSection 13.4Section 13.4
74
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
Static Hash TableStatic Hash Table Extensible Hash TableExtensible Hash Table Linear Hash TableLinear Hash Table
75
Dynamic Hash IndexesDynamic Hash Indexes
Static Hash Table:Static Hash Table: Fixed number of bucketsFixed number of buckets Waste space / inefficientWaste space / inefficient
Dynamic Hash Tables:Dynamic Hash Tables: Number of buckets can increase / Number of buckets can increase /
decrease dynamicallydecrease dynamically
76
Extensible Hash Table: Extensible Hash Table: Main Ideas (Abstract)Main Ideas (Abstract)
Hash Function: {Keys} Hash Function: {Keys} {Large {Large space of hash values}space of hash values}
Buckets Buckets dynamicallydynamically partition space partition space of hash valuesof hash values
Insertions: partitioning grows finerInsertions: partitioning grows finer i.e., more bucketsi.e., more buckets
Deletions: partitioning grows coarserDeletions: partitioning grows coarser i.e., fewer bucketsi.e., fewer buckets
77
Extensible Hash Table:Extensible Hash Table:Main Ideas (concrete)Main Ideas (concrete)
Hash Function: {Keys} Hash Function: {Keys} bit string of length b bit string of length b
0 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 Example:Example:
Bucket: Bucket: prefixprefix of bit string of bit string
All (keys with) hash values having that prefixAll (keys with) hash values having that prefixfall into that bucketfall into that bucket
11
0
10
01011010
01100110
10110001
10011010
11011110
prefixesHash Value bucket?
11
0
10
01011010
01100110
10110001
10011010
11011110
00
01
10
11
i = 2
i = max length of prefix
80
i = 0
.
Insertion
81
i = 0
.10110001
Insertion
82
i = 0
.
1011000110110001
Insertion
83
i = 0
.
10110001
00110101
00110101
Insertion
84
i = 0
.
10110001
00110101
11010010
Insertion
85
i = 0
0
10110001
00110101
11010010
1
Insertion
86
i = 0
0
10110001
00110101
11010010
1
Insertion
87
i = 1
0
10110001
00110101
11010010
10
1
Insertion
88
i = 1
0
10110001
00110101
11010010
10
1
11010010
Insertion
89
i = 1
0
10110001
00110101
11010010
10
1
11001101
Insertion
90
i = 1
0
10110001
00110101
11010010
10
1
11001101
Insertion
91
i = 1
0
10110001
00110101
11010010
100
1
11001101
11
Insertion
92
i = 1
0
10110001
00110101
11010010
100
1
11001101
11
Insertion
93
i = 2
0
10110001
00110101
11010010
1000
11001101
11
01
10
11
Insertion
94
i = 2
0
10110001
00110101
11010010
1000
11001101
11
01
10
11
11001101
Insertion
95
DeletionDeletion
Inverse of insertion: work out detailsInverse of insertion: work out details
96
i = 2
1
00
01
10
11
Textbook NotationTextbook Notation
Number of bits in prefix
0
97
Extensible Hash TableExtensible Hash Table
Directory doubles in size during some insertsDirectory doubles in size during some inserts
One Issue:One Issue:
98
RoadmapRoadmap
B-TreeB-Tree B-Tree variantsB-Tree variants Hash-based IndexesHash-based Indexes
Static Hash TableStatic Hash Table Extensible Hash TableExtensible Hash Table Linear Hash TableLinear Hash Table
99
Linear Hash TableLinear Hash Table
Differences from Extensible Hash Differences from Extensible Hash Table:Table: Bucket: Bucket: suffixsuffix of the hash value of the hash value Grows linearly Grows linearly
(avoids doubling of directory)(avoids doubling of directory)
10
00
1
01011000
01100100
10110001
10011001
11011110
suffixes
Linear Hash TableLinear Hash Table
101
0
1
Linear GrowthLinear Growth
102
00
1
10redistribute
Linear GrowthLinear Growth
00
01
10
11
redistribute
Linear GrowthLinear Growth
104
What does linear growth What does linear growth buy?buy?
000
01
10
11
100
i = 3
101
000
001
010
011
100
110
111
Redundant if we know # buckets = 5
105
What does linear growth What does linear growth buy?buy?
000
01
10
11
100
i = 3
000
001
010
011
100
i = 3n = 3
top related