![Page 1: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/1.jpg)
From Theory to Practice: Efficient Join Query Processing in a
Parallel Database System
Shumo Chu, Magdalena Balazinska and Dan SuciuDatabase Group, CSE, University of Washington
![Page 2: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/2.jpg)
2
In industry and science, users need to analyze large datasets
Myria: Parallel DBMS developed at UW
New class of queries
Two key differences: Multiple tables need to be joined Query structure may be cyclic
Motivation
Knowledge base exploration
Social network analysis: find all triangles
![Page 3: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/3.jpg)
Traditional Parallel Join Evaluation
Shuffle A, B on y
AB
Worker 1
AB
Worker 2
AB
Worker 3
A’B’
Worker 1
⋈
A’B’
Worker 2
⋈
A’B’
Worker 3
⋈
A’⋈B’
c’
Worker 1
⋈
A’⋈B’
c’
Worker 2
⋈
A’⋈B’
c’
Worker 3
⋈Shuffle A⋈B, C on (x, z)
3
A⋈B⋈C
A
⋈
B
C
⋈
Solution 1: Shuffle on joined attributes
Large intermediate result Skew on shuffle
Solution 2: keep largest table, broadcast others
![Page 4: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/4.jpg)
Background: HyperCube (Shares) Shuffle
A
⋈
B C
T(x, y, z) :- A(x, y), B(y, z), C(z, x)
CA
C
B
CA
C
B
CA
C
B
……
P
worke
rs
A(x1, y1) (h1(x1), h2(y1), *) P1/3 replication
B(y1, z1) (*, h2(y1), h3(z1)) P1/3 replication
C(z1, x1) (h1(x1), * , h3(z1)) P1/3 replication
4
Afrati and Ullman EDBT10
Beame et. PODS13
P1/3
P1/3
P1/3
x
y
z
![Page 5: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/5.jpg)
5
Single Node Multiway Join• Join algorithm with optimal
guarantees • Leapfrog TrieJoin by Veldhuizen,
2014• Minesweeper by Ngo etc, 2014
• Pipeline of joins Single multiway join
• Tributary Join : Leapfrog TrieJoin in Myria
• A multiway sort-merge join on steroid
• Avoid constructing tries compared with Leapfrog
x y
2 0
2 1
2 3
3 4
4 2
5 6
y z
0 1
2 0
2 3
3 4
4 2
5 6
x z
0 2
1 0
2 4
3 2
4 3
6 5
A B C
T(x, y, z) :- A(x, y), B(y, z), C(z, x)
![Page 6: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/6.jpg)
6
Questions
Empirical study of HyperCube shuffle and Tributary join
HyperCube configuration optimization
Tributary join cost model and attribute order optimization
![Page 7: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/7.jpg)
7
Empirical StudyMyria deployment with 64 workers.
Shuffle paradigms: Regular shuffles HyperCube shuffle Broadcast
Local join algorithms: Symmetric hash join Tributary join
Parallel semi-join
Evaluate 8 queries on Twitter social graph and Freebase
![Page 8: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/8.jpg)
8
Triangle Query on Twitter
Query: T(x, y, z) :- A(x, y), B(y, z), C(z, x)
Dataset: Sampled twitter social network graph with 1 million
edges (follower:int, followee:int)
![Page 9: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/9.jpg)
9
Triangle Query: Data Shuffling HyperCube Shuffle (12M Total)
A B
A⋈B C
A⋈B⋈C
#: 1MSkew:1.35
#: 1MSkew:1.72
#: 51MSkew:20.8
#: 1MSkew: 1.01
T(x, y, z) :- A(x, y), B(y, z), C(z, x)
Regular Shuffle (54M Total)
A B C
A⋈B⋈C# 4MSkew: 1.06
# 4MSkew: 1.06
# 4MSkew: 1.06
Broadcast (142M, no skew)
![Page 10: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/10.jpg)
10
Triangle Query: Runtime
Query Runtime (Sec)
Shuffle paradigm:HyperCube < Broadcast < Regular
Sequential join:Tributary Join < Hash Join
T(x, y, z) :- A(x, y), B(y, z), C(z, x)
HyperCube BroadcastRegular
![Page 11: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/11.jpg)
11
Query 2: Knowledge Base Exploration Query
Query: Show the full cast members of all films starring both Joe Pesci and Robert de Niro
• Dataset: FreeBase RDF, data is partitioned into separate tables by its predicate
CastMember(cast):- ActorName(a1, “Joe Pesci”), ActorPerform(a1, p1), PerformFilm(p1, film), ActorName(a2, “Robert de Niro”), ActorPerform(a2, p2), PerformFilm(p2, film), PerformFilm(p, film), ActorPerform(p, cast)
![Page 12: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/12.jpg)
12
Freebase Query: Data Shuffling
Regular shuffles: 7M tuples
HyperCube shuffle:105M tuples (16x replication)
Broadcast: 351M tuples (50x replication)
R1 R2
R3⋈
R3
R5
R6
R7
R8
⋈
⋈
⋈
⋈
⋈
⋈
26
1.09M
1.09M
1.10M
1.10M
2
1.09M
1.10M
660
660
25.2K
25.2K
140
10.3K
Regular shuffle
8-way join on freebase 1
![Page 13: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/13.jpg)
13
Knowledge Exploration in Freebase
Comparing shuffle paradigms:
Regular < HyperCube < Broadcast
Comparing sequential join algorithms:
Hash join < Tributary joinQuery Runtime (sec)
8-way join on freebase
![Page 14: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/14.jpg)
14
Empirical Study Summary
The best query plan depends on query, data and cluster Size of intermediate result Replication factor of HyperCube
Large intermediate results favor HyperCube and Tributary Join Small communication Small input Reducing
sorting time
![Page 15: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/15.jpg)
15
Optimizing HyperCube Shuffle
Optimization goal: minimizing maximum load of single worker
Example: Q1 with 64 workers 4x4x4 is better than 2x4x8
What if we have 63 workers or a 7 way join?
State of the art: Linear Programming (BeameKS, PODS13) If |A| = |B| = |C| = N, 63 servers, optimal is 3.98 x 3.98 x 3.98
The penalty of rounding down is non-negligible 3x3x3 only use 27 servers out of 63
![Page 16: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/16.jpg)
16
A Simple Yet Effective Algorithm for HyperCube Configuration
Algorithm:1. Enumerate all the hypercube configurations with
number of servers ≤ P
2. find the configuration with minimal shuffle cost
Tie-breaking heuristic: 1x16 vs 4x4
Best configuration of previous example: 3x4x5
![Page 17: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/17.jpg)
17
Evaluation of HyperCube Optimization
Compare different configuration algorithms Our Algorithm Rounding down Random (many virtual servers real servers)
Opt. Ratio: Max Load / Optimal (by LP Solution)
Our algorithm outperforms rounding down and random, with at most 1.06 optimality ratio
![Page 18: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/18.jpg)
18
More in the paper
Tributary join cost model and attribute order optimization
Evaluation of more queries
Comparison with parallel semi-join plans
Open source implementation in Myria:https://github.com/uwescience/myria
![Page 19: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/19.jpg)
19
Conclusions
Efficient parallel join query evaluation - break down the gap between theory and practice:
Select the best parallel query plan Shuffle paradigm Sequential join algorithm
Optimal HyperCube configuration
Optimizing Tributary join attribute order
![Page 20: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/20.jpg)
20
Thanks! Myria Team
![Page 21: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/21.jpg)
21
Conclusions
Efficient parallel join query evaluation - break down the gap between theory and practice:
Select the best parallel query plan Shuffle paradigm Sequential join algorithm
Optimal HyperCube configuration
Optimizing Tributary join attribute order
![Page 22: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/22.jpg)
22
Query execution profiling
PerfOpticon: the visual query profiling tool used in Myria
![Page 23: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/23.jpg)
23
Cost Model Explained query:
Number of binary searches in first attribute:
Number of binary searches in a joined attribute:
The total cost
![Page 24: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/24.jpg)
24
Why random HyperCube cell allocation is bad?
Query:A(x, y, z, p) :- S(x, y), R(y, z), T(z, p)
64 cells, 8 x 8 hypercube of cells, randomly allocate cells to 4 servers
Server 1 will receive 7/8 of S (1/2 if optimal) 1/4 of R 7/8 of T (1/2 if optimal)
![Page 25: From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,](https://reader036.vdocuments.net/reader036/viewer/2022062301/5697c0041a28abf838cc4145/html5/thumbnails/25.jpg)
Myria: new generation parallel DBMS
MyriaX
Coordinator
REST Server
Worker Catalog
Catalog
…
JSON query plans & other instructions
RDBMS
Worker Catalog
RDBMS
Worker Catalog
RDBMS
HDFS HDFS HDFS
Shared-nothing cluster
Primary data store:
Can also ingest data
from:25