achieving 100k queries per hour on hive on tez
TRANSCRIPT
May 1, 2023
Achieving 100k Queries per Hour with Hive on Tez
2
About Yahoo! JAPAN
The Largest Portal Site in Japan
65 billon pageviews / month
2.1 billon pageviews / day
3
YDN Report
What is YDN Report?• Report for Yahoo Display Ads. Networks
Batch Reporting over Massive Dataset• 13 months, 800B+ rows of data• Adding 3.3B+ rows of data per day
Highly Parallel Workload• 100K reports per hour
4
YDN Report Query
Typical Query• Query is Relatively Simple• Answer “How many clicks did I get last week?”
0
5000
10000
15000
SELECT account, yyyymmdd, sum(total_imps), sum(total_click),
... FROM table_x WHERE yyyymmdd >= xxx
AND yyyymmdd < xxx AND account = xxx ... GROUP BY account, yyyymmdd, ...;
5
Test Setup
Hive Performance Recap
Hive is fast: interactive response• ORC columnar file format• Cost based optimizer (CBO)• Vectorized SQL engine• Tez execution engine (replacing MapReduce)
Hive 0.10Batch Processing 100-150x Query Speedup
Hive 1.2HumanInteractive (5 seconds)
Hive on Tez Query Execution
A query execution essentially is put together from• Client execution [ 0s if done correctly ]• Optimization [HiveServer2] [~ 0.1s]• Metadata lookups [Hcatalog, Metastore] [ very fast in hive 0.14 ]• Application Master creation [4-5s]• Container Allocation [3-5s]• Tez task execution on YARN
YARN and HDFS
HiveServer2Server #1Client
Running testing tool
N connections
N connections
Metastore Metastore DB
HiveServer2Server #2
TezAM
TezContain
er
TezContaine
r…
8
Mini Test
Mini Setup Tested• 50 nodes• 450B rows dataset• Achieved 15K queries per hour
So, can we get 100K qph on 700 nodes?
We thought it should be easy, but…
9
The Bottlenecks at Scale
Challenges at Scale• Hive Metastore Server• YARN Resource Manager• Datanode Hotspot• YARN ATS
10
Hive Metastore Server
Use Local Metastore• Before: HS2 -> Metastore Server -> Metastore DB• After: HS2 (local metastore) -> Metastore DB
11
Hive Metastore Server
Use Local Metastore• Throughput: 7.6K -> 22K qph
12
Pending Apps
YARN ResourceManager Scalability• Too much pending apps
13
Pending Apps
YARN ResourceManager Scalability• Too much pending apps• Resolve: increase
yarn.resourcemanager.amlauncher.thread-count• Throughput: 22K -> 26K qph
14
Pending Containers
YARN ResourceManager Scalability• Too much pending containers
15
Pending Containers
YARN ResourceManager Scalability• Too much pending containers• Resolve: increase tez.am-rm.heartbeat.interval-
ms.max • Throughput: 26K -> 72.5K qph
16
Datanode Hotspot
Last Hour Problem• Connection timeout and disk access error• Many queries access recently added data
17
Datanode Hotspot
Last Hour Problem• Resolve: Increase HDFS replication factor• Throughput: 72.5K -> 95K qph
18
Other Tunings
Other Tunings We Did• Container reuse timeout• YARN capacity scheduler node locality delay• Tez shuffle keep alive• TCP fin_wait
Notes on YARN ATS• Disabling YARN ATS gives higher throughput• Trade off losing YARN log aggregation
19
End of first half
End of first half
Yohei Abe@Yahoo! JAPAN
Real-life Hive LLAP at Yahoo! JAPAN
Aug 2016
21
Agenda
• Hive LLAP at Yahoo! JAPAN
• Tuning• Performance Result• Future Work
Hive LLAP at Yahoo! JAPAN
23
Hive on Tez
Hive on Tez is able to produce 100K reports/hour
24
Hive on Tez+LLAP
How Hive on Tez+LLAP handle 100K reports ?
• how many servers • Tuning?
What is LLAP
26
What is LLAP?
LLAP is for sub-second query procesisng
•Persistent daemons
•Caching data
27
What is LLAP?
Tez container
Tez container
Tez AppMaster
Tez
created dynamically
LLAPdaemon
LLAPdaemon
Tez AppMaster
Tez+LLAP
persistent daemon
Basic Tuning
29
LLAP test cluster
Server node Xeon E5-2660v2 2.2GHz / 2CPU / 128GBMEM / 10GBase-T 2port
Slave node 45 nodesHiveServer2 node 10 nodesHadoop 2.7.1Hive 2.1.0-snapshotTez 0.8.3
30
Parameters
Some basic parameters needs to be changed
very slow performance if it’s default value
31
Threading model
hive.llap.daemon.num.executors
hive.llap.io.threadpool.size
threadexecutor
thread
threadI/O
thread
data
Executor thread pool
32
hive.llap.daemon.num.executors (default 4)• the number of JVM thread for
query execution• set this same with the num of
vCPU• 40 in our cpu
33
Performance: executor thread
34
I/O thread pool
hive.llap.io.threadpool.size(default 10)• number of IO threads• Set the number of vCPU
• 40 in our case
35
Performance: I/O thread
36
Memoryhive.llap.daemon.memory.per.instance.mb java -Xmx …
hive.llap.io.memory.size
executor I/O
JVM on-heap JVM off-heap
Performance(compared to Tez)
Performance: QPS
38
39
100K / hour ?
LLAP 45 nodes(test cluster)
max: 24 qps ≈ 87K query/hour
70 nodes for 100K(if it’s scaled linearly)
Advanced Tuning
Advanced Tuning
41
hive.llap.client.consistent.splits
false(default) => Use file locality for selecting LLAP daemon
true => LLAP daemon is selected evenly(by hash distribution)
42
Recap: LLAP
A node runs LLAPand also datanode
hive.llap.client.consistent.splits
43 Locality No Locality
Future Work
Web UI
46
Web UI (HIVE-11526)LLAP daemon exposes basic metrics on port 15002(default)
Included in HIVE2.1
Contributed from Yahoo! JAPAN
47
Web UI (HIVE-14030)
HIVE-11526 is just for each daemon
HIVE-14030 provides aggregation view of a LLAP cluster (not yet in master)
Contributed from Yahoo! JAPAN
ACL
49
Hive Column-level ACL
HS2 LLAP
YARN
HDFS
GOAL: Column-level ACL
SQL
ANSWER(?):HiveServer2 can do it
50
Direct Access to HDFSbreaks everything
HS2 LLAP
YARN
HDFS
Storage Based Authorization
M/R,Pig,
SparkBreak SQL
Standard Based ACLs
!!
But direct accessing(Not from Hive) to HDFS breaks the security model.
Other solutions(not only Hive)are necessary
51
Future Directions
HS2 LLAP
YARN
HDFSLlapInputFormat
M/R,Pig,
Spark
CheckSQL
Based ACLs
LlapInputFormat checks ACLs to HS2 for other applications.HIVE-13441 HIVE-12991
see LlapDump.java
Summary
Summary
53
• Throughput is greatly improved by LLAP
• Some tunings are necessary
• LLAP is also effective for batch processing
Q & A