1© Cloudera, Inc. All rights reserved.
HDFS Short Circuit Local Read Performance in Accumulo and HBaseMichael Ridley | Solutions Architect
2© Cloudera, Inc. All rights reserved.
Agenda
• Explanation of HDFS short circuit local reads
• Cluster configuration
• Testing methodology
• Accumulo results
• HBase results
• Thoughts on possible future research
• Q&A
3© Cloudera, Inc. All rights reserved.
What are HDFS short circuit local reads?Why are we here?
4© Cloudera, Inc. All rights reserved.
HDFS short circuit local reads
• Typical communication between an HDFS client and the HDFS datanode is over a TCP socket.
• In cases where the read happens to be occurring on the same host as the datanode serving the data, it is more efficient to avoid the socket overhead.
• HDFS provides a facility to communicate over a named pipe in these cases.
5© Cloudera, Inc. All rights reserved.
How did we test?Cluster stats, testing methodology, etc.
6© Cloudera, Inc. All rights reserved.
Cluster configuration
• Benchmarking was performed on a 40-node cluster using 36 tablet servers/region servers.
• All testing was performed on CDH 5.3.3 using the latest Cloudera build of Accumulo 1.6.
• Tablet servers and region servers were configured with 4 GB heap.
• Cluster installation and configuration was performed via Cloudera Manager using parcels.
7© Cloudera, Inc. All rights reserved.
HDFS configuration
• Primary hdfs-site.xml configuration property is dfs.client.read.shortcircuit
• Must be set both on the datanodes and the clients (tablet servers and region servers).
• The property dfs.domain.socket.path specifies the path to the local socket file.
• Additional performance could be possible by setting dfs.client.read.shortcircuit.skip.checksum to true (not tested in this experiment).
8© Cloudera, Inc. All rights reserved.
Testing methodology - YCSB
• Performance testing of Accumulo and HBase was performed using the YCSB benchmark suite available from https://github.com/brianfrankcooper/YCSB.
• Stock YCSB does not work with modern versions of HBase so the https://github.com/apurtell/ycsb/tree/new_hbase_client fork was used for HBasetesting.
• Two YCSB workloads were used, a small workload with 10 byte rows and a larger workload with 100 KB rows.
• Each workload was run with HDFS short circuit local reads disabled and then with short circuit local reads enabled.
9© Cloudera, Inc. All rights reserved.
Testing methodology - YCSB (continued)
• Each YCSB workload was run ten times.
• In the results, the first iteration is dropped because in some cases the services were restarted and the first run included JVM JIT warm-up overhead.
• YCSB benchmarks include two phases, load and run.
• Tables were flushed after the load phase to empty the memtable and OS disk caches were flushed.
• The YCSB workloads were 100% read operations.
10© Cloudera, Inc. All rights reserved.
Testing methodology - caching
• Each test was performed with caching enabled and disabled.
• For Accumulo the table properties table.cache.block.enable and table.cache.index.enable were set to true or false.
• The tablet server properties tserver.cache.data.size and tserver.cache.index.sizewere set to 0 or 2G.
• For HBase the property BLOCKCACHE was set to true or false.
11© Cloudera, Inc. All rights reserved.
Testing methodology – pre-splitting
• For both HBase and Accumulo the table was pre-split for better distribution.
• A splits file was used with 100 splits.
• The same splits were used for each workload.
12© Cloudera, Inc. All rights reserved.
So what did we learn?On to the results!
13© Cloudera, Inc. All rights reserved.
Accumulo results
20000
21000
22000
23000
24000
25000
26000
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
10 Byte Workload 100 KB Workload
14© Cloudera, Inc. All rights reserved.
Accumulo results
10 Byte WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 25669.35859 12 No 23226.63196 6
Yes 22518.62248 12 Yes 22124.00868 6
% Change -12.27% 0.00% % Change -4.75% 0.00%
Absolute Change 3150.73611 0 Absolute Change 1102.62328 0
100 KB WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 1723347.444 456 No 1739103.513 479
Yes 1589593.573 459 Yes 1143180.924 470
% Change -7.76% 0.66% % Change -34.27% -1.88%
Absolute Change 133753.871 -3 Absolute Change 595922.589 9
15© Cloudera, Inc. All rights reserved.
Accumulo single node results
10 Byte Workload 100 KB Workload
0
1000
2000
3000
4000
5000
6000
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
0
50000
100000
150000
200000
250000
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
16© Cloudera, Inc. All rights reserved.
Accumulo single node results
10 Byte WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 5335.53057 6 No 5264.11565 6
Yes 4228.43033 5 Yes 1607.29118 2
% Change -20.75% -16.67% % Change -69.47% -66.67%
Absolute Change 1107.10024 1 Absolute Change 3656.82447 4
100 KB WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 223443.433 402 No 227722.941 416
Yes 210199.774 316 Yes 189733.018 313
% Change -5.93% -21.39% % Change -16.68% -24.76%
Absolute Change 13243.659 86 Absolute Change 37989.923 103
17© Cloudera, Inc. All rights reserved.
HBase results
10 Byte Workload 100 KB Workload
0
500
1000
1500
2000
2500
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
176000
178000
180000
182000
184000
186000
188000
190000
192000
194000
Without Caching With Caching
Ave
rage
Lat
en
cy (
us)
YCSB Read Average Latency
SCR: No
SCR: Yes
18© Cloudera, Inc. All rights reserved.
HBase results
10 Byte WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 2047.91269 1 No 1264.74923 595423
Yes 1549.74452 0 Yes 1255.20502 0
% Change -24.33% -100.00% % Change -0.75% -100.00%
Absolute Change 498.16817 1 Absolute Change 9.54421 595423
100 KB WorkloadsWithout Caching With Caching
SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)
No 192503.1168 118 No 183588.0945 44
Yes 182921.3385 71 Yes 181761.4176 47
% Change -4.98% -39.83% % Change -0.99% 6.82%
Absolute Change 9581.7783 47 Absolute Change 1826.6769 -3
19© Cloudera, Inc. All rights reserved.
Where to from here?Possible future research directions.
20© Cloudera, Inc. All rights reserved.
Future research possibilities
• Testing with a more diverse set of workloads to better understand which workloads benefit most from short circuit local reads.
• Memory profiling during benchmarking to understand HDFS client memory overhead.
21© Cloudera, Inc. All rights reserved.
Q&AAny questions?