Transcript
Page 1: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

1© Cloudera, Inc. All rights reserved.

HDFS Short Circuit Local Read Performance in Accumulo and HBaseMichael Ridley | Solutions Architect

Page 2: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

2© Cloudera, Inc. All rights reserved.

Agenda

• Explanation of HDFS short circuit local reads

• Cluster configuration

• Testing methodology

• Accumulo results

• HBase results

• Thoughts on possible future research

• Q&A

Page 3: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

3© Cloudera, Inc. All rights reserved.

What are HDFS short circuit local reads?Why are we here?

Page 4: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

4© Cloudera, Inc. All rights reserved.

HDFS short circuit local reads

• Typical communication between an HDFS client and the HDFS datanode is over a TCP socket.

• In cases where the read happens to be occurring on the same host as the datanode serving the data, it is more efficient to avoid the socket overhead.

• HDFS provides a facility to communicate over a named pipe in these cases.

Page 5: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

5© Cloudera, Inc. All rights reserved.

How did we test?Cluster stats, testing methodology, etc.

Page 6: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

6© Cloudera, Inc. All rights reserved.

Cluster configuration

• Benchmarking was performed on a 40-node cluster using 36 tablet servers/region servers.

• All testing was performed on CDH 5.3.3 using the latest Cloudera build of Accumulo 1.6.

• Tablet servers and region servers were configured with 4 GB heap.

• Cluster installation and configuration was performed via Cloudera Manager using parcels.

Page 7: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

7© Cloudera, Inc. All rights reserved.

HDFS configuration

• Primary hdfs-site.xml configuration property is dfs.client.read.shortcircuit

• Must be set both on the datanodes and the clients (tablet servers and region servers).

• The property dfs.domain.socket.path specifies the path to the local socket file.

• Additional performance could be possible by setting dfs.client.read.shortcircuit.skip.checksum to true (not tested in this experiment).

Page 8: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

8© Cloudera, Inc. All rights reserved.

Testing methodology - YCSB

• Performance testing of Accumulo and HBase was performed using the YCSB benchmark suite available from https://github.com/brianfrankcooper/YCSB.

• Stock YCSB does not work with modern versions of HBase so the https://github.com/apurtell/ycsb/tree/new_hbase_client fork was used for HBasetesting.

• Two YCSB workloads were used, a small workload with 10 byte rows and a larger workload with 100 KB rows.

• Each workload was run with HDFS short circuit local reads disabled and then with short circuit local reads enabled.

Page 9: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

9© Cloudera, Inc. All rights reserved.

Testing methodology - YCSB (continued)

• Each YCSB workload was run ten times.

• In the results, the first iteration is dropped because in some cases the services were restarted and the first run included JVM JIT warm-up overhead.

• YCSB benchmarks include two phases, load and run.

• Tables were flushed after the load phase to empty the memtable and OS disk caches were flushed.

• The YCSB workloads were 100% read operations.

Page 10: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

10© Cloudera, Inc. All rights reserved.

Testing methodology - caching

• Each test was performed with caching enabled and disabled.

• For Accumulo the table properties table.cache.block.enable and table.cache.index.enable were set to true or false.

• The tablet server properties tserver.cache.data.size and tserver.cache.index.sizewere set to 0 or 2G.

• For HBase the property BLOCKCACHE was set to true or false.

Page 11: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

11© Cloudera, Inc. All rights reserved.

Testing methodology – pre-splitting

• For both HBase and Accumulo the table was pre-split for better distribution.

• A splits file was used with 100 splits.

• The same splits were used for each workload.

Page 12: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

12© Cloudera, Inc. All rights reserved.

So what did we learn?On to the results!

Page 13: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

13© Cloudera, Inc. All rights reserved.

Accumulo results

20000

21000

22000

23000

24000

25000

26000

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

10 Byte Workload 100 KB Workload

Page 14: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

14© Cloudera, Inc. All rights reserved.

Accumulo results

10 Byte WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 25669.35859 12 No 23226.63196 6

Yes 22518.62248 12 Yes 22124.00868 6

% Change -12.27% 0.00% % Change -4.75% 0.00%

Absolute Change 3150.73611 0 Absolute Change 1102.62328 0

100 KB WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 1723347.444 456 No 1739103.513 479

Yes 1589593.573 459 Yes 1143180.924 470

% Change -7.76% 0.66% % Change -34.27% -1.88%

Absolute Change 133753.871 -3 Absolute Change 595922.589 9

Page 15: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

15© Cloudera, Inc. All rights reserved.

Accumulo single node results

10 Byte Workload 100 KB Workload

0

1000

2000

3000

4000

5000

6000

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

0

50000

100000

150000

200000

250000

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

Page 16: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

16© Cloudera, Inc. All rights reserved.

Accumulo single node results

10 Byte WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 5335.53057 6 No 5264.11565 6

Yes 4228.43033 5 Yes 1607.29118 2

% Change -20.75% -16.67% % Change -69.47% -66.67%

Absolute Change 1107.10024 1 Absolute Change 3656.82447 4

100 KB WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 223443.433 402 No 227722.941 416

Yes 210199.774 316 Yes 189733.018 313

% Change -5.93% -21.39% % Change -16.68% -24.76%

Absolute Change 13243.659 86 Absolute Change 37989.923 103

Page 17: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

17© Cloudera, Inc. All rights reserved.

HBase results

10 Byte Workload 100 KB Workload

0

500

1000

1500

2000

2500

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

176000

178000

180000

182000

184000

186000

188000

190000

192000

194000

Without Caching With Caching

Ave

rage

Lat

en

cy (

us)

YCSB Read Average Latency

SCR: No

SCR: Yes

Page 18: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

18© Cloudera, Inc. All rights reserved.

HBase results

10 Byte WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 2047.91269 1 No 1264.74923 595423

Yes 1549.74452 0 Yes 1255.20502 0

% Change -24.33% -100.00% % Change -0.75% -100.00%

Absolute Change 498.16817 1 Absolute Change 9.54421 595423

100 KB WorkloadsWithout Caching With Caching

SCR Enabled Median Average Latency (us) Median 95th Percentile (ms) SCR Enabled Median Average Latency (us) Median 95th Percentile (ms)

No 192503.1168 118 No 183588.0945 44

Yes 182921.3385 71 Yes 181761.4176 47

% Change -4.98% -39.83% % Change -0.99% 6.82%

Absolute Change 9581.7783 47 Absolute Change 1826.6769 -3

Page 19: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

19© Cloudera, Inc. All rights reserved.

Where to from here?Possible future research directions.

Page 20: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

20© Cloudera, Inc. All rights reserved.

Future research possibilities

• Testing with a more diverse set of workloads to better understand which workloads benefit most from short circuit local reads.

• Memory profiling during benchmarking to understand HDFS client memory overhead.

Page 21: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

21© Cloudera, Inc. All rights reserved.

Q&AAny questions?

Page 22: Accumulo Summit 2015: HDFS Short Circuit Local Read Performance Benchmarking with Apache Accumulo and Apache HBase [Performance]

22© Cloudera, Inc. All rights reserved.

Thank youMichael Ridley

[email protected]


Top Related