a guide to stress testing kafka, spark and cassandra … · spark workers. the nodes are named...

19
INTRODUCTION MityLytics software helps customers plan, build and operate high-performance data pipelines. It keeps data pipelines running at peak performance so customers can deploy their data-driven revenue generating applications faster and maintain uptime at higher levels than before. At MityLytics, we keep running performance benchmarks on a variety of hardware and software combinations to allow our customers to establish upper bounds on performance for their Big Data applications on their current or planned infrastructure in the public cloud or on-premise. So far our efforts have concentrated on cloud-based deployments, but recently having been introduced to Packet, a cloud-provider that offers bare metal servers in the cloud, we decided to build a data pipeline testbench in an increasingly popular configuration - Kafka message broker, Spark for streaming and Cassandra for data storage using our software and share our observations and experiences. Our goals for this project are to establish infrastructure bounds for real-time data pipeline performance when all these systems are running together on bare-metal servers. This as you will see helps to characterize and then scale the system when load is increased on the systems. We decided to craft our testbench with five nodes, with more than one server type. Unlike recommendations from the software stack vendors, we are not going to break the bank and over-provision, since we have been able to characterize the performance of each of the stacks in the past. We will explain shortly how we settled on the components in the pipeline. THE SETUP Packet has 4 types of servers and compared to other vendors the capabilities of each server type have been well thought of specially in the context of building Big Data platforms. For the purposes of this test, we decided to use their Type 1 and Type 2 servers. METHODOLOGY Run system stress systems for each individual component stack till the point of saturation Measure performance at different levels of stress to come up with peak operations/sec with acceptable latency Characterize the performance in terms of resource usage and establish upper bounds on throughput and latency Identify the infrastructure components that bound the performance of the component stack. Execute steps 1-4 above for software stacks deployed and running in 2 scenarios 1. 2. 3. 4. In isolation on clusters by themselves Running on a multi-tenant - meaning co-located clusters. Clusters in the data pipeline share components. 1. 2. A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA ON BARE METAL

Upload: others

Post on 02-Feb-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

INTRODUCTION

MityLytics software helps customers plan, build and operate high-performance data pipelines. It keeps data pipelines running at peak performance so customers can deploy their data-driven revenue generating applications faster and maintain uptime at higher levels than before.

At MityLytics, we keep running performance benchmarks on a variety of hardware and software combinations to allow our customers to establish upper bounds on performance for their Big Data applications on their current or planned infrastructure in the public cloud or on-premise. So far our efforts have concentrated on cloud-based deployments, but recently having been introduced to Packet, a cloud-provider that offers bare metal servers in the cloud, we decided to build a data pipeline testbench in an increasingly popular configuration - Kafka message broker, Spark for streaming and Cassandra for data storage using our software and share our observations and experiences.

Our goals for this project are to establish infrastructure bounds for real-time data pipeline performance when all these systems are running together on bare-metal servers. This as you will see helps to characterize and then scale the system when load is increased on the systems. We decided to craft our testbench with five nodes, with more than one server type. Unlike recommendations from the software stack vendors, we are not going to break the bank and over-provision, since we have been able to characterize the performance of each of the stacks in the past. We will explain shortly how we settled on the components in the pipeline.

THE SETUP

Packet has 4 types of servers and compared to other vendors the capabilities of each server type have been well thought of specially in the context of building Big Data platforms. For the purposes of this test, we decided to use their Type 1 and Type 2 servers.

METHODOLOGY

Run system stress systems for each individual component stack till the point of saturation

Measure performance at different levels of stress to come up with peak operations/sec with acceptable latency

Characterize the performance in terms of resource usage and establish upper bounds on throughput and latency

Identify the infrastructure components that bound the performance of the component stack.

Execute steps 1-4 above for software stacks deployed and running in 2 scenarios

1.

2.

3.

4.

In isolation on clusters by themselves

Running on a multi-tenant - meaning co-located clusters. Clusters in the data pipeline share components.

1.

2.

A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA ON BARE METAL

Page 2: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

DESIGN CONSIDERATIONS

The diagram below shows how Kafka, Spark and Cassandra were installed on each of the five nodes under consideration.

Figure 1: Nodes and Software Frameworks

Here is a summary of the Server Configuration (for further details, please check the Packet.netwebsite) :

Page 3: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

BENCHMARK

Our approach to benchmarking is not just pure performance measurement with a specific configuration but in fact to get a characterization of the systems under test. To outline this approach

To benchmark the system, we first deployed our agent software to collect detailed system information from each node. We used the MityLytics analytics software to track how much of the infrastructure each node and software stack was using.

Run benchmarks in isolation on the cluster and get KPI to establish upper bounds on performance of software stack on chosen hardware configuration.

Run benchmarks together with each other to get an upper bound on performance when software stacks run together on chosen hardware configuration

Compare performance in the 2 cases above to see if clusters for the software stacks should be isolated or should they continue to be co-located and to understand the cost/performance tradeoffs.

In the sections below, we explain the details of each software stack.

Kafka ClusterThis cluster had three Type 1 nodes. We chose Type 1 because we wanted at least 32GB of RAM and realized that we don’t need a lot of computing power given that in general Kafka is network bound incoming messages are from the WAN (IoT devices, sensors, location software etc.). The nodes are named Kafka01, Kafka02 and Spark-Cassandra-Master. The third node name is reflective of the fact that it is a node being used by a Spark and Cassandra cluster as we explain in the section below.

Zookeeper, the Kafka management framework, is run on Kafka01. We have started the Kafka server on all three nodes. As we see in the benchmarks below, we have shown that we can exceed established Kafka benchmarks (Linkedin) with Type 1, the second-most economical Packet nodes.

Spark and Cassandra co-located ClusterThe Spark cluster has been setup with one Type 1 node as the Spark Master and two Type 2 Packet nodes as the Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra cluster has been set up on all of these three nodes as well and run in the Cassandra ring topology with 256 tokens.

Now you may ask why we did this; well we chose to follow the recommendations of Spark and Cassandra vendors, however given the nature of the Packet instances and high network throughput (20 Gbps) between Type 2 and Type 3 nodes we could have potentially split the cluster, where Spark and Cassandra do not share any nodes. Spark can live on Type 2, because it does a lot of in-memory processing and 256 GB of RAM is ideal for this type of node. Cassandra can be on Type 3 since it needs a lot of persistent high-speed storage (enabled with NVME on Type-3 nodes).

So coming back to this experiment we established a 3-node cluster with Type 2 servers as Spark workers and a Type 1 node as the Spark master. We chose Type 2 primarily, because of higher RAM and SSD capacity given that both Spark and Cassandra stacks will be running on it. This cluster is enough to characterize the performance of a co-located Spark and Cassandra cluster in order to understand the way to scale the deployment.

Page 4: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

TEST 1: KAFKA STRESS TEST

Test Specifics

For testing we used the Kafka stress test utility, that is bundled with Apache Kafka installations to saturate the cluster via the following tests:

Test Results

Kafka StatisticsThe table below shows throughput and latency numbers for each Producer test.

Figure 2: The nodes in the kafka cluster showing their respective roles

Producer Performance test on Kafka01. The test produces 50 million messages to a topic each of size 100 bytes, with a replication factor of 3 and partition size of 6. We measure the messages produced per second, throughput, average and maximum latency. The Producer perf is run with different throughput parameters of 100K, 1M and 10M messages per second. These parameters are chosen to verify the upper bound at which the platform can produce messages to various queues in the 6 partitions.

Consumer Perf test on Kafka02. This test consumes the 50 million messages earlier produced to the topic. The consumption rate, messages per second, throughput are measured while this test is run.

Simultaneous producer perf on Kafka01 and consumer perf on Kafka02. The producer perf will generate 50 million messages to a topic and simultaneously the consumer perf will consume the messages as they are getting generated.

1.

2.

3.

Produce Rate(record /sec)

100K

1 Million

10 Million

50 Million

50 Million

50 Million

7.79 ms

154.95 ms

160.42 ms

1 ms

568 ms

514 ms

99.99K records/sec

977K records/sec

1.3M records/sec

123.31MB/sec peak Network throughput

Record Sent Throughput Latency

Median 95th %tile

Page 5: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Test Summary

To summarize the results from the tests:

Infrastructure ObservationsWhile running the test we observed the following out from our analytics:

CPU usage was minimal as expected (less than 23% utilization), therefore the CPUcan be used for other tasks like number crunching; so it helps to have the right job mix.

Memory on all three nodes (Kafka-01, Kafka-02 and Spark-Cassandra-Master) was pretty much close to the total memory available 32GB utilization for the duration of the test and used swap space on the Kafka nodes was minimal (< 5MB). This leads us to believe that Kafka is pre-allocating memory.

As expected network performance bounds the performance of Kafka and as we can see from the above table we hit the ceiling of 1 Gbps/sec when we tried to saturate the cluster with a record rate of 10 million/sec. We were able to reach a peak rate of 90K TCP segments sent /second and 45K TCP segments received / second.

From a storage perspective, we saw that it definitely helps to have SSD; since we saw disk performance go up to 300K write operations/sec across the two SSDs in a RAID1 configuration.

Kafka is memory and network bound. Type 1 nodes on Packet seem like a good fit, especially since we don’t need high computing power. Given that Kafka does replication however, this could be changed to a RAID 0 configuration very easily on Packet - we really like this flexibility and we are am sure folks everywhere will love this.

Figure 3: Memory usage on the Kafka Nodes

The performance of the Kafka consumer peaked at about 1.68 million messages/sec with 10 Million/sec as the producer rate , however there was minimal effect on latency between 1 Million and 10 Million records/second although the system was saturated. The system scales well since on most systems latency lags bandwidth.

So we can see that the producer essentially is the limiting factor for message throughput. We can get a maximum rate of 1.3 million messages/sec when running a Producer and consumer together.

Kafka stress test numbers beat the numbers p ublished by LinkedIn given that our Type 1 hardware is very similar to what they used given that we have less computing power but better performant disks.

LinkedIn: Single producer thread, 3x synchronous replication - 421,823 records/sec (40.2 MB/sec)

Packet: Single producer thread, 3x synchronous replication - 1,292,958 records/sec (123.31 MB/sec)

Page 6: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

TEST 2: SPARK STRESS TEST

Test Specifics

The Spark-Perf benchmark was run with Spark batch tests and the tests were scaled up gradually in order to stress the cluster under test ultimately saturating the cluster to establish the peak operation rate achieved and the infrastructure resource that bound the operation rate. The spark jobs were sent to the executors, which are running on Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. Note: When we ran the Spark tests, although we had the Cassandra servers up and running we did not run any other tests at the same time.

Figure 4: Spark cluster depicting Master and Worker nodes

Test Results

Scale Factor

0.1

1

2

3

40

400

800

1200

2.9 GB

29.1 GB

37.7 GB

37.8 GB

0.8

89.8

367.4

842.2

2

16

438

1020

66

366

5040

7920+

0.8

89.8

367.3

842.2

0.097

3

57

132

Task/StageInput for

Shuffle Write Tasks

Shuffle Read (MB/stage)

Max Shuffle Read Time

(secs)

Shuffle Write (MB/stage)

Max Shuffle Write Time (MB/stage)

Total Time

Spark Statistics

Page 7: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Scale factor defines partitioning - i.e. how much processing you want to do in parallel

Between scale factor 1, 2 and 3

1.

2.

a.

b.

Input data for writes is the same but max write time is non-linear - contention could be disk or memory - lets look into that in the section below

Data being read and Shuffle read time also increases non-linearly

Observations

Figure 5: Spark performance test: scale factor vs various metrics

Page 8: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Scale factor 3 job was stuck at a stage where tasks were not progressing. After looking at the infrastructure data, the following stands out:

3.

a.

b

c.

d.

e.

Memory on the Master node is saturated (31.09GB/ 31.2GB used up).

CPU stats show that the worker is at close to 92% CPU utilization on one Worker and 79% on the Master.

The Master can handle 800 tasks per stage, corresponding to scale factor 2. On the other hand when 1200 tasks per stage are presented to it, corresponding to scale factor 3, it isn’t able to schedule them to the workers. If the Master is not limited by the number of tasks per stage, the Workers will become CPU bound before they become memory bound, from our observations.

So in order to scale gracefully the Master memory needs to be scaled up while the number of available workers needs to be increases which will result in more CPU available.

So in conclusion we have successfully figured out which resource dimensions we need to scale in as the problem size increases.

Figure 6: Scale factor 3 CPU stats show worker 1 at ~90%

Figure 7: Memory used on Master is saturated

Page 9: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 9: Master node CPU utilization

Figure 10: Worker node CPU utilization

Figure 8: Memory on the Workers is at ~46GB out of 256 Gb that is available

The below observations are for scale factor less than 3

Infrastructure Observations

CPU usage was as follows:

On average 2% for the Spark master node with a peak utilization of 41%

On average 23-25% for the Spark worker nodes with a peak utilization of 69%

Page 10: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Test Summary

We reached a maximum of 45GB (256GB total) memory utilization on the worker nodes and 25GB (total: 32GB) on the master node until scale factor 2. For scale factor 3 we hit memory saturation for Master node, while the Worker node was way below memory limit ~ 45 GB.

The network was not at all saturated, the maximum achieved network throughput we observed was 134 Mbps while we have seen that the network can easily scale up to

From a storage perspective we saw the following numbers:

Spark can be both CPU and Memory bound, however given the configuration of Type 2 servers, we didn’t run into any limits till scale factor 2. After we started tests with scale factor 3, our application was stuck after running for more than 2.5 hours. Out of the 21 stages just 12 stages were completed, while the 13th stage was showing an estimated time of 602.5 hours!!

Type 1 servers are a good fit for the Spark master nodes with number of tasks per stage being less than ~800. When we go over that to ~1200 tasks per stage, the type 1 master runs out of juice for scheduling the tasks on the executors.

Type 2 servers are adequate for Spark workers especially when Spark is run in isolation. For higher rate of tasks ~1200 per stage, type 2 should also be used as the Master.

Type 3 servers should be considered for Spark workers when spark nodes are co-located with Cassandra.

One other configuration could be type 2 servers for Spark and type 3 nodes for Cassandra data nodes. Given Packet’s unique 20Gbps bonded network between nodes and the low latency communication it would be very effective.

We saw a peak of 633K writes/sec on the Spark-Cassandra master node

333K writes/sec on the Spark-Cassandra worker nodes and the Kafka nodes

The workers however are Type 2 with RAID0 and therefore should have gotten better performance than the Type 1 with RAID1; however our explanation for this, is most of the work on the worker nodes for this was done in-memory given the high CPU utilization observed there.

TEST 3: APACHE CASSANDRA BENCHMARK

Test Specifics

The Cassandra test setup comprises of 3 nodes namely Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. All nodes run in the cassandra ring topology with 256 tokens. We run Cassandra Stress on this cluster with a mixed write-read ratio of 1:3, operations happen for a duration of 45 mins, which correspond to ~70+ million operation with a uniform key distribution in the range of 0 - 1M. The number of threads attempting to do this is limited to 16 - 256 client threads.

Page 11: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 11: The Cassandra cluster in the ring topology in our benchmark

OP Type

Mix

Read

Write

26,195 op/s

19,641 op/s

6,553 op/s

1.0 ms

1.0 ms

0.3 ms

1.3 ms

1.3 ms

0.4 ms

0.6 ms

0.7 ms

0.3 ms

0.4 ms

0.6 ms

0.2 ms

OP Rate (ops/sec)

Latency

MedianMean 95th %tile 95th %tile

CPU usage was as follows:

We reached a maximum of 22GB memory utilization on the worker nodes and 27GB on the master node.

The network was not at all saturated, the maximum bandwidth we observed was close to 1 Gbps on the type 2 worker nodes.

This time given that we were running on Type 2 nodes with SSD and no RAID, we got almost double the performance with 113K write operations/second.

Test Results

Cassandra Statistics

The table below shows operations rate and the latency for the duration of the test. These are great numbers especially with the low latency numbers which usually lags bandwidth.

Infrastructure Observations

On average 15% for the master node with a peak utilization of ~45%

On average 30% for the worker nodes with a peak utilization of ~40%

Page 12: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 12: Number of writes on the worker

Figure 13: Number of writes on the master

Cassandra is mostly CPU bound, however given the configuration of Type 2 servers, we didn’t run into any limits. As shown in the figures below, cassandra stress run on Type 2 has a max CPU utilization of ~40%. When the same test is run on Type 1 the average CPU utilization ~80%.

Test Summary

Page 13: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 14: CPU utilization on type 2 worker node

Figure 15: CPU utilization on type 1 master node

Cassandra should run on type-2 or type-3 nodes depending on the storage required. The advantage with Type-3 is Cassandra can run in isolation given the 20Gbps network between Type-2 and Type-3 nodes. Type-2 can host spark workers.

Page 14: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 16: Running all the performance tests together to stress testSpark-Cassandra-Worker01

TEST 4: COMBINED TESTING

Test Specifics

We finally test a combination of all the tests which we have executed in the different clusters in the previous sections. If you look at the figure below (Fig 4), Spark-Cassandra-Master is the common node between the Kafka, Spark and Cassandra cluster and as detailed above, the Spark and Cassandra clusters are exactly the same.

In the setup of this test, we run Kafka on Kafka01, Kafka02 and Spark-Cassandra-Master. Kafka01 has a producer running which writes 300M messages to all the partitions in the cluster.

Spark-Cassandra-Worker01 is running a consumer simultaneously to consume all the 300M messages produced on Kafka01.

Spark-Cassandra-Master is launching Spark batch tests as the Spark master node via the spark-submit utility. The executors for this job are running on worker nodes - Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02.

Finally we have a Cassandra cluster on Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. We run Cassandra stress test on Spark-Cassandra-Worker01.

The setup details are also shown in the figure below. Spark-Cassandra-Worker01 is the node which is running Spark streaming performance as a Spark worker, Kafka consumer and Cassandra stress.

Page 15: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Cassandra test results as compared to the standalone test shows that there is minimal difference in the results. We again are able to achieve low latency which is great for building a Real-Time data pipeline. Type 2 nodes for the cassandra cluster are therefore highly recommended.

Running comparable spark-perf tests particularly show that shuffle read times increased to almost double for the the combined test. This is because of high network activity on the cluster on account of the Kafka and Cassandra tests.

The above shows that the rate of data consumption has decreased by ~27% when the test ran individually compared with when it was run with all the other components.

Test Results

Cassandra Statistics

Cassandra stress runs well with similar numbers as if it was running by itself. This is a key to a well performing pipeline since we always want the database to work as if it had a cluster by itself.

Spark Results

The table shows the performance of the Spark cluster when run in isolation and when run with the Spark and Cassandra tests (Combined) using the spark-perf utility.

Kafka Results

The table below shows the performance of the Kafka consumer for both the individual and combined tests

OP Type

Mix

Read

Write

26,416 op/s

19,810 op/s

6,610 op/s

1.0 ms

1.1 ms

0.3 ms

1.4 ms

1.5 ms

0.3 ms

0.6 ms

0.7 ms

0.2 ms

0.4 ms

0.6 ms

0.2 ms

OP Rate (ops/sec)

Latency

MedianMean 95th %tile 95th %tile

Individual

Combined

37.7 GB

37.7 GB

367.4

367.4

438

468

5040

5362

367.3

367.3

57

90

TestInput for

Shuffle Write Tasks

Shuffle Read (MB/stage)

Max Shuffle Read Time

(secs)

Shuffle Write (MB/stage)

Max Shuffle Write Time (MB/stage)

Total Time

Individual

Combined

291

372

279,387,227

300,000,000

957760

695350

26,644

28,610

91.33

66.31

TestTime Taken

(secs)Data

Consumed in MB

MB/sec Data Consumed in Number of Messages

Number of Messages/sec

Page 16: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 17: CU stats across all the nodes. Worker01 is node under stress.Spark-Cassandra-Worker01

Figure 18: Memory used by all the nodes

Figure 19: TCP segments sent and received

CPU usage was as follows:

We reached a maximum of 48GB (256GB total) memory utilization on the worker nodes and close to 32GB (total: 32GB) on the master node.

The maximum achievable network bandwidth we observed was 1.2 Gbps.

Infrastructure Observations

On average 26% utilization for the master node with a peak utilization of 75%

On average 41-58% utilization for the worker nodes with a peak utilization of 88%

Page 17: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

Figure 20: SSD writes across all the clusters

CONCLUSIONS

So what we have done in this exercise is so far is:

Selected the infrastructure components for a real-time data pipeline

Ran stress tests on the components of the data pipeline in isolation to get peak rates of performance when there is no contention for resources

Ran stress tests on the components of the data pipeline when they work in tandem to identify the contention for resources

Derived insights into scaling individual clusters in the right dimensions, co-location and maintaining performance and provided guidance on price/performance trade-off

1.

2.

3.

4.

The Kafka-Spark consumer however, works well with a 99% percentile latency of 14 milliseconds being achieved by the producer, so it is not back-pressuring the Kafka producer.

The Spark workers and the Kafka consumers are contending for resources (SSD and CPU).

More specifically, the Cassandra data node and the Spark master are shared with the Kafka broker node so SSD writes for incoming data on the Kafka cluster are competing with Cassandra writes.

Specifically let’s examine the clusters in sequence:

We did see similar numbers for storage performance as seen before

633K writes/sec on the Spark-Cassandra master node

333K writes/sec on the Spark-Cassandra worker nodes and the Kafka nodes

Kafka Cluster: There is change in benchmark performance between running in isolation and in tandem with other components. The Kafka consumers residing on spark clusters contend with Spark stream processing on the worker nodes. Kafka standalone performance is bound by network throughput as seen by stressing the system and getting to a peak of 1.7 million messages being produced.

Kafka-Spark consumer: There is a 27% degradation in throughput when the Kafka consumer is placed on the Spark worker node. This is attributed to the fact that:

Page 18: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

The Spark-Cassandra co-located cluster is the recommended way to deploy as per the software vendors, however while running the stress test for the pipeline we observed the following

The machines on Packet are very easy to provision and get provisioned within minutes unlike other bare-metal server providers.

Packet gives us four easy to use options as opposed to a multitude of options with other providers, when really you want to able to select a node type based on your application. This is what we recommend how you should organize your clusters:

Spark performance is off by a peak of 50% but on average it is off by about 15%Cassandra performs with negligible performance hit, which is good. A few observations:

Type 0 machines for development environments

Type 1 for non-intensive apps or master nodes for frameworks such as Hadoop master nodes, Spark master nodes and Kafka workers or for stage environments

Type 2 for Spark worker nodes

Type 3 as data nodes. These will work well as Cassandra data nodes.

The co-located performance is not bound by network

CPU Utilization hit a maximum of (30%) so additional computing power is not needed. Bare-metal servers make a difference, since the resources are not shared.

Multiple runs yielded the same result, so we were able to get reliable performance numbers

Bare-metal servers in the cloud are a huge advantage. This is something that not many other providers have and provides a great economical alternative to expensive co-location.

Separating the Spark-Cassandra cluster is something that is well worth exploring on Packet given the dedicated 20 Gbps network, so Spark Worker nodes can live on Type 2 servers and Cassandra can live on Type 3 or Type 4. Also given that RAID configurations are flexible and available during runtime one can easily switch to RAID0 on all types of instances given that software replication is provided by the stacks.

At $1.75 / hour a Packet Type 3 node is half the price of AWS instances (i2.8xlarge) being recommended by Cassandra vendors and in addition comes with NVME.

EXPERIENCES WITH PACKET

Page 19: A GUIDE TO STRESS TESTING KAFKA, SPARK AND CASSANDRA … · Spark Workers. The nodes are named Spark-Cassandra-Master, Spark-Cassandra-Worker01 and Spark-Cassandra-Worker02. The Cassandra

We have been able to run tests and get good performance and performance indicators by intelligently choosing the right hardware and software stack at different scale points. Our aim was to find the peak performance rates and the saturation points for the problem set size on a given infrastructure configuration. Once we identified the saturation points we were able to identify the right resources to scale up in order to continue to meet performance goals. For us, low latency is key to good pipeline performance and the Packet infrastructure has given us just that.

Comparison with our tests on virtualized infrastructure from AWS in the context of Spark-perf performance. Here are a few salient points.

MityLytics has proprietary algorithms that predict and simulate Big Data frameworks via the MityLytics portal - contact us if you want to know more.

And BTW!

With the Spark-perf test suite and scale factor of 1 is required to be run on a cluster with 20 m4.xlarge AWS instances, however, we were able to do this with just 3 Type 2 nodes in 35 minutes. It is great to see this performance at this price point.

The Spark-Cassandra cluster is recommend to be run on i2.8xlarge nodes on AWS, which costs ~$3/hour after discounting, however on Packet the cost would be $1.50 with dedicated bandwidth, the low latency 20 Gbps network and with the flexibility in organizing the cluster such that the full power of the high-performance network can be used.

1.

2.

SUMMARY