adobe social collaboration: a deep dive into performance and scalability

© Sapient Corporation, 2013

Adobe Social Collaboration: A Deep Dive Into Performance and Scalability

Sruthisagar Kasturirangan, Infrastructure Architect, Infrastructure Practice, SapientNitro, Bangalore

POINT OF view

INTRODUCTION

Adobe’s Social Collaboration unifies all social networking and collaboration applications within AEM (Adobe Experience Manager) and has gained a lot of attention—in part because today’s consumers are increasingly active on various mobile devices and placing a lot of value on feedback from fellow buyers. And smart content and commerce platforms are capitalizing on Social Collaboration to boost sales and give the end user the best experience possible.

In order to understand Adobe’s Social Collaboration better, we dove into a complete analysis of its performance and scalability aspects. We accomplished this by performing tests with Adobe’s provided JMeter scripting framework for running the benchmark tests you’ll see below. The tests include scripts that perform pure write operations so that it’s possible to measure the overall throughput that can be supported in order to eventually arrive at a physical architecture sizing and capacity plan.

Through these tests, we are now able to provide a general guidance on the methodology needed in order to size the infrastructure and identify key bottlenecks when integrating Social Collaboration as part of the overall design of a content and collaboration platform.

This paper has been written not to contend the results provided by Adobe Systems Incorporated in their documentation but to extend the results for virtualized environments due to the influx in development in the arena of cloud hosting. The following results have been elaborately analyzed and discussed before arriving at the conclusions you’re about to read.

ExpERImENTal SETUp

First, let’s briefly go through the experimental setup we used to conduct those benchmark tests, including the AEM version used, the system configuration, the benchmark architecture, and the test scenario.


POINT OF view

AEM VersionAEM 5.6.0

System ConfigurationAuthor & Publish Environments:8 – CPUs Currently (Logical CPUs)8 – CPUs ConfiguredNumber of Processors: 2 (Allocated)PowerPC_POWER7 – Processor64 bit – Hardware7.1.2.1 TL02 – AIX Kernel VersionMemory Size: 8192MBTotal Paging Space: 2048MBJVM Settings: Maximum Heap Size: 4GB; PermGen: 512MB; IBM J9VM 1.6, GENCON Algorithm

Benchmark Architecture

Test ScenarioThe tests below were all performed using Adobe’s out-of-the-box application Geometrixx. Adobe’s benchmark scripts have procedures to create multiple users in the author and publish environments so that a realistic test scenario can be created. In this case, a test forum topic was created with a small description. The user was then pre-authenticated during the warm up and, once authenticated, held the session and performed continuous write operations.

ITERaTIONS

The various iterations of testing are tabulated and the details of the load model and results are described in the following sections. In particular, the result sections are focused on analyzing the transactions per second as a function of the total number of transactions and average response times (i.e., time taken for last byte).

Load Model #Generic properties: threads/users.#All timings are in seconds.#startThreadCount is the total number of concurrent threads/users. (For 5 requests per second, set it to 150.)#startupDelay is the ramp-up time for starting threads. (For 150 threads, set it to 60 seconds.)#holdLoadFor is the time the test is run. (For 10 minutes, set it to 600.)#shutdownTime is the time it takes the threads to shut down. (Set it to the same value as startupDelay.)#requestsPerSec is the number of requests per number of seconds.

SINGLE PUBLISH CONFIGURATION

PUBLISH NODEAUTHOR NODE

REVERSE REPLICATION

USER REQUESTS


Iteration 1startThreadCount (the total number of concurrent users/threads)=150 startupDelay=60holdLoadFor=1200shutdownTime=0requestsPerSec=2RPSduration=30

Load Ramp Up Model

Throughput Throttling

Note: This test was run with Ultimate Thread Group by throttling requests per second to 2.

Results

POINT OF view

200

180

160

140

120

100

80

60

40

20

000:00:00 00:02:06

Expected parallel users count

00:04:12 00:06:18 00:08:24

Elapsed Time

Num

ber

of a

ctiv

e th

read

s

00:10:30 00:14:42 00:16:48 00:18:54 00:21:0000:12:36

http://apc.kg/plugins

1.54

1.56

1.58

1.6

1.62

1.64

1.66

1.68

1.7

1.72

TPS

TPS

Transactions

755 1044 1341 1644 1940 2234 470

10

9

8

7

6

5

4

3

2

1

000:00:00 00:00:03

Expected RPS

00:00:06 00:00:09 00:00:12

Elapsed Time

Num

ber

of r

eque

sts/

sec

00:00:15 00:00:21 00:00:24 00:00:27 00:00:3000:00:18



Response Times vs. Elapsed Time

From the graphs above, it is clear that only when the load is throttled in such a way as to limit the TPS (transactions per second) to be around 2 are we able to achieve response times within an acceptable range. Throttling is performed using a JMeter Plugin (Ultimate Thread Group) but this does not indicate the concurrent user sessions.

Therefore, additional testing is required to understand the behaviors associated with these changing user patterns.

Iteration 2startThreadCount (the total number of concurrent users/threads)=150 startupDelay=1200holdLoadFor=1200shutdownTime=0

Load Ramp Up Model

Note: This test was run without Ultimate Thread Group and no throttling was applied

POINT OF view

0

500

1000

1500

2000

2500

3000

755 1044 1341 1644 1940

AVG_RESPONSE_TIME

AVG_RESPONSE_TIME

Transactions

2234 470

30 000

27 000

24 000

21 000

18 000

15 000

12 000

9 000

6 000

3 000

000:00:00 00:04:05 00:08:11 00:12:17 00:16:23

Elapsed Time (granularity: 100 ms)

Res

pons

e tim

es in

ms

00:20:28 00:28:40 00:32:46 00:36:51 00:40:57

add Topic to Publish Node

get Topic Page

setTotalTime

00:24:34 http://apc.kg/plugins

200

180

160

140

120

100

80

60

40

20

000:00:00 00:04:00


00:08:00 00:12:00 00:16:00Elapsed Time

Num

ber

of a

ctiv

e th

read

s

00:20:00 00:28:00 00:32:00 00:36:0000:24:00


00:40:00


Results


From the graphs above, we can see that the load was not throttled and users were ramped up at the rate of 1 user every 8 seconds. The moment all 150 users were ramped up, the response times grew to a level that were not within acceptable limits for the page performance.

POINT OF view

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

TPS

TPS

Transactions

1223 1971 2748 3454 4192 4953 5736 6432 7166 7935 8734 9500 9882482

0

5000

10000

15000

20000

25000

30000

1223 1971 2748 3454 4192

AVG_RESPONSE_TIME

4953 5736 6432 7166 7935 8734 9500 9882482

AVG_RESPONSE_TIME

Transactions


get Topic Page

setTotalTime


200 000

180 000

160 000

140 000

120 000

100 000

80 000

60 000

40 000

20 000

000:00:00 00:04:03 00:08:06 00:12:09 00:16:12


Res

pons

e tim

es in

ms

00:20:15 00:28:21 00:32:24 00:36:27 00:40:3000:24:18


POINT OF view

Iteration 3startThreadCount (the total number of concurrent users/threads)=10startupDelay=100holdLoadFor=600shutdownTime=0

Load Ramp Up Model

Note: This test was run without Ultimate Thread Group and no throttling was applied.

Results

10

9

8

7

6

5

4

3

2

1

000:00:00 00:01:10


00:02:20 00:03:30 00:04:40

Elapsed Time

Num

ber

of a

ctiv

e th

read

s

00:05:50 00:08:10 00:09:20 00:10:30 00:11:4000:07:00


2.05

2.1

2.15

2.2

2.25

2.3

2.35

2.4

2.45

2.5

2.55

TPS

TPS

Transactions

946 1429 1774459

3000

3200

3100

3300

3400

3500

3600

3700

946 1429 1774

AVG_RESPONSE_TIME

459

AVG_RESPONSE_TIME

Transactions


POINT OF view


From the graphs above, we can see that, since the load was not throttled and users were ramped up at the rate of 1 user every 10 seconds, the moment all 10 users were ramped up, the response times grew to a level that were not within acceptable limits for the page performance.

In this scenario, it did not make any sense to go below 10 concurrent users. And since the average response times were in the order of 3.5 seconds, it was concluded that a single publish server would be able to support less than 10 concurrent users.

OvERall SySTEm UTIlIzaTION

Publish

Author


get Topic Page

setTotalTime


10 000

9 000

8 000

7 000

6 000

5 000

4 000

3 000

2 000

1000

000:00:00 00:01:10 00:02:21 00:03:31 00:04:42


Res

pons

e tim

es in

ms

00:05:53 00:08:14 00:09:25 00:10:35 00:11:4600:07:03

CPU Total hdadhdcom03 19-7-2013

User% Sys%

0

10

20

30

40

50

60

70

80

90

100

00:0

0

00:1

0

00:2

0

00:3

0

00:4

0

00:5

0

01:0

0

01:1

0

01:2

0

01:3

0

01:4

0

01:5

0

02:0

0

02:1

0

02:2

0

02:3

0

02:4

0

02:5

0

03:0

0

03:1

0

03:2

0

03:3

0

03:4

0

03:5

0

04:0

0

04:1

0

04:2

0

04:3

0

04:4

0

04:5

0

05:0

0

05:1

0

05:2

0

05:3

0

Wait%

0

10

20

30

40

50

60

70

80

90

100

00:0

0

00:1

0

00:2

0

00:3

0

00:4

0

00:5

0

01:0

0

01:1

0

01:2

0

01:3

0

01:4

0

01:5

0

02:0

0

02:1

0

02:2

0

02:3

0

02:4

0

02:5

0

03:0

0

03:1

0

03:2

0

03:3

0

03:4

0

03:5

0

04:0

0

04:1

0

04:2

0

04:3

0

04:4

0

04:5

0

05:0

0

05:1

0

05:2

0

05:3

0

05:4

0

CPU Total hdadhdcom01 19-7-2013

User% Sys% Wait%


aBOUT THE aUTHORSruthisagar Kasturirangan is an Infrastructure architect, Infrastructure practice, at SapientNitro Bangalore. a graduate from Iowa State University, he moved on to gain extensive experience within leading IT organizations and eventually moved back to his home country to join Sapient Corporation. He has over 11 years of experience in systems administration of Unix platforms and application Servers such as WebSphere and Weblogic, and intense exposure on capacity planning and performance tuning of Java applications.

POINT OF view

CONClUSION

After conducting this series of tests, and then discussing and analyzing them, we’ve arrived at a few key takeaways that we think are worthwhile to consider:

1. For a total achievable throughput, a single publish and a single author are able to achieve 1.6 TPS within an acceptable response time (those response times below 2 seconds).

2. For a total achievable concurrent user/thread count, a single publish instance is able to handle less than 10 concurrent threads/users performing continuous read operations and updates to maintain response times within SLAs (service-level agreements).

3. Scaling publish servers horizontally, in order to handle higher volumes of updates, is of no value since the bottleneck would lead to reverse replication to the author instance. (Throughput indicated above is for the entire publish layer and not for a single publish layer.)

Adobe’s Social Collaboration can help to achieve social media goals and improve strategy, performance, and scalability. It is our hope that this paper has answered some of your questions and helped you better understand this particular social solution.

References

1. CQ Planning and Capacity Guide http://dev.day.com/docs/en/cq/current/managing/capacity-guide.html

2. CQ Hardware Sizing Guidelines http://wem.help.adobe.com/enterprise/en_US/10-0/wem/managing/hardware_sizing_guidelines.html

3. Introduction to Adobe’s Social Communities http://dev.day.com/docs/en/cq/current/administering/social_communities.html

http://wem.help.adobe.com/enterprise/en_US/10-0/wem/managing/hardware_sizing_guidelines.html

adobe social collaboration: a deep dive into performance and scalability

Business