measuring the capacity of a web server

Measuring The Capacity of a Web server

By Xiongjun Tang

Topics

• Introduction• Dynamics of HTTP Server• Problems in generating Synthetic HTTP requests• A Scalable method for Generating HTTP requests• Quantitative Evaluation• Conclusion

Introduction

• Recently improvement on performance of web server– Better Web cashing

• Application-level cashing (L7 cluster)

– HTTP protocol enhancement• data compression

– Better HTTP servers and proxies

– Server OS implementation• tuning parameter in OS to get better performance

Introduction (continued)

• Measuring Web server (characters)– request file types– transfer data sizes– locality of reference in URL– Using real workload directly


• Simple scheme of WEB client Generator– client establishes connection– send a HTTP request– receive response– wait for a certain time (Think time)– repeat the cycle

• Adding client processes = increase total client requests


• Problems with naive method• Hard to exceed server’s capacity

• Little resemblance in temporal characteristic to real-world Web traffic

• Difference between delay and lose of WAN and LAN

• limited resource on client machine

HTTP Connection Timeline

1. A web server listen for connection

2. Receive a client request (SYN packet)

3. Server TCP responds SYN-ACK packet, create socket for new,incomplete connection, and place it in SYN-RECN queue

4. Client responds ACK, server move socket created above from SYN-RECV to accept queue

5. Server remove first socket in queue and send back appropriate responses, then close connection

Limitation of TCP Implementation

• Most system has a maximum number of backlog (Sum of length of SYN-RCVD queue and Accept Queue)– if sum > 1.5*backlog, server will drop incoming SYN packet

– when TCP misses SYN-ACK packet, it goes into a exponential back-off status,

For BSD system, it will send request at 6 seconds and 30 seconds after first SYN is sent before finally give up at 75 seconds.

– If no SYN-ACK, it will only send 3 requests for during 75 seconds!

• The Average length of SYN-RCVD depends on request rate and round-trip delay between client and server– long round-trip delay and high request rate increase SYN-RCVD length

– Accept queue length depends on CPU handling speed and request rate

Problems in Generating Synthetic HTTP request

• 1. Inability to generate Excess load– In real world HTTP requests are generated by huge number of

clients• large mean and variance

• requests are bursty ( such as France 98)

• peak request can easily exceeds capacity of server

– In simple model• small mean and variance

• little burstiness

Why can’t simple method generate excess load

• A new request can only be generate after a pervious one completed– when clients increase, the queue increases,so it will take CPU

longer time to finish a connection, thus completed time increases, so request generating decrease.

– The net connection request rate of all clients will remain equal to throughput of server

– when clients is greater than server’s maximum backlog, server is beginning to drop SYN packet.

• TCP exponential backoff will happen, generate further requests at very low rate

Request Rate vs. no. of clients

• A exampleHow many client will be requested to

generate 1100 requests/sec?

y-100 = (x-1024)*0.04

--> x = 1024 + (y-100)/0.04

if y = 1100 then

x = 1024 + 1000/0.04

--> x = 1024 + 25000

(on paper is order of 15000)

• If Max connection = 327671.5*32767+15000 = 64151

to generate 1100 requests/sec.

Additional Problem

• Simple Method does not model high and variable WAN delay

• Resource constrain in client machine– if too many processes, the contention for CPU

and memory will increase.– Potential bottleneck

– server is OK, but clients wait for resource

A Scalable Method for Generating HTTP request

• Total P machines• Each machine running

several S-clients• A Router can be used

to simulate wan delay

S-Client

• Created by a UNIX domain socketpair call– it has two processes, one is connection establishment process, the other is

connection handling process

• connection establishment process– purpose: to generate HTTP request at certain rate and with certain

distribution

– open D connections by using D sockets, requests are spaced out over T ms

– after each socket is created, a timer is associated with it.

– If in time T, it get the response from server, it will hand off to connection handling process. Close this socket, initiate another connection to server

– if in time T, it don’t get response from server, close it, then initiate another connection to server

• all avoid TCP exponential back-off

S-Client (continued)

• Connection handling process– waiting data to arrive on any of active

connections– if any new data coming, read it. If this completes, close

the socket.

– waiting for new connection to arrive on UNIX domain socket connecting to the other process

» simply added to active pool of active connections

A S-client• Two key ideas:1.shorten TCP

connection timeout

this will allow generati-

on of request rate bey-

ond capacity of server

increase at least 1/T

2.maintain a constant number of unconnected sockets

this will ensure generated request rate is independent of the rate at which server handles request

Request generating capacity of a client machine

• Purpose– To use as less S-client to

generate as many requests

• Choose largest allowable

number of descriptors (N)– How to get it ?

– Choose a value largest value N, for which throughput Vs request rate curve when using 1 client machine is unchanged from the same curve when using 2 client machines.

Quantitative Evaluation

• Each HTTP request is for a single file of size 1294

• No more than 130 requests/sec for simple method

• with S-clients, up to 2065 requests/sec (limitation of machine resource)

• T value for S-client is 500ms

Overload behavior

• Why Dropped ?– Because CPU resource spent on protocol

processing for incoming requests (SYN packet)

Bursty Condition

• First parameter: the ration between Max request rate and average request rate

• second parameter: the fraction of time for which request of rate exceed average rate

• In general high busrtiness both in above two parameters degrades the throughput of server substantially

Conclusion

• This paper examines pitfall in process of generating synthetic web server workload consisting of a small number of client machine

• It exposes the limitation of simple method

• A new method (S-client) is introduced, which can easily generate workload exceeding capacity of server as well as bursty workload

• it will help study of the characters when server is in overload status and then improve the performance of server

measuring the capacity of a web server

Documents