performance evaluation of web proxy cache replacement policies orit brimer ravit krayif sigal ishay

48
Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Post on 19-Dec-2015

238 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Performance Evaluation ofWeb Proxy Cache Replacement

Policies

Orit Brimer

Ravit krayif

Sigal ishay

Page 2: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Introduction

The World-Wide Web has grown tremendously in the past few years to become the most significant source of traffic on the Internet today.

This growth has led to overloaded Web servers, network congestion and an increase in the response time observed by the client..

Caching of web documents is widely used to reduce both latency and network traffic in accessing data.

Page 3: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Web proxy servers A proxy server is a server that acts as an intermediary

between a workstation user and the Internet so that the enterprise can ensure security, administrative control, and caching service..

Web proxy servers that cache documents can potentially improve performance in three ways:

Reduce the latency that an end user experiences in retrieving a web page .

Lower the volume of network traffic resulting from web pages requests

Reduce the number of requests that reach popular servers.

Page 4: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

How does a web proxy work?

A proxy server receives a request for an Internet service (such as a Web page request) from a user.

the proxy server looks in its local cache of previously downloaded Web pages.

If it finds the page, it returns it to the user without needing to forward the request to the Internet – Cache hit!

Cache miss!

If the page is not in the cache, the proxy server, acting as a client on behalf of the user, request

the page from the server –

Page 5: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

How does a web proxy work? (cont.)

Clients Proxy Cache Servers

Hits

Misses Misses

Internet

When the page is returned, the proxy server relates it to the original request and forwards it on to the user.

Page 6: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Cache Replacement Policy

Since the proxy has finite storage, some strategy must be devised to periodically replace documents in favor of more popular ones.

The replacement policy decides which web pages are cached and which web pages are replaced, therefore it affects which future requests will be cache hits.

The cache’s replacement policy no doubt plays an important role in a cache’s performance.

Page 7: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Project Goal

Design and implement a web proxy cache simulator in order to test several different replacement policies and other parameters.

Evaluate the performance of the web proxy cache for each parameter by the following metrics:

Hit rate Byte hit rate Response time

Page 8: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

How does the simulation work ? Client’s requests are simulated by the Prowgen simulator. The proxy simulator attempts to fulfill the request from

among the web pages stored in its cache.• If the requested web page is found (a cache hit) the

proxy can immediately respond to the client’s request,hit rate and byte hit rate are updated.

• If the requested web page is not found (a cache miss) the proxy then should retrieve the web page from the origin server- the miss is written in the input file for the NS.

Start the NS simulator which simulates the transfer of the pages from the servers to the proxy.

Page 9: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

CLIENTS

WEB PROXY

WEB SERVERS

Request for Web page

Simulated by the Prowgen simulator

Simulated by the NS simulator

Cache HIT

Cache MISS

Request for web page

The requested web page is saved in the cache.

Page 10: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

ProwGen simulator

ProwGen is a synthetic web proxy workload generator.

The workload generator incorporates five selected workload characteristics which are relevant to caching performance: one–time referencing file popularity file size distribution correlation between file size and popularity temporal locality

Page 11: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

ProwGen simulator (cont.) The ProwGen is used in our project to create trace file with

about 300000 requests. Each request simulated by the ProwGen has an id and page

size. the proxy simulator maps the id to the server that holds this

page and adds time of arrival for each request. The time of arrival has exponential distribution with

An example of the requests file:5 5247 0.013242

8 5410 0.04516915 6178 0.0496801 2596 0.0573988 1990 0.11913716 9441 0.31409110 32982 0.3912935 2498 0.53597016 10029 0.55079318 2366 0.707047

Time of arrivalServer number

Page size

Page 12: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

NS simulator What is NS?

NS is a discrete event simulator targeted at networking research.

NS is a multiprotocol simulator that implements unicast and multicast routing algorithms, transport and session protocols.

What is good for? Evaluate performance of existing network protocols,

thus Protocols can be compared. Prototyping and evaluation of new protocols. Large-scale simulations not possible in real

experiments. 

Page 13: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Using NS in our projectSimulation script flow: Create the event schedulerCreate the event scheduler Create network configuration Create network configuration Create transport connection – TCP connectionsCreate transport connection – TCP connections Create traffic on top of TCP – FTP application Create traffic on top of TCP – FTP application Transmit application-level dataTransmit application-level data

Input files: nsRand- this file contains the parameters for creating the network

topology. nsout – the file with the requests that were not in the cache.

this file contains: 1. server number2. page size3. the arrival time of the request.

Page 14: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Using NS in our project (cont.)

Creating network Configuration: The network topology in our project is star topology,

the web proxy is connected to each server with duplex link.

The topology is given as an input file for the ns script. It is defined by the following parameters:

1. Number of servers

for each duplex link we define:

2. Delay – random parameter between 10-40 ms.

3. Bandwidth –random parameter between 1-10 Mb.

Page 15: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

NAM visualization for network configuration

Page 16: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Using NS in our project (cont.)Creating connections and traffic:

The NS parse the input file and for each miss open a TCP session with the origin server and retrieve the file from it.

The pages are transferred from the servers to the proxy by using FTP application.

The NS create an output file that contain the retrieval time of each request.

This is done by defining a special procedure which is called automatically at the end of each session.

The retrieval time of the request is dependent on the link attributes and has an affect on the web proxy performance.

We compare this time for each replacement algorithm

Page 17: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

The procedure done:Agent/TCP instproc done {} {global tcpsrc NodeNb ns ftp Out tcp_snk totalSum PR # print in $Out: node, session, start time, end time, duration,# trans-pkts, transm-bytes, retrans-bytes, throughput-how many bytes#transffered per second.  set duration [expr [$ns now] - [$self set starts] ]#k is the source node set k [$self set node]#l is the number of the session. set l [$self set sess]set totalSum [expr $totalSum + $duration]#ndatapack_ is the number of packets transmitted by the connection.#ndatadbytes_ is the number of data bytes transmitted by the

connection.#nrexmitbytes_ is the number of bytes retransmitted by the connection. puts $Out "$k \t $l \t [$self set starts] \t\ [$ns now] \t $duration \t

[$self set ndatapack_] \t\ [$self set ndatabytes_] \t [$self set nrexmitbytes_]" }

An example for the code from the tcl script

Page 18: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Pruning Algorithms

We describe several cache replacement algorithms proposed in recent studies, which attempt to minimize various cost metrics such as miss ratio, byte miss ratio, average latency, and total cost.

These algorithms will be used in our simulation and will be compared at the end of the simulation.

In our implementation, each page has a pruning value field and this field holds varying information according to the specific pruning algorithm.

The html pages are sorted according to this field – and therefore the pruning is very simple and similar for almost all algorithms.

The following algorithms were implemented and tested: .

 

Page 19: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

LRU-Least Recently Used LRU evicts the document which was requested the least

recently. It is based on the observation that documents, which have

been referenced in the recent past, will likely be referenced again in the near future.

We implemented this algorithm by holding a time stamp in the pruning value field of the page.

When a page in the cache is accessed, the value of this field is set to the current time.

The page with the lowest time stamp will be replaced.

Page 20: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

LFU-Least Frequently Used The Least Frequently Used policy maintains a reference

count for every object in the cache. The object with the lowest reference count is selected for

replacement. The motivation for this algorithm is that some pages are

accessed more frequently than others so that the reference counts can be used as an estimate of the probability of a page being referenced.

The page with the lowest probability to be referenced again will be replaced.

Page 21: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Hybrid Algorithm HYB algorithm purpose is to answer the need of minimize the time that end users wait for a

document to load. HYB is a hybrid of several factors, considering not only download time but also number of

references to a document and document size. Each server in the serversDB holds the bandwidth and delay of the link which connects it to

the proxy. HYB selects for replacement the document i with the lowest value of the following

expression:

(clatser(i) + WB/cbwser(i))(nrefi* WN)/ si

clat – estimated latency (time) to open a connection to the server. cbw - bandwidth of the connection (in Mega Bytes/second) .

nrefi - number of references to document i since it last entered the cache.

si - the size in bytes of document i.

WB and WN are constants.

Page 22: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

GreedyDual-Size This algorithm combines locality, size and latency/cost concerns

effectively to achieve the best overall performance. The algorithm associates a value, H, with each cached page p. Initially, when a page is brought into cache, H is set to be the cost of

bringing the page into the cache. When a replacement needs to be made, the page with the lowest H

value is replaced, and all pages reduce their H values by minH. If a page is accessed, its H value is restored to the cost of bringing it

into the cache. Thus, the H values of recently accessed pages retain a larger portion of

the original cost than those of pages that have not been accessed for a long time.

GreedyDual-size selects for replacement the document i with the lowest value of the following expression:

(clatser(i) + 1/cbwser(i))/ si

Page 23: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Size The Size policy, designed specifically for web proxy

caches, removes the largest object from the cache when space is needed for a new object.

We implemented this algorithm by holding the page size in the pruning value field of the page.

Page 24: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Data Structuresstruct HtmlPage{

long int id; long int size; double prunningValue; int reference;long int timeStamp;HtmlPage next;

}; The WebProxy holds a cache which is implemented as a sorted list of HtmlPages.struct WebProxy{

List* cache;double currMemory; long int proxyHits;double byteHits; double inflationVal;

};

Page 25: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Data Structures (cont.)

struct WebServer

{

/*the index in the servers array */

int sNum;

double bandwidth;

int delay;

};

We hold an array of WebServers.Each WebServer holds information about the delay and bandwidth of the link that connects him to the proxy.

Page 26: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Basic ImplementationThe program consists of several stages:

Creating random values for the network configuration. Read request by request from a trace file created by the ProwGen. For each request:

It first checks if the page is stored in its cache. If so, records a proxy hit.

Update the pruning value of the page according to the pruning algorithm.

If the page is not in the cache, a miss is recorded. The request is written to the misses file. The WebProxy creates a new page and update its pruning

value according to the pruning algorithm. The WebProxy checks if there is enough memory in the

cache for this page.

Page 27: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Basic Implementation (cont.) If not, it removes pages from the cache according to

the pruning algorithm, in such a way that the occupied memory in the cache after inserting the new page, will not exceed TRESHOLD * CACHE_SIZE.

The page is cached.

Page 28: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Performance analysis

This section evaluates the performance of the web proxy cache for each replacement policy.

We examined the replacement policies for different cache size: 4,8,16,32,64,128,256 (MB).

The simulations were executed in two different network topologies : 20 and 100 servers.

In this study we use three metrics to evaluate the performance of the proxy cache: Hit rate - percentage of all requests that can be satisfied by

searching the cache for a copy of the requested object. Byte hit rate - the percentage of all data that is transferred directly

from the cache rather than from the origin server. Average response time – the average time that takes to bring a

web page that caused cache miss.

Page 29: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Hit Rate The following table show the Hit rate of the tested

algorithms. Network configuration: 20 servers in a star topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.2526710.2871930.3017980.347720.33295

8MB0.3009520.3305130.3416770.39720.42261

16MB0.3810000.3827840.3725750.4393930.48865

32MB0.454560.4350520.4669110.5127690.54588

64MB0.514840.4943740.5173950.5687990.60184

128MB0.5676930.5640180.579580.6212930.64749

256MB0.6147120.6211010.6430650.6679060.67868

Page 30: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0 50 100 150 200 250 300

cashe size (MB)

hit r

ate

LRU

LFU

SIZE

HYBRID

GREEDY

Page 31: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

As we expected GREEDY and HYBREED algorithms show the best Hit rate (were designed to maximize hit rate).

The graph shows that the hit rate grows as the cache size grows, but the sloap is decreasing start from cache size of 64 MB.

Page 32: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Byte Hit Rate The following table show the Byte Hit rate of the

tested algorithms. Network configuration: 20 servers in a star topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.1201150.1268850.0791430.1047380.123857

8MB0.1651120.1725190.0905170.120520.170542

16MB0.24180.2281140.1023890.1390160.230695

32MB0.3153320.2919380.1507910.189450.288911

64MB0.379660.3591240.1822190.2316070.345246

128MB0.4435850.434930.2423060.2961140.401259

256MB0.498640.5071240.3376040.3936010.455649

Page 33: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.1

0.2

0.3

0.4

0.5

0.6

0 50 100 150 200 250 300

cache size

byte

hit

rate

LRU

LFU

SIZE

HYBRID

GREEDY

Page 34: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

SIZE gets the lowest Byte Hit rate. This result is not surprising since SIZE removes from the cache pages with the biggest size.

GREEDY and HYBRID also consider the size of the page when calculating the pruning value of the page, therefore these algorithms do not achieve the best Byte Hit rate .

Page 35: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Average time per request The following table show the average time per request of

the tested algorithms. Network configuration: 20 servers in a star topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.0687310.0664380.0692620.0641360.065687

8MB0.0665490.0620110.0644440.0615390.062312

16MB0.0614920.0583040.0616750.057460.056407

32MB0.0562350.0527220.0507650.0479550.049974

64MB0.0454040.0441380.0439390.0394170.037043

128MB0.0319690.0306230.0385320.0314640.031515

256MB0.0204060.0192440.027520.022430.026061

Page 36: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0 50 100 150 200 250 300

cache size (MB)

avg

time

per r

eque

st LRU

LFU

SIZE

HYBRID

GREEDY

Page 37: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

SIZE gets the lowest average time per request. This result is not surprising since SIZE showed the worst Byte Hit ratio.

GREEDY and HYBRID show the best result although they don’t have an optimal Byte Hit ratio, this is because they take into consideration the cost (delay and bandwidth) of bringing a page from origin server. In addition they showed the best Hit rate which also effects this metric results.

Page 38: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Hit Rate The following table show the Hit rate of the tested

algorithms. Network configuration: 100 servers in a star

topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.252670.2871930.3017980.351640.353713

8MB0.300950.3305130.3416770.405190.423008

16MB0.380980.3827840.3725750.443080.488233

32MB0.454560.4350520.4669110.516530.542906

64MB0.514840.4943740.5173950.56610.595806

128MB0.567690.5640180.579580.614750.639573

Page 39: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 20 40 60 80 100 120 140

cache size

Hit

rate

LRU

LFU

SIZE

HYBRID

GREEDY

Page 40: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

GREEDY and HYBREED algorithms still show the best Hit rate.

As expected, changing the network configuration did not influence the Hit rate.

Page 41: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Byte Hit Rate The following table show the Byte Hit rate of the

tested algorithms. Network configuration: 100 servers in a star

topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.1201150.1268850.0791430.1063090.124026

8MB0.1651120.1725190.0905170.1244560.170799

16MB0.2418470.2281140.1023890.1423840.232462

32MB0.3153320.2919380.1507910.1990120.291591

64MB0.379660.3591240.1822190.2405380.346464

128MB0.4435850.434930.2423060.3053410.397325

Page 42: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0 20 40 60 80 100 120 140

cache size

Byt

e H

it ra

te LRU

LFU

SIZE

HYBRID

GREEDY

Page 43: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

As in the graph for 20 servers network configuration,SIZE gets the lowest Byte Hit rate.

GREEDY and HYBRID also consider the size of the page when calculating the pruning value of the page, therefore these algorithms do not achieve the best Byte Hit rate .

Page 44: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Average time per request The following table show the average time per request of

the tested algorithms. Network configuration: 100 servers in a star topology.

LRULFUSIZEHYBRIDGREEDY

4MB0.0687310.0664380.0692620.0448240.045984

8MB0.0665490.0620110.0644440.0427350.043542

16MB0.0614920.0583040.0616750.0399340.039591

32MB0.0562350.0527220.0507650.0334110.034010

64MB0.0454040.0441380.0439390.0291920.027259

128MB0.0319690.0306230.0385320.0238110.020275

Page 45: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0 20 40 60 80 100 120 140

cache size

av

g t

ime

pe

r re

qu

es

t

LRU

LFU

SIZE

HYBRID

GREEDY

Page 46: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

Analyze results

GREEDY and HYBRID give the lowest average time per request.

These are the expected results since they are the only algorithms that consider the cost of retrieving a page from an origin server.

In this network configuration the difference between GREEDY and HYBRID algorithms to the others is obvious.

Page 47: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

conclusions

The algorithm that gives the best results for

all metrics is GREEDY!!! Best Hit rate Best average time per request

The algorithm that gives the worse results is SIZE. Worse Byte Hit rateWorse average time per request.

LRU and LFU gives the best Byte Hit rate. This can be explained by the fact that these are the only algorithms that do not take into account page size.

Page 48: Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay

conclusions (cont.)

HYBRID algorithm shows good performance

in the following metrics: Hit rate average time per request

BUT in all the metrics, GREEDY shows better results.

For all the tested algorithms, the Hit rate improved significantly when the cache size increases from

4MB-64MB. From this point the improvement is much more moderate.