locality-aware request distribution in cluster-based network servers

34
Locality-Aware Request Distribution in Cluster- based Network Servers Presented by: Kevin Boos Authors: Vivek S. Pai, Mohit Aron, et al. Rice University ASPLOS 1998 *** Figures adapted from original presentation ***

Upload: beata

Post on 24-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Locality-Aware Request Distribution in Cluster-based Network Servers. Presented by: Kevin Boos Authors: Vivek S. Pai , Mohit Aron , et al. Rice University ASPLOS 1998 *** Figures adapted from original presentation ***. Time Warp to 1998. Rapid Internet growth Bandwidth limitations - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Locality-Aware Request Distribution in Cluster-based Network Servers

Locality-Aware Request Distribution in Cluster-based Network ServersPresented by: Kevin Boos

Authors: Vivek S. Pai, Mohit Aron, et al.Rice UniversityASPLOS 1998*** Figures adapted from original presentation ***

Page 2: Locality-Aware Request Distribution in Cluster-based Network Servers

2

Time Warp to 1998

Rapid Internet growth Bandwidth limitations “Cheap” PCs and “fast” LANs Need for increased throughput

Page 3: Locality-Aware Request Distribution in Cluster-based Network Servers

3

Clustered Servers

Front-End

Node

LAN (Switch

)

Back-End

NodeBack-End

NodeBack-End

Node

Client

Client

Page 4: Locality-Aware Request Distribution in Cluster-based Network Servers

4

Weighted Round Robin (WRR)

Page 5: Locality-Aware Request Distribution in Cluster-based Network Servers

5

Pure Locality-Based Distribution

Page 6: Locality-Aware Request Distribution in Cluster-based Network Servers

6

Motivation for Change

Weighted Round Robin Disregards content on back-end nodes Many cache misses Limited by disk performance

Pure Locality-Based Distribution Disregards current load on back-end nodes Uneven load distribution Inefficient use of resources

Page 7: Locality-Aware Request Distribution in Cluster-based Network Servers

7

LARD Concepts

Locality-Aware Request Distribution Goal: improve performance

Higher throughput Higher cache hit rates Reduced disk access

Even load distribution + content-based distribution The best of both algorithms

Page 8: Locality-Aware Request Distribution in Cluster-based Network Servers

8

Outline

Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 9: Locality-Aware Request Distribution in Cluster-based Network Servers

9

Outline

Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 10: Locality-Aware Request Distribution in Cluster-based Network Servers

10

Basic LARD Algorithm

Front-end maps target content to back-end nodes 1-to-1 mapping

First request for each target is assigned to the least-loaded back-end node

Subsequent requests are distributed to the same back-end node based on target content mapping Unless overloaded… Re-assigns target content to a new back-end node

Page 11: Locality-Aware Request Distribution in Cluster-based Network Servers

11

Front-End

Flow of Basic LARD

Client

AAa

AAa

Page 12: Locality-Aware Request Distribution in Cluster-based Network Servers

12

Determining Load in Basic LARD

Ask the server? Introduces unnecessary communication

Current load = number of open connections Tracked in the front-end node

Use thresholds to determine when to re-balance Low, High, and Limit Re-balance when (load > Tlimit) or

(load > Thigh and there is a “free” node with load < Tlow)

Page 13: Locality-Aware Request Distribution in Cluster-based Network Servers

13

Outline

Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 14: Locality-Aware Request Distribution in Cluster-based Network Servers

14

LARD Needs Improvement

Only one back-end node per target content Working set is a single node Front-end must limit total connections

Still need to increase throughput One node per content type is unrealistic …add more back-end nodes?

Page 15: Locality-Aware Request Distribution in Cluster-based Network Servers

15

LARD/R

LARD with Replication Maps target content to a set of back-end nodes

Working set is several nodes with similar cache content

Sends new requests to least-loaded node in set Moves nodes to/from sets based on load

imbalance Idle nodes in a low-load set are moved to higher-load set

Page 16: Locality-Aware Request Distribution in Cluster-based Network Servers

16

Front-End

Flow of LARD/R

Client

AAa

AAa

AAa

Page 17: Locality-Aware Request Distribution in Cluster-based Network Servers

17

LARD Outline

Basic LARD Algorithm Improvements to LARD Request Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 18: Locality-Aware Request Distribution in Cluster-based Network Servers

18

Determining Content Type

How do we determine content in the front-end? Front-end must see network traffic

Standard TCP Assumptions Requests are small and light Responses are big and heavy

How do we forward requests?

Page 19: Locality-Aware Request Distribution in Cluster-based Network Servers

19

Potential TCP Solutions

Simple TCP Proxy Everything must flow through front-end node

Can inspect all incoming content

Cannot respond directly from back-end to client But front-end can also inspect all outgoing content

Better for persistent connections

Page 20: Locality-Aware Request Distribution in Cluster-based Network Servers

20

TCP Connection Handoff Front-end connects

to client Inspects content Forwards request

to back-end node Returned directly

back to client from back-end node

Page 21: Locality-Aware Request Distribution in Cluster-based Network Servers

21

LARD Outline

Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 22: Locality-Aware Request Distribution in Cluster-based Network Servers

22

Evaluation Goals

Throughput Requests/second served by entire cluster

Hit rate (Requests that hit memory cache) / (total requests)

Underutilization time Time that a node’s load is ≤ 40% of Tlow

Page 23: Locality-Aware Request Distribution in Cluster-based Network Servers

23

Simulation Model

300MHz Pentium II 32MB Memory (cache) 100Mbps Ethernet Traces from web servers at Rice and IBM

Page 24: Locality-Aware Request Distribution in Cluster-based Network Servers

24

Simulation Results – Prior Work

Weighted Round Robin Lowest throughput Highest cache miss ratio But lowest idle time

Pure Locality-Based An increase in nodes decrease in cache miss ratio But idle time increases (unbalanced load) Only minor improvement over WRR

Page 25: Locality-Aware Request Distribution in Cluster-based Network Servers

25

Simulation Results – LARD & LARD/R Throughput ~4x better (8 nodes)

WRR would need nodes with a 10x larger cache size

CPU bound after 8 nodes Cache miss rate decreases Only 1% idle time on average

Page 26: Locality-Aware Request Distribution in Cluster-based Network Servers

26

Simulation Results – Throughput

Page 27: Locality-Aware Request Distribution in Cluster-based Network Servers

27

Simulation Results – Cache Misses

Page 28: Locality-Aware Request Distribution in Cluster-based Network Servers

28

Simulation Results – Idle Time

Page 29: Locality-Aware Request Distribution in Cluster-based Network Servers

29

What Affects Performance?

WRR is disk-bound, LARD/R is CPU bound Increasing CPU speed improves LARD/R, not WRR Adding more disks improves WRR, not LARD/R

LARD/R shows no improvement if a node has > 2 disks

WRR is not scalable

Page 30: Locality-Aware Request Distribution in Cluster-based Network Servers

30

LARD Outline

Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing

Page 31: Locality-Aware Request Distribution in Cluster-based Network Servers

31

Prototype Implementation

One front-end PC 300MHz Pentium II, 128MB RAM

6 back-end PCs 7 client PCs

166MHz Pentium Pro, 64MB RAM

100Mb Ethernet, 24-port switch

Page 32: Locality-Aware Request Distribution in Cluster-based Network Servers

32

Prototype Testing Results

Page 33: Locality-Aware Request Distribution in Cluster-based Network Servers

33

Evaluation Shortcomings

What influences the results more? LARD/R protocol? TCP handoff protocol?

Page 34: Locality-Aware Request Distribution in Cluster-based Network Servers

34

Conclusion

LARD and LARD/R significantly better than WRR Higher throughput Better CPU utilization More frequent cache hits Reduced disk access

Benefits of Locality-Based and Load-Balanced Scalable at low cost