Traditional Web Caching
Goals Reduce browser latency Reduce aggregate bandwidth Reduce load on web servers
Deployment Dedicated centralized machines Placed at local network boundaries
Squirrel Web Caching
Decentralized caching Desktops cooperate in a peer-to-peer
fashion Mutual sharing between hosts
Hosts browse and cache
Pros Centralized
Dedicated Hardware
Cost Administration
Handling load bursts
Single point of failure
Decentralized No additional
hardware More users
more resources Automatic scaling
Self organizing Easy deployment
Assumptions
Cooperative hosts No security issues
Link and node failures Nodes are in single geographic
location Low internal network latencies
Design Goals
Target environment: 100 - 100,000 machines
Goal: Achieve performance comparable to centralized cache
Design Overview Built on top of Pastry Objects have 128-bit objectIds
SHA-1 hash of URL Mapped to home node with closest nodeId Requests:
GET – new request cGET – conditional
Two schemes Home-store Directory
Home-store Objects stored at
client cache and home node
External requests come through home node Cache replacement
All objects are considered
a. home node freshb. home node stale
Directory Home node keeps a
directory of pointers Randomly redirect
to delegates
a. no directory, add new delegateb. cGET not modifiedc. delegate fresh, get from delegated. cGET and stale, updatee. GET and stale, update
Evaluation Characteristics Compare two schemes and dedicated
cache
Performance Latency External bandwidth Hit ratio
Overhead Load Storage
Fault Tolerance
Bandwidth and Hit ratio Bytes transferred to origin servers and
back correlated with hit rate
Centralized cache with infinite storage 100MB cache per node achieves
optimal rates 10MB in-memory cache is reasonable
Directory scheme Active nodes suffer from eviction Distributed LRU is worse than centralized
Home-store More total storage required
Latency User-perceived time for a response With comparable hit ratios, only
consider internal hops Many requests can be satisfied locally,
with 0 hops Directory scheme latency is up to one
hop greater Some requests can be satisfied by
home node
Squirrel Latency Based on Pastry hops on cache hit Overshadowed on cache miss
Load on Nodes(1/2) Bursty behavior observations
Max objects served per second Up to 48 and 55 objects per second
served for the two traces
Directory scheme One delegate can get bombarded with
requests from many home nodes Home-store scheme
Replicate objects at request threshold
Load on Nodes(2/2) Sustained load
measurements Max objects/minute
Average load in any second or minute: 0.31 objects/minute Redmond trace,
both models
Fault Tolerance Internet connection loss Internal partitioning Individual failure
Desktop shutdown or reboot Graceful shutdown
Pastry aided content transfer Directory scheme
More vulnerable to failures
Results
The home-store models seems to outperform the directory model Hit ratio Load balancing Internal network latency
Compared to centralized cache?