icp and the squid web cache duanc wessels k claffy august 13, 1997 元智大學系統實驗室...
TRANSCRIPT
ICP and the Squid Web Cache
Duanc Wessels
k Claffy
August 13, 1997
• 元智大學系統實驗室 宮春富 2000/01/26
Outline
⊙ Introduction
⊙ Internet Cache Protocol
⊙ Implementation of ICP in Squid
⊙ ICP Delays
What is Caching
⊙ Caching has proven a useful technique for reducing end user
experienced latency on the Web.
⊙ Caching is effective because many Web Documents are
requested much more than once.
⊙ Cache is the intermediate storage of copies of popular Web
documents close to the end users.
HTTP
⊙ An HTTP request is comprised of three major parts: a request
method, a URL, and a set of request headers.
⊙ An HTTP reply consists of a numeric result code, a set of
reply headers, and an optional reply body.
⊙ GET -> download; POST -> upload.
⊙Max-age directive: age refers to the elapsed time since the
origin server provide the data.
Cache Hierarchical
⊙ A set of child cache share a common parent cache.
⊙ A simple hierarchy is not appropriate to all situations.
⊙ The ICP is to provide a quick and efficient method of
intercache communication, offers a mechanism for
establishing complex cache hierarchies.
PARENT
CHILD CHILDCHILD
ICP Message Format
⊙ A cache will query its peers by sending each one an
ICP_QUERY message.
⊙ The peer will reply with either an ICP_HIT or ICP_MISS.
⊙ Other codes: ICP_DENIED 、 ICP_HIT_OBJ.
OPCODE VERSION PACKET LENGTHREQUEST NUMBER
OPTIONSPADDING
SENDER HOST ADDRESS
0 31
ICP Transport
⊙ ICP could use TCP or UDP as the underlying delivery
protocol.
⊙ A UDP is simpler to implement because each cache needs to
maintain only a single UDP socket.
⊙ A ICP is intended as unreliable protocol and TCP would
actually be detrimental.
⊙ One advantage: a cache can quickly parse and interpret an ICP
message.
⊙ Two disadvantages: ICP doesn’t match HTTP; ICP increase
the request latency by at least the network round-trip time to a
neighbor cache.
ICP vs. HTTP
ICP Query Algorithm
⊙ Squid supports the ability to restrict the range of ICP_QUERY
messages it will send to different peers.
⊙ The cache_host_domain option lets one specify which
domains to query for a given peer.
⊙ Another Squid configuration parameter, hierarchy_stoplist,
allows one to exclude certain requests from the ICP query
algorithm.
⊙ Extract and parses the URL. (ICP_INVALID)
⊙ Check local access controls. (ICP_DENIED)
⊙ Lookup the given URL. (ICP_MISS)
⊙ If object is small enough, return an ICP_HIT_OBJ message.
⊙ Otherwise, return an ICP_HIT message.
Processing an ICP query
⊙ Squid collects replies until it receives an ICP_HIT or until all
ICP_MISS replies arrive.
⊙When receiving an ICP_HIT, Squid begins retrieving the
object from that peer.
⊙ If ICP_HIT_OBJ reply is first arrive, Squid just takes the
object data from the ICP message payload.
⊙ If no hit reply, then Squid retrieves the object from the parent.
Collect ICP replies
⊙ One of the peers becoming unreachable would significantly
increase the chances of suffering the two-second timeout.
⊙We designate a peer as dead when it fails to reply to 20
consecutive ICP queries.
⊙We still send the ICP_QUERY messages to dead peers, we
just don’t expect to receive replies from them.
Detecting Unreachable Peers
⊙ The Squid will return ICP_MISS_NOFETCH instead of
ICP_MISS message.
⊙ This feature allows this parent to continue serving hits, but
take itself out of the peer selection process for misses.
More Network Failure
INTERNETPARENT
ROUTER
CHILD CHILD
A
B
⊙ One problem is that it makes the UDP packet quite a bit larger.
⊙ Another problem is they require more time to generate.
⊙ The payload must actually consist of the URL followed by the
object data.
ICP_HIT_OBJ
⊙We don’t claim these measurements prove that hierarchical
caching with ICP gives improved performance.
⊙We suspect it depends on the regional and/or local network
situation.
⊙We used a special program to alternate between sending
ICMP echo request and ICP_QUERY messages.
ICP Delay