a taxonomy and survey of content delivery networks

36
A Taxonomy and Survey of Content Delivery Networks Meng-Huan Wu 2011/10/26 1

Upload: oswald

Post on 22-Feb-2016

75 views

Category:

Documents


0 download

DESCRIPTION

A Taxonomy and Survey of Content Delivery Networks. Meng-Huan Wu 2011/10/26. Outline. Introduction Request-routing mechanisms Content selection and delivery Content routing and delivery Caching techniques Conclusion & Future work References. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Taxonomy and Survey of Content Delivery Networks

A Taxonomy and Survey of Content Delivery Networks

Meng-Huan Wu 2011/10/26

1

Page 2: A Taxonomy and Survey of Content Delivery Networks

Outline

• Introduction• Request-routing mechanisms• Content selection and delivery• Content routing and delivery• Caching techniques• Conclusion & Future work• References

2

Page 3: A Taxonomy and Survey of Content Delivery Networks

Introduction

• A CDN is a collection of network elements arranged for more effective delivery of content to end-users.

• Reduce network impact on the response time of user requests.

• Avoid flash crowd (or SlashDot effect)

3

Page 4: A Taxonomy and Survey of Content Delivery Networks

The three key components of a CDN architecture

• A content provider or customer is one who delegates the URI name space of the Web objects to be distributed. The origin server of the content provider holds those objects.

• A CDN provider is a proprietary organization or company that provides infrastructure facilities to content providers in order to deliver content in a timely and reliable manner.

• End-users or clients are the entities who access content from the content provider’s website.

4

Page 5: A Taxonomy and Survey of Content Delivery Networks

Servers

• Origin server : The server where the definitive version of a resource resides is called origin server

• Replica server(or surrogate server) : A server is called a replica server when it is holding a replica of a resource but may act as an authoritative reference for client responses.

5

Page 6: A Taxonomy and Survey of Content Delivery Networks

Relationships

6

Page 7: A Taxonomy and Survey of Content Delivery Networks

Abstract architecture of a Content Delivery Network (CDN)

7

Page 8: A Taxonomy and Survey of Content Delivery Networks

Request-routing in a CDN environment

8

Page 9: A Taxonomy and Survey of Content Delivery Networks

Content selection and delivery

9

Page 10: A Taxonomy and Survey of Content Delivery Networks

Full-site content selection and delivery

10

Surrogate Server

CDN

Origin Server

Client

GET index.html

GET image1.gif, image2.gif

inde

x.ht

ml,

imag

e1.g

if,

imag

e2.g

if

index.htmlembedded image1.gif

image2.gif

Page 11: A Taxonomy and Survey of Content Delivery Networks

Partial site content selection and delivery

11

Origin Server

SurrogateServer

CDN

Client GET index.html

GET image1.gif, image2.gif

imag

e1.g

if,

imag

e2.g

if

index.htmlembedded image1.gif

image2.gif

Page 12: A Taxonomy and Survey of Content Delivery Networks

Empirical-based approach

• In empirical-based approach, the Web site administrator empirically selects the content to be replicated to the edge servers. Heuristics are used in making such an empirical decision.

• The main drawback of this approach lies in the uncertainty in choosing the right heuristics.

12

Page 13: A Taxonomy and Survey of Content Delivery Networks

Popularity-based approach

• In popularity-based approach, the most popular objects are replicated to the surrogates.

• This approach is time consuming and reliable objects request statistics is not guaranteed due to the popularity of each object varies considerably.

• Moreover, such statistics are often not available for newly introduced content.

13

Page 14: A Taxonomy and Survey of Content Delivery Networks

Cluster-based approach

• In cluster-based approach, Web content is grouped based on either correlation or access frequency and is replicated in units of content clusters.

14

Page 15: A Taxonomy and Survey of Content Delivery Networks

Content routing and delivery

• If the local CDN server accepts a user’s request but does not have the requested content, it will perform content routing to locate and then deliver the content to the user.

15

Page 16: A Taxonomy and Survey of Content Delivery Networks

The steps the CDN takes to serve a user’s request

• Step 1. Try to satisfy the user’s request using the local CDN server.

• Step 2. If step 1 fails, try to satisfy the user’s request using a CDN server inside the cluster including the local CDN server.

• Step 3. If step 2 fails, try to satisfy the user’s request using a CDN server inside a nearby cluster.

• Step 4. If step 3 fails, try to satisfy the user’s request using the origin server.

16

Page 17: A Taxonomy and Survey of Content Delivery Networks

17

Page 18: A Taxonomy and Survey of Content Delivery Networks

Caching techniques

18

Page 19: A Taxonomy and Survey of Content Delivery Networks

Query-based scheme

• The most straightforward scheme is the query-based scheme, in which a CDN server broadcasts a query for the requested content to other CDN servers inside the same cluster if it does not have the content.

19

Page 20: A Taxonomy and Survey of Content Delivery Networks

Digest-based scheme

• In order to avoid flooding queries, the digest-based scheme was proposed. Each CDN server maintains a content digest that includes the content information of other CDN servers inside the same cluster. Once a CDN server has cached/ deleted some contents, it notifies other CDN servers to update their content digests.

• Hence, a CDN server knows where to locate the content by checking its content digest.

20

Page 21: A Taxonomy and Survey of Content Delivery Networks

Directory-based scheme• A centralized version of the digest-based scheme is the

directory-based scheme, in which a directory server maintains the content information of the CDN servers inside the cluster. A CDN server only needs to notify the directory server when local updates occur, and queries the directory server when there is a local miss.

• Compared to the digest-based scheme the update traffic is greatly reduced, but the directory server is a single point of failure because it needs to handle the update and query messages from all the cooperating CDN servers.

21

Page 22: A Taxonomy and Survey of Content Delivery Networks

Hashing-based scheme

• A more efficient scheme is the hashing-based scheme. The CDN servers inside a cluster maintain the same hashing function. Each content is assigned to a designated CDN server based on the content’s URL (or other unique identification), unique IDs (e.g., IP addresses) of the CDN servers, and the hashing function. All requests for the same content are redirected to the designated CDN server for that content.

22

Page 23: A Taxonomy and Survey of Content Delivery Networks

Semi-hashing-based scheme

• Under the semi-hashing-based scheme, a local CDN server allocates a certain portion, Plocal, of its disk space to cache the most popular contents for its local users, and the remaining portion to cooperate with other CDN servers via a hashing function.

23

Page 24: A Taxonomy and Survey of Content Delivery Networks

Cache update taxonomy

24

Page 25: A Taxonomy and Survey of Content Delivery Networks

Periodic update• The most common cache update method is the periodic

update. To ensure content consistency and freshness, the content provider configures its origin Web servers to provide instructions to caches about what content is cacheable, how long different content is to be considered fresh, when to check back with the origin server for updated content, and so forth.

• With this approach, caches are updated in a regular fashion. But this approach suffers from significant levels of unnecessary traffic generated from update traffic at each interval.

25

Page 26: A Taxonomy and Survey of Content Delivery Networks

Update propagation

• The update propagation is triggered with a change in content. It performs active content pushing to the CDN cache servers. In this mechanism, an updated version of a document is delivered to all caches whenever a change is made to the document at the origin server.

• For frequently changing content, this approach generates excess update traffic.

26

Page 27: A Taxonomy and Survey of Content Delivery Networks

On-demand update

• On-demand update is a cache update mechanism where the latest copy of a document is propagated to the surrogate cache server based on prior request for that content. This approach follows a assume nothing structure and content is not updated unless it is requested.

• The disadvantage of this approach is the back and forth traffic between the cache and origin server in order to ensure that the delivered content is the latest.

27

Page 28: A Taxonomy and Survey of Content Delivery Networks

Invalidation• Another cache update approach is invalidation, in which an

invalidation message is sent to all surrogate caches when a document is changed at the origin server. The surrogate caches are blocked from accessing the documents when it is being changed. Each cache needs to fetch an updated version of the document individually later.

• The drawback of this approach is that it does not make full use of the distribution network for content delivery and belated fetching of content by the caches may lead to inefficiency of managing consistency among cached contents.

28

Page 29: A Taxonomy and Survey of Content Delivery Networks

Taxonomy of request-routing mechanisms

29

Page 30: A Taxonomy and Survey of Content Delivery Networks

DNS based Request-Routing

30

Akamai DNS

DN

S q

uery

:www.cnn.com

DN

S re

spon

se:

145.155.10.15

Sess

ion

local DNS server

DNS query:www.cnn.com

DNS response:145.155.10.15

Surrogate145.155.10.15

Surrogate58.15.100.152

AkamaiCDN

Client140.124.180.1

delaware.cnn.akamai.com

california.cnn.akamai.com

Page 31: A Taxonomy and Survey of Content Delivery Networks

31

DNS based Request-Routing

DN

S q

uery

DN

S re

spon

se

Sess

ionAkamai DNS

Surrogate

Surrogate

AkamaiCDN

Client140.124.180.1 local DNS server

DNS query

DNS response

Measure to

Client DNS

Measure to Client DNS

Measurement results

Measurement results

Measu

remen

tsMeasurements

Page 32: A Taxonomy and Survey of Content Delivery Networks

URL rewriting

32

HTTP request for www.foo.com/sports/highlight.mpg =>www.cdn.com/www.foo.com/sports/highlight.mpg

DNS query for www.cdn.com

HTTP request for www.cdn.com/www.foo.com/sports/highlight.mpg

1

2

3

origin server

CDN’s authoritative DNS server

CDN server near client

client

http://www.foo.com/sports/highlight.mpg=> http://www.cdn.com/www.foo.com/sports/highlight.mpg

Page 33: A Taxonomy and Survey of Content Delivery Networks

Content outsourcing

• Cooperative push-based: – This approach is based on the pre-fetching of content to

the surrogates.• Non-cooperative pull-based:

– In this approach, client requests are directed to their closest surrogate servers.

• Cooperative pull-based:– The cooperative pull-based approach differs from the non-

cooperative approach in the sense that surrogate servers cooperate with each other to get the requested content in case of cache miss.

33

Page 34: A Taxonomy and Survey of Content Delivery Networks

Conclusion & Future work

• Conclusion– They offer fast and reliable applications and

services– Reduce network impact on the response time– Enhance QoE

• Future work– Find a better way to content placement

34

Page 35: A Taxonomy and Survey of Content Delivery Networks

References

[1] A. K. Pathan, and R. Buyya, “A Taxonomy and Survey of Content Delivery Networks,” Tech Report, Univ. of Melbourne, 2007[2] J. Ni, and D. H. K. Tsang, “Large Scale Cooperative Caching and Application-level Multicast in Multimedia Content Delivery Networks,” IEEE Communications, Vol. 43, Issue. 5, pp. 98-105, May 2005.

35

Page 36: A Taxonomy and Survey of Content Delivery Networks

Q&A

36