csci 599: beyond web browsers

39
CSCI 599: Beyond CSCI 599: Beyond Web Browsers Web Browsers Professor Shahram Ghandeharizadeh Professor Shahram Ghandeharizadeh Computer Science Department Computer Science Department Los Angeles, CA 90089 Los Angeles, CA 90089

Upload: imani-stevens

Post on 02-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

CSCI 599: Beyond Web Browsers. Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA 90089. QUIZ 1:. When you register your email with Google, Google emails you a key that must be included with each request to their web methods. (True) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSCI 599:  Beyond Web Browsers

CSCI 599: Beyond Web CSCI 599: Beyond Web BrowsersBrowsers

Professor Shahram GhandeharizadehProfessor Shahram GhandeharizadehComputer Science DepartmentComputer Science DepartmentLos Angeles, CA 90089Los Angeles, CA 90089

Page 2: CSCI 599:  Beyond Web Browsers

QUIZ 1:QUIZ 1:

1.1. When you register your email with Google, Google When you register your email with Google, Google emails you a key that must be included with each emails you a key that must be included with each request to their web methods. (True)request to their web methods. (True)

2.2. The Google web service API can be invoked using The Google web service API can be invoked using ASP.NET (True)ASP.NET (True)

3.3. Once a client caches the results of a Google search, Once a client caches the results of a Google search, Google will invalidate this client’s cache when it Google will invalidate this client’s cache when it detects updates to its information system. (False)detects updates to its information system. (False)

4.4. Napster employed a central server to store the Napster employed a central server to store the index of all files available for download by a client. index of all files available for download by a client. (True)(True)

5.5. CAN assumes nodes that insert (key,value) pairs CAN assumes nodes that insert (key,value) pairs will periodically refresh their inserted entries. (True)will periodically refresh their inserted entries. (True)

Page 3: CSCI 599:  Beyond Web Browsers

A Scalable Content A Scalable Content Addressable Network Addressable Network (CAN)(CAN)

by S. Ratnsmy, P. Francis, M. by S. Ratnsmy, P. Francis, M. Handley, R. Karp, S. Shenker.Handley, R. Karp, S. Shenker.

Page 4: CSCI 599:  Beyond Web Browsers

CANCAN

CAN is composed of individual nodes.CAN is composed of individual nodes. CAN employs a hash function to insert, CAN employs a hash function to insert,

lookup, and delete (key,value) pairs.lookup, and delete (key,value) pairs. A node stores a chunk, termed a zone, A node stores a chunk, termed a zone,

of the entire hash table.of the entire hash table. A node maintains information about its A node maintains information about its

neighboring nodes.neighboring nodes.

Page 5: CSCI 599:  Beyond Web Browsers

EXAMPLE HASH FUNCTIONEXAMPLE HASH FUNCTION

A two dimensional hash function:A two dimensional hash function: h(K) = a 6 bit unsigned integerh(K) = a 6 bit unsigned integer The low three and high three bits form The low three and high three bits form

the 2 dimensions of a hash index.the 2 dimensions of a hash index. e.g., h(“Thriller”) = 111011e.g., h(“Thriller”) = 111011 Low 3 bits = 011Low 3 bits = 011 High 3 bits = 111High 3 bits = 111

Three bits range in value from 0 (000) Three bits range in value from 0 (000) to 7 (111)to 7 (111)

Page 6: CSCI 599:  Beyond Web Browsers

ADDRESS SPACEADDRESS SPACE A 2 dimensional address space, can be partitioned across 64 A 2 dimensional address space, can be partitioned across 64

nodesnodes

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 7: CSCI 599:  Beyond Web Browsers

EXAMPLEEXAMPLE A 2 dimensional space partitioned across six nodesA 2 dimensional space partitioned across six nodes

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 8: CSCI 599:  Beyond Web Browsers

EXAMPLE (CONT…)EXAMPLE (CONT…) h(“Thriller”) = 111011 = (111, 011) = node 6h(“Thriller”) = 111011 = (111, 011) = node 6

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 9: CSCI 599:  Beyond Web Browsers

NEIGHBORSNEIGHBORS Two nodes are neighbors if their coordinate spans overlap Two nodes are neighbors if their coordinate spans overlap

along d-1 dimensions and about along one dimension.along d-1 dimensions and about along one dimension.

1

2

4 3

65

Page 10: CSCI 599:  Beyond Web Browsers

NEIGHBORS (CONT…)NEIGHBORS (CONT…) 3’s neighbors:3’s neighbors:

1

2

4 3

65

Page 11: CSCI 599:  Beyond Web Browsers

NEIGHBORS (CONT…)NEIGHBORS (CONT…) 5 is not 3’s neighbor because it does not overlap along one 5 is not 3’s neighbor because it does not overlap along one

dimension; it only abuts along two dimensions.dimension; it only abuts along two dimensions.

1

2

4 3

65

Page 12: CSCI 599:  Beyond Web Browsers

NEIGHBORS (CONT…)NEIGHBORS (CONT…) The coordinate space is a d-torus, it wraps. Example, 5’s The coordinate space is a d-torus, it wraps. Example, 5’s

neighbors:neighbors:

1

2

4 3

65

Page 13: CSCI 599:  Beyond Web Browsers

NEIGHBORS (CONT…)NEIGHBORS (CONT…) A node maintains information about its neighbors in order to A node maintains information about its neighbors in order to

route a lookup, insert, and delete:route a lookup, insert, and delete:

1

2

4 3

65

Page 14: CSCI 599:  Beyond Web Browsers

NODE ADDRESSINGNODE ADDRESSING

CAN has an associated DNS domain CAN has an associated DNS domain name that resolves to the IP address of name that resolves to the IP address of one or more CAN bootstrap nodes.one or more CAN bootstrap nodes.

A bootstrap node maintains a partial A bootstrap node maintains a partial list of CAN nodes it believes are list of CAN nodes it believes are currently in the system.currently in the system.

A request is routed to one of these A request is routed to one of these nodes.nodes.

The contacted node applies the hash The contacted node applies the hash function and routes the request function and routes the request towards its target destination (using towards its target destination (using information about its neighbors).information about its neighbors).

Page 15: CSCI 599:  Beyond Web Browsers

EXAMPLEEXAMPLE A client looks up “Fragile”, h(“Fragile”) = 100010, (4,2) by A client looks up “Fragile”, h(“Fragile”) = 100010, (4,2) by

contacting N5 (7,0).contacting N5 (7,0). Reduce the y-value (high-bits) from 7 to 4, Increase x-value Reduce the y-value (high-bits) from 7 to 4, Increase x-value

(low bits) from 0 to 2(low bits) from 0 to 2

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 16: CSCI 599:  Beyond Web Browsers

EXAMPLEEXAMPLE A client looks up “Fragile”, h(“Fragile”) = 100010, by A client looks up “Fragile”, h(“Fragile”) = 100010, by

contacting N5.contacting N5.

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 17: CSCI 599:  Beyond Web Browsers

EXAMPLEEXAMPLE A client looks up “Fragile”, h(“Fragile”) = 100010, by A client looks up “Fragile”, h(“Fragile”) = 100010, by

contacting N5.contacting N5.

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 18: CSCI 599:  Beyond Web Browsers

EXAMPLEEXAMPLE A client looks up “Hey bebe!”, h(“Hey bebe!”) = 110101, by A client looks up “Hey bebe!”, h(“Hey bebe!”) = 110101, by

contacting N5; how is the request routed?contacting N5; how is the request routed?

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 19: CSCI 599:  Beyond Web Browsers

OBSERVATIONSOBSERVATIONS Observation 1:Observation 1:

In a d-dimensional space, each node has In a d-dimensional space, each node has 2d neighbors.2d neighbors.

A node maintains information about its A node maintains information about its neighbors.neighbors.

Thus, one may grow the number of nodes Thus, one may grow the number of nodes without increasing the node state.without increasing the node state.

Observation 2:Observation 2: The average path length grows as O(n The average path length grows as O(n 1/d1/d) )

as a function of the number of nodes, n.as a function of the number of nodes, n. Observation 3:Observation 3:

The path length is O(d n The path length is O(d n 1/d1/d) hops for d ) hops for d dimensions and n nodesdimensions and n nodes

Page 20: CSCI 599:  Beyond Web Browsers

NUMBER OF DIMENSIONSNUMBER OF DIMENSIONS

Figure 4Figure 4 Substantial improvement with going from Substantial improvement with going from

d=2 to 4. Beyond 4, the percentage d=2 to 4. Beyond 4, the percentage improvement levels off.improvement levels off.

This same observation is shown in Figure This same observation is shown in Figure 6.6.

Page 21: CSCI 599:  Beyond Web Browsers

NEW NODENEW NODE CAN incorporates a new node, say N7, CAN incorporates a new node, say N7,

as follows:as follows: N7 must find a CAN node.N7 must find a CAN node. N7 randomly chooses a point P that maps N7 randomly chooses a point P that maps

to a node, say N1, and sends it a join to a node, say N1, and sends it a join request. N7’s zone will be partitioned request. N7’s zone will be partitioned between N7 and N1.between N7 and N1.

N1 splits is zone in half, retains one half N1 splits is zone in half, retains one half and handles the other half to N7.and handles the other half to N7.

N7 identifies its neighbors.N7 identifies its neighbors. Neighbors of N1 are notified to include N7 Neighbors of N1 are notified to include N7

for routing.for routing.

Page 22: CSCI 599:  Beyond Web Browsers

NEW NODE (Cont…)NEW NODE (Cont…) Zone belonging to N1 is partitioned between N1 and N7Zone belonging to N1 is partitioned between N1 and N7

1

2

4 3

65

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

7

Page 23: CSCI 599:  Beyond Web Browsers

Questions & Answers Questions & Answers

Page 24: CSCI 599:  Beyond Web Browsers

NODE REMOVAL (FAILURE)NODE REMOVAL (FAILURE)

Page 25: CSCI 599:  Beyond Web Browsers

IMPROVEMENTSIMPROVEMENTS Categories of improvements:Categories of improvements:

Replication: reduce path lengthReplication: reduce path length Multiple realities: additional state information (Sec 3.2)Multiple realities: additional state information (Sec 3.2) Multiple hash functions (Sec 3.5)Multiple hash functions (Sec 3.5) MAX replica based: additional state information (Sec MAX replica based: additional state information (Sec

3.4)3.4)

Routing of requests: reduce path latencyRouting of requests: reduce path latency Route requests to a candidate neighbor with minimum Route requests to a candidate neighbor with minimum

RTT: additional state information (3.3)RTT: additional state information (3.3)

Assignment of nodes to zonesAssignment of nodes to zones Uniform partitioning of space (3.7): load balancingUniform partitioning of space (3.7): load balancing Topologically close nodes are assigned to the same Topologically close nodes are assigned to the same

zone (3.6): reduce path latencyzone (3.6): reduce path latency

Page 26: CSCI 599:  Beyond Web Browsers

IMPROVEMENTSIMPROVEMENTS A matrix perspective (missing: data A matrix perspective (missing: data

caching)caching)

One may consider a combination, e.g., One may consider a combination, e.g., data placement & replicationdata placement & replication

Reduce path length

Reduce path latency

Load balancing

Replication

Better Routing

Data placement

Page 27: CSCI 599:  Beyond Web Browsers

REPLICATION: MULTIPLE REPLICATION: MULTIPLE REALITIESREALITIES

Maintain multiple, independent coordinate spaces.Maintain multiple, independent coordinate spaces. Each node is assigned a different zone in each coordinate Each node is assigned a different zone in each coordinate

space.space. Here are two realities:Here are two realities:

1

2

4 3

65 1 2

4

3

65

Reality-1 Reality-2

Page 28: CSCI 599:  Beyond Web Browsers

MULTIPLE REALITIESMULTIPLE REALITIES

Replication increases availability of data in the Replication increases availability of data in the presence of failurespresence of failures

Figure 5Figure 5 The benefit (percentage improvement) with 2 and 3 The benefit (percentage improvement) with 2 and 3

realities is substantial. It levels off with 4 or more realities.realities is substantial. It levels off with 4 or more realities.

Figure 6Figure 6 Number of neighbors is fixed on the x-axis with both (a) Number of neighbors is fixed on the x-axis with both (a)

d=2&r=varying, and (b) d=varying&r=2d=2&r=varying, and (b) d=varying&r=2 To improve routing efficiency, multiple dimensions is more To improve routing efficiency, multiple dimensions is more

beneficial than increasing the number of realities (given beneficial than increasing the number of realities (given the same amount of space).the same amount of space).

Qualitatively: additional realities provide a higher degree Qualitatively: additional realities provide a higher degree of data availability (in the presence of failures).of data availability (in the presence of failures).

Notice the knee of both curves in Figure 6 (impact of Notice the knee of both curves in Figure 6 (impact of realities & dimensions is marginal beyond a certain point).realities & dimensions is marginal beyond a certain point).

Page 29: CSCI 599:  Beyond Web Browsers

REPLICATION: MULTIPLE REPLICATION: MULTIPLE HASH FUNCTIONHASH FUNCTION Use k different hash functions to map a single key Use k different hash functions to map a single key

to k different points in the coordinate space.to k different points in the coordinate space. This results in 0 to k replicas of a single key. In This results in 0 to k replicas of a single key. In

case of collisions to a single zone do not construct case of collisions to a single zone do not construct replicas.replicas.

With a lookup, retrieve the entry from the closest With a lookup, retrieve the entry from the closest node. (Retrieve the node from all the k potential node. (Retrieve the node from all the k potential targets, consuming more bandwidth.)targets, consuming more bandwidth.)

Figure 7 Figure 7

Page 30: CSCI 599:  Beyond Web Browsers

CONTROLLED REPLICATIONCONTROLLED REPLICATION Multiple nodes share the same zone.Multiple nodes share the same zone.

These nodes are termed peers.These nodes are termed peers.

MAXPEERS is a system parameter to MAXPEERS is a system parameter to control the number of replicas.control the number of replicas.

Logically, a node has 2d(MAXPEERS) Logically, a node has 2d(MAXPEERS) neighborsneighbors

To maintain a fixed amount of state To maintain a fixed amount of state information per node:information per node: A node selects one neighbor from A node selects one neighbor from

amongst the peers in each of its amongst the peers in each of its neighboring zones.neighboring zones.

Page 31: CSCI 599:  Beyond Web Browsers

CONTROLLED REPLICATIONCONTROLLED REPLICATION Neighbor selection:Neighbor selection:

Periodically, a node request its coordinate Periodically, a node request its coordinate neighbor to transmit its peer listneighbor to transmit its peer list

Measure the RTT to all nodes in its Measure the RTT to all nodes in its neighboring zoneneighboring zone

Retain the node with the lowest RTT as its Retain the node with the lowest RTT as its neighborneighbor

Replication:Replication: Increases data availabilityIncreases data availability Improves performance: path length, path Improves performance: path length, path

latency, and load balancinglatency, and load balancing Increases update overheadIncreases update overhead

Page 32: CSCI 599:  Beyond Web Browsers

CONTROLLED REPLICATIONCONTROLLED REPLICATION Table 2: per-hop latencyTable 2: per-hop latency

0

5

10

15

20

25

2 3 4

MAXPEERSMAXPEERS

Relative % ImprovementRelative % Improvement

Page 33: CSCI 599:  Beyond Web Browsers

Questions & Answers Questions & Answers

Page 34: CSCI 599:  Beyond Web Browsers

BETER ROUTINGBETER ROUTING

Standard routing metric: Progress Standard routing metric: Progress towards the destintion in terms of the towards the destintion in terms of the Cartesian distance.Cartesian distance.

Better routing:Better routing: Each node measures the network-level Each node measures the network-level

round-trip-time (RTT) to each of its round-trip-time (RTT) to each of its neighborsneighbors

For a given destination, a message is For a given destination, a message is forwarded to the neighbor with the forwarded to the neighbor with the max(progress / RTT)max(progress / RTT)

Table 1 (20-40% improvement)Table 1 (20-40% improvement)

Page 35: CSCI 599:  Beyond Web Browsers

Questions & Answers Questions & Answers

Page 36: CSCI 599:  Beyond Web Browsers

Topologically sensitive routingTopologically sensitive routing

Assign zones to nodes in a manner that assigns neighboring Assign zones to nodes in a manner that assigns neighboring zones to nodes that have a minimum RTTzones to nodes that have a minimum RTT USC’s neighboring node should be UCLA (instead of Cornell) USC’s neighboring node should be UCLA (instead of Cornell)

assuming the RTT to UCLA is smaller.assuming the RTT to UCLA is smaller. How?How?

Identify m well known set of machines as landmarks.Identify m well known set of machines as landmarks. Every CAN measures its RTT to these machines and Every CAN measures its RTT to these machines and

maintains a vector listing closest to farthest.maintains a vector listing closest to farthest. m! ordering of landmarks is possiblem! ordering of landmarks is possible Partition the coordinate space into m! portionsPartition the coordinate space into m! portions When a new node joints, it is mapped to a portion with a When a new node joints, it is mapped to a portion with a

matching landmark orderingmatching landmark ordering

Page 37: CSCI 599:  Beyond Web Browsers

Topologically sensitive routingTopologically sensitive routing Assuming 3 landmarks, the Cartesian space is divided into six Assuming 3 landmarks, the Cartesian space is divided into six

portions: m splits along x-axis, m-1 splits along the y-axisportions: m splits along x-axis, m-1 splits along the y-axis

000

001

010

011

100

101

110

111

000 001 010 011 100 101 110 111

High bits

Low bits

Page 38: CSCI 599:  Beyond Web Browsers

Topologically sensitive routingTopologically sensitive routing

Figure 8:Figure 8: 4 landmarks with a minimum hop distance of 54 landmarks with a minimum hop distance of 5

Latency stretch = CAN network latency Latency stretch = CAN network latency average IP network average IP network latencylatency

(2-d with landmark ordering) out performs (4-d without landmark (2-d with landmark ordering) out performs (4-d without landmark ordering)ordering)

Page 39: CSCI 599:  Beyond Web Browsers

Questions & Answers Questions & Answers