cs728 lecture 17 web indexes iii. last time showed how build indexes for graph connectivity based on...
Post on 20-Dec-2015
214 views
TRANSCRIPT
CS728
Lecture 17
Web Indexes III
•Last Time• Showed how build indexes for graph connectivity
•Based on 2-hop covers•Today
•Look at more general problem of compact encodings for
graphs and network problems• Applications
- Fast queries for path information - routing & routing table construction
- topology control- spanning trees- dominating sets & clustering- hierarchical clustering
Main Problem Considered
• arbitrary topology• goal small routing tables to find path to destination• related problem: finding closest item of certain type
Routing: how do I get there from here?
source
destination
Definitions:
Spanner: subgraph whose distance between two nodes is close to that in the original graph
We will see that radio networks need energy-spanners, i.e, subgraphs that contain energy-efficient paths
Spanning Trees:
K-Dominating Sets:
• minimum connected subgraph• useful for routing• single point of failure• non-minimal routes• many variants
• set of nodes that are within K hops of every node• used to defines partition of the network into zones 1-dominating set
Graph Clustering:
Hierarchical Clustering
• K-center problem – find k nodes such that minimize the max distance to all nodes – Flat Clustering
• Hierarchical Clustering• tree clustering with internal and border nodes and edges
Hierarchical Clustering
• The hierarchy imposes a natural addressing scheme
• Each node labeled with the path in the hierarchy tree
• Problem: give a compact labeling for a tree– Clearly need logn bits to identify some nodes.– Need to add information about tree structure– Complete binary tree– Other n-node trees
• Interval labeling scheme– Label the leaves of the tree uniquely logn bits– Label each internal node with the range of its
descendents 2log n bits.– Given two nodes x,y and their labels
• Can you test if x is an ancestor or y?• Can you describe the path from x to y?
• Greedy Dewey Labeling scheme
• Label each edge with small unique string
• Nodes are concatenation of edge labels
v0
00 01
v1 v2
10
v3 v4
Out-degree 4 requires edge labels of maximum length 2.
v0
v1
0
v600
……..
Out-degree 600 requires edge labels of maximum length 10.
1101101110
Theorem: Upper bound on GDL label length withunary delimiters is bits, - is the depth of v in T - n is number of nodes in T
• Alternative use binary (fixed length) for delimiting each edge– Seems to do worse in practice
• Can remove dependence on depth by converting encodings of long interior paths using count labels
)log(2 nv
v
Spanners and Stretch
• Stretch of a subgraph H is the maximum ratio of the distance between two nodes in H to that between them in G– Extensively studied in the graph algorithms and graph
theory literature [Eppstein 96]• Distance stretch and topological stretch• A spanner is a subgraph that has constant stretch
– The Delaunay triangulation yields a planar Euclidean distance-spanner
– The Yao-graph [Yao 82] is also a simple distance-spanner
Energy Stretch and Energy Spanners
• Commonly adopted power attenuation model:– is between 2 and 4
• Assuming uniform threshold for reception power and interference/noise levels, energy consumed for transmitting from to needs to be proportional to
• Power control: Radios have the capability to adjust their power levels so as to reach destination with desired fidelity
• Energy consumed along a path is simply the sum of the transmission energies along the path links
• Define energy-stretch analogous to distance-stretch
distancepowerTransmit
Power Received
u v ),( vud
Energy-Aware Routing
• A path with many short hops consumes less energy than a path with a few large hops– Which edges to use? (Considered in topology control)– Can maintain “energy cost” information to find minimum-energy
paths [Rodoplu-Meng 98]
• Routing to maximize network lifetime [Chang-Tassiulas 99]– Formulate the selection of paths and power levels as an
optimization problem– Suggests the use of multiple routes between a given source-
destination pair to balance energy consumption
• Energy consumption also depends on transmission rate– Schedule transmissions lazily [Prabhakar et al 2001]– Can split traffic among multiple routes at reduced rate [Shah-
Rabaey 02]
Topology Control
• Given:– A collection of nodes in the plane– Transmission range of the nodes
(assumed equal)
• Goal: To determine a subgraph of the transmission graph G that is– Connected – Low-degree– Small stretch, hop-stretch, and power-
stretch
The Yao Graph
• Divide the space around each node into sectors (cones) of angle
• Each node has an edge to nearest node in each sector
• Number of edges is
• For any edge (u,v) in transmission graph– There exists edge (u,w) in same sector such that w is closer to v than u is
• Theorem: The Yao Graph has stretch ))2/sin(21/(1
)(nO
u
wv
Dominating Set
• Applications Facility location– A set of -dominating centers can be selected to
locate servers or copies of a distributed directory– Dominating sets can serve as location database for
storing routing information in ad hoc networks [Liang Haas 00]
• NP-hard for general graphs• Reduces to the minimum set cover problem• Recall last time: Greedy gives logn
approximation• Admits a PTAS for planar graphs [Baker 94]
k
• An Example
Greedy Algorithm
Hierarchical Network Decomposition
• Sparse neighborhood covers [Awerbuch-Peleg 89, Linial-Saks 92]– Applications in location management, replicated data
management, routing– Provable guarantees, though difficult to adapt to a
dynamic environment
• Routing scheme using hierarchical partitioning [Dolev et al 95]– Adaptive to topology changes– Weak guarantees in terms of stretch and memory per
node
Sparse Neighborhood Covers
• An r-neighborhood cover is a set of overlapping clusters such that the r-zone of any node is in one of the clusters
• Aim: Have covers that are low diameter and have small overlap
• Overlap is measured by the max number of clusters a node is in
• Tradeoff between diameter and overlap– Set of all r-zones: Have diameter 2r but overlap n– The entire network single cluster: Overlap 1 but diameter could
be n
• Sparse r-neighborhood with O(r log(n)) diameter clusters and O(log(n)) overlap [Peleg 89, Awerbuch-Peleg 90]
Sparse Neighborhood Covers
• Set of sparse neighborhood covers– { -neighborhood cover: }
• For each node:– For any , the -zone is contained within a
cluster of diameter – The node is in clusters
• Applications:– Tracking mobile users– Distributed directories for replicated objects
r)log( nrO
)(log2 nO
ni log0
r
i2
Online Tracking of Mobile Users
• Given a fixed network with mobile users• Need to support location query operations• Home location register (HLR) approach:
– Whenever a user moves, corresponding HLR is updated
– Inefficient if user is near the seeker, yet HLR is far
• Performance issues:– Cost of query: ratio with “distance” between source
and destination– Cost of updating the data structure when a user
moves
Mobile User Tracking: Initial Setup
• The sparse -neighborhood cover forms a regional directory at level
• At level , each node u selects a home cluster that contains the -zone of u
• Each cluster has a leader node.
• Initially, each user registers its location with the home cluster leader at each of the levels
i2i
)(lognO
i2i
The Location Update Operation
• When a user X moves, X leaves a forwarding pointer at the previous host.
• User X updates its location at only a subset of home cluster leaders– For every sequence of moves that add up to
a distance of at least , X updates its location with the leader at level
• Amortized cost of an update is for a sequence of moves totaling distance
i2i
)log( ndO
d
The Location Query Operation
• To locate user X, go through the levels starting from 0 until the user is located
• At level , query each of the clusters u belongs to in the -neighborhood cover
• Follow the forwarding pointers, if necessary• Cost of query: , if is the
distance between the querying node and the current location of the user
i2i
)log( ndOd
)(lognO
Comments on the Tracking Scheme
• Distributed construction of sparse covers in time [Awerbuch et al 93]
• The storage load for leader nodes may be excessive; use hashing to distribute the leadership role (per user) over the cluster nodes
• Distributed directories for accessing replicated objects [Awerbuch-Bartal-Fiat 96]– Allows reads and writes on replicated objects– An -competitive algorithm assuming each
node has times more memory than the optimal
• Unclear how to maintain sparse neighborhood covers in a dynamic network
)loglog( 2 nnnmO
)(lognO)(lognO
Bubbles Routing and Partitioning Scheme
• Adaptive scheme by [Dolev et al 95]
• Hierarchical Partitioning of a spanning tree structure
• Provable bounds on efficiency for updates
2-level partitioningof a spanning tree
root
Bubbles (cont.)
• Size of clusters at each level is bounded
• Cluster size grows exponentially
• # of levels equal to # of routing hops
• Tradeoff between number of routing hops and update costs
• Each cluster has a leader who has routing information
• General idea:
- route up the tree until in the same cluster as destination,
- then route down
- maintain by rebuilding/fixing things locally inside subtrees
Bubbles Algorithm
• A partition is an [x,y]-partition if all its clusters are of size between x and y
• A partition P is a refinement of another partition P’ if each cluster in P is contained in some cluster of P’.
• An (x_1, x_2, …, x_k)-hierarchical partitioning is a sequence of partitions P_1, P_2, .., P_k such that
- P_i is an [x_i, d x_i] partitioning (d is the degree)
- P_i is a refinement of P_(i-1)
• Choose x_(k+1) = 1 and x_i = x_(i+1) n1/k
Clustering Construction
• Build a spanning tree, say, using BFS
• Let P_1 be the cluster consisting of the entire tree
• Partition P_1 into clusters, resulting in P_2
• Recursively partition each cluster
• Maintenance rules:
- when a new node is added, try to include in existing cluster, else split cluster
- when a node is removed, if necessary combine clusters
• memory requirement
• adaptability
• k hops during routing
• matching lower bound for bounded degree graphs
• Note: Bubbles does not provide a non-trivial upper bound
on stretch in the non-hop model
Performance Bounds
kk nd /123
nkdn k log/11