distributed systems principles and paradigms minqi zhou [email protected] chapter 05: naming

65
Distributed Systems Principles and Paradigms Minqi Zhou [email protected] Chapter 05: Naming

Upload: roger-evan-mcgee

Post on 02-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Distributed SystemsPrinciples and Paradigms

Minqi Zhou

[email protected]

Chapter 05: Naming

Page 2: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming things

• User names– Login, email

• Machine names– rlogin, email, web

• Files• Devices• Variables in programs• Network services

2

Page 3: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming Service

Allows you to look up names– Often returns an address as a response

Might be implemented as– Search through file– Client-server program– Database query– …

3

Page 4: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

What’s a name?

Name: identifies what you want

Address: identifies where it is

Route: identifies how to get there

Binding: associates a name with an address– “choose a lower-level-implementation for

a higher-level semantic construct”

4RFC 1498: Inter-network Naming, addresses, routing

Page 5: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Names

Need names for:– Services: e.g., time of day– Nodes: computer that can run services– Paths: route– Objects within service: e.g. files on a file server

Naming convention can take any format– Ideally one that will suit application and user– E.g., human readable names for humans, binary

identifiers for machines

5

Page 6: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Uniqueness of names

Easy on a small scale

Problematic on a large scale

Hierarchy allows uniqueness to be maintained

compound name: set of atomic names connected with a name separator

6

Page 7: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Terms: Naming convention

Naming system determines syntax for a name

–Unix file names:Parse components from left to right separated by //home/paul/src/gps/gui.c

– Internet domain names:Ordered right to left and delimited by .www.cs.rutgers.edu

– LDAP namesAttribute/value pairs ordered right to left, delimited by , cn=Paul Krzyzanowski, o=Rutgers, c=US

7

Page 8: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Terms: Context

– A particular set of name object bindings

– Each context has an associated naming convention

– A name is always interpreted relative to some context• E.g., directory /usr in the UNIX file system

8

Page 9: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Terms: Naming System

Connected set of contexts of the same type (same naming convention) along with a common set of operations

For example:– System that implements DNS– System that implements LDAP

9

Page 10: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Terms: Name space

Set of names in the naming system

For example,– Names of all files and

directories in a UNIX file system

10

Page 11: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Terms: Resolution

Name lookup

–Return the underlying representation of the name

For example,–www.rutgers.edu 128.6.4.5

11

Page 12: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.1 Naming Entities

Naming Entities

Names, identifiers, and addressesName resolutionName space implementation

3 / 38

Page 13: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.1 Naming Entities

Naming

Essence

Names are used to denote entities in a distributed system. To operateon an entity, we need to access it at an access point. Access pointsare entities that are named by means of an address.

Note

A location-independent name for an entity E, is independent from theaddresses of the access points offered by E.

4 / 38

Page 14: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.1 Naming Entities

Identifiers

Pure name

A name that has no meaning at all; it is just a random string. Purenames can be used for comparison only.

Identifier

A name having the following properties:

P1: Each identifier refers to at most one entityP2: Each entity is referred to by at most one identifierP3: An identifier always refers to the same entity (prohibits reusingan identifier)

Observation

An identifier need not necessarily be a pure name, i.e., it may havecontent.

5 / 38

Page 15: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Flat naming

Problem

Given an essentially unstructured name (e.g., an identifier), how canwe locate its associated access point?

Simple solutions (broadcasting)Home-based approachesDistributed Hash Tables (structured P2P)Hierarchical location service

6 / 38

Page 16: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Simple solutions

Broadcasting

Broadcast the ID, requesting the entity to return its current address.

Can never scale beyond local-area networks

Requires all processes to listen to incoming location requests

Forwarding pointers

When an entity moves, it leaves behind a pointer to next location

Dereferencing can be made entirely transparent to clients by simplyfollowing the chain of pointers

Update a client’s reference when present location is found

Geographical scalability problems (for which separate chain reductionmechanisms are needed):

Long chains are not fault tolerantIncreased network latency at dereferencing

7 / 38

Page 17: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Simple solutions

Broadcasting

Broadcast the ID, requesting the entity to return its current address.

Can never scale beyond local-area networks

Requires all processes to listen to incoming location requests

Forwarding pointers

When an entity moves, it leaves behind a pointer to next location

Dereferencing can be made entirely transparent to clients by simplyfollowing the chain of pointers

Update a client’s reference when present location is found

Geographical scalability problems (for which separate chain reductionmechanisms are needed):

Long chains are not fault tolerantIncreased network latency at dereferencing

7 / 38

Page 18: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Home-based approaches

Single-tiered scheme

Let a home keep track of where the entity is:

Entity’s home address registered at a naming serviceThe home registers the foreign address of the entityClient contacts the home first, and then continues with foreignlocation

8 / 38

Page 19: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming

Home-based approaches

Host's home

5.2 Flat Naming

location 1. Send packet to host at its home

2. Return addressof current location

Client'slocation

3. Tunnel packet tocurrent location

4. Send successive packetsto current location

Host's present location

9 / 38

Page 20: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Home-based approaches

Two-tiered scheme

Keep track of visiting entities:

Check local visitor register firstFall back to home location if local lookup fails

Problems with home-based approaches

Home address has to be supported for entity’s lifetimeHome address is fixed ⇒ unnecessary burden when the entitypermanently movesPoor geographical scalability (entity may be next to client)

Question

How can we solve the “permanent move” problem?

10 / 38

Page 21: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Home-based approaches

Two-tiered scheme

Keep track of visiting entities:

Check local visitor register firstFall back to home location if local lookup fails

Problems with home-based approaches

Home address has to be supported for entity’s lifetimeHome address is fixed ⇒ unnecessary burden when the entitypermanently movesPoor geographical scalability (entity may be next to client)

Question

How can we solve the “permanent move” problem?

10 / 38

Page 22: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Home-based approaches

Two-tiered scheme

Keep track of visiting entities:

Check local visitor register firstFall back to home location if local lookup fails

Problems with home-based approaches

Home address has to be supported for entity’s lifetimeHome address is fixed ⇒ unnecessary burden when the entitypermanently movesPoor geographical scalability (entity may be next to client)

Question

How can we solve the “permanent move” problem?

10 / 38

Page 23: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Distributed Hash Tables (DHT)

Chord

Consider the organization of many nodes into a logical ring

Each node is assigned a random m-bit identifier.Every entity is assigned a unique m-bit key.Entity with key k falls under jurisdiction of node with smallestid ≥ k (called its successor).

Nonsolution

Let node id keep track of succ(id) and start linear search along thering.

11 / 38

Page 24: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

DHTs: Finger tables

Principle

Each node p maintains a finger table FTp [] with at most m entries:

FTp [i] = succ(p + 2i−1 )

Note: FTp [i] points to the first node succeeding p by at least 2i−1 .To look up a key k, node p forwards the request to node with indexj satisfying

q = FTp [j] ≤ k < FTp [j + 1]

If p < k < FTp [1], the request is also forwarded to FTp [1]

12 / 38

Page 25: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

12

Naming 5.2 Flat Naming

DHTs: Finger tables4 Finger table4

Actual node

3031 0 1

3

45

2

9918

i

1

su

9

cc(p

+

i-1 )2

12345

111414 27

28

29

Resolve k = 12

3

4

5

2345

991420

26

25

from node 28 6

7

24 812

1111

23 Resolve k = 26from node 1

9 345

141828

22 101 282345

282819

123

21

212828

20

19

1817 16 15

14

13

1

12

18

11 12345

1414182028

4 28 2 185 4 1

23

202028

345

1828

145

284

13 / 38

Page 26: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Exploiting network proximity

Problem

The logical organization of nodes in the overlay may lead to erratic messagetransfers in the underlying Internet: node k and node succ(k + 1) may bevery far apart.

Topology-aware node assignment: When assigning an ID to a node, makesure that nodes close in the ID space are also close in the network. Canbe very difficult.

Proximity routing: Maintain more than one possible successor, and forward tothe closest.Example: in Chord FTp [i] points to first node inINT = [p + 2i−1 , p + 2i − 1]. Node p can also store pointers to othernodes in INT .

Proximity neighbor selection: When there is a choice of selecting who yourneighbor will be (not in Chord), pick the closest one.

14 / 38

Page 27: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Exploiting network proximity

Problem

The logical organization of nodes in the overlay may lead to erratic messagetransfers in the underlying Internet: node k and node succ(k + 1) may bevery far apart.

Topology-aware node assignment: When assigning an ID to a node, makesure that nodes close in the ID space are also close in the network. Canbe very difficult.

Proximity routing: Maintain more than one possible successor, and forward tothe closest.Example: in Chord FTp [i] points to first node inINT = [p + 2i−1 , p + 2i − 1]. Node p can also store pointers to othernodes in INT .

Proximity neighbor selection: When there is a choice of selecting who yourneighbor will be (not in Chord), pick the closest one.

14 / 38

Page 28: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Exploiting network proximity

Problem

The logical organization of nodes in the overlay may lead to erratic messagetransfers in the underlying Internet: node k and node succ(k + 1) may bevery far apart.

Topology-aware node assignment: When assigning an ID to a node, makesure that nodes close in the ID space are also close in the network. Canbe very difficult.

Proximity routing: Maintain more than one possible successor, and forward tothe closest.Example: in Chord FTp [i] points to first node inINT = [p + 2i−1 , p + 2i − 1]. Node p can also store pointers to othernodes in INT .

Proximity neighbor selection: When there is a choice of selecting who yourneighbor will be (not in Chord), pick the closest one.

14 / 38

Page 29: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Exploiting network proximity

Problem

The logical organization of nodes in the overlay may lead to erratic messagetransfers in the underlying Internet: node k and node succ(k + 1) may bevery far apart.

Topology-aware node assignment: When assigning an ID to a node, makesure that nodes close in the ID space are also close in the network. Canbe very difficult.

Proximity routing: Maintain more than one possible successor, and forward tothe closest.Example: in Chord FTp [i] points to first node inINT = [p + 2i−1 , p + 2i − 1]. Node p can also store pointers to othernodes in INT .

Proximity neighbor selection: When there is a choice of selecting who yourneighbor will be (not in Chord), pick the closest one.

14 / 38

Page 30: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

Hierarchical Location Services (HLS)

Basic idea

Build a large-scale search tree for which the underlying network isdivided into hierarchical domains. Each domain is represented by aseparate directory node.

The root directorynode dir(T)

Top-leveldomain T

Directory node

dir(S) of domain S

A subdomain Sof top-level domain T(S is contained in T)

A leaf domain, contained in S

15 / 38

Page 31: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

HLS: Tree organization

Invariants

Address of entity E is stored in a leaf or intermediate node

Intermediate nodes contain a pointer to a child iff the subtree rooted atthe child stores an address of the entity

The root knows about all entitiesField with no data

Field for domaindom(N) withpointer to N

Location recordfor E at node M

M

N

Location recordwith only one field,containing an address

Domain D1Domain D2

16 / 38

Page 32: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.2 Flat Naming

HLS: Lookup operation

Basic principles

Start lookup at local leaf node

Node knows about E ⇒ follow downward pointer, else go up

Upward lookup always stops at root

Node knowsabout E, so request

Node has norecord for E, sothat request isforwarded toparent

Look-uprequest

is forwarded to child

M

Domain D

17 / 38

Page 33: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

HLS: Insert operation

Node knows

Naming 5.2 Flat Naming

Node has no about E, so requestrecord for E,so request isforwardedto parent

Insertrequest

is no longer forwarded

M

Domain D

(a)

Node creates record

and stores pointer

MNode createsrecord andstores address

(b)

18 / 38

Page 34: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name space

Essence

A graph in which a leaf node represents a (named) entity. A directory node isan entity that refers to other nodes.

Data stored in n1

n2: "elke" homen0

keys

n3: "max"n4: "steen"

elke

n1

maxsteen

n5"/keys""/home/steen/keys"

n2 n3 n4 keys

Leaf node

Directory node.twmrc mbox

"/home/steen/mbox"

Note

A directory node contains a (directory) table of (edge label, node identifier)pairs.

19 / 38

Page 35: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name space

Observation

We can easily store all kinds of attributes in a node, describing aspectsof the entity the node represents:

Type of the entityAn identifier for that entityAddress of the entity’s locationNicknames...

Note

Directory nodes can also have attributes, besides just storing adirectory table with (edge label, node identifier) pairs.

20 / 38

Page 36: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name space

Observation

We can easily store all kinds of attributes in a node, describing aspectsof the entity the node represents:

Type of the entityAn identifier for that entityAddress of the entity’s locationNicknames...

Note

Directory nodes can also have attributes, besides just storing adirectory table with (edge label, node identifier) pairs.

20 / 38

Page 37: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name resolution

Problem

To resolve a name we need a directory node. How do we actually find that(initial) node?

Closure mechanism

The mechanism to select the implicit context from which to start nameresolution:

www.cs.vu.nl: start at a DNS name server

/home/steen/mbox: start at the local NFS file server (possible recursivesearch)

0031204447784: dial a phone number

130.37.24.8: route to the VU’s Web server

Question

Why are closure mechanisms always implicit?

21 / 38

Page 38: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name resolution

Problem

To resolve a name we need a directory node. How do we actually find that(initial) node?

Closure mechanism

The mechanism to select the implicit context from which to start nameresolution:

www.cs.vu.nl: start at a DNS name server

/home/steen/mbox: start at the local NFS file server (possible recursivesearch)

0031204447784: dial a phone number

130.37.24.8: route to the VU’s Web server

Question

Why are closure mechanisms always implicit?

21 / 38

Page 39: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name resolution

Problem

To resolve a name we need a directory node. How do we actually find that(initial) node?

Closure mechanism

The mechanism to select the implicit context from which to start nameresolution:

www.cs.vu.nl: start at a DNS name server

/home/steen/mbox: start at the local NFS file server (possible recursivesearch)

0031204447784: dial a phone number

130.37.24.8: route to the VU’s Web server

Question

Why are closure mechanisms always implicit?

21 / 38

Page 40: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name linking

Hard link

What we have described so far as a path name: a name that isresolved by following a specific path in a naming graph from one nodeto another.

22 / 38

Page 41: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name linking

Soft link

Allow a node O to contain a name of another node:

First resolve O’s name (leading to O)Read the content of O, yielding nameName resolution continues with name

Observations

The name resolution process determines that we read the contentof a node, in particular, the name in the other node that we needto go to.One way or the other, we know where and how to start nameresolution given name

23 / 38

Page 42: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name linking

Soft link

Allow a node O to contain a name of another node:

First resolve O’s name (leading to O)Read the content of O, yielding nameName resolution continues with name

Observations

The name resolution process determines that we read the contentof a node, in particular, the name in the other node that we needto go to.One way or the other, we know where and how to start nameresolution given name

23 / 38

Page 43: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

n1

Name linking

Naming 5.3 Structured Naming

Data stored in n1

n2: "elke" homen0

keys

n3: "max"n4: "steen" n5 "/keys"

elkemax

steen

Leaf noden2 n3

.twmrc

n4

mbox keys

Data stored in n6

"/keys"Directory node

n6 "/home/steen/keys"

Observation

Node n5 has only one name

24 / 38

Page 44: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Basic issue

Distribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrations

Administrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.

Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

25 / 38

Page 45: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Basic issue

Distribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrations

Administrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.

Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

25 / 38

Page 46: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Basic issue

Distribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrations

Administrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.

Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

25 / 38

Page 47: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Basic issue

Distribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrations

Administrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.

Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

25 / 38

Page 48: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Basic issue

Distribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrations

Administrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.

Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

25 / 38

Page 49: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Name-space implementation

Globallayer

sun

com edu

yale

gov mil

acm

org net

ieee

jp

ac

us

co

nl

oce vu

Adminis-eng cs eng jack jill keio nec cs

trationallayer

ai lindacs csl

ftp www

pc24

Mana-

robot pub

globe

geriallayer Zone

index.txt

26 / 38

Page 50: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

1

3

4

Naming 5.3 Structured Naming

Name-space implementation

Item

2

5

6

Global

Worldwide

Few

Seconds

Lazy

Many

Yes

Administrational

Organization

Many

Milliseconds

Immediate

None or few

Yes

Managerial

Department

Vast numbers

Immediate

Immediate

None

Sometimes

1: Geographical scale

2: # Nodes

3: Responsiveness

4: Update propagation

5: # Replicas

6: Client-side caching?

27 / 38

Page 51: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

1

2

3

nl

ftp

Naming 5.3 Structured Naming

Iterative name resolution

resolve(dir,[name1,...,nameK]) sent to Server0 responsible for dir

Server0 resolves resolve(dir,name1) → dir1, returning the identification(address) of Server1, which stores dir1.

Client sends resolve(dir1,[name2,...,nameK]) to Server1, etc.

1. <nl,vu,cs,ftp>

2. #<nl>, <vu,cs,ftp>

3. <vu,cs,ftp>

Root

name server

Name server

Client'snameresolver

4. #<vu>, <cs,ftp>

5. <cs,ftp>

6. #<cs>, <ftp>

7. <ftp>

8. #<ftp>

nl node

Name servervu node

Name servercs node

vu

cs

<nl,vu,cs,ftp> #<nl,vu,cs,ftp>Nodes aremanaged bythe same server

28 / 38

Page 52: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

1

2

3

Naming 5.3 Structured Naming

Recursive name resolution

resolve(dir,[name1,...,nameK]) sent to Server0 responsible for dir

Server0 resolves resolve(dir,name1) → dir1, and sendsresolve(dir1,[name2,...,nameK]) to Server1, which stores dir1.

Server0 waits for result from Server1, and returns it to client.

1. <nl,vu,cs,ftp>Root

8. #<nl,vu,cs,ftp> name server 2. <vu,cs,ftp>

Client's

7. #<vu,cs,ftp> Name servernl node 3. <cs,ftp>

nameresolver 6. #<cs,ftp>

5. #<ftp>

Name servervu node

Name servercs node

4. <ftp>

<nl,vu,cs,ftp> #<nl,vu,cs,ftp>

29 / 38

Page 53: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Caching in recursive name resolution

Server

for node

cs

vu

Should

resolve

<ftp>

<cs,ftp>

Looks up

#<ftp>

#<cs>

Passes to

child

<ftp>

Receives

and caches

#<ftp>

Returns

to requester

#<ftp>

#<cs>

#<cs, ftp>

nl <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<vu>

#<cs,ftp> #<vu,cs>

#<vu,cs,ftp>

root <nl,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<nl>#<vu,cs>

#<vu,cs,ftp>

#<nl,vu>

#<nl,vu,cs>

#<nl,vu,cs,ftp>

30 / 38

Page 54: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Scalability issues

Size scalability

We need to ensure that servers can handle a large number of requests pertime unit ⇒ high-level servers are in big trouble.

Solution

Assume (at least at global and administrational level) that content of nodeshardly ever changes. We can then apply extensive replication by mappingnodes to multiple servers, and start name resolution at the nearest server.

Observation

An important attribute of many nodes is the address where the representedentity can be contacted. Replicating nodes makes large-scale traditionalname servers unsuitable for locating mobile entities.

31 / 38

Page 55: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Scalability issues

Size scalability

We need to ensure that servers can handle a large number of requests pertime unit ⇒ high-level servers are in big trouble.

Solution

Assume (at least at global and administrational level) that content of nodeshardly ever changes. We can then apply extensive replication by mappingnodes to multiple servers, and start name resolution at the nearest server.

Observation

An important attribute of many nodes is the address where the representedentity can be contacted. Replicating nodes makes large-scale traditionalname servers unsuitable for locating mobile entities.

31 / 38

Page 56: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Scalability issues

Size scalability

We need to ensure that servers can handle a large number of requests pertime unit ⇒ high-level servers are in big trouble.

Solution

Assume (at least at global and administrational level) that content of nodeshardly ever changes. We can then apply extensive replication by mappingnodes to multiple servers, and start name resolution at the nearest server.

Observation

An important attribute of many nodes is the address where the representedentity can be contacted. Replicating nodes makes large-scale traditionalname servers unsuitable for locating mobile entities.

31 / 38

Page 57: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Scalability issues

Geographical scalability

We need to ensure that the name resolution process scales across largegeographical distances.

Recursive name resolution

R1

I1Name server

nl node

Client

Iterative name resolution

I2

I3

Name servervu node

Name servercs node

R2

R3

Long-distance communication

Problem

By mapping nodes to servers that can be located anywhere, we introduce animplicit location dependency.

32 / 38

Page 58: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Example: Decentralized DNS

Basic idea

Take a full DNS name, hash into a key k , and use a DHT-based system toallow for key lookups. Main drawback: You can’t ask for all nodes in asubdomain (but very few people were doing this anyway).

Information in a nodeSOA

A

MX

SRV

NS

CNAME

PTR

HINFO

TXT

Zone

Host

Domain

Domain

Zone

Node

Host

Host

Any kind

Holds info on the represented zone

IP addr. of host this node represents

Mail server to handle mail for this node

Server handling a specific service

Name server for the represented zone

Symbolic link

Canonical name of a host

Info on this host

Any info considered useful

33 / 38

Page 59: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

DNS on Pastry

Pastry

DHT-based system that works with prefixes of keys. Consider a system inwhich keys come from a 4-digit number space. A node with ID 3210 keepstrack of the following nodes

nk

n0

n2

n31

n320

n323

prefix of ID(nk )02

31320323

nk

n1

n30

n33

n322

prefix of ID(nk )1

3033322

Note

Node 3210 is responsible for handling keys with prefix 321. If it receives a

request for key 3012, it will forward the request to node n30 . For DNS: A noderesponsible for key k stores DNS records of names with hash value k.

34 / 38

Page 60: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Replication of records

Definition

Replicated at level i – record is replicated to all nodes with i matchingprefixes. Note: # hops for looking up record at level i is generally i.

Observation

Let xi denote the fraction of most popular DNS names of which therecords should be replicated at level i, then:

xi =d i (log N − C)

1 + d + · · · + d log N−1

1/(1−α)

with N is the total number of nodes, d = b(1−α)/α and α ≈ 1, assumingthat popularity follows a Zipf distribution:The frequency of the n-th ranked item is proportional to 1/nα

35 / 38

Page 61: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.3 Structured Naming

Replication of records

Meaning

If you want to reach an average of C = 1 hops when looking up a DNSrecord, then with b = 4, α = 0.9, N = 10, 000 and 1, 000, 000 recordsthat

61 most popular records should bereplicated at level 0

284 next most popular records at level 11323 next most popular records at level 26177 next most popular records at level 3

28826 next most popular records at level 4134505 next most popular records at level 5the rest should not be replicated

36 / 38

Page 62: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.4 Attribute-Based Naming

Attribute-based naming

Observation

In many cases, it is much more convenient to name, and look upentities by means of their attributes ⇒ traditional directory services(aka yellow pages).

Problem

Lookup operations can be extremely expensive, as they require tomatch requested attribute values, against actual attribute values ⇒inspect all entities (in principle).

Solution

Implement basic directory service as database, and combine withtraditional structured naming system.

37 / 38

Page 63: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.4 Attribute-Based Naming

Attribute-based naming

Observation

In many cases, it is much more convenient to name, and look upentities by means of their attributes ⇒ traditional directory services(aka yellow pages).

Problem

Lookup operations can be extremely expensive, as they require tomatch requested attribute values, against actual attribute values ⇒inspect all entities (in principle).

Solution

Implement basic directory service as database, and combine withtraditional structured naming system.

37 / 38

Page 64: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.4 Attribute-Based Naming

Attribute-based naming

Observation

In many cases, it is much more convenient to name, and look upentities by means of their attributes ⇒ traditional directory services(aka yellow pages).

Problem

Lookup operations can be extremely expensive, as they require tomatch requested attribute values, against actual attribute values ⇒inspect all entities (in principle).

Solution

Implement basic directory service as database, and combine withtraditional structured naming system.

37 / 38

Page 65: Distributed Systems Principles and Paradigms Minqi Zhou mqzhou@sei.ecnu.edu.cn Chapter 05: Naming

Naming 5.4 Attribute-Based Naming

Example: LDAP

C = NL

O = Vrije Universiteit

OU = Comp. Sc.

CN = Main server

N

Host_Name = star Host_Name = zephyr

Attribute

Country

Locality

Organization

OrganizationalUnit

CommonName

Host Name

Host Address

Value

NL

Amsterdam

Vrije Universiteit

Comp. Sc.

Main server

star

192.31.231.42

Attribute

Country

Locality

Organization

OrganizationalUnit

CommonName

Host Name

Host Address

Value

NL

Amsterdam

Vrije Universiteit

Comp. Sc.

Main server

zephyr

137.37.20.10

answer =search("&(C = NL) (O = Vrije Universiteit) (OU = *) (CN = Main server)")

38 / 38