internet-based tsp computation with javelin++ michael neary & peter cappello computer science,...

48
Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Upload: vernon-fox

Post on 19-Jan-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Internet-Based TSP Computation with Javelin++

Michael Neary & Peter CappelloComputer Science, UCSB

Page 2: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionGoals

• Service parallel applications that are:– Large: too big for a cluster– Coarse-grain: to hide communication latency

• Simplicity of use– Design focus: decomposition [composition] of computation.

• Scalable high performance– despite large communication latency

• Fault-tolerance– 1000s of hosts, each dynamically [dis]associates.

Page 3: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionSome Related Work

Page 4: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

IntroductionSome Applications

• Search for extra-terrestrial life• Computer-generated animation• Computer modeling of drugs for:

– Influenza– Cancer– Reducing chemotherapy’s side-effects

• Financial modeling• Storing nuclear waste

Page 5: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Outline

• Architecture

• Model of Computation

• API

• Scalable Computation

• Experimental Results

• Conclusions & Future Work

Page 6: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Architecture Basic Components

Brokers

Clients

Hosts

Page 7: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 8: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 9: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 10: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

PING(BID?)

Page 11: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

Page 12: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Each broker manages a tree of hosts

Page 13: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

Page 14: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker

Page 15: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker• Client gets host trees

Page 16: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationDeterministic Work-Stealing Scheduler

Task container

addTask( task ) getTask( )

stealTask( )

HOST

Page 17: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationDeterministic Work-Stealing Scheduler

Task getWork( )

{

if ( my deque has a task )

return task;

else if ( any child has a task )

return child’s task;

else

return parent.getWork( );

}

CLIENT

HOSTS

Page 18: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of Computation

• Master-slave

– AFAIK all proposed commercial applications

• Branch-&-bound optimization

– A generalization of master-slave.

Page 19: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 0UPPER = LOWER = 0

Page 20: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

2

0UPPER = LOWER = 2

Page 21: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

3

2

0UPPER = LOWER = 3

Page 22: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

4

3

2

0UPPER = 4LOWER = 4

Page 23: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3

2

0UPPER = 3LOWER = 3

Page 24: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3 6

2

0UPPER = 3LOWER = 6

Page 25: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 UPPER = 3LOWER = 7

34

3 6

2 7

0

Page 26: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

• Tasks created dynamically

• Upper bound is shared

• To detect termination:

scheduler detects tasks that

have been:

– Completed

– Killed (“bounded”)34

3 6

2 7

0

Page 27: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

APIpublic class Host implements Runnable{ . . . public void run() { while ( (node = jDM.getWork()) != null ) { if ( isAtomic() ) compute(); // search space; return result else { child = node.branch(); // put children in child array for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound )

jDM.addWork( child[i] ); //else child is killed implicitly } } }

Page 28: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

APIprivate void compute() { . . .

boolean newBest = false;

while ( (node = stack.pop()) != null ) { if ( node.isComplete() ) if ( node.getCost() < UpperBound ) { newBest = true; UpperBound = node.getCost(); jDM.propagateValue( UpperBound ); best = Node( child[i] ); } else { child = node.branch(); for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) stack.push( child[i] ); //else child is killed implicitly } } if ( newBest ) jDM.returnResult( best );} }

Page 29: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 30: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 31: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 32: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 33: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Page 34: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationFault Tolerance via Eager Scheduling

When:

• All tasks have been assigned

• Some results have not been reported

• A host wants a new task

Re-assign a task!

• Eager scheduling tolerates faults & balances the load.

– Computation completes, if at least 1 host communicates with client.

Page 35: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Scalable ComputationFault Tolerance via Eager Scheduling

• Scheduler must know which:

– Tasks have completed

– Nodes have been killed

• Performance balance

– Centralized schedule info

– Decentralized computation34

3 6

2 7

0

Page 36: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Experimental Results

0

20

40

60

80

100

0 20 40 60 80 100

Processors

Speedup graph22

ideal

graph24

Page 37: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Experimental Results

34 8 7 12 10 9 10

3 6 10 8

2 7

0 Example of a “bad” graph

Page 38: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Conclusions• Javelin 2 relieves designer/programmer managing a set of

[Inter-] networked processors that is:– Dynamic– Faulty

• A wide set of applications is covered by:– Master-slave model– Branch & bound model

• Weak shared memory performs well.• Use multicast (?) for:

– Code distribution– Propagating values

Page 39: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future Work

• Improve support for long-lived computation:– Do not require that the client run continuously.

• A dag model of computation– with limited weak shared memory.

Page 40: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkJini/JavaSpaces Technology

TaskManageraka Broker

H H

HH

H

H

H

H

“Continuously” disperse Tasks among brokers via a physics model

Page 41: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkJini/JavaSpaces Technology

• TaskManager uses

persistent JavaSpace

– Host management: trivial

– Eager scheduling: simple

• No single point of failure

– Fat tree topology

Page 42: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Future WorkAdvanced Issues

• Privacy of data & algorithm• Algorithms

– New computational complexity model“Minimize” communication between machines

– N-body problem, …

• Accounting: Associate specific work with specific host– Correctness– Compensation (how to quantify?)

• Create international open source organization– System infrastructure– Application codes

Page 43: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB
Page 44: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34 8 7 12 10 9 10

3 6 10 8

2 7

0UPPER = 3LOWER = 0

Page 45: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

Page 46: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

Page 47: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

3. Ping brokers on list

Page 48: Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB

ArchitectureBroker Name Service (BNS)

BROKER

HOST

BNS1. Register with BNS

2. Get broker list

3. Ping brokers on list

4. Connect to selected broker