trust-sensitive scheduling on the open grid

Trust-Sensitive Scheduling on the Open Grid

Jon B. Weissmanwith help from Jason Sonnek and Abhishek

ChandraDepartment of Computer Science

University of MinnesotaTrends in HPDC Workshop

Amsterdam 2006

Background

• Public donation-based infrastructures are attractive– positives: cheap, scalable, fault tolerant

(UW-Condor, *@home, ...)

– negatives: “hostile” - uncertain resource availability/connectivity, node behavior, end-user demand => best effort service

Background

• Such infrastructures have been used for throughput-based applications– just make progress, all tasks equal

• Service applications are more challenging– all tasks not equal– explicit boundaries between user requests– may even have SLAs, QoS, etc.

Service Model

• Distributed Service– request -> set of independent tasks– each task mapped to a donated node– makespan

– E.g. BLAST service• user request (input sequence) + chunk of DB form

a task

BOINC + BLAST

workunit = input_sequence + chunk of DBgenerated when a request arrives

The Challenge

• Nodes are unreliable– timeliness: heterogeneity, bottlenecks, …– cheating: hacked, malicious (> 1% of SETi

nodes), misconfigured– failure– churn

• For a service, this matters

Some data- timeliness

Computation Heterogeneity

- both across and within nodes

Communication Heterogeneity

- both across and within nodes

PlanetLab – lower bound

The Problem for Today

• Deal with node misbehavior

• Result verification– application-specific verifiers – not general– redundancy + voting

• Most approaches assume ad-hoc replication– under-replicate: task re-execution (^ latency)– over-replicate: wasted resources (v throughput)

• Using information about the past behavior of a node, we can intelligently size the amount of redundancy

System Model

Problems with ad-hoc replication

Unreliable node

Reliable nodeTask x sent to group A

Task y sent to group B

Smart Replication• Reputation

– ratings based on past interactions with clients

– simple sample-based prob. (ri) over window

– extend to worker group (assuming no collusion) => likelihood of correctness (LOC)

• Smarter Redundancy– variable-sized worker groups– intuition: higher reliability clients => smaller groups

Terms• LOC (Likelihood of Correctness), g

– computes the ‘actual’ probability of getting a correct answer from a group of clients (group g)

• Target LOC (target)– the task success-rate that the system tries to ensure while

forming client groups– related to the statistics of the underlying distribution

12

1:,

12

1

1

12

1121

)1(k

kmm

k

iii

k

iik

ii rr

Trust Sensitive Scheduling

• Guiding metrics– throughput : is the number of successfully

completed tasks in an interval

– success rate s: ratio of throughput to number of tasks attempted

Scheduling Algorithms

• First-Fit– attempt to form the first group that satisfies target

• Best-Fit– attempt to form a group that best satisfies target

• Random-Fit– attempt to form a random group that satisfies target

• Fixed-size– randomly form fixed sized groups. Ignore client

ratings. • Random and Fixed are our baselines• Min group size = 3

Scheduling Algorithms

Scheduling Algorithms (cont’d)

Different Groupings

target = .5

Evaluation• Simulated a wide-variety of node

reliability distributions

• Set target to be the success rate of Fixed– goal: match success rate of fixed (which over-

replicates) yet achieve higher throughput– if desired, can drive tput even higher (but

success rate would suffer)

Comparison

gain: 25-250%open question: how much better could we have done?

Non-stationarity• Nodes may suddenly shift gears

– deliberately malicious, virus, detach/rejoin– underlying reliability distribution changes

• Solution– window-based rating (reduce from infinite)

• Experiment: “blackout” at round 300 (30% effected)

Role of target

• Key parameter• Too large

– groups will be too large (low throughput)• Too small

– groups will be too small (low success rate)• Adaptively learn it (parameterless)

– maximizing * s : “goodput”– or could bias toward or s

Adaptive algorithm

• Multi-objective optimization– choose target LOC to simultaneously

maximize throughput and success rate s1 2 s

– use weighted combination to reduce multiple objectives to a single objective

– employ hill-climbing and feedback techniques to control dynamic parameter adjustment

Adapting target

• Blackout example

Throughput (1=1, 2=0)

BF

-Uniform

BF

-Norm

Low

BF

-Norm

Hig

h

BF

-HeavyLow

BF

-HeavyH

igh

BF

-Bim

odal Min

AdaptMax0

5

10

15

20

25

30

Xput comparison - BF

Min

Adapt

Max

Current/Future Work

• Implementation of reputation-based scheduling framework (BOINC and PL)

• Mechanisms to retain node identities (hence ri) under node churn

– “node signatures” that capture the characteristics of the node

Current/Future Work (cont’d)

• Timeliness– extending reliability to encompass time– a node whose performance is highly variable is less

reliable

• Client collusion– detection: group signatures– prevention:

• combine quiz-based tasks with reputation systems• form random-groupings

Thank you.

trust-sensitive scheduling on the open grid

Documents