robust asynchronous optimization using volunteer computing grids

49
Robust Asynchronous Optimization Using Volunteer Computing Grids Rensselaer Polytechnic Institute Department of Computer Science BOINC Workshop 2009 October 22 Barcelona, Spain Travis Desell, Boleslaw Szymanski, Carlos Varela, Nathan Cole, Heidi Newberg, Malik Magdon- Ismail

Upload: angus

Post on 06-Jan-2016

43 views

Category:

Documents


5 download

DESCRIPTION

Robust Asynchronous Optimization Using Volunteer Computing Grids. Rensselaer Polytechnic Institute Department of Computer Science BOINC Workshop 2009 October 22 Barcelona, Spain. Travis Desell , Boleslaw Szymanski, Carlos Varela, Nathan Cole, Heidi Newberg , Malik Magdon-Ismail. Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Robust Asynchronous Optimization Using Volunteer Computing Grids

Robust Asynchronous OptimizationUsing Volunteer Computing Grids

Rensselaer Polytechnic InstituteDepartment of Computer Science

BOINC Workshop 2009October 22

Barcelona, Spain

Travis Desell, Boleslaw Szymanski, Carlos Varela,

Nathan Cole, Heidi Newberg, Malik Magdon-Ismail

Page 2: Robust Asynchronous Optimization Using Volunteer Computing Grids

203/19/08 2

Overview

Motivation What is Optimization? Astro-Informatics at Milkyway@Home Making Optimization Asynchronous Partial Verification Strategies Results Future Work

Page 3: Robust Asynchronous Optimization Using Volunteer Computing Grids

303/19/08 3

Motivation

Distribution is essential for modern scientific computing• Scientific models are becoming increasingly complex• Rates of data acquisition are far exceeding increases in

computing power

Scientists need easily accessible distributed optimization tools

Traditional optimization strategies not well suited to large scale computing• Lack scalability and fault tolerance

Page 4: Robust Asynchronous Optimization Using Volunteer Computing Grids

403/19/08

What is Optimization?

What parameters x’ give the maximum (or minimum) value of f(x)?

f is typically very complex with multiple minima

Values of x can be continuous or discreetThis talk focuses on continuous optimization

4

Page 5: Robust Asynchronous Optimization Using Volunteer Computing Grids

503/19/08 5

Astro-Informatics

• Being inside the Milky Way provides 3D data:• SLOAN digital sky survey has collected over 10 TB data.• Can determine its structure – not possible for other galaxies.• Very expensive – evaluating a single model of the Milky Way with a single set of

parameters can take hours or days on a typical high-end computer.

• Models determine where different star streams are in the Milky Way, which helps us understand better its structure and how it was formed.

What is the structure and origin of the Milky

Way galaxy?

Page 6: Robust Asynchronous Optimization Using Volunteer Computing Grids

603/19/08

Milkyway@Home Progress

6

Page 7: Robust Asynchronous Optimization Using Volunteer Computing Grids

703/19/08

Traditional Optimization Strategies

Individual-based Evolution:‣ Differential Evolution

‣ Particle Swarm Optimization

7

Population-based Evolution:

‣ Genetic Search

Traditional continuous optimization strategies are evolutionary, imitating biology.

Individual members or entire populations improve monotonically, through recombination.

Page 8: Robust Asynchronous Optimization Using Volunteer Computing Grids

803/19/08 8

Issues With Traditional Optimization

Traditional global optimization techniques are dependent and iterative• Current population (or individual) is used to generate the next

population (or individual)

Dependencies and iterations limit scalability and impact performance• With volatile hosts, what if an individual in the next generation is

lost?• Redundancy is expensive• Scalability limited by population size

Page 9: Robust Asynchronous Optimization Using Volunteer Computing Grids

903/19/08 9

Asynchronous Optimization Strategy

Use an asynchronous methodology• No dependencies on unknown results• No iterations

Continuously updated population• N individuals are generated randomly for the initial population• Fulfill work requests by applying recombination operators to the

population• Update population with reported results

Page 10: Robust Asynchronous Optimization Using Volunteer Computing Grids

10

03/19/08 10

Asynchronous Search Architecture

Assimilator

Population

Fitness (1)

Fitness (2)

Fitness (n)

.

.

.

.

.

.

.

.

Individual (1)

Individual (2)

Individual (n)

.

.

.

.

.

.

.

.

Generate work when queue is low

Unevaluated IndividualsUnevaluated Individual (1)

Unevaluated Individual (2)

Unevaluated Individual (n)

.

.

.

.

.

.

.

.

Work Units

Workers (Fitness Evaluation)

BOINC Clients

WUs ready to send less than 500

Report results and update population

Validate and assimilate results

Request Work

WU Request

Send Work

Send WUs

Page 11: Robust Asynchronous Optimization Using Volunteer Computing Grids

11

03/19/08

Genetic Search

Generate initial random population

Iteratively generate new populations: N best individuals survive through ‘selection’

M individuals mutated

O individuals generated through ‘recombination’

11

Page 12: Robust Asynchronous Optimization Using Volunteer Computing Grids

12

03/19/08

Genetic Search Example

12

9 2, -2, -1

f(pi) pi

recombination (average 3 pairs)

mutation (1 random)

selection (1 best)5 0, 1, -2

6.75 -2.5, .5, -.5

12.5 0, 2.5, -2.5

10.5 -.5, 2, -2.5

10 0, 1, 3

f(pi) pi

5 0, 1, -2

1.1875 -.25, -.75, -.75

3.625 .75, 0, -1.75

4.6875 -1.25, 0.25, -1.75

recombination (average 3 pairs)

mutation (1 random)

selection (1 best)

sort

25 0, 4, -3

14 2, 3, -1

5 0, 1, -2

13 -2, 0, 3

26 -3, 1, -4

f(pi) pi

1

9 2, -2, -1

f(pi) pi

5 0, 1, -2

6.75 -2.5, .5, -.5

12.5 0, 2.5, -2.5

10.5 -.5, 2, -2.5

2

iteration

optimize sum of squares: f(pi) = pi[0]2 + pi[1]2 + pi[2]2

Page 13: Robust Asynchronous Optimization Using Volunteer Computing Grids

13

03/19/08 13

Alternate Recombination

Double Shot - two parents generate three children• Average of the parents• Outside the less fit parent, equidistant to parent and average• Outside the more fit parent, equidistant to parent and average

Page 14: Robust Asynchronous Optimization Using Volunteer Computing Grids

14

03/19/08 14

Alternate Recombination (2)

Randomized Simplex• N parents generate one or more children• Points randomly along the line created by the worst parent, and

the centroid (average) of the remaining parents

Page 15: Robust Asynchronous Optimization Using Volunteer Computing Grids

15

03/19/08

Steady State and Asynchronous GS

Steady State is less parallel than Classical GS: Generate initial random population

Randomly choose mutation or recombination to generate new individual

If new individual improves population, insert it and remove worst member

We modify this approach for Asynchronous GS: Generate initial random population

Randomly choose mutation or recombination to generate new individuals for work requests

When fitness reported, insert members if they improve the population

15

Page 16: Robust Asynchronous Optimization Using Volunteer Computing Grids

16

03/19/08 16

Asynchronous vs Iterative Genetic Search

Page 17: Robust Asynchronous Optimization Using Volunteer Computing Grids

17

03/19/08

Particle Swarm Optimization

Particles ‘fly’ around the search space.

Move according to their previous velocity and are pulled towards the global best found position and their locally best found position.

Analogies:cognitive intelligence (local best knowledge)

social intelligence (global best knowledge)

17

Page 18: Robust Asynchronous Optimization Using Volunteer Computing Grids

18

03/19/08

Particle Swarm Optimization

PSO:vi(t+1) = w * vi(t) + c1 * r1 * (li - pi(t)) + c2 * r2 * (g - pi(t))

pi(t+1) = pi(t) + vi(t+1)

w, c1, c2 = constants

r1, r2 = random float between 0 and 1

vi(t) = velocity of particle i at iteration t

pi(t) = position of particle i at iteration t

li = best position found by particle i

g = global best position found by all particles

18

Page 19: Robust Asynchronous Optimization Using Volunteer Computing Grids

19

03/19/08

Particle Swarm Optimization (Example)

19

previous: pi(t-1)

current: pi(t)

local bestglobal best

c1 * (li - pi(t))c2 * (g - pi(t))

w * vi(t)

velocity: vi(t)

possible newpositions

Page 20: Robust Asynchronous Optimization Using Volunteer Computing Grids

20

03/19/08

Differential Evolution (In Brief)

Many variations:• best/n/bin

• rand/n/bin

• best/n/exp

• rand/n/exp

• current/n/bin

• current/n/exp

In general:Perform binary or exponential recombination between the current individual and another individual modified by a scaled difference between n pairs of other individuals

20

Page 21: Robust Asynchronous Optimization Using Volunteer Computing Grids

21

03/19/08

Differential Evolution (Details)

pi,j(t) = jth parameter of ith member of population at iteration t

gj = jth parameter of global best member at iteration t

c = scaling factor

r1, r2 = random int between 0 and population size, r1 != r2

r3 = random int between 0 and number of parameters

r4 = random float between 0 and 1

cr = crossover rate

21

= gj(t) + c * (pr1,j(t) - pr2,j(t))

= pi,j(t)

pi,j(t+1)

DE (best/1/bin):if r3 == j or r4 < cr

otherwise

if f(p(t+1)) < f(p(t)) then p(t+1) = p(t)

Page 22: Robust Asynchronous Optimization Using Volunteer Computing Grids

22

03/19/08

Asynchronous DE & PSO

Note that generating new positions does not necessarily require the fitness of the previous position

1. Generate new particle or individual positions to fill work queue

2. Update local and global best on resultsDE:

If result improves individual, update individual’s position

PSO:If result improves particles local best, update local best, particle’s

position and velocity of the result

22

Page 23: Robust Asynchronous Optimization Using Volunteer Computing Grids

23

03/19/08

Optimization Method Comparison

Tracked best fitness across 5 separate searches for each combination of search parameters.

Used Sagittarius stripe 22:100,789 observed stars

3 streams

20 optimization parameters

23

Page 24: Robust Asynchronous Optimization Using Volunteer Computing Grids

24

03/19/08

Optimization Method Comparison

24

DE best/p/bin DE rand/p/bin

Particle SwarmGenetic Search (Simplex & Mutation)

Page 25: Robust Asynchronous Optimization Using Volunteer Computing Grids

25

03/19/08

Latency Effects

Is BOINC a good platform for optimization?Fast turnaround required to keep populations evolving

Many slow clients -- are these resources wasted?

25

Page 26: Robust Asynchronous Optimization Using Volunteer Computing Grids

26

03/19/08 26

Operator Examination (1) - BlueGene

Page 27: Robust Asynchronous Optimization Using Volunteer Computing Grids

27

03/19/08 27

Operator Examination (2) - BOINC

Page 28: Robust Asynchronous Optimization Using Volunteer Computing Grids

28

03/19/08 28

Operator Examination (3) - BOINC

Page 29: Robust Asynchronous Optimization Using Volunteer Computing Grids

29

03/19/08 29

Operator Examination (4) - BOINC

Page 30: Robust Asynchronous Optimization Using Volunteer Computing Grids

30

03/19/08 30

Operator Examination (5) - BOINC

Page 31: Robust Asynchronous Optimization Using Volunteer Computing Grids

31

03/19/08 31

Operator Examination (6) - BOINC

Page 32: Robust Asynchronous Optimization Using Volunteer Computing Grids

32

03/19/08 32

Operator Examination (7) - BOINC

Page 33: Robust Asynchronous Optimization Using Volunteer Computing Grids

33

03/19/08

Partial Verification

Only results that will be inserted into the population need to be verified

BOINC verifies every work unit

Partial Verification:Ignore false-negatives (results that won’t be inserted)Verify results which potentially improve the search

33

Page 34: Robust Asynchronous Optimization Using Volunteer Computing Grids

34

03/19/08

Partial Verification Strategies (2)

Required combining assimilation and validation

Slow validation of good results slows convergence

Strategy:Queue potentially good results

Randomly determine to send results for verification or optimization at an verification rate.

Prematurely terminate unvalidated results if better results are received -- particularly beneficial for DE & PSO.

34

Page 35: Robust Asynchronous Optimization Using Volunteer Computing Grids

35

03/19/08

Limiting Redundancy (Genetic Search)

35

Genetic Search (v = 0.9)

Genetic Search (v = 0.6)Genetic Search (v = 0.3)

Page 36: Robust Asynchronous Optimization Using Volunteer Computing Grids

36

03/19/08

Limiting Redundancy (PSO)

36

Particle Swarm (v = 0.9)

Particle Swarm (v = 0.6)Particle Swarm (v = 0.3)

Page 37: Robust Asynchronous Optimization Using Volunteer Computing Grids

37

03/19/08

Limiting Redundancy (DE best/n/bin)

37

DE best/n/bin (v = 0.9)

DE best/n/bin (v = 0.6)DE best/n/bin (v = 0.3)

Page 38: Robust Asynchronous Optimization Using Volunteer Computing Grids

38

03/19/08

Limiting Redundancy (DE rand/p/bin)

38

DE rand/n/bin (v = 0.9)

DE rand/n/bin (v = 0.6)DE rand/n/bin (v = 0.3)

Page 39: Robust Asynchronous Optimization Using Volunteer Computing Grids

39

03/19/08

Conclusions

BOINC is good for optimizationBOINC’s redundancy is not optimal for optimizationGlobal optimization requires lots of tuningVerifying results quickly can be especially important

for optimization

39

Page 40: Robust Asynchronous Optimization Using Volunteer Computing Grids

40

03/19/08

Future Work

DNA@Home:Discreet parameter optimization

Generic optimization framework for BOINC

Compare limited verification to BOINC’s verificationAdaptive verification strategies

Meta-Heuristics

Simulation with Benchmark Test Functions

40

Page 41: Robust Asynchronous Optimization Using Volunteer Computing Grids

41

03/19/08 41

Questions?

Page 42: Robust Asynchronous Optimization Using Volunteer Computing Grids

42

03/19/08 42

Thanks!

http://wcl.cs.rpi.eduhttp://milkyway.cs.rpi.edu

Work partially supported by:• NSF AST No. 0607618• NSF IIS No. 0612213• NSF MRI No. 0420703• NSF CAREER CNS Award No. 0448407

Page 43: Robust Asynchronous Optimization Using Volunteer Computing Grids

43

03/19/08 43

Extra Slides

Page 44: Robust Asynchronous Optimization Using Volunteer Computing Grids

44

03/19/08 44

Search Parameters

• Population Size: 300• Mutation Rate: 0.3

• Simplex:• 1 Child• 2 .. 5 Parents• Points generated between -1.5 * (worst – centroid) to 1.5 * (worst -

centroid)

Page 45: Robust Asynchronous Optimization Using Volunteer Computing Grids

45

03/19/08 45

Asynchronous GS-Simplex on BlueGene

Page 46: Robust Asynchronous Optimization Using Volunteer Computing Grids

46

03/19/08 46

Asynchronous GS-Simplex on BOINC

Page 47: Robust Asynchronous Optimization Using Volunteer Computing Grids

47

03/19/08 47

Simplex Operator Analysis

• Even with a long time to report, results still can improve the population

• Generation near reflection has highest insert rate

• Generation near centroid provides the most population improvement for fast report times

• Generation near reflection provide most population improvement for long report times

Page 48: Robust Asynchronous Optimization Using Volunteer Computing Grids

48

03/19/08 48

Simplex Operator Improvement (2)

Page 49: Robust Asynchronous Optimization Using Volunteer Computing Grids

49

03/19/08 49

Simplex Operator Improvement (3)