performance prediction and automated tuning of randomized and parametric algorithms

43
Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms Frank Hutter 1 , Youssef Hamadi 2 , Holger Hoos 1 , and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver, Canada 2 Microsoft Research Cambridge, UK

Upload: belden

Post on 08-Jan-2016

52 views

Category:

Documents


0 download

DESCRIPTION

Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms. Frank Hutter 1 , Youssef Hamadi 2 , Holger Hoos 1 , and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver, Canada 2 Microsoft Research Cambridge, UK. Motivation: Performance Prediction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Performance Prediction andAutomated Tuning of

Randomized and Parametric Algorithms

Frank Hutter1, Youssef Hamadi2, Holger Hoos1, and Kevin Leyton-Brown1

1University of British Columbia, Vancouver, Canada2Microsoft Research Cambridge, UK

Page 2: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 2

Motivation: Performance Prediction

• Useful for research in algorithms What makes problems hard? Constructing hard benchmarks Constructing algorithm portfolios (satzilla) Algorithm design

• Newer applications Optimal restart strategies

(see previous talk by Gagliolo et al.) Automatic parameter tuning (this talk)

Page 3: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 3

Motivation: Automatic tuning

• Tuning parameters is a pain Many parameters → combinatorially many configurations About 50% of development time can be spent tuning parameters

• Examples of parameters Tree Search: variable/value heuristics, propagation, restarts, … Local Search: noise, tabu length, strength of escape moves, … CP: modelling parameters + algorithm choice + algo params

• Idea: automate tuning with methods from AI More scientific approach More powerful: e.g. automatic per instance tuning Algorithm developers can focus on more interesting problems

Page 4: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 4

Related work

• Performance Prediction

[Lobjois and Lemaître, ’98, Horvitz et. al ’01, Leyton-Brown, Nudelman et al. ’02 & ’04, Gagliolo & Schmidhuber ’06]

• Automatic Tuning Best fixed parameter setting for instance set

[Birattari et al. ’02, Hutter ’04, Adenso-Diaz & Laguna ’05]

Best fixed setting for each instance[Patterson & Kautz ’02]

Changing search strategy during the search[Battiti et al, ’05, Lagoudakis & Littman, ’01/’02, Carchrae & Beck ’05]

Page 5: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 5

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models for parametric algorithms

• Automatic tuning based on these

• Ongoing Work and Conclusions

Page 6: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 6

Empirical hardness models: basics

• Training: Given a set of t instances inst1,...,instt For each instance insti

- Compute instance features xi = (xi1,...,xim)

- Run algorithm and record its runtime yi

Learn function f: features → runtime, such that yi f(xi) for i=1,…,t

• Test / Practical use: Given a new instance instt+1 Compute features xt+1

Predict runtime yt+1 = f(xt+1)

Page 7: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 7

Which instance features?• Features should be computable in polytime

Basic properties, e.g. #vars, #clauses, ratio Graph-based characterics Local search and DPLL probes

• Combine features to form more expressive basis functions = (1,...,q) Can be arbitrary combinations of the features x1,...,xm

• Basis functions used for SAT in [Nudelman et al. ’04] 91 original features: xi Pairwise products of features: xi * xj Feature selection to pick best basis functions

Page 8: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 8

How to learn function f: features runtime?

• Runtimes can vary by orders of magnitude Need to pick an appropriate model Log-transform the output

e.g. runtime is 103 sec yi = 3

• Simple functions show good performance Linear in the basis functions: yi f(i) = i * wT

- Learning: fit the weights w (ridge regression: w = ( + T )-1 Ty)

Gaussian Processes didn’t improve accuracy

Page 9: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 9

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models for parametric algorithms

• Automatic tuning based on these

• Ongoing Work and Conclusions

Page 10: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 10

• We have incomplete, randomized local search algorithms Can this same approach still predict the run-time ? Yes!

• Algorithms are incomplete (local search) Train and test on satisfiable instances only

• Randomized Ultimately, want to predict entire run-time distribution (RTDs) For our algorithms, RTDs are typically exponential Can be characterized by a single sufficient statistic (e.g. median

run-time)

EH models for randomized algorithms

Page 11: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 11

EH models: basics → sufficient stats for RTD

• Training: Given a set of t instances inst1,..., instt For each instance insti

- Compute instance features xi = (xi1,...,xim)

- Compute basis functions i = (i1,..., ik)

- Run algorithm and record its runtime yi

Learn function f: basis functions → runtime, such that yi f(i) for i=1,…,t

• Test / Practical use: Given a new instance instt+1 Compute features xt+1

Compute basis functions t+1 = (t+1,1,..., t+1,k) Predict runtime yt+1 = f(t+1)

Page 12: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 12

EH models: basics → sufficient stats for RTD

• Training: Given a set of t instances inst1,..., instt For each instance insti

- Compute instance features xi = (xi1,...,xim)

- Compute basis functions i = (i1,..., ik)

- Run algorithm multiple times and record its runtimes yi1, …, yi

k

- Fit sufficient statistics si for distribution from yi1, …, yi

k

Learn function f: basis functions → sufficient statistics, such that si f(i) for i=1,…,t

• Test / Practical use: Given a new instance instt+1 Compute features xt+1

Compute basis functions t+1 = (t+1,1,..., t+1,k) Predict sufficient statistics st+1 = f(t+1)

Page 13: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 13

Predicting median run-time

Median runtime of Novelty+ on CV-var

Prediction based on single runs

Prediction based on 100 runs

Page 14: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 14

SAPS on QWHNovelty+ on SW-GCP

Structured instances

Median runtime predictions based on 10 runs

Page 15: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 15

Predicting run-time distributions

RTD of SAPS on q0.75

instance of QWHRTD of SAPS on q0.25

instance of QWH

Page 16: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 16

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models for parametric algorithms

• Automatic tuning based on these

• Ongoing Work and Conclusions

Page 17: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 17

EH models: basics → parametric algos

• Training: Given a set of t instances inst1,..., instt For each instance insti

- Compute instance features xi = (xi1,...,xim)

Compute basis functions i = (xi)

- Run algorithm and record its runtime yi

Learn function f: basis functions → runtime, such that yi f(i) for i=1,…,t

• Test / Practical use: Given a new instance instt+1 Compute features xt+1 .

Compute basis functions t+1 = (xt+1)

Predict runtime yt+1 = f(t+1)

Page 18: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 18

EH models: basics → parametric algos

• Training: Given a set of t instances inst1,..., instt For each instance insti

- Compute instance features xi = (xi1,...,xim)

- For parameter settings pi1,...,pi

ni:

Compute basis functions i j (xi, pi

j) of features and parameter settings(quadratic expansion of params, multiplied by instance features)

- Run algorithm with each setting pij and record its runtime yi

j

Learn function f: basis functions → runtime, such that yi

j f(i

j) for i=1,…,t

• Test / Practical use: Given a new instance instt+1 Compute features xt+1

For each parameter setting pj of interest, Compute basis functions t+1

j (xt+1, pj)

Predict runtime yt+1j = f(t+1

j)

Page 19: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 19

Predicting SAPS with different settings

• Train and test with 30 different parameter settings on QWH

• Show 5 test instances, each with different symbol Easiest 25% quantile Median 75% quantile Hardest

• More variation in harder instances

Page 20: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 20

One instance in detail

• Note: this is a projection from 40-dimensional joint feature/parameter space

• Relative relationship predicted well

(blue diamonds in previous figure)

Page 21: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 21

Algo Data Set Speedup over default params

Speedup over best fixed params for data set

Nov+ unstructured 0.90 0.90

Nov+ structured 257 0.94

Nov+ mixed 15 10

SAPS unstructured 2.9 1.05

SAPS structured 2.3 0.98

SAPS mixed 2.31 1

Automated parameter setting: results

Do you

have one?

Not the best algorithm to

tune ;-)

Page 22: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 22

Results for Novelty+ on Mixed

Compared to best fixed parameters

Compared to random parameters

Page 23: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 23

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models for parametric algorithms

• Automatic tuning based on these

• Ongoing Work and Conclusions

Page 24: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 24

Ongoing work• Uncertainty estimates

• Bayesian linear regression vs. Gaussian processes

• GPs are better in predicting uncertainty

• Active Learning For many problems, cannot try all parameter combinations Dynamically choose best parameter configurations to train on

• Want to try more problem domains (do you have one?) Complete parametric SAT solvers Parametric solvers for other domains (need features) Optimization algorithms

Page 25: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 25

Conclusions

• Performance Prediction Empirical hardness models can predict the run-times of

randomized, incomplete, parameterized, local search algorithms

• Automated Tuning We automatically find parameter settings that are better than

defaults Sometimes better than the best possible fixed setting

• There’s no free lunch Long initial training time Need domain knowledge to define features for a domain

(only once per domain)

Page 26: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 26

The End

• Thanks to Holger Hoos, Kevin Leyton-Brown,

Youssef Hamadi Reviewers for helpful comments You for your attention

Page 27: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 27

Backup

Page 28: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 28

Experimental setup: solvers

• Two SAT solvers Novelty+ (WalkSAT variant)

- Adaptive version won SAT04 random competition- Six values for noise between 0.1 and 0.6

SAPS (Scaling and Probabilistic Smoothing)- Second in above competition- All 30 combinations of

3 values for between 1.2 and 1.4 10 values for between 0 and 0.9

• Runs cut off after 15 minutes Cutoff is interesting (previous talk), but orthogonal

Page 29: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 29

Experimental setup: benchmarks• Unstructured distributions:

SAT04: two generators from SAT04 competition, random

CV-fix: uf400 with c/v ratio 4.26 CV-var: uf400 with c/v ratio between 3.26 and 5.26

• Structured distributions: QWH: quasi groups with holes, 25% to 75% holes SW-GCP: graph colouring based on small world graphs QCP: quasi group completion , 25% to 75% holes

• Mixed: union of QWH and SAT04

• All data sets split 50:25:25 for train/valid/test

Page 30: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 30

Prediction based on single runs

Median runtime of SAPS on CV-fix

Predicting median run-time

Prediction based on 100 runs

Page 31: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 31

Automatic tuning

• Algorithm design: new algorithm/application A lot of time is spent for parameter tuning

• Algorithm analysis: comparability Is algorithm A faster than algorithm B because they

spent more time tuning it ?

• Algorithm use in practice Want to solve MY problems fast, not necessarily the

ones the developers used for parameter tuning

Page 32: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 32

Examples of parameters

• Tree search Variable/value heuristic Propagation Whether and when to restart How much learning

• Local search Noise parameter Tabu length in tabu search Strength of penalty increase and decrease in DLS Pertubation, acceptance criterion, etc. in ILS

Page 33: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 33

Which features are most important?

• Results consistent with those for deterministic tree-search algorithms Graph-based and DPLL-based features Local search probes are even more important here

• Only very few features needed for good models Previously observed for all-sat data [Nudelman et al. ’04]

A single quadratic basis function is often almost as good as the best feature subset

Strong correlation between features Many choices yield comparable performance

Page 34: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 34

Algorithm selection based on EH models

• Given portfolio of n different algorithms A1,...,An

Pick best algorithm for each instance E.g. satzilla

• Training: Learn n separate functions

fj: features runtime of algorithm j

• Test (for each new instance st+1): Predict runtime yj

t+1 = fj(t+1) for each algorithm

Choose algorithm Aj with minimal yjt+1

Page 35: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 35

Experimental setup: solvers

• Two SAT solvers Novelty+ (WalkSAT variant)

- Default noise setting 0.5 (=50%) for unstructured instances

- Noise setting 0.1 used for structured instances

SAPS (Scaling and Probabilistic Smoothing)- Default setting (alpha, rho) = (1.3, 0.8)

Page 36: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 36

Best perinstancesettings

Worst perinstancesettings

Results for Novelty+ on Mixed

Page 37: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 37

• Best default parameter setting for instance set Racing algorithms [Birattari et al. ’02] Local search in parameter space [Hutter ’04] Fractional experimental design [Adenso-Daz & Laguna ’05]

• Best parameter setting per instance: algorithm selection/ algorithm configuration Estimate size of DPLL tree for some algos, pick smallest

[Lobjois and Lemaître, ’98] Previous work in empirical hardness models

[Leyton-Brown, Nudelman et al. ’02 & ’04] Auto-WalkSAT [Patterson & Kautz ’02]

• Best sequence of operators / changing search strategy during the search Reactive search [Battiti et al, ‘05] Reinforcement learning [Lagoudakis & Littman, ’01 & ‘02]

Related work in automated parameter tuning

Page 38: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 38

• Learn a function that predicts runtime from instance features and algorithm parameter settings (like before)

• Given a new instance Compute the features (they are fix) Search for the parameter setting that minimizes

predicted runtime for these features

Parameter setting based on runtime prediction

Page 39: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 39

Find single parameter setting that minimizes expected runtime for a whole class of problems

• Generate special purpose code [Minton ’93]

• Minimize estimated error [Kohavi & John ’95]

• Racing algorithm [Birattari et al. ’02]

• Local search [Hutter ’04]

• Experimental design [Adenso-Daz & Laguna ’05]

• Decision trees [Srivastava & Mediratta, ’05]

Related work: best default parameters

Page 40: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 40

Examine instance, choose algorithm that will work well for it

• Estimate size of DPLL search tree for each algorithm [Lobjois and Lemaître, ’98]

• [Sillito ’00]

• Predict runtime for each algorithm[Leyton-Brown, Nudelman et al. ’02 & ’04]

Related work: per-instance selection

Page 41: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 42

Performance Prediction

• Vision: situational awareness in algorithms When will the current algorithm be done ? How good a solution will it find ?

• A first step: instance-aware algorithms Before you start: how long will the algorithm take ?

- Randomized → whole run-time distribution

For different parameter settings- Can pick the one with best predicted performance

Page 42: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 43

Algo Data Set Speedup over default params

Speedup over best fixed params for data set

Nov+ unstructured 0.89 0.89

Nov+ structured 177 0.91

Nov+ mixed 13 10.72

SAPS unstructured 2 1.07

SAPS structured 2 0.93

SAPS mixed 1.91 0.93

Automated parameter setting: results-old

Do you

have one?

Not the best algorithm to

tune ;-)

Page 43: Performance Prediction and Automated Tuning of  Randomized and Parametric Algorithms

Hutter, Hamadi, Hoos, Leyton-Brown: Performance Prediction and Automated Tuning 44

Compared to best fixed parameters

Results for Novelty+ on Mixed - old

Compared to random parameters