data-driven meta-heuristic search - ustcstaff.ustc.edu.cn/~ketang/ppt/ddms201412.pdf · data-driven...

71
Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology University of Science and Technology of China December 2014 @ CityU of Hong Kong 1

Upload: others

Post on 12-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Data-driven Meta-heuristic Search

Ke Tang

USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology

University of Science and Technology of China

December 2014 @ CityU of Hong Kong

1  

Page 2: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

2  2  

Page 3: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

3  3  

Page 4: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

A data-driven perspective on MS

•  There are a lot of famous meta-heuristic (MS) search methods –  Simulated Annealing –  Tabu Search –  Scatter Search –  Genetic Algorithms –  Evolution Strategies –  Evolutionary Programming –  Particle Swarm Optimizer –  Ant Colony Optimization –  Differential Evolution –  Estimation of Distribution Algorithms –  etc.

4  4  

Page 5: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

A data-driven perspective on MS

•  Despite of the different historical background, most MS methods share a similar framework, i.e., they are Stochastic Generate-and-Test algorithms.

•  An MS method iteratively sample in a solution space, and thus can be viewed as a data-generating process.

5  5  

Sampling

x1 … xD fitness Individual 1 …

Individual n

Page 6: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

A data-driven perspective on MS

•  The data consist a lot of information, such as: –  The candidate solutions (individuals) –  Their corresponding fitness –  The “origin” of an individual (e.g., an individual was

generated by applying which operator to which parents).

•  Data-driven Meta-heuristic Search: To exploit the data generated by MS during its search process to enhance the performance of MS.

6  6  

Page 7: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

A data-driven perspective on MS

•  Many existing works could be interpreted as DDMS: –  Surrogate Assisted Evolutionary Algorithms –  Many parameter adaptation/self-adaptation schemes

•  Key questions of DDMS:

–  Q1: Why will an MS benefit from historical data? –  Q2: What information is to be extracted from the data? –  Q3: How should the required information been extracted

from the data?

7  7  

Page 8: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

A data-driven perspective on MS

•  The answers to Q1 and Q2 define the specific data analytics problem that need to be addressed.

•  Data analytics problems in DDMS are likely to be intractable (e.g., NP-hard) themselves.

•  Or, they may introduce substantial computational overhead.

(No free lunch)

•  Hence, trade-off between overhead and the benefit of using historical data should always been keep in mind when answering Q3.

8  8  

Page 9: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

9  9  

Page 10: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Speciation in DDMS

10  10  

•  Challenge brought by a multimodal problem –  There might be more than one optima that are (roughly) equally good. –  Finds multiple optima of a problem.

•  Why?

–  Provides the user with a range of choices (more informed decisions) –  Reveals insights into the problem (inspire innovations)

Page 11: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Speciation in DDMS

•  When employing EAs to find multiple optima, a procedure called speciation is usually required.

•  Speciation: partitioning a population into a few species. –  Niche: an region of attraction on the fitness landscape –  Species: a group of individuals occupying the same niche –  Species seed: the best (fittest) individual of a species

11  11  

Page 12: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Speciation in DDMS

•  A typical speciation procedure

12  12  

Page 13: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Speciation in DDMS

13  13  

•  Most speciation methods relies on a sub-algorithm to determine whether two individuals are of the same species.

•  Speciation methods –  Distance-based: Determines whether two individuals are of the same

species according to their distance –  Topology-based: Determines whether two individuals are of the same

species according to the fitness landscape topography

Speciation

Distance-Based Topology-Based

Hill-Valley Recursive Middling

Page 14: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Distance-based Speciation

14  14  

•  Two individuals are assigned to the same species if their distance is smaller than a predefined threshold called niche radius.

•  Introduce an additional parameter (i.e., niche radius), which is difficult to tune.

•  Make strong assumptions, i.e., equally sized and spherically shaped niches.

Page 15: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Topology-based Speciation

15  15  

–  Hill-Valley (HV)

–  Recursive Middling

–  Make weaker assumptions –  Sampling new points in order to capture the landscape topography, –  When more FEs are spent on speciation, less are available for the

evolutionary algorithm to converge. –  Not very attractive especially when fitness evaluation is costly.

Page 16: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

History-based Topological Speciation

16  16  

•  Research question: Could topology-based speciation be FE-free, so that their benefits can be better appreciated?

•  Approach: History-Based Topological Speciation (HTS)

•  Capture landscape topography exclusively based on search history.

Page 17: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

History-based Topological Speciation

17  17  

•  Topology-based speciation methods can be interpreted from the perspective of a sequence of points. –  is an infinite sequence and cannot be tested directly. –  RM “approximates” by sampling a few points on ab.

•  Basic idea: Approximate only using history data/points.

•  What is a “good” approximation? bad bad good

XX

X

Page 18: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

History-based Topological Speciation

18  18  

•  Conceptually, HTS follows a two-step procedure 1.  Construct a finite discrete approximate sequence 2.  Test the approximate sequence to reach a final decision (trivial)

•  More formally, the problem of finding the best approximation

can be stated as:

•  where

Page 19: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

History-based Topological Speciation

19  19  

Page 20: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

History-based Topological Speciation

20  20  

Page 21: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

HTS: Experiments

21 21

Page 22: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

HTS: Experiments

22 22

•  Different methods are integrated into the same evolutionary framework for comparison –  Crowding Differential Evolution with Species Conservation

•  Benchmark functions –  F1—F6: 6 two-dimensional functions with various properties

Number of optima: 2—10 –  F7—F10: MMP functions in 4, 8, 16, 32 dimensions, respectively

Number of optima: 48 –  F11: An composition of 50 random 32-dimensional shifted rotated

ellipsoidal sub-functions coupled via the max operator Number of optima: 50

•  The goal is to find all optima of the benchmark functions

Page 23: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

HTS: Experiments

23  23  

•  Performance Measure: The distance error of the last generation is then used to measure the performance of the algorithm

•  Win/Draw/Lose of HTS versus every other method

–  Both t-test and Wilcoxon rank-sum tests are used –  Consider a difference to be statistically significant if it is asserted so by both

tests at the 0.05 significance level –  A draw is counted when no statistically significant difference is observed

DIS- DIS DIS+ HV1 HV3 HV5 RM RM* 9/0/2 2/5/4 4/5/2 9/0/2 9/0/2 9/0/2 9/0/2 1/3/7

Page 24: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Summary of HTS

24  24  

•  Q1: Why will an MS benefit from historical data? A1: Because species could be formed without setting any parameter and consuming any additional FEs.

•  Q2: What information is to be extracted from the data?

A2: A few clusters of individuals (or tags of species for each individual)

•  Q3: How should the required information been extracted from the data? A3: To find an approximation to a line segment only using previously evaluated candidate solutions.

•  Representation of Data: Individuals and their fitness (e.g., n-by-(D+1) matrix).

•  Generalization issue is not required/considered.

Page 25: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

For more details

25  25  

•  L. Li and K. Tang, “History-Based Topological Speciation for Multimodal Optimization,” IEEE Transactions on Evolutionary Computation, in press (Early Access).

•  P. Yang, K. Tang and X. Lu, “Improving Estimation of Distribution Algorithm on Multi-modal Problems by Detecting Promising Areas,” IEEE Transactions on Cybernetics, accepted on 22 August 2014.

Page 26: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

26  26  

Page 27: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � A scenario frequently encounter in the real world: Ø  A number of optimization problems Ø  A time budget T Ø  A number of optimization algorithms (e.g., GA, ES, EP, EDA,

DE, PSO…) We want to obtain the best (or as good as possible) solutions for all the problems with T.

27  

Page 28: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � Ø  Intuitively, the total time budget T can used for two purposes

(1) to identify the best algorithm (2) to search for the best solution

Ø  In general, the more time we spent on (2), the better solution it

will achieve.

Ø  Different problem may favor different algorithm. Finding the best algorithm for a problem can be very time consuming.

28  

Page 29: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � General thoughts:

Ø  Arbitrarily pick an algorithm for every problem? – T will solely be used to search for solutions – too risky

Ø  Carefully identify the best algorithm for each problem? – A lot of time will be used for algorithm selection – The time left for searching for good solutions might be insufficient.

Ø  Try to find a single algorithm suitable for all problems? – Sounds like a good trade-off – Advantages of having a set of different algorithms are not fully utilized.

29  

Page 30: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � Ø  How about establishing a good “portfolio” of algorithms (e.g.,

a combination of multiple algorithms) for all problems?

Advantages:

•  Making use of advantages of different algorithms, rather than putting all the eggs (time) into a single basket (algorithm).

•  Hopefully not too time-consuming since only one portfolio is needed for all problems.

30  

Page 31: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � 

Ø  Algorithm Portfolios “invests” limited time in multiple algorithms to fully utilize the advantages of these algorithms to maximize the expected utility of a problem solving episode.

Ø  Analogy to Economics: One allocates his/her money to different financial assets (stocks, bonds, etc.) in order to maximize the expected returns while minimizing risks.

Ø  Population-based Algorithm Portfolios (PAP) – Conceptually similar to Algorithm Portfolios – Aims to solve a set of problems rather a single one – Focuses on population-based algorithms (e.g., EAs)

31  

Page 32: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � 

Ø  The General Framework of PAP

32  

Select  the  cons:tuent  algorithms  from    a  pool  of  candidate  algorithms  

Construct  a  concrete  PAP  instan:a:on  with    the  cons:tuent  algorithms  

Apply  the  PAP  instan:a:on  to  each  problem  

Output  the  best  solu:on  obtained    for  each  problem  

Page 33: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � 

Which candidate algorithms should serve as constituent algorithms depends on the way of building a PAP instantiation: Ø  A PAP instantiation maintains multiple sub-populations.

Ø  Each sub-population is evolved with a constituent algorithm.

Ø  Information is shared among sub-populations by activating a migration scheme periodically.

33  

Page 34: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

PAP - Background � 

Pseudo-code of a PAP instantiation:

34  

Page 35: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP � 

On Choosing Constituent Algorithms

Ø  Let F = {fk | k = 1, 2, … , n} be a given problem set, A = {aj |j = 1, 2, ..., m} be a set of candidate EAs, choosing constituent algorithm for PAP is formulated as seeking the algorithm subset = {ai |i = 1, 2, ..., l} of A that leads to the best overall performance on F, as given in Eq. (1)

Ø  A most straightforward approach: enumerate all possible subset and employ a procedure like statistical racing to find the best one.

Even more time consuming than selecting a single algorithm!

35  

),,~(maxarg~~

TFAUAAA

opt⊆

=

Page 36: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP � 

Ø  Recall that we expect that a good PAP instantiation to under-perform a candidate EA (say, ai) with small probability.

Ø  Assuming independence between constituent algorithms, the above statement can be written for an algorithm j on problem fk as:

Ø  Averaging over all problems and all candidate EAs, we get (1)

36  

Rjk = (1−Pi, jk )

i=1

l

∑∑∏= = =

−=m

j

n

k

l

i

kjiPmn

R1 1 1

, )1(1

Page 37: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP � 

What is Estimated Performance Matrix

•  A matrix that records the performance of each candidate EA.

•  For each aj, the corresponding EPM, denoted by EPMj, is an r-by-n matrix.

•  This matrix can be obtained by running aj on each of the n problems for r times.

•  Each element of EPMj is the objective value of the best solution that aj obtained on a problem in a single run.

•  Since each element of EPMj is obtained with a small portion of T, it can be viewed as a conservative estimate of the solution quality achieved by running aj with T on the same problem.

37  

Page 38: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP � 

Ø  With the help of some statistical tests, EPMs provide all information that is needed to calculate Eq. (1)

Good news

Ø  No need to compare the performance of all possible subsets with a tedious procedure like statistical racing.

Ø  Estimating the performance of a single candidate EA is sufficient for constituent algorithm subset selection.

38  

Page 39: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP � 

Detailed steps for Choosing Constituent Algorithms

1.  Apply each candidate EA aj to each problem for r independent runs. The final population obtained in each run is stored.

2.  Construct EPM for each aj based on the quality of the best solution it obtained in each run.

3.  All possible subset of A is enumerated and the corresponding R is calculated using Eq. (1) and the EPMs.

4.  The subset with the smallest R is selected as the constituent algorithms for PAP.

39  

Page 40: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP: Experiments � Ø  4 Candidate EAs: CMA-ES, G3PCX, SaNSDE, wPSO

Ø  Benchmark problems – 13 numerical problems from classical benchmark suite – 14 numerical problems from CEC2005 benchmark suite – Dimension: 30

40  

Page 41: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP: Experiments � Ø  Total Fitness Evaluations (FEs) for each problem: 400000,

800000, and 1200000, respectively.

Ø  25 independent runs on each problem

Ø  For the convenience of implementation, all constituent algorithms of a PAP instantiation evolve with the same number of generations.

Ø  Parameters of constituent algorithms are not fine-tuned.

Ø  migration_interval=MAX_GEN/20, migration_size=1

Ø  PAP with 2 and 3 constituent algorithms are considered.

41  

Page 42: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP: Experiments � Ø  Wilcoxon Test Results (Significance level 0.05): “w-d-l” stands

for “win-draw-lose”

42  

Time Budget  

SaNSDE  

wPSO   G3PCX   CMA-ES  

F-Race   Intra-AOTA  

EPM-PAP-2  

T1   8-14-5   17-10-0   21-6-0   8-13-6   9-14-4   6-15-6  

T2   7-14-6   16-10-1   20-7-0   9-14-4   7-15-5   5-18-4  

T3   6-15-6   17-9-1   21-6-0   10-14-3   7-14-6   6-18-3  

EPM-PAP-3  

T1   9-11-7   19-7-1   21-5-1   10-10-4   10-13-4   5-17-5  

T2   8-17-2   17-9-1   20-7-0   9-12-6   9-12-6   5-20-2  

T3   9-16-2   17-10-0   21-6-0   9-14-4   9-14-4   6-20-1  

Page 43: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP: Experiments � 

Ø  Performance ranking of all possible EPM-PAP-2 and EPM-PAP-3

43  

PAP   Rank   Time Budget = T1   Time Budget = T2   Time Budget = T3  with 2 constituent algorithms  

1   SaNSDE + CMA-ES   SaNSDE + CMA-ES   SaNSDE + CMA-ES  2   wPSO + CMA-ES   wPSO + CMA-ES   wPSO + CMA-ES  3   SaNSDE + wPSO   SaNSDE + wPSO   SaNSDE + wPSO  4   SaNSDE + G3PCX   SaNSDE + G3PCX   SaNSDE + G3PCX  5   G3PCX + CMA-ES   G3PCX + CMA-ES   G3PCX + CMA-ES  6   wPSO + G3PCX   wPSO + G3PCX   wPSO + G3PCX  

with 3 constituent algorithms  

1   SaNSDE+wPSO+CMA-ES   SaNSDE+wPSO+CMA-ES   SaNSDE+wPSO+CMA-ES  2   SaNSDE+G3PCX+CMA-ES   SaNSDE+G3PCX+CMA-ES   SaNSDE+G3PCX+CMA-ES  3   SaNSDE+wPSO+G3PCX   SaNSDE+wPSO+G3PCX   wPSO+ G3PCX+CMA-ES  4   wPSO+ G3PCX+CMA-ES   wPSO+ G3PCX+CMA-ES   SaNSDE+wPSO+G3PCX  

Page 44: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EPM-PAP: Experiments � 

Ø  Success Rates of the EPM-based selection procedure: How likely did it select the best constituent algorithm subset?

44  

Time Budget   SR1   SR2  

EPM-PAP-2   T1   40%   88%  

T2   56%   100%  

T3   72%   100%  

EPM-PAP-3   T1   16%   84%  

T2   36%   88%  

T3   56%   100%  

Page 45: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Summary of EPM-PAP

45  45  

•  Q1: Why will an MS benefit from historical data? A1: Because a better subset of algorithms could be identified.

•  Q2: What information is to be extracted from the data?

A2: An m-dimensional binary vector

•  Q3: How should the required information been extracted from the data? A3: Invest additional FEs to accumulate statistically meaningful estimation of the performance of algorithms.

•  Representation of Data: The quality of solutions of each candicate algorithm on each problem. (i.e., EPM).

•  It is implicitly assumed that performance achieved with small number of FEs could generalize to cases with larger number of FEs.

Page 46: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

For more details

46  46  

•  F. Peng, K. Tang, G. Chen and X. Yao, “Population-based Algorithm Portfolios for Numerical Optimization,” IEEE Transactions on Evolutionary Computation, 14(5): 782-800, October 2010.

•  K. Tang, F. Peng, G. Chen and X. Yao, “Population-based Algorithm Portfolios with automated constituent algorithms selection,” Information Sciences, 279: 94-104, September 2014.

Page 47: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

47  47  

Page 48: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Background

•  Although EAs have achieved great success in the domain of optimization, most reported studies are obtained using small scale problems (e.g., numerical optimization with less than 100 decision variables).

•  Most existing EAs suffer from the “Curse of Dimensionality” phenomenon.

•  On the other hand, large scale problems have emerged in many areas.

48  48  

Page 49: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

An example

•  Birds Nest (China & Switzerland)

•  The irregular ordering of the beams poses an insoluble problem for the then-current CAD tools.

49  49  

Page 50: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Large Scale Optimization Problems

•  Research Target of LSGO: To scale up EAs to problems that are at least one magnitude larger than the state-of-the-art (i.e., with about 1000 variables).

•  What Makes Large Scale Problems Difficult?

–  Solution space often increases exponentially with the growth of problem dimensionality.

–  Problem complexity may increase with the growth of dimensionality, e.g., the number of local optima.

–  Candidate search directions often increase exponentially. EAs might fail to find the promising search directions.

50  50  

Page 51: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EACC-G

•  Basic (and old) idea: divide-and-conquer.

•  Cooperative Coevolution is an ideal approach for implementing the idea: –  Decomposes the objective problem into some sub-problems; –  Evolves each sub-problem separately using EAs; –  Combines the solutions to all sub-problems to form the

solution to the original problem.

•  By decompose, we mean to categorize/divide the D decision variables into a few groups.

51  51  

Page 52: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EACC-G

•  More formally speaking…

•  The above approach is named EACC-G, which involves a predefined number of cycles.

•  Each cycle consists of the following steps: –  Split D decision variables into m groups, each contains s variables. –  Optimize each sub-problem with an EA. –  Solutions for each sub-problem is evaluated by combining with the best

solution obtained for the other sub-problems.

52  52  

Page 53: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EACC-G

•  The key question: How to decompose?

•  If a problem consists a nonseparable component, we say the decision variables in this component are interacting variables.

•  Intuitively, interacting variables should be grouped together by the decomposition procedure.

•  The simplest way for decomposition is to group decision variables randomly. –  Sounds too straightforward to work properly. –  But not as “silly” as it seems to be.

53  53  

Page 54: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EACC-G

54  54  

Nature Computing

The probability of EACC-G to assign two interacting variables xi and xj into the same group for at least k cycles is:

N: Number of Cycles; m: Number of Groups

• For example, given a 1000-D problem, when m = 10, P1

=0.9948, P2 =0.9662• Even the simple random grouping strategy has some chance to group two interacting variables together.

Benefit of Random Grouping

13

!! =!! ( 1!)

!(1 − 1!)

!!!!

!!!!

Page 55: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

EACC-G

•  With the random grouping scheme, each cycle of EACC-G becomes: –  Randomly Split D decision variables into m groups, each contains s

variables. –  Optimize each sub-problem with an EA. –  Solutions for each sub-problem is evaluated by combining with the best

solution obtained for the other sub-problems.

55  55  

Page 56: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Experimental Studies

•  Test Suite: 13 minimization problems (1000-dimensional).

•  Applying Differential Evolution (DE) to the problem directly.

•  DECC-G: using DE as basic optimizer.

•  The numbers of FEs were set to 5e+06 for all algorithms.

•  Results of 25 independent runs were collected for each problem.

56  56  

Page 57: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Experimental Studies

57  57  

Nature Computing

Comparison between DECC-G and SaNSDE on functions f1 − f7 (unimodal), with dimension D = 1000, averaged over 25 runs.

Results (Unimodal)

16

# of Dim SaNSDE DECC-G

f1 1000 6.97E+00 2.17E-25f2 1000 1.24E+00 5.37E-14f3 1000 6.43E+01 3.71E-23f4 1000 4.99E+01 1.01E-01f5 1000 3.31E+03 9.87E+02f6 1000 3.93E+03 0.00E+00f7 1000 1.18E+01 8.40E-03

Page 58: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Experimental Studies

58  58  

Nature Computing

Comparison between DECC-G and SaNSDE on functions f8 − f13 (multimodal), with dimension D = 1000, averaged over 25 runs.

Results (MultiModal)

17

# of Dim SaNSDE DECC-G

f8 1000 -372991 -418983f9 1000 8.69E+02 3.55E-16f10 1000 1.12E+01 2.22E-13f11 1000 4.80E-01 1.01E-15f12 1000 8.97E+00 6.89E-25f13 1000 7.41E+02 2.55E-21

Page 59: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Drawbacks of Random Decomposition

•  The group-size needs to be predefined - rather difficult.

•  All groups are assumed to be of the same size - probably unreasonable.

•  The nature of random grouping limits the chance of categorizing all interacting variables into the same group.

59  59  

Page 60: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Variable Interaction Learning

•  A bottom-up grouping approach 1.  Start by treating each decision variable as a group 2.  Learn the interaction between variables 3.  Merge interacting variables/groups into the same group 4.  Goto step 2 until a stopping criterion is met

•  Benefits –  No need to specify the number of groups. –  Groups are can be of different sizes. –  Once the learning phase finishes, no need to re-group the

decision variables.

60  60  

Page 61: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Variable Interaction Learning

•  How to Learn the Interaction? –  If two solution vectors, say x and x’ are different only on the ith

dimension, and the ith and jth decision variables are NOT interacting. –  Then changing the value of the jth decision variable will NOT change

the relative order of f(x) and f(x’).

•  Hence, we may say that the ith and jth variables are interacting if the following condition holds:

•  Every interaction learned by this mechanism is correct.

61  61  

Page 62: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Variable Interaction Learning

62  62  

Page 63: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Variable Interaction Learning

63  63  

Nature Computing

CCVIL: A Two-stage Algorithm

21

Cooperative Coevolution with Variable Interaction Learning

1.! Initialization: Randomly initialize a population of solutions, and randomly choose an individual from the population.2.! Learning Stage: Repeat a number of learning cycles, each leaning cycle consists of three steps: (1) Randomly permute the sequence of decision variables (2) Scan over the permuted decision variables sequence to check the interaction between each pair of successive variables. If evidence of interaction is discovered, mark the two variables as ”belonging to the same group”.3. Optimization Stage: (1) Categorize the decision variables according to the information obtained in the learning stage (2) Solve the problem using CC framework

Page 64: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Variable Interaction Learning

64  64  

No Free Lunch: The Learning Overhead

26

The Learning stage costs FEs and a trade-off between learning and evolution (optimization) needs to be set.

Appropriate setting for learning cycle can deal with both separable functions and non-separable functions:

Termination Conditions for Learning Stage

• If no interactions were learned after Kˇ cycles, we treat it as separable function and thus the learning stage will terminate.

• If any interaction has been learned before reaching the Kˇ cycles, we treat it as a non-separable function. In this case, learning stage only stops if:

• all N dimensions have been combined into one group• 60% of FEs has been consumed in learning stage

Page 65: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Experimental Studies

65  65  

Experimental Results

28

��

Page 66: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Summary of VIL

66  66  

•  Q1: Why will an MS benefit from historical data? A1: Because interacting variables will be more likely to be grouped together

•  Q2: What information is to be extracted from the data?

A2: A binary “interaction” matrix (D-by-D)

•  Q3: How should the required information been extracted from the data? A3: Invest additional FEs to perform tests between variables.

•  Representation of Data: Individuals and their fitness (e.g., n-by-(D+1) matrix).

•  Generalization issue is not required/considered.

Page 67: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

For more details

67  67  

•  Z. Yang, K. Tang and X. Yao, “Large Scale Evolutionary Optimization Using Cooperative Coevolution,” Information Sciences, 178(15): 2985-2999, 2008.

•  W. Chen, T. Weise, Z. Yang and K. Tang, “Large-Scale Global Optimization using Cooperative Coevolution with Variable Interaction Learning,” in Proceedings of the 11th International Conference on Parallel Problem Solving From Nature (PPSN), Kraków, Poland, September 11–15, 2010, pp. 300–309.

Page 68: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Outline

•  A data-driven perspective on Meta-heuristic Search •  Speciation in DDMS

•  Algorithm Selection in DDMS

•  Identification of Interacting Decision Variables in DDMS •  Summary

68  68  

Page 69: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Summary

•  Data-driven MS makes use of data analytics approach to gain useful information from the data generated during search.

•  Three examples of DDMS have been introduced.

•  Different context in MS may induce significantly different data analytics problems, where a lot of work could be done.

69  69  

Page 70: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Collaborators

•  Collaborators at UBRI (ubri.ustc.edu.cn) –  Mr. Lingxi Li (HTS)

–  Dr. Fei Peng (EPM-PAP)

–  Prof. Xin Yao (EPM-PAP)

–  Prof. Guoliang Chen (EPM-PAP)

–  Mr. Wenxiang Chen (CCVIL)

–  Dr. Thomas Weise (CCVIL)

–  Dr. Zhenyu Yang (CCVIL)

70  70  

Page 71: Data-driven Meta-heuristic Search - USTCstaff.ustc.edu.cn/~ketang/PPT/DDMS201412.pdf · Data-driven Meta-heuristic Search Ke Tang USTC-Birmingham Joint Research Institute in Intelligent

Thanks for your time! Q&A?

71  71