downloads.hindawi.comdownloads.hindawi.com/.../complexity/2018/7242105.f1.docx · web viewas...

SUPPLEMENTARY MATERIALS

I. Exploring the Impact of Customized GA Choices

In Section 3, we provide an overview of our implementation of NSGA-II while Section 4 discusses the results for Phase I of the search. Here we discuss how we arrived at the choices made for the Phase I search.

The goal of Phase I of the two-phase GA search is to find a decent approximation of the true PF for a representative subset of 40 units from the stockpile. The solutions found to be on the PF at the end of the first phase will then be projected onto the full stockpile of 200 units to obtain starting points for the second phase of the GA search. Thus, the results of the first phase do not need to be perfect, and as such, in selecting settings such as the number of generations to run the algorithm, the maximum number of offspring produced, and the max population size, we choose to balance quality exploration of the true PF and run time. Specifically in this section, we experiment with the number of generations and the maximum number of offspring (maxoff ) to be produced each generation by the algorithm. For most combinations considered, five instances of the GA were run in parallel. For those cases where the GA was run multiple times, the resulting PFs found were combined to find an overall PF for those cases. As a baseline for comparison, the PFs found for all combinations explored were combined and the “Phase I Superfront” with 350 solutions was identified. Table SM1 summarizes the combinations of maxoff and number of generations considered, and for each combination, the run time required, the size of the first tier PFs found by each run, the size of the combined overall PF, and the percentage of solutions in common with the Phase I Superfront are reported.

Gen. maxoff Gen. when maxoff is

reached

Time (min) Size of PF Size of Phase I PF

% Solutions Shared with

“Phase I Superfront”

200 100 14, 17, 18, 16, 10

17.87, 18.65, 18.52, 17.89, 18.53

183, 196, 194, 173, 203

242 1.1% (4)

200 200 39, 34, 36, 49, 39

35.56, 36.14, 36.33, 34.97, 34.89

246, 240, 244, 222, 225

225 3.4% (12)

200 500 137, 159, 144, 123, 147

76.66, 73.38, 76.18, 83.45, 77.99

266, 261, 275, 281, 268

326 54.2% (190)

500 100 17, 24, 21, 18, 17

53.25, 57.14, 54.61, 56.32, 55.77

238, 275, 248, 265, 248

302 21.4% (75)

2000 100 20 328.68 318 318 58.2% (185)Table SM1: Run time and size of first tier PF for all runs of the GA (Phase I) for the specified combinations of number of generations and maximum number of offspring. For each combination run multiple times, the combined Phase I PF and percentage of those solutions shared with the Phase I Superfront are presented.

As described in Section 3, we employ adaptive population sizing (Eskandari et al 2007) in our implementation of NSGA-II. Adaptive population sizing allows the GA to use smaller population sizes in early generations, when there are unlikely to be many solutions on the first tier PF, especially when random starting points are used, and in each generation grows the size of the population based on the size of the first tier PF. Adaptive population sizing, as described by Eskandari et al (2007), requires specifying a maximum population size (maxpop) which the algorithm never exceeds. At the end of each generation, we grow the population size for the next generation as

popsiz eg+ 1=min {|(P F1 )g|+ floor (0.2∗maxpop) ,maxpop },

where (P F1 )g is the first tier PF at the end of generation g and |(P F1)g| is the size of the first tier PF. This ensures that, each generation, the population size is slightly larger than the number of first tier PF solutions and how much larger depends on the size of maxpop.

A drawback of specifying a fixed maximum population size is that if the number of first tier PF solutions found by the algorithm exceeds the maximum population size, not all first tier PF solutions will be included, resulting in a truncated PF. This could impact the progress of the GA and prevent reasonable solutions from being presented to users. Thus, for all combinations of maxoff and number of generations considered, we initially specify maxpop=200, but in the event that there are more solutions on the first tier PF than maxpop allows (|(P F1 )g|>maxpop ), we let the maximum population size to grow based on the number of first tier PF solutions that exist in a given generation. Specifically, the updated maximum population size is computed as maxpop=|(PF1 )g|+ floor (0.2∗maxpop). Because maxpop is allowed to grow, the choice of maxpop less critical to the performance of the algorithm, and thus we did not experiment with other choices of maxpop.

Figure SM1: Population growth for all combinations where GA is run for 200 generations, where maxoff=100 (left), maxoff=200 (center), and maxoff=500 (right). Dashed lines indicate value of maxpop for each generation.

Figure SM1 illustrates population growth across the duration of the GA across all combinations run for 200 generations, while Figure SM2 displays population growth for combinations run for 500 and 2000 generations. The multiple solid lines for each combination correspond to the different runs of the GA for that combination while the dashed lines illustrate how maxpop grows when necessary for the different runs. Each time the size of the first tier PF exceeds the current value of maxpop, maxpop is increased (evidence by a “jump” in the dashed line). Figures SM1 and SM2 both illustrate that when maxpop is allowed to grow, the initially chosen value of maxpop does not have a significant impact on the performance of the GA.

Further examination of Figure SM1 provides some insight into the impact of the maximum number of offspring, maxoff . From left to right, the three panels of Figure SM1 correspond to 100, 200,

and 500 maxoff , respectively. As seen in Figure SM1, the population size grows more quickly for larger values of maxoff . This is not surprising as the more solutions that are generated, the more likely we are to find solutions on the PF. However, this fast population growth due to larger values of maxoff comes at the expense of computation time, as seen in Table SM1, because in later generations more function evaluations are required because of the larger number of offspring that need to be examined. In our implementation of NSGA-II, we choose to use maxoff=¿ 200 for the first Phase of the search, since it represented a compromise between exploration of the search space and run time.

Figure SM2: Population growth when GA is run for 500 generations with maxoff=100 (top) and 2000 generations with maxoff=100 (bottom). Dashed lines indicate value of maxpop for each generation.

In all cases considered, we grew the number of offspring produced each generation according to the rule

nof f g=min {20+⌈ 2×|(P F1 )g|⌉ ,maxoff } .

This rule guarantees that each generation at least 20 offspring solutions will be created. Further, for each solution on the first tier PF, two additional offspring solutions are created. This rule ensures that an even number of offspring are produced, and in our implementation, we create half of those offspring via recombination and half via mutation. Increasing either the slope or the intercept of this rule would likely result in a population that grows more quickly, and thus maxoff would be reached more quickly and run time would be increased (as more solutions are evaluated earlier).

Figures SM1 and SM2 illustrate that the size of the first tier PF grows quickly in the early generations of the GA and that growth eventually slows. Not surprisingly, as seen in Table SM1, GAs run for more generations require more run time. Since the goal of Phase I of the search is to balance exploration of the true PF and run time, we choose to run the Phase I GA for 200 generations.

Figure SM3: Comparison of Phase I PF identified by different GA settings (gray) and the Phase I Superfront (black). Columns of the figure correspond to the GA settings explored.

Figure SM3 illustrates the practical consequences of the decisions made when implementing the GA. Each column of Figure SM3 corresponds to a different combination of GA inputs (number of generations and maxoff ) while the rows correspond to pairwise scatterplots of the optimization metrics. Each plot displays the combined PF from all runs of the GA for that combination of inputs (gray points) and the 350 solutions on the Phase I Superfront (black points). Table SM1 summarizes the percentage of points on the Phase I Superfront in common with the combined PF for each combination of settings. As seen in Table SM1, when the GA is run for more generations or when maxoff is increased, the combined PF has more solutions in common with the Phase I Superfront. This can also be seen in Figure SM3, as the gray PF points more closely overlap with the black Superfront points for these combinations. However, even though the percent overlap with the Superfront is lower when fewer generations are used and maxoff is lower, practically speaking, there is still considerable similarity

between the combined PFs and the Superfront, as evidenced by the visual overlap in the fronts in Figure SM3.

In summary, the goal of Phase I is to find a decent approximation of the “true” PF in a reasonable amount of time, and thus the choices we make when implementing the first phase of the GA reflect these trade-offs. Specifically, we choose to run the first phase of the GA for 200 generations, use 200 maximum number of offspring per generation, and specify the maximum population size to be 200. Even though a maximums population size is initially specified, we allow that maximum value to increase to avoid truncation of the first tier PF. We would not recommend similar flexibility on the maximum number of offspring, as that quantity will grow much more quickly and considerably increase the runtime of the algorithm due to the large number of solutions that would need to be evaluated. In examining Figure SM3, the practical differences due to these choices are generally small, especially when one considers the likely numerical error due to the MCMC calculations used to evaluate solutions. The solutions identified on the Phase I PF will be projected onto the full stockpile of units to serve as the starting solutions for the second stage of the GA search.

II. Justification of Choices and Investigation of Effectiveness of Phase II of Search

The goal of Phase I of the GA search was to identify starting points for the second phase of the search. In Phase II the goal is to identify a Pareto front of possible solutions from which a user will ultimately decide which solution best fits their needs and priorities. As in our exploration of Phase I, we consider different settings for the GA (specifically, number of generations and maximum number of offspring) for Phase II of the GA. As in the Phase I exploration, we continue to initially specify the maximum population size as 200, but all that quantity to grow to avoid truncation of the PF. For each combination considered, we run the GA five times and combine the results for the final Phase II PF for each setting. Additionally, we consider a single phase approach where the GA is run on the entire stockpile of units using random starting solutions (thus avoiding Phase I). As in our exploration of Phase I, the PFs from all runs of the GA (all settings) are combined to identify an overall “Phase II Superfront” that we serve as a basis for comparison. The results are contrasted in this section.

Gen. maxoff Time (min) Size of PF Size of Phase II PF

% Solutions Shared with “Phase II Superfront”

100 200 23.21, 23.05, 23.24, 23.51, 22.93

202, 196, 177, 201, 190

206 4.8% (11)

200 200 47.89, 48.37, 48.94, 48.97, 49.47

202, 215, 205, 206, 219

229 5.6% (13)

500 200 130.57, 125.91, 128.89, 131.84, 129.92

219, 210, 208, 205, 215

226 16.0% (37)

800 200 207.71, 221.72, 233.49, 242.12, 223.81

225, 210, 210, 215, 202

233 29.0% (67)

1000 200 315.95, 300.51, 321.12, 304.1, 270.83

214, 213, 215, 222, 216

234 44.6% (103)

100 400 46.96, 46.21, 45.03, 45.31, 41.85

208, 198, 206, 202, 183

221 4.8% (11)

200 400 97.11, 94.55, 99.42, 98.31, 97.21

205, 211, 216, 210, 193

232 15.6% (36)

Table SM2: Run time and size of first tier PF for all runs of the GA (Phase II) for the specified combinations of number of generations and maximum number of offspring. For each combination run multiple times, the

combined Phase II PF and percentage of those solutions shared with the Phase II Superfront are presented.

Table SM2 summarizes the combinations of maxoff and number of generations considered, and for each combination, the run time required, the size of the first tier PFs found by each run, the size of the combined overall PF, and the percentage of solutions in common with the Phase II Superfront (with 231 solutions) are reported. As seen in the exploration of Phase I, increasing either the number of generations or the maximum number of offspring to be produced increases both the run time of the GA and the percentage overall with the Phase II Superfront. Thus, there is a trade-off between run time and good exploration of the PF.

Figure SM4: Comparison of Phase II PF identified by different GA settings (gray) and the Phase II Superfront (black). Columns of the figure correspond to the GA settings explored.

Even though the implementations with lower run time have a smaller percentage overlap with the Phase II Superfront, visually the differences between the combined PF from each setting and the Superfront are quite small (Figures SM4 and SM5). Figure SM4 displays pairwise scatterplots of the optimization metrics, with columns of the figure corresponding to settings of the second phase of the GA with maxoff=¿ 200. Figure SM5 displays the same information for the second phase GA settings with maxoff=¿ 400, as well as the results from a single-phase GA search on the entire stockpile of units with maxoff=¿ 200 and varying numbers of generations. Each plot displays the combined PF from all runs of the GA for that combination of inputs (gray points) and the 231 solutions on the Phase II Superfront (black points). As seen in Figures SM4 and SM5, the final second phase PF for all combinations has considerable overlap with the Phase II Superfront, indicating that, for all practical purposes, the resulting PFs are likely comparable. Thus, for our implementation of the second phase we used maxoff=¿ 200 for 200 generations.

The final combined PFs from the single-phase search, on the other hand, are clearly inferior to the Phase II Superfront (and all combined PFs from the two phase search), as evidenced by the very little visual overlap with the Superfront (Figure SM5) and the fact that they have no solutions in common with the Superfront (Table SM3). Further, in the scatterplots of Consistency versus Uncertainty (top row of Figure SM5), the single-phase search fails to find the solutions with low uncertainty and high consistency (upper left hand corner).

Figure SM5: Comparison of Phase II PF identified by different GA settings (gray) and the Phase II Superfront (black). Columns of the figure correspond to the GA settings explored. The last three columns correspond to single-phase GAs run for the entire stockpile of units.

Table SM3 further summarizes the single-phase GA search on the entire stockpile by providing run times and the size of the first tier and combined PFs for the combinations considered. The run times for the largest number of generations considered (1500) well exceeds the longest run times for Phase II of the two-phase search, and the resulting PF is considerably inferior. To improve the single-phase search, the GA would most assuredly need to be run for more generations, possibly with a larger value of maxoff . The run time necessary for these changes would far exceed the total run time for our two-phase search before a comparable PF is found, demonstrating the superiority of the two-phase search.

In summary, this exploration illustrated that running the second phase of the GA for either more generations or with a higher limit on the maximum number of offspring produced results in better exploration of the PF, but at the expense of run time. For all practical purposes, the PFs found by the second phase searches with shorter run times (either run for fewer generations or with a lower limit on maxoff ) were comparable to those found in the longer running GAs. All PFs identified by the two-phase search were vastly superior to those found using a single-phase search, with both more thorough exploration of the front and shorter run time, indicating that the two-phase approach is better.

Gen. maxoff Time (min) Size of PF Size of Combined PF

% Solutions Shared with

“Phase II Superfront”

500 200 94.24, 101.91, 92.76, 91.93, 96.36

118, 123, 110, 127, 116

125 0% (0)

1000 200 159.05, 209.34, 224.54, 223.61, 218.41

85, 110, 122, 123, 124

121 0% (0)

1500 200 320.84, 329, 352.82, 367.39, 365.91

122, 130, 130, 128, 123

136 0% (0)

Table SM3: Run time and size of first tier PF for all runs of the single-phase GA on the entire stockpile of units for the specified combinations of number of generations and maximum number of offspring. For each combination run multiple times, the combined single-phase PF and percentage of those solutions shared with the Phase II Superfront are presented.

downloads.hindawi.comdownloads.hindawi.com/.../complexity/2018/7242105.f1.docx · web viewas...

Documents