constrained design optimization using generative topographic mapping

14
Constrained Design Optimization Using Generative Topographic Mapping Asha Viswanath, A. I. J. Forrester, and A. J. Keane University of Southampton, Southampton, England SO17 1BJ, United Kingdom DOI: 10.2514/1.J052414 High-dimensional design-optimization problems involving complex and time-consuming solvers present computational challenges and are expensive to execute. Even though surrogate models can replace these expensive problems with simpler models, the initial design of experiment for constructing these models effectively is still exponential to the dimension of the problem. Traditional screening methods in optimization reduce the dimension of the problem by discarding variables, which is undesirable. In this paper, a latent variable model called generative topographic mapping is proposed to reduce the dimension of the problem so as to facilitate an optimization search in a low-dimensional space without removing any variables from the design problem. The method works by transforming high-dimensional data to be embedded on a low-dimensional manifold. It is demonstrated on a two-dimensional Branin function subjected to nonlinear constraints and then applied to real engineering constrained optimization problems of an aircraft wing design and an aircraft compressor rotor. The model developed in this work proved to be more effective in dealing with constrained optimization problems by effectively learning the constraint boundary, hence finding feasible best designs when compared to other surrogate models like kriging. Nomenclature C D = drag coefficient D = dimension of training data set g = constraint function K = number of latent data points L = dimension of latent space _ m = drag coefficient N = number of training data points n = number of constraints P = pressure P r = pressure ratio q = dynamic pressure T = training data set Tmp = temperature W = matrix of weights x = latent data set Y = prediction data set β = inverse of variance of Gaussian mixture distribution ϵ = expansion width of constraint limits η = adiabatic efficiency μ = vector of basis function centers Φ = matrix of basis function activations σ = width of basis function I. Introduction I N GLOBAL optimization problems encountered in engineering design, the majority involve large numbers of design variables, complex nonlinear solvers for objective function evaluations, and costly nonlinear constraints functions. In such a scenario, engineers have always preferred to work with simpler models of the problem by imposing assumptions on the physics of the problem. With the advent of computing technologies like parallel computing and use of faster processors, high-fidelity simulation models are now more confidently, increasingly used in industry. But the problem of compu- tational time has yet to be tackled as a single simulation of these expensive models involves running complex finite-element/ computational-fluid-dynamics simulations, which take up many hours of simulation time, making the entire design process to run for days or even months. A probable panacea to this problem is the use of surrogate models, which help simplify the problem by generating a model for the actual design problem, based on a few initial runs of the expensive solver. Then, by searching this model for the global optimum, one can reduce the number of function evaluations required for optimization of the actual problem. These models have been used quite effectively ever since their introduction by Myers and Mongomery [1], in the form of simple polynomial models, radial basis functions [2], kriging models [3] and so on. But for a high- dimensional problem, the initial set of designs, which have to be evaluated using the expensive solver to build the surrogate itself, can become cumbersome. Thus, a reduction in the dimension of the actual design problem appears to be the solution to all these scenarios. Factor screening methods [1] can be used to reduce dimensions. The common approach in most screening methods is to identify the variables most relevant for design problems and discard the remaining variables by fixing them at constant values during any optimization. But this may not always be an attractive feature because the relevance of the fixed variables may emerge later during the design process. Hence, the need is for a dimension reduction method that can capture underlying patterns in the variables in a reduced space without removing any variables from the design, thereby enabling optimization to be performed in the reduced space. In the broad field of statistics, many methods are used in data analysis to capture the important nonlinear relations and correlations in the data and then project it to a lower-dimensional space. We have studied one such nonlinear latent variable model (LVM) of generative topographic mapping (GTM) [4] in our previous work [5]. LVMs [6] may be used to explain the observed data in terms of fewer latent (hidden) variables and have been used effectively as dimension reduction tools in pattern recognition and machine learning [7]. GTM finds a low-dimensional nonlinear manifold embedded in the high- dimensional space. In our previous work [5], GTM was applied to unconstrained optimization problems, first on the unconstrained Branin function and then on an application problem of optimizing an airfoil shape to minimize its drag coefficient. The GTM-based approach was found comparable both in terms of the quality of the Received 21 November 2012; revision received 31 July 2013; accepted for publication 20 September 2013; published online 28 February 2014. Copyright © 2013 by Rolls Royce. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission. Copies of this paper may be made for personal or internal use, on condition that the copier pay the $10.00 per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923; include the code 1533-385X/14 and $10.00 in correspondence with the CCC. *Ph.D. Student, Computational Engineering and Design Group, School of Engineering Sciences; [email protected]. Lecturer, Computational Engineering and Design Group, School of Engineering Sciences. Member AIAA. Professor, Computational Engineering and Design Group, School of Engineering Sciences. Member AIAA. 1010 AIAA JOURNAL Vol. 52, No. 5, May 2014 Downloaded by UNIVERSITY OF CALIFORNIA - DAVIS on May 10, 2014 | http://arc.aiaa.org | DOI: 10.2514/1.J052414

Upload: a-j

Post on 25-Dec-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Constrained Design Optimization Using Generative Topographic Mapping

Constrained Design Optimization Using GenerativeTopographic Mapping

Asha Viswanath,∗ A. I. J. Forrester,† and A. J. Keane‡

University of Southampton, Southampton, England SO17 1BJ, United Kingdom

DOI: 10.2514/1.J052414

High-dimensional design-optimization problems involving complex and time-consuming solvers present

computational challenges and are expensive to execute. Even though surrogate models can replace these expensive

problems with simpler models, the initial design of experiment for constructing these models effectively is still

exponential to the dimension of the problem. Traditional screening methods in optimization reduce the dimension of

the problem by discarding variables, which is undesirable. In this paper, a latent variable model called generative

topographicmapping is proposed to reduce the dimension of the problem so as to facilitate an optimization search in a

low-dimensional spacewithout removing any variables from the design problem. Themethodworks by transforming

high-dimensional data to be embedded on a low-dimensional manifold. It is demonstrated on a two-dimensional

Branin function subjected to nonlinear constraints and then applied to real engineering constrained optimization

problems of an aircraft wing design and an aircraft compressor rotor. Themodel developed in this work proved to be

more effective in dealing with constrained optimization problems by effectively learning the constraint boundary,

hence finding feasible best designs when compared to other surrogate models like kriging.

Nomenclature

CD = drag coefficientD = dimension of training data setg� � = constraint functionK = number of latent data pointsL = dimension of latent space_m = drag coefficientN = number of training data pointsn = number of constraintsP = pressurePr = pressure ratioq = dynamic pressureT = training data setTmp = temperatureW = matrix of weightsx = latent data setY = prediction data setβ = inverse of variance of Gaussian mixture distributionϵ = expansion width of constraint limitsη = adiabatic efficiencyμ = vector of basis function centersΦ = matrix of basis function activationsσ = width of basis function

I. Introduction

I N GLOBAL optimization problems encountered in engineeringdesign, the majority involve large numbers of design variables,

complex nonlinear solvers for objective function evaluations, andcostly nonlinear constraints functions. In such a scenario, engineershave always preferred toworkwith simplermodels of the problem by

imposing assumptions on the physics of the problem. With theadvent of computing technologies like parallel computing and use offaster processors, high-fidelity simulation models are now moreconfidently, increasingly used in industry. But the problemof compu-tational time has yet to be tackled as a single simulation of theseexpensive models involves running complex finite-element/computational-fluid-dynamics simulations, which take up manyhours of simulation time, making the entire design process to run fordays or evenmonths. A probable panacea to this problem is the use ofsurrogate models, which help simplify the problem by generating amodel for the actual design problem, based on a few initial runs of theexpensive solver. Then, by searching this model for the globaloptimum, one can reduce the number of function evaluations requiredfor optimization of the actual problem. These models have beenused quite effectively ever since their introduction by Myers andMongomery [1], in the form of simple polynomial models, radialbasis functions [2], kriging models [3] and so on. But for a high-dimensional problem, the initial set of designs, which have to beevaluated using the expensive solver to build the surrogate itself, canbecome cumbersome. Thus, a reduction in the dimension of theactual design problemappears to be the solution to all these scenarios.Factor screening methods [1] can be used to reduce dimensions.

The common approach in most screening methods is to identify thevariables most relevant for design problems and discard theremaining variables by fixing them at constant values during anyoptimization. But thismay not always be an attractive feature becausethe relevance of the fixed variables may emerge later during thedesign process. Hence, the need is for a dimension reduction methodthat can capture underlying patterns in the variables in a reducedspace without removing any variables from the design, therebyenabling optimization to be performed in the reduced space. In thebroad field of statistics, many methods are used in data analysis tocapture the important nonlinear relations and correlations in thedata and then project it to a lower-dimensional space. We havestudied one such nonlinear latent variablemodel (LVM) of generativetopographic mapping (GTM) [4] in our previous work [5]. LVMs [6]may be used to explain the observed data in terms of fewer latent(hidden) variables and have been used effectively as dimensionreduction tools in pattern recognition andmachine learning [7]. GTMfinds a low-dimensional nonlinear manifold embedded in the high-dimensional space. In our previous work [5], GTM was applied tounconstrained optimization problems, first on the unconstrainedBranin function and then on an application problem of optimizing anairfoil shape to minimize its drag coefficient. The GTM-basedapproach was found comparable both in terms of the quality of the

Received 21 November 2012; revision received 31 July 2013; accepted forpublication 20 September 2013; published online 28 February 2014.Copyright © 2013 by Rolls Royce. Published by the American Institute ofAeronautics andAstronautics, Inc., with permission. Copies of this papermaybe made for personal or internal use, on condition that the copier pay the$10.00 per-copy fee to the Copyright Clearance Center, Inc., 222 RosewoodDrive, Danvers, MA 01923; include the code 1533-385X/14 and $10.00 incorrespondence with the CCC.

*Ph.D. Student, Computational Engineering and Design Group, School ofEngineering Sciences; [email protected].

†Lecturer, Computational Engineering and Design Group, School ofEngineering Sciences. Member AIAA.

‡Professor, Computational Engineering and Design Group, School ofEngineering Sciences. Member AIAA.

1010

AIAA JOURNALVol. 52, No. 5, May 2014

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 2: Constrained Design Optimization Using Generative Topographic Mapping

optima produced and the computational cost savingswhen comparedwith other surrogate methods like kriging [3]. Here, we re-examinethe benefits of using GTM on constrained optimization problemsby using it to generate a low-dimensional surrogate model ofconstrained optimization problems and study its effectiveness infinding an optimal design satisfying the constraints of the problem.Constraints are common in practical designwork because there are

always limits to various physical quantities determining the optimumperformance of a system. Many different techniques have beendeveloped to tackle constrained optimization problems [8] like thetransformation of the problem into an unconstrained problem byapplying transformation on variables, using Lagrangian multipliersfor the constraints to incorporate them into the objective function andestimating the multipliers during optimization, using penalty func-tions for the objective functions to penalize the function whenever aconstraint gets violated, and so on. In this work, we approach theconstrained problem by transformation of the constrained designspace to a low-dimensional manifold using the GTM method andthen searching the low-dimensional space for the optimum. Inconstrained optimization problems, the constraints are in factmanifolds in the design space separating the feasible and nonfeasibleregions in the space. Each active constraint essentially reduces thefreedom of the designer by one dimension; the issue is identifyingwhere the constraint boundaries lie. In many problems, the optimalvalue lies on the edges of one or more constraints, and hence anymodel that tries to learn the constrained design space should firstsuccessfully learn the constraint boundaries. The motive of usingGTM is to learn these boundaries and align the GTM manifold insuch a way that it passes not only through high-quality design spacebut also along all the constraint boundaries so that the optimal point,whether it lies inside the space or on the edges of constraintboundaries, does not escape themanifold. The constraints are appliedon the transformed design space using penalty functions during thesearch. The low-dimensional space is a simple predictor alternative tothe expensive solver and so can be searched effectively by randomsearch strategies that overcome the problem of the penalizedmanifold becoming discontinuous.We begin in the next sectionwith a description of themethodology

of GTM. Constrained optimization using GTM is then discussedbriefly. Themethod is then demonstrated on a two-dimensional (2-D)problem and then applied on a transonic aircraft wing design problemand an aircraft rotor design problem.

II. Generative Topographic Mapping Methodology

A. Theory

The GTM method is used to model the distribution of a high-dimensional data set in terms of low-dimensional latent variables. AGTMmodel represents a high-dimensional data setT � ft1; : : : ; tNgof D dimension and N points in terms of an L-dimensional space(L < D) of K latent variable points x � fx1; : : : ; xKg through amapping y�x;W�, which maps every point in latent variable space tothe centers of Gaussian mixture distributions of the data. This isachieved by positioning latent points in L space in such a way that amapping from latent space to data space models the probabilitydistribution of the data space. For this, the latent points are initiallyassumed to have a prior probability distribution of discrete deltafunctions p�x�, and the data set T is assumed to have a Gaussianmixture probability distribution. A mapping function y�x;W� mapsevery point in latent space to the centers of Gaussian mixturedistributions of the data, and its parameters are the weights W ofeach latent point to the data point and the variance parameter β ofthe Gaussian mixture distribution. y is D-dimensional like T withY � fy�x1�; : : : ; y�xK�g and is defined by W. The parameters ofthe model are determined by a training procedure that seeks tomaximize the likelihood that the set of latent points have generatedthe data set of assumed probability, hence acting as an updating fromprior to the posterior probability by maximizing likelihood. ABayesian modeling approach thus forms the backbone of thismethod. In mathematical terms, the probability distribution of thedata is given by

p�tjx;W; β� ��β

�−D∕2exp

�−β

2

XDd

�td − yd�x;W��2�

(1)

Integrating out the latent variable

p�tjW; β� �Zp�tjx;W; β�p�x� dx (2)

and choosing prior probability p�x� as a set of K equally weighteddelta functions on a regular grid we get,

p�x� � 1

K

XKk�1

δ�x − xk� (3)

p�tjW; β� � 1

K

XKk�1

p�tjxk;W; β� (4)

Thismodelmaps each latent point to the centre of aGaussian lying onthe manifold embedded in D space. The centers of the Gaussiancannot move independently of each other and so is a constrainedmixture of Gaussians depending only on the mapping y. Also, allcomponents of the mixture have the same variance β−1 and mixingcoefficient 1∕K. GTM follows a Bayesian inference using amaximum likelihood estimation to determine the parametersW andβ. The likelihood function for the data set is

L �YNn

p�tjW; β� �YNn

�1

K

XKk�1

p�tnjxk;W; �

(5)

whose log likelihood is

l �XNn�1

ln

�1

K

XKk�1

p�tnjxk;W; �

(6)

which is maximized with respect to both W and β using anexpectation maximization (EM) algorithm [9]. A linear regressionmodel is chosen for y�x;W� with a linear combination of a set ofMfixed nonlinear radial basis functions of the following form:

ϕm�xk� � exp

�−kxk − μmk2

2σ2

�(7)

where μm are centers of the radial basis functions and σ2 theircommon variance. The regression equation in matrix form is

Y � ΦW (8)

W is anM ×Dmatrix of weights, Y isK ×Dmatrix of the Gaussianmixture component centers, and Φ is K ×M matrix of basisfunctions. In the start of the maximization of likelihood, W isinitialized using the first L principal components of the data set andlater updated during the EM algorithm. GTM acts as a supervisedlearning process in which the training data includes the variablesalong with their response values, making the data set dimensionD� 1 during computations. A detailed mathematical derivation ofthe nonlinear relations and an explanation of the method is availablein Bishop et al. [10]. Specially note here that, though T are variablesmodeling physical quantities in the real world, when they are reducedto latent space variables, x do not have any physical meaning. Theyare just mathematical transformations through the proceduredescribed previously, and hence the only factor to be considered inthis dimension reduction method is to what dimension theoptimization problem can be reduced before we lose accuracy of theoptima. This is determined in each example problems by amethod oftrial and error and analyzing the results obtained for each. Hence,

VISWANATH, FORRESTER, AND KEANE 1011

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 3: Constrained Design Optimization Using Generative Topographic Mapping

there is no general rule of which dimension to choose as it varies fromproblem to problem.

B. Generative-Topographic-Mapping-Based ConstrainedOptimization

The algorithm of GTM-based optimization (GTMBO) applied toconstrained problems is shown in Fig. 1. The method starts with adesign of experiment (DOE) sample, trains the DOE to form a low-dimensional manifold embedded in the high-dimensional space, andsearches the low-dimensional space for the optimum. The DOE orthe training data that is learned by theGTMcontains the variables, theobjective function, and the constraints. Thus, the dimension of theDOE matrix is N × �D� 1� n� when there are n constraints tothe problem. Among this initial DOE, those data points that do notsatisfy the constraints are discarded, and the reduced DOE is used forGTM training. How the constraints are applied will be discussedshortly. After a GTM learning process, the values of the objectivefunction and constraint values are predicted on the same manifold.The low-dimensional space is then searched using any optimizationalgorithm for the best point, subject to the GTM predicted constraintvalues. This point is then evaluated by the expensive solver to get theactual objective function value and the actual constraint values toupdate the training data set before retraining the GTM. The search/update iterations are continued until there is no further improvementin the design. As with any DOE studies, it is important to repeat theprocedure with a range of different DOEs and average the optimumobtained so as to confirm the correctness of the optimum. This is forconsidering different sample spaces to ensure that the same optimumis obtained and to prevent mistaking the result to be any local optima.The error of the model is quantified using the root-mean-square error(RMSE) metric on a separate validation data set. In most constrainedoptimization problems, rather than the accuracy of the globaloptimum, the feasibility of the solution becomesmore important, andhence the RMSE value for constraints are specifically examined toshow how well the method learns the constraint boundary andadheres to it.

To apply the constraints on the original function, the constraintboundary is slightly expanded so that some data points just outsidethe actual constraint boundary are included in theDOEused forGTMtraining. This enables themodel to learn the design landscape on bothsides of the boundary for equality constraints and the outside of theboundary for inequality constraints. This helps the manifold tosuitably align itself so as to pass through regions on the boundary thatmay contain the optimal point. To decide how much each constraintfunction of the problem must be expanded so as to provide sufficientnumber of samples for the GTM training, we consider the conceptof ϵ tubes from support vector regression [11]. The size of ϵ isdetermined by fixing the number of samples that is required in theDOE forGTM training and is set at approximately 10%more than thesample size retained after the application of actual constraints. Thisnumber is just provided as a tolerance and is generalized for all of theproblems in this paper. The idea is only to take a few more pointsoutside the constraint boundary and 10% seemed a reasonable choicefor this procedure. First, the value of ϵ is initializedwith a small value,and this value is added to the constraint value to make the boundaryslightly larger to accommodate 10% more points in the feasibilityregion and to include them in the DOE. The size of the DOE isthen checked, and the value of ϵ adjusted by increasing the value ifthe 10% limit is not reached and by decreasing if the limit is exceeded.If there are multiple constraints, the value of ϵ is the same for allequality constraints but taken differently for inequality constraints. Incase of more than one inequality constraint, there are two ways ofapproaching the problem. Either the overall constrained designspace, obtained after applying all the constraints to the problem, isconsidered and a single ϵ value evaluated for the final feasible space,as described previously, or else separate ϵ values for each constraintsare applied and the feasible space decided from the combination of allrelaxed constrained spaces.We have adopted the latter method in thispaper because, in the GTM reduction method, the constraints areapplied on the transformed design space using penalty functions, thuskeeping each of the constraint separate. Aweighted value of ϵ is usedfor each constraint, the weights decided according to how muchconstrained the design space becomes on applying the constraint onthe problem. If the constrained space is very small compared to actualunconstrained design space, to maintain the same ratio even afterthe 10% relaxation, a smaller weight should be used so that the ϵcorresponding to it will be small and vice versa for a constraint thatdoes not reduce the design space much from the unconstrained case.This is shown schematically in Fig. 2, where constraint 1 has a narrowfeasible design space and hence a smaller ϵ1 corresponding to asmaller weight. The weights are obtained by normalizing the samplesize retained after the application of each constraint individually as inthe next equation:

ϵc � ϵwc c � 1; : : : ; n wc �nc

n1 � : : : � nn(9)

Fig. 1 Algorithm for GTM-based constrained optimization.

Fig. 2 ϵ tubes for more than single-constraint design space. Thedarkened area shows the feasible regions of each constraint. Note howϵ1 < ϵ2 as constraint 1 is stricter than constraint 2.

1012 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 4: Constrained Design Optimization Using Generative Topographic Mapping

where wc is the weight, and nc is the sample size retained byconstraint c individually applied on the problem, with c varying from1 to n. The ϵ value is determined by initializing a value, thenmultiplying it with weights of individual constraints to get ϵ1; : : : ; ϵnfor the n constraints and, after expanding/contracting the constraintswith respective ϵ values, applying all of them on the problem to retain10% extra designs. This process will be explained in detail for theexamples used in the next section. To be noted here is the fact that this10% relaxation is for the actual constraint boundary to be included inthe DOE sample to effectively find the optima if they are lying on theboundary. If the optima are not active at the boundary, the GTMmanifold need not lie along the boundary, nor will the 10% extradesigns need to be provided. But becausewe do not know in advanceabout the nature of the global optimum of the problem, the previousprocedure is an effective way to determine the sample space.A space-filling Latin hypercube DOE is used as the training

sample. While choosing the sample size N, a guideline of taking 10times the dimension of the problem, as indicated by a recent research[12], is used though it is experimented in each test case and chosen asper the requirement of the problem. The GTM training isimplemented here viaMATLAB using the toolbox routines providedby Svensén [13]. The GTM training parameters are number of latentpoints, number of basis function centers, and number of EMcycles oftraining. In case of a lower latent space dimension, say fewer thanfour, a grid structure of latent points is adopted as in the originalformulation of GTM. But beyond four-dimensional latent space,calculations become cumbersome and time-consuming and causestorage issues due to insufficient space in the computer memory.Hence, a random sampling having an underlying uniformdistributionlike a space-filling Latin hypercube is adopted for the latent spacedimensions up to 10 and a random sampling from discrete uniformlydistributed points for dimensions higher than 10 [14]. This form ofrandom sampling can lead to some interactions of the latent variablesnot being captured, and hence many different random samples areused and the GTM manifold constructed. Again the importance ofaveraging is reasserted, this time in the case of latent space, whichalleviates any biases due to randomness. The data are trained until thelikelihood function converges to the maximum value. The minimumof the objective function given by GTM prediction, subject to GTMconstraint prediction, is the first point used as an update. The shapethemanifolds take do not in anyway depend on the function; instead,they depend solely on the training data provided to it. Thus, a DOEsample that is spread out evenly in the design space is desirable for theGTMmodel to be trained to the landscape of the function and henceto find the optima. It is to be noted that GTM does not find the globaloptimum unless the manifold exactly passes over it. This is whereupdating the data set helps converge to the optimum. After updatingthe training set, the set is again retrained to obtain the updated GTM

manifold rather than just curve-fitting the update point to themanifold.

III. Demonstration of ConstrainedOptimization Problems

Avisual illustration of how constrained optimization works usingGTM can be given by using a two-dimensional problem reduced to aone-dimensional manifold embedded in the two-dimensional spaceof the objective function. This example does not require a dimensionreduction and is used here only for demonstration of the GTM-basedoptimization algorithm. We choose two artificial test problems, bothinvolving the minimization of a two-dimensional modified Braninfunction [15] given by

f�t� ��t2 −

5.1

4π2t21 �

5

πt1 − 6

�� 10

��1 −

1

�cos t1 � 1

� 5t1; t1ϵ�−5; 10�; t2ϵ�0; 15� (10)

Thismodification to the original Branin function is performed so as tomake the optimization problem harder as the function now has twolocal minima and one global minimum instead of three equal globalminima. A not-so-efficient optimizer can easily get stuck in the localoptima and fail to locate the actual optimum.

A. Constraint 1

In the first test problem, a simple inequality product constraint isapplied on the function given by

g�x� � x1x2; x1; x2ϵ�0; 1� (11)

The constraint is satisfied at g > 0.2. Figure 3 shows the constraintapplied on the Branin function and the feasible area in the designspace that has to be searched. Using 50 generations of population size20 for an optimization using genetic algorithm (GA) used as compari-son for GTMmethod gives an optimum value of 19.8 correspondingto �t1; t2� � �−1.4; 12.5�.A 10-point space-filling Latin hypercubeDOE is used as the initial

training sample. The constraint function reduces the design space toaround five points, and hence, considering 10% extra designs, we fixa value of around six (it may not be possible to get exactly six datapoints, and so six or the next possible number of points with a relaxedconstraint is taken) as the final DOE sample size with which GTMwill be trained.When the constraint is applied on the initial DOE, it isrelaxed to g > 0.2 − ϵ, with initial value of ϵ set to 0.1 and lateradjusted, so as to include a minimum of six data points in the reducedDOE. Figure 4 illustrates how the extended constraint boundarylooks. Here, we obtained seven data points with value of 0.1 for ϵ.

x1

x 2

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t1

t 2

5 0 5 100

5

10

15

40

60

80

100

120

140

160

180

200

220

a) b)

Fig. 3 Constraint function and optimum on constrained Branin function.

VISWANATH, FORRESTER, AND KEANE 1013

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 5: Constrained Design Optimization Using Generative Topographic Mapping

From previous experience of using the GTM method forunconstrained optimization [5] concludes that to improve theeffectiveness of GTM, taking the good designs out of a typical DOE(i.e., the ones those are closest to the optima), we can refine themanifold to be constructed in the part of the design space that haveoptimal designs, thus making it find the optimum sooner. This isperformed by sorting a particular DOE with the objective functionvales and using the top 10% (again taken intuitively) good designs ofthem for GTM training. The idea is to show good designs to GTM,and it gives the best design. Here, for the constrained optimization, avisual examination showed that taking the best designs (those designsshown by crosses near the constraint boundary) from the training datacan result in the GTM better fitting the constraint boundary. This ismostly not possible in constrained optimization because theconstraints already reduce the final DOE size to a small value so thatto decrease it further with best designs may end in too few samples.Also, if some of the best designs so chosen are from the relaxedconstraint region of GTM (which are infeasible and provided only forGTM to learn the landscape), then the training data set will not haveenough feasible points to learn the actual constraint boundary. Hence,this practice of taking best designs for training the GTM is performedonly when the final DOE is sufficiently large for the specific problemso that a reduction by taking best designs will not become fewer thantheminimum required for the generation of theGTMmanifold. Here,five designs are the minimum necessary for GTM training of thisdesign space, and so we choose the best five designs out of the sevenobtained for the GTM training. The GTM parameters used for thisfive-point DOE are six latent points, three basis function centers, andfiveEMcycles of training. The true function contours and the trainingof the GTMmanifold for the five EM cycles along with optimizationprediction after five updates of the function are shown in Figs. 5a–5f.The crosses in the figure represents the training points, and whitecrosses show the update points. Figure 5a shows the initial GTMmanifold formed from the principal components of the training pointsas described in Sec. II.A. Figures 5b–5e show the training of thisinitial manifold to the data following the maximization of thelikelihood function, described in Sec. II.A. Notice how the manifoldaligns closer to each of the training points after each cycle. Thoughthese figuresmay look similar, there are slight changes to the positionof the manifold as it is converging to the final shape (i.e,corresponding to maximum likelihood). Once the GTM training iscompleted, the minimum of the function is used as the first update tothe training set. The new training data set is retrained, similar toFigs. 5a–5e. After five such updates, the minimum of the functiongiven by the GTM prediction is 21.003 corresponding to (−1.12,11.6), shown in Fig. 5f, which is 6%more than the value given by theGA. Figure 5f also shows all the update points. The RMSE with avalidation set was 0.1. The results from 50 different initial DOEsshowed inconsistency in results because, in some cases, the DOEpoints did not spread evenly across the constraint boundary andGTM

prediction was only a local optima. To avoid this, an increased initialDOE sample was used so that the final five DOE points would likelybe sampling the boundary well. The averaged results of 50 DOEswith the initial DOE having 30 data points and final DOE having fiveDOE points gives 24.5, which is closer to the global optimum but notcompletely converged to it. It is worth noting, however, that the GTMmanifold after updating tends to lie along the constraint boundary asdesired.

B. Constraint 2

Before applying GTM to practical engineering problems, we wishto test its performance on one more mathematical constraint functionproblem, this time with a very small constrained space to samplefrom. The purpose is to study how GTM manifolds aligns in such aspace and whether it can find the global optimum. Thus, the secondtest problem is a more complicated constraint, which generatesislands of feasibility in the design space. The constraint function is amodified version of the Gomez #3 function [16,17] with a sine waveadded to the original function, given by

g�x� ��4 − 2.1x21 �

1

3x41

�x21 � x1x2 � �−4� 4x22�x22

� 3 sin�6�1 − x1�� � 3 sin�6�1 − x2��;x1; x2ϵ�0; 1� (12)

The constraint is satisfied at g > 6. Figure 6 shows the constraintapplied on the Branin function and the feasible area in the designspace that has to be searched. A GA search using 50 generations ofpopulation size 20 gives an optimum value of 19.83 correspondingto �t1; t2� � �−0.085; 5.31�.A 30-point space-filling Latin hypercubeDOE is used as the initial

sample. More sample points are considered in the initial DOE due tothe small feasible area of the design problem. The application of theconstraint gave only around two points, and hence we add 10% extradesigns to fix the final DOE size limit as four, which is also theminimum DOE size required for training the GTM for this con-strained area. The constraint is relaxed to g > 6 − ϵwith initial valueof ϵ set to 1, so as to include a minimum of four data points in thereduced DOE. As discussed in the previous problem, the value of ϵ isadjusted to include four DOE points in the final sample if the DOEpoints run short, and if more, then from that DOE, the best fourdesigns are included in the final DOE. The GTM parameters used arethe same as the previous constraint example. The true functioncontours and the training of theGTMmanifold for the five EM cyclesalong with the prediction after five updates of the function are shownin Figs. 7a–7f. Here, it may be confusing how a single continuousmanifold models the feasible space that contains four disconnectedregions. But the GTM manifold is actually trained using the samplepoints (shown by crosses), and as discussed in the methodology, amanifold represents the centers of the probability density function ofthe sample points and is an L dimensional continuous function.Figure 7a shows the initial shape of the manifold which is trained inFigs. 7b–7e. Each update point used to retrain the entire training seteach time and the minimum of the function obtained fromGTM afterfive updates, 20.95 (5%more than theGA-found optimum), is shownin Fig. 7f. TheRMSEwith a validation set is 0.2,which is high for thisfunction. Also, with different DOEs, the global optimum could notalways be found due to the difficult feasible area of the problem. Theglobal optimum exists on a small island, which has to be learned bytheGTMmanifold to find it. In our initial DOE,whenwe increase therelaxations on the constraint by increasing ϵ, it also samples some ofthe other islands and deviates the manifold from the global optimum.For this purpose, taking the best four designs is essential in this case.The only solution to this specific problem is to increase the initialDOE sample size so that there are more samples around the islandcontaining the global optimum. Here, increasing initial DOE to 50made GTM find a good design comparable to the GA. This increasein sample space could be provided in the two mathematical problemsas they are simple functions and low-dimensional problems. Thus,

t1

t 2

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

40

60

80

100

120

140

160

180

200

220

Fig. 4 ϵ tube beyond the constraint boundary.

1014 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 6: Constrained Design Optimization Using Generative Topographic Mapping

the focus was to find the optimum by experimenting with the samplespace. This will not be continued in practical engineering problemsbecause then it may defeat the objective of computational savings.

IV. Application to a Constrained Wing Design Problem

Having observed howGTMperforms for two different constraints,we now apply the method to a real engineering design problemdominated by constraints. A transonic civil aircraft wing design

problem is considered, which includes 11 independent variables(Fig. 8), one objective function, and four constraints, all tabulated inTable 1 with their different baseline values. Thewing design problemis a multi-objective function with wing weight, drag levels, and cost(Keane and Nair [8]) as the objective functions. Here, we limitourselves to the minimization of D∕q (drag/dynamic pressure)estimated from the wing concept design tool Tadpole [18]. Con-straints exist on wing weight, fuel tank volume, pitch-up margin, andundercarriage bay length, and these values are also calculated from

Fig. 5 GTMBO with constraint 1 on the Branin function.

VISWANATH, FORRESTER, AND KEANE 1015

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 7: Constrained Design Optimization Using Generative Topographic Mapping

the wing design tool. The other important design parameters includeMach number, which is taken as 0.785, and Reynolds number, takenas 7.3 million.TheGTMmodel is built for this function using a sample size of 150

(considering 10 times the dimension plus constraint). The DOEcontains the 11 variables, the objective function, and the fourconstraints in each of its columns. The constraints are to be relaxedwith their respective ϵ values before training with GTM. These arecalculated by estimating the weights of each constraint, subject to itsrelevance on the data set. The relevance of each constraint is found byapplying each constraint individually on the function and noting thereduction in sample size. For example, it was found that, in a specificDOE of size 150, 56 designs satisfied the required wing weight,83 designs satisfied the required wing volume, 66 designs satisfiedthe required pitch-up margin, and 121 designs satisfied the requiredundercarriage bay length. Thus, the hierarchy of relevance of con-straints for this DOE is wing weight, pitch-up margin, wing volume,and undercarriage bay length, in this order.The number of final DOE points after application of all constraints

also has to be fixed. For this, the total number of points retained in thedata set on application of actual constraints is increased by around10%. For the previous example, only 8% of data points were retainedwith the original constraints, with the relaxed DOE containing 18%of the available designs. Because most of the DOEs had a similarpercentage, to have a consistent value for all DOEs, around 20% of150 (i.e., 30 designs) were fixed as the final DOE size for all samples.We continuewith this sample to construct theGTMmanifoldwithoutfurther reduction of best designs as the feasible DOE size is small forthe constrained optimization. The function is modeled from one- to10-dimensional latent spaces. The latent space structure is a randomLatin hypercube. The number of latent points is taken as 50 times thelatent space dimension. The number of basis functions is half that ofthe latent points, 10 EM cycles, and the number of updates is 20. Thevalidation set sample size is 50. The best minimum D∕q valuesatisfying all of the constraint conditions did not improve after 7-D.This indicates that a 7-D latent space can effectively model thisconstrained 11-D design space. This is perhaps to be expected givenfour constraints being active at the best solution. The best knownsolution to this constrained problem is 2.76 using a kriging surrogatemodel [17]. GTM gives 2.8 as the average minimum D∕q, only 4%greater than the best known solution. This error is not verysignificant. The results are averaged over 50DOE runs. The averagedvalue of D∕q for each latent dimension is plotted in Fig. 9. Theaveraged values along with the standard deviation shows sevendimensions to give the best converged result. This does not improvesignificantly in higher dimensions. The best minimum value foundby 7-D GTM (2.78) was close to the best known solution for thisproblem. Figure 10 shows the optimization history of 7-D GTM forthe 50 DOEs. These DOEs are considered for averaging purposes,and this figure shows the convergence pattern of each DOE updates

(i.e., from DOE size 50 to 60). The average optimum value is theaverage of the converged solution of these DOEs. This can be clearlyread from the figure as 2.8. The constraints were mostly satisfied bythe 20 updates predicted by GTM when evaluated by the Tadpolecode. The averaged result of the 50 DOEs are shown in Fig. 11. Onlythe 20 updates are shown as the DOE itself will have designs thatwere infeasible due to the relaxation of constraints we performed.Although one of the updates had a slightly higher wing weight value,all the other constraints were met. As mentioned before, this is thebiggest advantage of modeling using GTM in that it satisfiesconstraints very effectively compared with other methods known inthe literature. TheRMSE values are as follows: the objective functionis 0.05, the wing weight constraint is 0.057, the wing volumeconstraint is 0.21, the pitch-up margin is 0.21, and the undercarriagebay length is 0.14, as calculated from the validation set. This problemwithout constraints was discussed in a previous work [20], inwhich itwas compared against principal component analysis (PCA) andfound to be computationally more expensive than using GTM for theproblem. Hence, we do not attempt a comparison with PCA inthis work.In summary, the method of GTMBO performed well for the

constrainedwing design optimization by the reduction of dimensionsfrom 11 to 7. This is a significant reduction of dimensions forexpensive solver problems. Even though this problem did not involvean expensive solver to show a considerable reduction of compu-tational burden, it gives us a ray of hope that, when applied toproblems with expensive solvers like the next application of thiswork, this kind of reduction helps us reduce computational timesignificantly.

V. Application on a Transonic CompressorRotor Blade Design

We now apply the GTMBO techniques to the design problem of asingle rotor blade of an axial-flow compressor designed and studiedbyNASALewis Research Center, known as theNASA rotor 37. Thistest case is first described, and then the algorithm of how the GTMmethod is applied to it is outlined. The results are discussed in detailfinally. As mentioned earlier, the ability of GTM to effectivelymodelconstraints and hence find good quality solutions within the feasibleregions of the constrained design space is evident from this rotordesign example.

A. NASA Rotor 37

The NASA rotor 37 is an isolated axial-flow compressor rotor,which was designed at the NASA Lewis Research Center (now theGlenn Research Center) by Reid and Moore in 1978 [21] to studycompressor performancewith various computational-fluid-dynamics(CFD) codes for turbomachinery problems. It is a low-aspect-ratiohigh-pressure inlet fan of a four-stage axial flow compressor, which

x1

x 2

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

6

4

2

0

2

4

6

t1

t 2

5 0 5 100

5

10

15

20

40

60

80

100

120

140

160

180

200

220

Fig. 6 Constraint function and optimum on constrained Branin function.

1016 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 8: Constrained Design Optimization Using Generative Topographic Mapping

has 36 multiple circular arc blades with a design pressure ratio Pr of2.106 at a mass flow rate _m of 20.19 kg∕s. The specifications of theblade are found in [22].To use the test case for our method, first rotor 37 is meshed and

analyzed using the Rolls-Royce proprietary PADRAM-HYDRAsoftware system [23,24], which includes the parameterization of theblade shape, meshing using PADRAM software, CFD analysis usingthe HYDRA software, postprocessing, and objective/constraintevaluations. The computational mesh of the blade using PADRAM isshown in Fig. 12.

The design parameterization used here involves five bladeparameters: axial movement of sections along the engine axis inmillimeters (sweep), circumferential movements of sections indegrees (lean), solid body rotation of sections in degrees (skew), andleading/trailing-edge recambering in degrees [25]. The designvariables were specified using six control sections at 0 (hub), 20, 40,60, 80, and 100% (tip) along the span (Fig. 13). The sweep and leanwere not used at the hub. Thus, the total number of independentdesign variables D was 28. For generation of smooth designperturbations in the radial direction, B-spline interpolation was used

Fig. 7 GTMBO with constraint 2 on the Branin function.

VISWANATH, FORRESTER, AND KEANE 1017

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 9: Constrained Design Optimization Using Generative Topographic Mapping

through the control points. The objective function is the adiabaticefficiency η that has to be maximized at the working conditions of97.67% choke mass flow and 100% engine speed, given by

η�x� � �Poutlet�x�∕Pinlet�x��γ−1∕γ�� − 1

�Tmpoutlet�x�∕Tmpinlet�x�� − 1(13)

where P and Tmp are total mass averaged pressure and temperature,respectively. The equality constraints on Pr and _m are the values ofthe datum blade, 2.092 for Pr and 0.5635 for _m (single blade _m). It isdifficult to deal with equality constraints in such a high-dimensionalspace and so the constraints are given a tolerance ofmaximum1% forPr and 0.5% for _m [25]. To make the limits tighter, in this work, only0.2% tolerance is applied. The bounds on the variables and theconstraint values are

−3 ≤ sweep ≤ 3 mm −0.5 ≤ lean ≤ 0.5 deg

−0.5 ≤ skew ≤ 0.5 deg −0.5 ≤ TER∕LER ≤ 0.5

2.088 ≤ Pr ≤ 2.096 0.563 ≤ _m ≤ 0.565 kg∕s (14)

B. Generative-Topographic-Mapping-Based Optimization

of Rotor 37

As used here, the NASA rotor 37 test case is a 28-dimensionalproblem, and it employs aCFD solver that takes 50min to run a singlesimulation. A typical optimization solver like GA with minimalpopulation size of 10 and 50 generations will include 500 runs of theCFD solver, which takes 25,000 min (approximately 18 days).Parallelizing all of these runs is not possible because GA generatesrandom populations, and the accuracy of the optimum obtained may

also be questionable because we used very few generations andpopulation size. A remedy could be to provide an initial populationfor GA, but it still requires a larger population size and number ofgenerations for convergence to optimum value. This design problemis a typical example where dimension reduction can help reduce thecomputational time required for optimization. The computationalexpense limits the number of samples available for ourDOE forGTMtraining, averaging, and validation purposes. Thus, first a 280 samplesize DOE (10 times dimension) is generated, and for averagingpurposes, random subset samples of the DOE are used. For the 280runs of the solver, it takes 10 days, but parallelizing 20 runs at a time,the computational time is reduced to 12 h. We select 20 randomsamples of 150 designs each from the master set of 280 designs anduse these for averaging. For validation, 50 samples separate from thetraining DOE were taken. For each of the 150 sample DOEs, theGTM method is applied, and the number of updates limited to 10fresh solutions for each DOE. Note here that not all of the runs of thesolver will be successful, and hence the 150 sample DOEs are sure tohave some failed runs of HYDRA,whichwill be discarded before theprocess of GTMBO. This again reduces the sample sizewe start with.A 150-point DOE selected randomly from the parent sample of

280 designs, generated by a space-filling Latin hypercube, is used asthe initial training sample for GTM. The training sample consists ofthe normalized 28 variables, the constraints of Pr, _m, and the ηobjective value, thus giving 31 columns. After removing any failedruns present in the DOE, the normal procedure of our constrainedGTMBO is performed. First, the relevance of each constraint is foundby experiments on the initial 280 sample DOE. Here, on applicationsof the actual constraints on the DOE, only 6% of samples remained,

Fig. 8 Wing variables [19].

Table 1 Values of design variables, constraint parameters, andobjective function [8]

Lower limit Value Upper limit Quantity

100 168 250 Wing area, m2

6 9.07 12 Aspect ratio � wingspan2∕wingarea

0.2 0.313 0.45 Kink position25 27.1 45 Sweep angle, deg0.4 0.598 0.7 Inboard taper ratio � root

chord/kink chord0.2 0.506 0.6 Outboard taper ratio � kink

chord/tip chord0.1 0.150 0.18 Root tip chord0.06 0.122 0.14 Kink tip chord0.06 0.122 0.14 Tip tip chord4.0 4.5 5.0 Tip washout, deg0.65 0.75 0.84 Kink washout fraction— — 127,984 135,000 Wing weight40.0 41.73 — — Wing volume, m3

— — 4.179 5.4 Pitch-up margin2.5 2.693 — — Undercarriage bay length, m— — 3.145 — — D∕q, m2

1 2 3 4 5 6 7 8 9 102.7

2.75

2.8

2.85

2.9

2.95

3

Latent dimension L

D/q

obj

ectiv

e fu

nctio

n

Fig. 9 Global optimum found by GTM for different latent dimensions.

0 10 20 30 40 50 602.7

2.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6

DOE

D/q

func

tion

valu

e

Fig. 10 Optimization history of drag function for 7-D GTM for 50DOEs.

1018 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 10: Constrained Design Optimization Using Generative Topographic Mapping

and so the constraints were relaxed to involve approximately around20% (10% extra rounded to 20%) of data. The relevance of theconstraints are found by applying them individually on the DOE, andit is found to retain 77 and 67%of samples, respectively, forPr and _m.Because they did not differ by a large margin, the constraint weightsare kept equal, and hence the same ϵ values applied for both, whichwas 0.003. The relaxed constraints were thus

2.085 ≤ Pr ≤ 2.099 0.56 ≤ _m ≤ 0.568 (15)

These relaxed constraints onPr and _m are next applied to the trainingdata set which reduces the design space to around 30 designs for thevarious different 150 initial size DOEs. Because of the limited budgetof function evaluations, the latent dimensions of GTMwas restrictedto 10, 15, and 20 dimensions. This wide range was adopted so as toget an idea around which dimension will the optimum be obtained,satisfying constraints of the problem, so as to further decide ifexperiments have to be performed with other latent dimensions:fewer than 10 or more than 20 dimensions. The GTM parametersused are 50 times the latent dimensionwith a random sampling froma

uniform distribution and 10 EM cycles of training. After training thedata, GTM forms a low-dimensional latent space, which is searchedusing the actual limits of Pr and _m on the GTM given constraintvalues. The best point predicted by GTM is then evaluated usingHYDRA to check for performance and constraints and appended tothe training set. This is continued for 10 update points. Again notethat some of these update points may fail when evaluated withHYDRA even if they were predicted as feasible points by GTM. Touse the parallel processing of results fromHYDRA, each update of all20 independent DOEs are evaluated in parallel using 20 nodes of theHPC cluster so that they are all evaluated in 1 h.

0 5 10 15 201.3

1.32

1.34

1.36

1.38x 105

Update

Ave

rage

d W

ing

wei

ght

0 5 10 15 2040

41

42

43

44

Update

Ave

rage

d w

ing

volu

me

0 5 10 15 20

4.8

4.9

5

5.1

5.2

Update

Ave

rage

d pi

tch

up m

argi

n

0 5 10 15 202.9

2.95

3

3.05

Update

Ave

rage

d U

nder

carr

iage

ba

y le

ngth

Fig. 11 Constraint function values for 20 updates of GTM averaged over 50 DOEs. Only one update exceeded the upper limit for wing weight function,shown by the boundary line.

Fig. 12 Computational mesh of the rotor blade.

Lean

directionof rotation

Flow

Sweep

Skew

Trailing edgerecambering Leading edge

recambering

Fig. 13 Variables at a control point.

VISWANATH, FORRESTER, AND KEANE 1019

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 11: Constrained Design Optimization Using Generative Topographic Mapping

As a comparison for the results given by GTMBO method, therecent study of this test case with standard kriging and multifidelity(MF) cokriging models performed by Brooks et al. [26] is studied.Standard kriging models used for this test case did not give animproved performance, and hence cokriging models were used toimprove the efficiency remainingwithin the constraint limits. Both ofthe results are used for comparison here.

C. Results and Discussion

The results for all the 20 DOEs were averaged for the GTM runswith latent dimensions 10-D, 15-D, and 20-D and averaged resultshowed that 15-D GTM could find a feasible solution that did notimprove with 20 dimensions. Hence, more latent dimensions werenot experimented for the problem. The results of all of the dimensionsof GTM are compared with kriging results obtained from literature[26] in Table 2. Because 15-D GTM gave the best solution among allof the other methods in terms of optimum, constraint feasibility, andspeed, it is adopted as the best dimension GTM can reduce to, and allfurther results are plotted for this dimension of GTM. However, abest design geometry from 20-DGTMwill also be shown later. In the15-D GTM, note that different DOEs had different sample sizes, allaround 30, and so, for consistency in plotting various results, 30sample points from each DOE plus their corresponding 10 updatesare used for averaging and plotting figures. The different figures andtabulated results are discussed in detail next.1) The averaged result of the 20 DOEs of 15D GTM with their

standard deviations as error bars are shown in Fig. 14. The gaps in thefigure at update points 34, 35, and 36 show that therewere some failedruns of the solver at those updates, which made the averagingimpossible for these specific updates. This is due to error in GTMprediction at those updates, but even so, the percentage of failure inupdates was only 2%, taking into account all of the 200 updates. Thisis an interesting observation because the percentage of failed resultswas more for both the kriging and cokriging methods [26] thanGTM (see Table 2), showing that GTM more effectively avoidedthe bad regions in the design space. The reduced standard deviationsof the η value for the updates shows a convergence toward the best

value of 86.5%. Again, to see the performance of all the 20 DOEsused for averaging, the history of each optimization is shown inFig. 15. Unlike the previous examples, the convergence of theobjective function toward the optimum shows that the optimumvaluewas not present in the DOE sample provided to the GTM training butwas foundwith theGTM training during updates. Thus, this indicatesthat GTM could successfully find an optimal value even if the initialtraining set did not sample the space containing the optimal value.2) To study the constraint feasibility of GTMBO, the averaged

results of Pr and _m were plotted with their standard deviations inFigs. 16 and 17, respectively. The actual constraint limits and therelaxed constraint limits, provided for GTM training, are shown inthe figures. For Pr, the constraints were not violated by any of theupdates, and the value corresponding to the best η value is on theboundary, giving 2.088, which is −0.2% from the Pr of the datumblade. For _m, two updates violated the constraints slightly, and so thefeasible value for _m was on the boundary, with −0.2% from datumblade. Note that theDOEwill have points outside the actual boundarydue to the relaxation allowed, but the updates will not as they arechecked for actual constraints. Thus, the best value of efficiency ηsatisfying the constraints is given by the ninth update of the 15DGTM. The important observation here is that, when the method iscomparedwith krigingmodels, it is clear that the krigingmodels havea higher efficiency but at the cost of violating the constraints by alarge margin as−0.8∕ − 0.98% from the datum blade (highlighted inTable 2). TheGTMgiven efficiency η value is only around 12% lowerthan standard kriging model value, but it has a constraint feasibility75% higher than the kriging model. This is the main advantage ofGTM that it has effectively learned the constraint boundary in thedesign space, which is what kriging models lack and which is a verydesirable quality. When the constraint limit of the design problemwas limited to 0.2% for kriging by those authors, the design problem

Table 2 Comparison of different methods for NASA rotor 37 efficiency

Method Percent failed Response Percent datum RMSE, m2 Time per run, min

10-D GTM 2 η 1.14 0.0078 0.510-D GTM 2 Pr −0.05 0.015 0.510-D GTM 2 _m 0.2 0.004 0.515-D GTM 2 η 1.56 0.0087 315-D GTM 2 Pr −0.2 0.015 315-D GTM 2 _m −0.2 0.005 320-D GTM 1 η 1.56 0.0091 6.520-D GTM 1 Pr −0.2 0.016 6.520-D GTM 1 _m −0.2 0.005 6.5Standard kriging [26] 3.1 η 1.79 0.0065 6.5Standard kriging [26] 3.1 Pr −0.8 0.16 6.5Standard kriging [26] 3.1 _m 0.2 0.28 6.5MF kriging [26] 3.9 η 2.34 0.0022 — —

MF kriging [26] 3.9 Pr −0.98 0.0014 — —

MF kriging [26] 3.9 _m 0.27 0.0006 — —

0 5 10 15 20 25 30 35 40 450.84

0.845

0.85

0.855

0.86

0.865

0.87

0.875

30 point DOE with 10 updates

Effi

cien

cy

Fig. 14 The objective function η for 15DGTM averaged over 20 DOEs.The gaps in the updates show the runs that failed with HYDRA.

0 5 10 15 20 25 30 35 400.845

0.85

0.855

0.86

0.865

0.87

0.875

30 point DOE with 10 updates

Effi

cien

cy

Fig. 15 Optimization history of all the 20 DOEs for 15-D GTM.

1020 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 12: Constrained Design Optimization Using Generative Topographic Mapping

required 120 updates for a 280 size initial DOE to obtain a gooddesign, and this proved to be very costly for this expensive solverdesign problem. Thus, GTM could not only satisfy constraints butalso do so with fewer number of updates.3) Table 2 shows the comparison between different methods of

GTM and also with the kriging models. The 10-D GTM gives theleast improvement in efficiency even though it has the best feasibility,least time, and low error for the validation set. 15-D GTM gives ahigher efficiency satisfying constraints, which does not improvewith20-D GTM even in terms of validation error or time, and hence 15-DGTM can be considered a good model for this test case because itbalances optimality, feasibility, validation error, and time. Comparedto kriging models, apart from the significant margin of constraintfeasibility, the computational time involved for the 15-D GTM wasalso less than the kriging models. As discussed before, the parametertuning has to be performed independently for constraints in kriging,and this increases its computational efforts.4) Finally, the best design from the updates of 15-D GTM and a

further improved design found in the updates of 20-D GTM areshown in Figs. 18 and 19 alongwith the datumblade geometry. Thesefigures show the rotor geometries and the static pressure contours. Itcan be observed that the shock lines and separated regions haveshifted slightly backward toward the trailing edge for the optimizeddesigns. The optimized geometry can also clearly be differentiatedfrom the datum. The radial efficiencies along the blade height of bothof these best designs and the kriging models against the datum areexamined in Fig. 20. It shows that, whileMFkrigingmodels gives thebest efficiency over most of the rotor (but violates the through flowconstraints), 15-DGTMgives an improved result in the hub region ofthe rotor and satisfies the tight constraints as well. Both the 15-D and20-D GTM showed the best overall efficiency for the blade,particularly notable at 5 to 35% blade height.Apart from the 12 h for the CFD runs of the initial DOE of 280, the

computational expense for each latent dimension is the time for 200GTM training runs (10 GTM training each for 20 averaging DOEs)and 50 min for CFD runs of each of the 10 updates. The 20 DOEs foraveraging are run in parallel for calculation of updates using CFDsolver, and so no extra time is used for this. Table 2 shows the trainingrun times for each latent dimension. Thus, 15-D GTM would take10 h for GTM and 8 h for update calculation (i.e., a total of 18 h). Itcan clearly be seen that this is a major improvement from conven-tional optimization, which takes days to finish, and slightly better

0 5 10 15 20 25 30 35 402.084

2.086

2.088

2.09

2.092

2.094

2.096

2.098

2.1

30 point DOE with 10 updates

Pre

ssur

e ra

tio

Fig. 16 Pr constraint with 15DGTMfor 20DOEs. The actual boundarylines (solid) and GTM relaxed boundary lines (dashed) are shown. Thegaps show failed runs.

Fig. 18 Datumgeometry (left) and a best optimizeddesign geometrywith 15-DGTM(right) of rotor blade. η is improvedby 1.85%while both constraintsare satisfied.

0 5 10 15 20 25 30 35 400.559

0.56

0.561

0.562

0.563

0.564

0.565

0.566

0.567

0.568

30 point DOE with 10 updates

Mas

s flo

w r

ate

Fig. 17 _m constraint with 15DGTM for 20 DOEs. The actual boundarylines (solid) and GTM relaxed boundary lines (dashed) are shown. Thegaps show failed runs.

Fig. 19 Datum geometry (left) and one of the best optimized geometry with 20-D GTM (right) of rotor blade. η is improved by 2.07% while bothconstraints are satisfied.

VISWANATH, FORRESTER, AND KEANE 1021

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 13: Constrained Design Optimization Using Generative Topographic Mapping

than other surrogate models like kriging, which takes more time forits training.

VI. Conclusions

Anonlinear dimension reductionmethod from the field ofmachinelearning called generative topographic mapping method has beenemployed in this work for design space reduction in constrainedoptimization problems. The motive behind the work was to reducethe computational effort involved in evaluating expensive objectivefunctions in optimization by reducing the dimension of the problem.GTM effectively provided a low-dimensional surrogate model toworkwith, which gives predictions of the expensive function throughsimple easy-to-evaluate regression functions. It learns from the initialDOE of the actual expensive simulations and provides a trans-formation of this data into a low-dimensional manifold, which is thenonward the predictor of the actual objective function. The mosteffective optimizer to be used in the search for optimum on the surro-gate was a GA because it had to sample only a lower-dimensionalspace. Themethodwas first illustrated on a two-dimensional functionto visually study its performance using two test constraint functionsvarying in complexity. GTM effectively found the global optimum inboth the cases but was not consistent in results when averaged. Thisdefect was alleviated in this problem by increasing the initial DOEpoints. But this option of increasing sample space was not employedfor expensive solvers because it would then not be helpful in reducingcomputational time.The method was then applied to a transonic aircraft wing design

problem with the objective of minimizing drag on the wing subjectto constraints on its weight, volume, pitch-up margin, and under-carriage bay length. GTM found the global optimum with seven-dimensional latent space simultaneously satisfying the constraintfunctions very effectively. This is because GTMwas able to learn theconstrained design spacewell from the providedDOE and hence findthe global optimum in the constrained space. This is the advantage ofthe constrained GTMBO over other models, its ability to find not justgood designs but feasible designs, and it is shownwith an example ofan aircraft compressor rotor.The test case of a transonic aircraft compressor rotor blade involves

an expensive solver for its objective and constraint evaluations andis subjected to tight constraint limits. The advantage the GTMmodelon this test case was its effectiveness in learning the constraintboundaries closely so as to give optimal points within the specifiedconstraint limits, violating them only for a very small percentage ofpredictions. The optimum obtained was slightly less good thankriging models applied for this test case, but kriging models failed tosatisfy very tight constraints. The reduced GTM model for this testcase took only half the time of a kriging model for finding an update,thus also saving the computational effort involved.

In summary, the GTM method could reduce the dimension ofthe problem, find the optimum of the function, and satisfy constraintseffectively. It does not help in reducing the sample size needed to get tothe optimum. On the contrary, for the mathematical two-dimensionaltest problems, the sample size had to be increased to get to theoptimum. Thus, the method does not guide us in DOE selection, butwith the given sample, it effectively finds the optimum while simul-taneously satisfying constraints. Study on the DOE selection for suchsurrogate methods could be a future scope of research in this area.In industry, the need for satisfying constraint boundaries is a major

challenge because most optimization problems have constraints,which are often equality constraints or with very tight limits that arehard to satisfy. In such problems, a near-optimal design that meets theconstraints is more important, and the GTM model performs well inthis aspect.

Acknowledgments

This work is supported by Rolls-Royce Plc and the U.K.Department of Trade and Industry.

References

[1] Myers, R. H., and Mongomery, D. C., Response Surface Methodology:

Process and Product OptimizationUsingDesigned Experiments,WileySeries in Probability and Statistics, Wiley, New York, 1995.

[2] Regis, R. G., and Shoemaker, C. A., “Constrained Global Optimizationof Expensive Black Box Functions Using Radial Basis Functions,”Journal of Global Optimization, Vol. 31, No. 1, 2005, pp. 153–171.doi:10.1007/s10898-004-0570-0

[3] Jones, D. R., Schonlau, M., and Welch, W. J., “Efficient Global Opti-mization of Expensive Black-Box Functions,” Journal of Global

Optimization, Vol. 13, No. 4, 1998, pp. 455–492.doi:10.1023/A:1008306431147

[4] Svensén, M., “GTM: The Generative Topographic Mapping,” Ph.D.Thesis, Aston Univ., Birmingham, England, U.K., 1998.

[5] Viswanath, A., Forrester, A. I. J., and Keane, A. J., “DimensionReduction for Aerodynamic Design Optimization,” AIAA Journal,Vol. 49, No. 6, 2011, pp. 1256–1266.doi:10.2514/1.J050717

[6] Bartholomew, D. J., Latent Variable Models and Factor Analysis,Charles Griffin, London, 1987.

[7] Everitt, B. S.,An Introduction to Latent VariableModels, Chapman andHall, London, 1984.

[8] Keane, A. J., and Nair, P. B., Computational Approaches for AerospaceDesign, Wiley, New York, 2000, pp. 141–151.

[9] Dempster, A. P., Laird, N.M., and Rubin, D. B., “MaximumLikelihoodfrom Incomplete Data via the EM Algorithm,” Journal of the Royal

Statistical Society B, Vol. 39, No. 1, 1977, pp. 1–38.[10] Bishop, C. M., Svensén, M., and Williams, C. K. I., “GTM: The

Generative Topographic Mapping,” Neural Computation, Vol. 10,No. 1, 1998, pp. 215–234.doi:10.1162/089976698300017953

[11] Smola, A. J., and Scholkopf, B., “A Tutorial on Support VectorRegression,” Statistics and Computing, Vol. 14, No. 3, 2004, pp. 199–222.doi:10.1023/B:STCO.0000035301.49549.88

[12] Loeppky, J. L., Sacks, J., and Welch, W. J., “Choosing the Sample Sizeof a Computer Experiment: A Practical Guide,” Technometrics, Vol. 51,No. 4, 2009, pp. 366–376.doi:10.1198/TECH.2009.08040

[13] Svensén,M., “TheGTMToolbox—User’sGuide,”Ver. 1.01,Oct. 1999.[14] MacKay, D. J. C., “Bayesian Neural Networks and Density Networks,”

Nuclear Instruments and Methods in Physics Research, Section A,Vol. 354, No. 1, 1995, pp. 73–80.

[15] Forrester, A. I. J., Sóbester, A., and Keane, A. J., Surrogate Models in

EngineeringDesign: APracticalGuide,Wiley,NewYork, 2008, p. 196.[16] Sasena, M. J., Papalambros, P., and Goovaerts, P., “Exploration of

Metamodeling Sample Criteria for Constrained Global Optimization,”Engineering Optimization, Vol. 34, No. 3, 2002, pp. 263–278.doi:10.1080/03052150211751

[17] Parr, J. M., Holden, C. M. E., Forrester, A. I. J., and Keane, A. J.,“Review of Efficient Surrogate Infill Sampling Criteria with ConstraintHandling,” Proceedings of the 2nd International Conference on

Engineering Optimization, edited by Rodrigues, H., Herskovits, J.,Morta Soares, C., Miranda Guedes, J., Folgado, J., Araujo, A., Moleiro,

20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

efficiency

% b

lade

hei

ght

20D GTMdatum15D GTMStd. KrigingMF Kriging

Fig. 20 Radial profiles of efficiency for datum and best designs ofdifferent methods.

1022 VISWANATH, FORRESTER, AND KEANE

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14

Page 14: Constrained Design Optimization Using Generative Topographic Mapping

F., Kuzhichalil, J., Aguilar Madeira, J., and Dimitrovova, Z., TechnicalUniveristy of Lisbonm, Lisbon, 2010.

[18] Cousin, J., and Metcalfe, M., “The BAE Ltd Transport AircraftSynthesis and Optimization Program,” Aircraft Design, Systems and

Operations Conference, AIAA Paper 1990-3295, Sept. 1990.[19] Forrester, A. I. J., “Efficient Global Aerodynamic Optimisation Using

Expensive Computational Fluid Dynamics Simulations,” Ph.D. Thesis,Southampton Univ., Southampton, England, U.K., Nov. 2004.

[20] Viswanath, A., Forrester, A. I. J., and Keane, A. J., “GenerativeTopographic Mapping for Dimension Reduction in EngineeringDesign,” LION4, Learning and Intelligent Optimization, Vol. 6073,”Lecture Notes in Computer Science, edited by Blum, C., and Batiti, R.,Springer, Berlin, Heidelberg, 2010, pp. 204–207.

[21] Reid, L., and Moore, R. D., “Design and Overall Performance of FourHighly Loaded, High-Speed Inlet Stages for an Advanced High-Pressure-Ratio Core Compressor,” NASA TP-1337, 1978.

[22] Dunham, J., “CFD Validation for Propulsion System Components,”AGARD, Rept. 355, 1998.

[23] Shahpar, S., and Lapworth, B. L., “PADRAM: Parametric Designand Rapid Meshing System for Turbo-Machinery Optimisation,”

Proceedings of ASME Turbo Expo, Vol. 6a, ASME Paper GT-2003-38698, Atlanta, 2003, pp. 579–590.

[24] Lapworth, B. L., and Shahpar, S., “Design of Gas Turbine EnginesUsing CFD,” Proceedings of European Congress on Computational

Methods in Applied Sciences and Engineering, University of Jyväskylä,Dept. of Mathematical Information Technology, 2004.

[25] Shahpar, S., Polynkin, A., and Toropov, V., “Large Scale Optimizationof Transonic Axial Compressor Rotor Blades,” 49th AIAA/ASME/

ASCE/AHS/ASC Structures, Structural Dynamics, and Materials

Conference, AIAA Paper 2008-2056, April 2008.[26] Brooks, C. J., Forrester, A. I. J., Keane, A. J., and Shahpar, S., “Multi-

Fidelity Design Optimization of a Transonic Compressor Rotor,”Proceedings of the 9th European Conference Turbomachinery Fluid

Dynamics and Thermodynamics, edited by Sen, M., Bois, G., Manna,M., and Arts, T., Istanbul Technical University, 2011, pp. 1267–1276.

J. MartinsAssociate Editor

VISWANATH, FORRESTER, AND KEANE 1023

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

AL

IFO

RN

IA -

DA

VIS

on

May

10,

201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/1.J

0524

14