siag/opt views-and-news optimization under uncertainty

SIAG/OPT Views-and-NewsA Forum for the SIAM Activity Group on Optimization

Volume 13 Number 1 March 2002

Contents

Optimization under UncertaintyOptimization under Uncertainty: An OverviewUrmila Diwekar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Two-Stage Stochastic Linear Programming:A TutorialVicente Rico-Ramirez . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Combinatorial Optimization under UncertaintyKi-Joo Kim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Stochastic Nonlinear Optimization:The BONUS AlgorithmKemal Sahin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Bulletin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Comments from the Editors . . . . . . . . . . . . . . . . 22

Optimization underUncertainty

Optimization Under Uncertainty:An Overview

Urmila DiwekarCUSTOM (Center for Uncertain Systems: Tools for

Optimization and Uncertainty), Carnegie Mellon University,

Pittsburgh, PA 15213 ([email protected]).1

1This overview is based on the chapter entitled “Optimiza-tion Under Uncertainty” from Introduction to Applied Opti-mization[1].

1. Introduction

Deterministic optimization literature classifies prob-lems as Linear Programming (LP), NonLinearProgramming (NLP), Integer Programming (IP),Mixed Integer LP (MILP), and Mixed Integer NLP(MINLP), depending on the decision variables, ob-jectives, and constraints. However, the future can-not be perfectly forecasted but instead should beconsidered random or uncertain. Optimization un-der uncertainty refers to this branch of optimizationwhere there are uncertainties involved in the dataor the model, and is popularly known as StochasticProgramming or stochastic optimization problems.In this terminology, stochastic refers to the random-ness, and programming refers to the mathematicalprogramming techniques like LP, NLP, IP, MILP,and MINLP. In discrete (integer) optimization, thereare probabilistic techniques like Simulated Anneal-ing and Genetic Algorithms; these techniques aresometimes referred to as the stochastic optimizationtechniques because of the probabilistic nature of themethod. In general, however, Stochastic Program-ming and stochastic optimization involves optimaldecision making under uncertainties.

2. Types of Problems and Gener-alized Representation

The need for including uncertainty in complex deci-sion models arose early in the history of mathemat-ical programming. The first model forms, involvingaction followed by observation and reaction (or re-course), appear in [2, 3]. The commonly used ex-ample of a recourse problem is the news vendor orthe newsboy problem described below. This prob-lem has a rich history that has been traced back tothe economist Edgeworth [4], who applied a variance

2 SIAG/OPT Views-and-News

to a bank cash-flow problem. However, it was notuntil the 1950s that this problem, like many otherOR/MS models seeded by the war effort, became atopic of serious and extensive study by academicians[5].

Example: The simplest form of a stochastic pro-gram may be the news vendor (also known as thenewsboy) problem. In the news vendor problem, thevendor must determine how many papers (x) to buynow at the cost of c cents for a demand which is un-certain. The selling price is sp cents per paper. Fora specific problem, whose weekly demand is shownbelow, the cost of each paper is c = 20 cents andthe selling price is sp = 25 cents. Solve the problem,if the news vendor knows the demand uncertaintiesbut does not know the demand curve for the com-ing week a-priori (Table 1). Assume no salvage values = 0, so that any papers bought in excess of demandare simply discarded with no return.

Table 1: Weekly demand and its uncertainties.Weekly demand Uncertainty

i Day Demand j Demand Probability(di) (dj) (pj)

1 Monday 50 1 50 5/72 Tuesday 50 2 100 1/73 Wednesday 50 3 140 1/74 Thursday 505 Friday 506 Saturday 1007 Sunday 140

Solution: In this problem, we want to find howmany papers the vendor must buy (x) to maximizethe profit. Let r be the effective sales and w bethe excess which is going to be thrown away. Thisproblem falls under the category of Stochastic Pro-gramming with recourse where there is action (x),followed by observation (profit), and reaction (or re-course) (r and w). We know that any papers boughtin the excess are just thrown away. It is obviousthat we should minimize the waste but increase thesales. Our first instinct to solve this problem is tofind the average demand and find the optimal sup-ply x corresponding this demand. Since the averagedemand from the Table 1 is 70 papers, x = 70 shouldbe the solution. Let us see if this represents the op-timal solution for the problem. Table 2 shows the

observation (profit function) for this action.

Table 2: Supply and profit.i Day Supply, Profit,

xi cents1 Monday 70 -1502 Tuesday 70 -1503 Wednesday 70 -1504 Thursday 70 -1505 Friday 70 -1506 Saturday 70 3507 Sunday 70 350Average Weekly - - 50

From Table 2, it is obvious that if we take theaverage demand as the solution, then the news ven-dor will be making a loss of 50 cents per week. Thisprobably is not the optimal solution. Can we do bet-ter? For that we need to propagate the uncertaintyin the demand to see the effect of uncertainty on theobjective function and then find the optimum valueof x. This formulation is shown below.

Maximize Z = Profitavg(u)x

subject to

Profitavg(u) =∫ 1

0[−cx+ Sales(r, w, p(u))]dp

=∑j

pj Sales(r, w, dj)− cx

Sales(r, w, dj) = sp rj + swj

where

rj = min (x, dj)= x, if x ≤ dj= dj , if x ≥ dj

wj = max (x− dj , 0)= 0, if xi ≤ di= xi − di, if xi ≥ di

The above information can be transformed for dailyprofit as follows: if d1 ≤ x ≤ d2,

Profit = −cx+ 5/7sp d1 + 1/7sp x+ 1/7sp x, (1)

Volume 00 Number 00 January 2002 3

or if d2 ≤ x ≤ d3,

Profit = −cx+ 5/7sp d1 + 1/7sp d2 + 1/7sp x. (2)

Notice that the problem represents two equationsfor the objective function, Equations 1 and 2, mak-ing the objective function a discontinuous functionand is no longer an LP. Special methods like the L-shaped decomposition or stochastic decompositionare required to solve this problem. However, sincethe problem is simple, we can solve this problem astwo separate LPs. The two possible solutions to theabove LPs are x = d1 = 50 and x = d2 = 100respectively. This provides the news vendor with anoptimum profit of 1750 cents per week from Equation1 and with a loss of 2750 cents per week from Equa-tion 2. Obviously Equation 1 provides the globaloptimum for this problem.

The difference between taking the average valueof the uncertain variable as the solution as com-pared to using stochastic analysis (propagating theuncertainties through the model and finding the ef-fect on the objective function as above) is definedas the Value of Stochastic Solution, VSS. If we takethe average value of the demand, i.e. x = 70, asthe solution, we obtain a loss of 50 cents per week.Therefore, the value of stochastic solution, VSS, is1750− (−50) = 1800 cents per week.

Now consider the case where the vendor knowsthe exact demand (Table 1) a-priori. This is theperfect information problem where we want to findthe solution xi for each day i. Let us formulate theproblem in terms of xi.

Maximize Profiti = −cxi + Sales(r, w, di)xi

subject to

Sales(r, w, di) = sp ri + swi

ri = min(xi, di)= xi, if xi ≤ di= di, if xi ≥ di

wi = max(xi − di, 0)= 0, if xi ≤ di= xi − di, if xi ≥ di

Here we need to solve each problem (for each i)separately, leading to the following decisions shownin Table 3.

Table 3: Supply and profit.i Day Supply, Profit,

xi cents1 Monday 50 2502 Tuesday 50 2503 Wednesday 50 2504 Thursday 50 2505 Friday 50 2506 Saturday 100 5007 Sunday 140 700Average Weekly - 2450

One can see that the difference between the twovalues, (1) when the news vendor has the perfect in-formation and (2) when he does not have the perfectinformation but can represent it using probabilisticfunctions, is the Expected Value of Perfect Infor-mation, EVPI. EVPI is 700 cents per week for thisproblem.

The literature on optimization under uncertaintiesvery often divides the problems into categories suchas “wait and see”, “here and now” and “chance con-strained optimization” [6, 7]. In “wait and see” wewait until an observation is made on the random el-ements, and then solve the deterministic problem.The last formulation described in terms of prob-lem under perfect information, falls under this cate-gory. This is similar to the “wait and see” problemof Madansky [8], originally called “Stochastic Pro-gramming” by Tintner [9], is not in a sense, one ofdecision analysis. In decision making, the decisionshave to be made “here and now” about the activitylevels. The “here and now” problem involves opti-mization over some probabilistic measure— usuallythe expected value. By this definition, chance con-strained optimization problems can be included inthis particular category of optimization under un-certainty. Chance constrained optimization involvesconstraints which are not expected to be always sat-isfied; only in a proportion of cases, or “with givenprobabilities”.

These various categories require different methodsfor obtaining their solutions. As stated earlier, wecan easily divide these problems into two main cat-egories (1) here and now problems, and the (2) waitand see problems. It should be noted that many


problems have both here and now, and wait and seeproblems embedded in them. The trick is to dividethe decisions into these two categories and use a cou-pled approach.

Here and Now Problems: The “here and now”problems require that the objective function andconstraints be expressed in terms of some proba-bilistic representation (e.g. expected value, variance,fractiles, most likely values). For example, in chanceconstrained programming, the objective function isexpressed in terms of expected value, while the con-straints are expressed in terms of fractiles (proba-bility of constraint violation), and in the Taguchi’soff-line quality control method [10], the objective isto minimize variance. These problems can be classi-fied as here and now problems.

The “here and now” problem, where the decisionvariables and uncertain parameters are separated,can then be viewed as:

Optimize J = P1(j(x, u)) (3)

subject to

P2(h(x, u)) = 0P3(g(x, u) ≥ 0) ≥ α

where u is the vector of uncertain parameters andP represents the cumulative distribution functionalsuch as the expected value, mode, variance, or frac-tiles.

Unlike the deterministic optimization problem, instochastic optimization one has to consider the prob-abilistic functional of the objective function and con-straints. The generalized treatment of such problemsis to use probabilistic or stochastic models instead ofthe deterministic model inside the optimization loop.

Figure 1a represents the generalized solution pro-cedure, where the deterministic model is replacedby an iterative stochastic model with sampling looprepresenting the discretized uncertainty space. Theuncertainty space is represented in terms of the mo-ments like the mean, fractiles, or the standard de-viation of the output. In chance constrained for-mulation, the uncertainty surface is translated intoinput moments, resulting in an equivalent determin-istic optimization problem. This is discussed in thenext section.

Wait and See: In contrast to here and now prob-lems, which yield optimal solutions that achieve a

Optimizer

Decision Variables

StochasticModeler

MODEL

Probabilistic ObjectiveFunction & Constraints

Optimal Design

Objective Function

&Constraints

UncertainVariables

MODEL

Optimizer

StochasticModeler

Objective Function

&Constraints

UncertainVariables

Decision Variables

Distribution of Optimal Designs

Optimal Design

(a) Here and Now

(b) Wait and See

Figure 1: Pictorial representation of the stochasticprogramming framework.

given level of confidence, wait and see problems in-volve a category of formulations that shows the ef-fects of uncertainty on optimum design. A wait andsee problem involves deterministic optimal decisionat each scenario or random sample, equivalent tosolving several deterministic optimization problems.The generalized representation of this problem isgiven below.

Optimize Z = z(x, u∗) (4)

subject to

h(x, u∗) = 0g(x, u∗) ≤ 0

where u∗ is the vector of values of uncertain variablescorresponding to each scenario or sample. This op-timization procedure is repeated for each sample ofuncertain variables u and a probabilistic representa-tion of the outcome is obtained.


Figure 1b represents the generalized solution pro-cedure, where the deterministic problem forms theinner loop, and the stochastic modeling forms theouter loop. The difference between the two solutionsobtained using the two frameworks is the ExpectedValue of Perfect Information (EVPI). The concept ofEVPI was first developed in the context of decisionanalysis and can be found in classical references suchas [11]. From Figures 1 it is clear that by simply in-terchanging the position of the uncertainty analysisframework and the optimization framework, one cansolve many problems in stochastic optimization andStochastic Programming domain [1]. Recourse prob-lems with multiple stages involve decisions that aretaken before the uncertainty realization (here andnow) and recourse actions which can be taken wheninformation is disclosed (wait and see). These prob-lems can be solved using decomposition methods.

As can be seen from the above description, bothhere and now and wait and see problems requirerepresentation of uncertainties in the probabilisticspace and then the propagation of these uncertain-ties through the model to obtain the probabilisticrepresentation of the output. This is the major dif-ference between stochastic and deterministic opti-mization problems. Is it possible to propagate theuncertainty using moments (like mean and variance),thereby obtaining a deterministic representation ofthe problem? This is the basis of the chance con-strained programming method, developed very earlyin the history of optimization under uncertainty,principally by Charnes and Cooper [12].

3. Chance Constrained Program-ming Method

In the chance constrained programming (CCP)method, some of the constraints likely need not holdas we had assumed in earlier problems. Chance con-strained problems can be represented as follows:

Optimize J = P1(j(x, u)) = E(z(x, u)) (5)

subject toP (g(x) ≤ u) ≤ α (6)

In the above formulation, equation 6 is the chanceconstraint. In chance constraint formulation, thisconstraint (or constraints) is (are) converted into a

deterministic equivalent under the assumption thatthe distribution of the uncertain variables, u, is astable distribution. Stable distributions are suchthat the convolution of two distribution functionsF (x − m1/υ1) and F (x − m2/υ2) is of the formF (x− dmu/v), where mi and υi are two parametersof the distribution [13]. Normal, Cauchy, Uniform,and Chi-square are all stable distributions that allowthe conversion of probabilistic constraints into deter-ministic ones. The deterministic constraints are interms of moments of the uncertain variable u (in-put uncertainties). For example, if the constraintg in equation 6 has a cumulative probability distri-bution F then the deterministic equivalent of thisconstraint is given below. Deterministic equivalentof the chance constraint 6:

g(x) ≤ F−1(α) (7)

where F−1 is the inverse of the cumulative distribu-tion function F .

The major restrictions in applying the CCP for-mulation include that the uncertainty distributionsshould be stable distribution functions, the uncer-tain variables should appear in the linear terms inthe chance constraint, and that the problem needsto satisfy the general convexity conditions. The ad-vantage of the method is that one can apply thedeterministic optimization techniques to solve theproblem.

4. Uncertainty Analysis and Sam-pling

The probabilistic or stochastic modeling (Figure 1)iterative procedure involves:

1. Specifying the uncertainties in key input param-eters in terms of probability distributions.

2. Sampling the distribution of the specified pa-rameter in an iterative fashion.

3. Propagating the effects of uncertainties throughthe model and applying statistical techniques toanalyze the results.

4.1 Specifying Uncertainty Using Proba-bility Distributions

To accommodate the diverse nature of uncertainty,the different distributions can be used. The type


of distribution chosen for an uncertain variable re-flects the amount of information that is available.For example, the uniform and log-uniform distribu-tions represent an equal likelihood of a value lyinganywhere within a specified range, on either a linearor logarithmic scale, respectively. Further, a nor-mal (Gaussian) distribution reflects a symmetric butvarying probability of a parameter value being aboveor below the mean value. In contrast, lognormal andsome triangular distributions are skewed such thatthere is a higher probability of values lying on oneside of the median than the other. Once probabilitydistributions are assigned to the uncertain parame-ters, the next step is to perform a sampling oper-ation from the multi-variable uncertain parameterdomain.

4.2 Sampling Techniques

One of the most widely used techniques for samplingfrom a probability distribution is the Monte Carlosampling technique, which is based on a pseudo-random generator used to approximate a uniformdistribution (i.e., having equal probability in therange from 0 to 1). The specific values for each inputvariable are selected by inverse transformation overthe cumulative probability distribution.

Importance Sampling: Crude Monte Carlomethods can result in large error bounds (confi-dence intervals) and variance. Variance reductiontechniques are statistical procedures designed to re-duce the variance in the Monte Carlo estimates[14]. Importance sampling, Latin Hypercube Sam-pling (LHS) [15, 16], Descriptive Sampling [17], andHammersley Sequence Sampling [18] are examples ofvariance reduction technique. In importance MonteCarlo sampling, the goal is to replace a sample usingthe distribution of u with one that uses an alterna-tive distribution that places more weight in the areasof importance. Obviously such a distribution func-tion is problem dependent and is difficult to find.The following two sampling methods provide a gen-eralized approach to improve the computational ef-ficiency of sampling.

Latin Hypercube Sampling: The main advan-tage of Monte Carlo method lies in the fact thatthe results from any Monte Carlo simulation can betreated using classical statistical methods; thus re-

sults can be presented in the form of histograms,and methods of statistical estimation and inferenceare applicable. Nevertheless, in most applications,the actual relationship between successive points ina sample has no physical significance; hence the ran-domness/independence for approximating a uniformdistribution is not critical [19]. Once it is apparentthat the uniformity properties are central to the de-sign of sampling techniques, constrained or stratifiedsampling techniques become appealing [20].

Latin hypercube sampling [15, 16] is one form ofstratified sampling that can yield more precise esti-mates of the distribution function. In Latin hyper-cube sampling, the range of each uncertain parame-ter Xi is sub-divided into non-overlapping intervalsof equal probability. One value from each interval isselected at random with respect to the probabilitydistribution in the interval. The n values thus ob-tained for X1 are paired in a random manner (i.e.,equally likely combinations) with n values of X2.These n values are then combined with n values ofX3 to form n-triplets, and so on, until n k-tupletsare formed. In median Latin Hypercube sampling(MLHS) this value is chosen as the mid-point of theinterval. MLHS is similar to the descriptive sam-pling described by [17]. The main drawback of thisstratification scheme is that it is uniform in one di-mension and does not provide uniformity propertiesin k-dimensions.

Hammersley Sequence Sampling: Recently,an efficient sampling technique (Hammersley se-quence sampling) based on Hammersley points hasbeen developed by Kalagnanam and Diwekar[18],which uses an optimal design scheme for placing then points on a k-dimensional hypercube. This schemeensures that the sample set is more representativeof the population, showing uniformity properties inmulti-dimensions, unlike Monte Carlo, Latin hyper-cube, and its variant, the Median Latin hypercubesampling techniques. The paper by Kalagnanamand Diwekar[18] provides a comparison of the perfor-mance of the Hammersley sampling (HSS) techniqueto that of other techniques. It was found that theHSS technique is at least 3 to 100 times faster thanLHS and Monte Carlo techniques and hence is a pre-ferred technique for uncertainty analysis as well asoptimization under uncertainty.


4.3 Sampling Accuracy and DifferentAlgorithms

As stated earlier, the Stochastic Programming for-mulations often include some approximations of theunderlying probability distribution. The disadvan-tage of sampling approaches that solve the γ-th ap-proximation completely is that some effort mightbe wasted on optimizing when approximation isnot accurate [21]. For specific structures wherethe L-shaped method is applicable, two approachesavoid these problems by embedding sampling withinanother algorithm without complete optimization.These two approaches are the method of Dantzig[22]which uses importance sampling to reduce variancein each cut based on a large sample, and the stochas-tic decomposition method proposed by Higle andSen[23]. Please refer to article by Dr. Rico-Ramirezon stochastic linear programming for details.

In almost all stochastic optimization problems, themajor bottleneck is the computational time involvedin generating and evaluating probabilistic functionsof the objective function and constraints. The accu-racy of the estimates for the actual mean (µ) and theactual standard deviation (σ) is particularly impor-tant to obtain realistic estimates of any performanceor economic parameter. However, this accuracy isdependent on the number of samples. The number ofsamples required for a given accuracy in a stochasticoptimization problem depends upon several factors,such as the type of uncertainty, and the point val-ues of the decision variables[24]. Especially for opti-mization problems, the number of samples requiredalso depends on the location of the trial point solu-tion in the optimization space. Figure 2 shows howthe shape of the surface over a range of uncertainparameter values changes since one is at a differ-ent iteration (different values of decision variables)in the optimization loop. Therefore, the selection ofthe number of samples for the stochastic optimiza-tion procedure is a crucial and challenging problem.A combinatorial optimization algorithm that auto-matically selects the number of samples and providesthe trade-off between accuracy and efficiency is pre-sented in the article by Dr. Ki-Joo Kim in this issue.

Cos

t

U1U2

Cos

t

U1U2

Figure 2: Uncertainty space at different optimizationiterations.

5. Summary

The problems in optimization under uncertaintyinvolve probabilistic objective function and con-straints. These problems can be categorized as (1)here and now problems, and (2) the wait and seeproblems. Many problems involve both here andnow, and wait and see decisions. The difference inthe solution of these two formulations is the expectedvalue of perfect information (EVPI). Recourse prob-lems normally involve both here and now, and waitand see decisions and hence are normally solved bydecomposition strategies like the L-shaped method.The major bottleneck in solving stochastic optimiza-tion (programming) problems is the propagation ofuncertainties. In chance constrained programming,the uncertainties are propagated as moments, result-ing in a deterministic equivalent problem. However,chance constrained programming methods are appli-cable to a limited number of problems. A generalizedapproach to uncertainty propagation involves sam-pling methods that are computationally intensive.New sampling techniques like the Hammersley Se-quence sampling reduce the computational intensityof the sampling approach. Sampling error boundscan be used to reduce the computational intensity ofthe stochastic optimization procedure further. Thisstrategy is used in some of the decomposition meth-ods and in the stochastic annealing algorithm de-scribed in the next two articles. For stochastic non-linear problems, a new algorithm called Better Opti-mization of Nonlinear Uncertain Systems (BONUS)is also presented in the last article by Dr. KemalSahin.


REFERENCES

[1] Diwekar U. M., Introduction to Applied Optimization,to be published by Kluwer Academic Publishers,Boston, MA, 2002.

[2] Bealet E.M. L., On minimizing a convex function sub-ject to linear inequalities, J. Royal Statistical Society,17B: 173.

[3] Dantzig G. B., Linear programming under uncertainty,Management Science, 1: 197.

[4] Edgeworth E.,The mathematical theoryof banking, J.Royal Statistical Society, 1888, 51: 113.

[5] Petruzzi N. C. and M. Dada, Pricing and thenewsvendor problem: a review with extensions, Op-erations Research, 1999, 47(2): 183.

[6] Vajda S., Probabilistic programming, Academic Press,New York, 1972.

[7] Nemhauser, G. L., A. H. G. Ronnooy Kan, and

M. J. Todd, Optimization: Handbooks in operationsresearch and management science, Vol. 1. North-Holland Press: New York, 1989.

[8] Madansky A., Inequalities for stochastic linear pro-gramming problems, Man. Sci., 1960, 6: 197.

[9] Tintner G., Stochastic linear programming with ap-plications to agricultural economics, In Proc. 2ndSymp. Lin. Progr. , Washington, 1955, 197.

[10] Taguchi G., Introduction to Quality Engineering, AsianProductivity Center, Tokyo, Japan, 1986.

[11] Raiffa H. and R. Schlaifer, Applied Stastical Deci-sion Theory, Harvard University, Boston, MA, 1961.

[12] Charnes A. and W. W. Cooper, Chance-constrainedprogramming, Management Science, 1959, 5: 73.

[13] Luckacs, Characteristic Functions, Charles Griffin,London, 1960.

[14] James B. A. P., Variance reduction techniques, J. Op-erations Research Society, 1985, 36(6): 525.

[15] McKay M. D., R. J. Beckman, and W. J. Conover,A comparison of three methods of selecting valuesof input variables in the analysis of output from acomputer code, Technometrics, 1979, 21(2): 239.

[16] Iman R. L. and W. J. Conover, Small sample sensi-tivity analysis techniques for computer models, withan application to risk assessment, Communicationsin Statistics, 1982, A17: 1749.

[17] Saliby E., Descriptive sampling: A better approach toMonte Carlo simulations, J. Operations ResearchSociety, 1990, 41, no.12, 1133.

[18] Kalgnanam J. R. and U. M. Diwekar, An efficientsampling technique for off-line quality control, Tech-nometrics, 1997, 39(3): 308.

[19] Knuth D. E.,The Art of Computer Programming, Vol-ume 1: Fundamental Algorithms, Addison-WesleyPublishing Co., 1973.

[20] Morgan G. and M. Henrion, Uncertainty: A Guideto Dealing with Uncertainty in Quantitative Risk andPolicy Analysis, Cambridge University Press, Cam-bridge, UK, 1990.

[21] Birge, J. R. and F. Louveaux, Introduction tostochastic programming, Springer-Verlag, New York,

1997.[22] Dantzig G. B. and G. Infanger, Large scale stochas-

tic linear programs–Importance sampling and benderdecomposition, Computational and Applied Mathe-matics, Brezinski and U. Kulisch Eds., 1991, 111.

[23] Higle, J. L. and S. Sen, Stochastic decomposition:an algorithm for two-stage linear programs with re-course, Mathematics of Operations Research, 1991,16(3): 650-669.

[24] Painton, L. A. and U. M. Diwekar, Stochastic an-nealing under uncertainty, European Journal of Op-erations Research, 1995, 83: 489.

Two-Stage Stochastic LinearProgramming: A Tutorial

Vicente Rico-RamirezInstituto Tecnologico de Celaya, Mexico

([email protected])1

1. Introduction

There is a huge body of literature on stochastic lin-ear programming including surveys and numerousarticles (e.g., [2, 3, 6]). The main class of stochasticlinear problems with recourse (SLPwR) involves twostages. In the first stage, the choice of the decisionvariable x is made. In the second stage, followingthe observation of the value of u and the evaluationof the objective function, a corrective action (repre-sented by the second stage decisions, v) is suggested.

The standard mathematical form of a SLPwRproblem is given by Equations (1) and (2). Equa-tion (1) is referred as the first stage problem :

Min cTx+Q(x)s. t. Ax α b (1)

x>0

where A is a coefficient matrix, c is a coefficient vec-tor, α is used to represent any of =,> or <, Q(x) isthe recourse function defined by Q(x) = Eu[Q(x, u)],and Q(x, u) is obtained from the second stage prob-lem, Equation (2):

Q(x, u) = Min qT (u) vs. t. W (u) v α h(u)− T (u) x (2)

1Visiting scientist, CUSTOM, Carnegie Mellon University.


v>0

where q is a coefficient vector and W , h and T arecoefficient matrices which in principle might dependon the random variables u. The matrix W is of par-ticular interest and is known as the recourse matrix.

Observe that the problem given by Equation (1)has linear constraints and a convex objective func-tion, so that there is a rather complete optimiza-tion theory with necessary and sufficient conditions.However, SLPwR problems are difficult to solve sincein general it is very demanding to generate evalua-tions of the recourse function (Q(x)) and its gradi-ent. Some of the difficulties disappear when the re-course structure is simple, as in problems with fixedand complete recourse. Fixed recourse means thatthe recourse matrix, W , is independent on u (sim-ply a coefficient matrix), whereas complete recoursemeans that any set of values that we choose for thefirst stage decisions, x, leaves us with a feasible sec-ond stage problem.

The two main algorithms for stochastic linearprogramming with fixed recourse are the L-Shapedmethod [1, 2] and the Stochastic Decomposition al-gorithm (SD) [4, 5]:

1. The L-Shaped method is used when the randomvariables of the problem are described by dis-crete distribution functions. As a result, ex-act computations for the lower bound of therecourse function are possible.

2. On the other hand, the SD algorithm uses sam-pling when the random variables are repre-sented by continuous distribution functions.As a result, estimations for the lower boundof the recourse function are based on expecta-tion.

2. Feasibility and Optimality Cuts

Both of the algorithms (L-Shaped and SD) are basedon the addition of linear constraints (known as cuts)to the first stage problem. Hence, there are two typesof cuts successively added during the solution proce-dure: Feasibility cuts and optimality cuts.

A feasibility cut is a linear constraint which en-sures that a first stage decision is second stage fea-sible. Notice that a complete recourse problem does

not need the addition of feasibility cuts. There ex-ists a formal notation to indicate that any decisionwe take on the first stage should leave us with a sec-ond stage problem which is feasible:

Min cTx+Q(x)s. t. x ∈ (K1 ∩K2) (3)

where the set K1 contains any possible decisionswhich satisfy the constraints of the first stage, whilethe set K2 contains those decisions which are secondstage feasible. That is, K1 = { x| Ax = b, x> 0}and K2 = { x|Q(x)< ∞}. Very often, the problemin Equation 3 is reformulated as:

Min cTx+ θ

s. t. Q(x) < θ

x ∈ (K1 ∩K2) (4)

On the other hand, an optimality cut is a linearapproximation of Q(x) on its domain of finiteness,and is determined based on the dual of the secondstage problem. As such, each optimality cut providesa lower bound (linear support) on Q(x). The dualof the second stage problem defined by Equation (2)is given by:

Max πT (h − Tx)s. t. πT W < q (5)

where the dual variables, π, corresponds also to thelagrange multipliers of the problem given by Equa-tion (2). Next we show how these types of cuts arederived ([2]).

2.1 Optimality Cut

Assume that we are solving a SLPwR with eitherthe L-Shaped or the SD method. Assume also thatwe are performing the ν − th iteration of the algo-rithm which considers the k − th realization of therandom variables. Using the duality theorem, thevalue of the objective function of the second stageproblem would be given by:

Q(xν , uk) = (πνk)T (hk − Tk xν) (6)

and therefore, because of convexity:

Q(x, uk) > (πνk)T (hk − Tk x) (7)


If we consider now a discrete distribution for u (L-Shaped method) and assume that the probability forthe k − th realization of u to happen is pk, thenthe expected value of the objective function givenby Equation (6) is:

Q(xν) = Eu[Q(xν , uk)] = Eu[(πνk)T (hk − Tkxν)]

=∑K

k=1 pk[(πνk)T (hk − Tk x

ν))]

(8)

Hence, also because of convexity:

Q(x)>K∑k=1

pk[(πνk)T (hk − Tk x)

](9)

Finally, by defining

g =K∑k=1

pk(πνk)T hk, G =K∑k=1

pk(πνk)T Tk

and recalling that θ > Q(x), then by substituting gand G in Equation (9), we get the optimality cut asa result:

G x + θ > g (10)

2.2 Feasibility cut

As explained above, the role of a feasibility cut is toensure that a first stage decision is second stage feasi-ble. A first stage decision xν is second stage feasibleif a finite vector v exists such that the constraints ofthe second stage problem are satisfied. To test forthe feasibility of such constraints, one can solve theproblem:

z = Min eT (v+ + v−)s. t. W v + v+ − v− = h − T xν(11)

v>0, v+>0, v−>0

or its equivalent dual:

z = Max σT (h − T xν)s. t. σT W < 0 (12)

|σ| < e

where e is the vector of ones. Note that the prob-lem given by Equation (11) comes from the modifi-cation of the second stage problem by adding thepositive variables v+ and v−. Clearly, z>0. If

z = 0, it means that the second stage constraintsare satisfied and, therefore, xν is second stage fea-sible. However, if z > 0, then the original secondstage problem is infeasible (and its dual unbounded).Hence, the second stage problem is infeasible becausez = (σν)T (h − T xν) is greater than zero (recallthe duality theorem). Therefore, in order to ensurethe feasibility of the second stage problem, the con-straint:

(σν)T (h − T x) < 0 (13)

must be added. If for some k − th realization of therandom variable the second stage problem is infeasi-ble, then, according to Equation (13), we define:

D = (σν)T Tk, d = (σν)T hk

in order to obtain the feasibility cut:

D x > d (14)

3. SLPwR Algorithms

In this section the algorithmic steps of the L-Shapedmethod and the SD method are given.

3.1 L-Shaped Algorithm

The steps of the L-Shaped method are the following([2]):

• Step 0 Set r = s = ν = 0

• Step 1 Set ν = ν + 1 and solve the so calledCurrent Problem (CP)

Min cTx+ θ

s. t. Ax = b

Dl x > dl l = 1 · · · rGl x + θ > gl l = 1 · · · s

Let xν and θν be the optimal solution. If notoptimality cuts exist (s = 0), set θν = −∞ anddo not consider it in CP.

• Step 2 For k = 1 · · · K (K is the number ofdiscrete realizations) solve the problem:

z = Min eT (v+k + v−k )


s. t. W vk + v+k − v−k = hk − Tk xν

vk>0, v+k >0, v−k >0

If for any k the optimal value is z > 0, add afeasibility cut, set r = r + 1 and return to Step1. Otherwise go to Step 3. The feasibility cutis given by

Dr+1 x > dr+1

where Dr+1 = (σνk)T Tk, dr+1 = (σνk)T hk, andσνk are the Lagrange multipliers of the aboveproblem.

• Step 3 For k = 1 · · · K solve the problem:

Min qTk vk

s. t. W vk = hk − Tk xν

vk>0

(or its dual) to calculate the Lagrange multipli-ers πνk and define:

gs+1 =K∑k=1

pk(πνk)T hk (15)

Gs+1 =K∑k=1

pk(πνk)T Tk (16)

uν = gs+1 − Gs+1 xν

If θν > uν , stop, xν is the optimal solution. Oth-erwise, add the optimality cut:

θν > gs+1 − Gs+1 x

Set s = s+ 1, and return to Step 1.

3.2 SD Algorithm

In the SD algorithm, it is necessary to sample at eachiteration from a continuous probability distribution.For that reason, the estimation of the lower boundof Q(x) is based on expectation. In the L-Shapedalgorithm, the optimality cut is calculated in termsof Equations (15) and (16). On the other hand, inthe SD algorithm it is calculated in terms of:

gν =1ν

ν∑k=1

(πνk)Thk, Gν =1ν

ν∑k=1

(πνk)TTk

where ν is the number of samples (which is equal tothe number of iterations). The algorithm presentedhere corresponds to that provided by [5]. Higle andSen [4, 5] assume complete recourse (no feasibilitycuts are needed) and provide a simplification involv-ing a restricted solution set for the dual of the secondstage problem in order to decrease the computationaleffort. The steps of that algorithm are given next.

• Step 0 Set ν = 0, V0 = {∅}, θν = −∞. x1 isassumed as given.

• Step 1 Set ν = ν + 1 and sample to generatean observation uν independent of any previousobservation.

• Step 2 Determine the coefficients of a piecewiselinear approximation to Q(x):

a) Solve the program

Max πT (hν − Tν xν)

s. t. πT W < q

to find the values of the vector π, πνν , andmake Vν = Vν−1 ∪ πνν

b) Get the coefficients of the optimality cut

gνν =1ν

ν∑k=1

(πνk)Thk, Gνν =1ν

ν∑k=1

(πνk)TTk

where πνk is the solution to the problem (forall k|k 6= ν):

Max πT (hk − Tk xν)

s. t. π ∈ Vν

Observe that the solution vector to thisproblems can only be one of the vectorsalready included in the set of solutions, Vν(the solution set is restricted to decreaseeffort).

c) Update the coefficients of previous cuts

gνk = ν−1ν gν−1

k k = 1 · · · ν − 1Gνk = ν−1

ν Gν−1k k = 1 · · · ν − 1

• Step 3 Solve

Min cTx+ θν


s. t. Ax = b

θν +Gνk x> gνk k = 1 · · · ν

to obtain xν+1. Go to Step 1

• The algorithm stops if the change in the objec-tive function is small or if no new dual vectorsare added to the set of dual solutions, V .

4. Illustrative Examples

This section presents in detail the beginning of thesolution procedure of two illustrative examples in or-der to show the application of both of the algorithms.

4.1 L-Shaped Method

The example consists on solving the problem:

Min −0.75x+ Eu[Q(x, u)]s. t. x < 5 (17)

x>0

where

Q(x, u) = Min − v1 + 3v2 + v3 + v4

s. t. − v1 + v2 − v3 + v4 = u+ 12 x

−v1 + v2 + v3 − v4 = 1 + u+ 14 x

v1, v2, v3, v4>0 (18)

and u is defined by a uniform discrete distribution,with K = 11 realizations between 0 and -1 {0,-0.1,-0.2,-0.3,-0.4,-0.5,-0.6,-.07,-0.8,-0.9,-1.0}. Each real-ization has a probability pk = 1/11. The dual of thesecond stage problem, Equation (18), is given by:

Q(x, u) = Max π1(u+ 12x) + π2(1 + u+ 1

4x)s. t. − π1 − π2 < − 1

π1 + π2 < 3−π1 + π2 < 1π1 − π2 < 1 (19)

Solution

• Step 0 r = s = ν = 0

• Step 1 ν = 1, θ1 = −∞. Solve:

Min −0.75x

s. t. x < 5x>0

the solution is x1 = 5.

• Step 2 For k = 1 · · · K solve:

z = Min v+k1 + v+

k2 + v−k1 + v−k2

s. t. − vk1 + vk2 − vk3 + vk4 + v+k1

−v−k1 = uk + 12 x

1

−vk1 + vk2 + vk3 − vk4 + v+k2

−v−k2 = 1 + uk + 14 x

1

vk1, vk2, vk3, vk4>0v+k1, v

+k2, v

−k1, v

−k2>0 (20)

Recall that x1 = 5. For instance, for k = 1,uk = 0 (first discrete value), then problem (20)would become:

z = Min v+11 + v+

12 + v−11 + v−12

s. t. − v11 + v12 − v13 + v14 + v+11

−v−11 = 2.5−v11 + v12 + v13 − v14 + v+

12

−v−12 = 2.25v11, v12, v13, v14>0v+

11, v+12, v

−11, v

−12>0 (21)

The solution to (21) is z = 0. The results of allof the problems (k = 1 · · · 11) are summarized inTable 1. Variables not shown are equal to zero.Observe that no feasibility cuts are needed inthis iteration (z = 0 in all of the cases). Asa matter of fact, the example used here is aproblem with complete recourse.

• Step 3 For k = 1 · · · K solve:

z = Min − vk1 + 3vk2 + vk3 + vk4

s. t. − vk1 + vk2 − vk3 + vk4 =uk + 1

2 x1

−vk1 + vk2 + vk3 − vk4 =1 + uk + 1

4 x1

vk1, vk2, vk3, vk4>0 (22)

Recall that x1 = 5. For instance, for k = 1,uk = 0, then Equation (22) would become:

z = Min − v11 + 3v12 + v13 + v14


Table 1: Determining feasibility of the second stage.k uk zk vk2 vk4

1 0 0 2.375 0.1252 -0.1 0 2.275 0.1253 -0.2 0 2.175 0.1254 -0.3 0 2.075 0.1255 -0.4 0 1.975 0.1256 -0.5 0 1.875 0.1257 -0.6 0 1.775 0.1258 -0.7 0 1.675 0.1259 -0.8 0 1.575 0.12510 -0.9 0 1.475 0.12511 -1.0 0 1.375 0.125

s. t. − v11 + v12 − v13 + v14 = 2.5−v11 + v12 + v13 − v14 = 2.25

v11, v12, v13, v14>0 (23)

The solution to Equation (23) is z = 3.5, v12 =2.375, v14 = 0.125, π1

11 = 2, π112 = 1. The

results of all of the problems (k = 1 · · · 11) aresummarized in Table 2.

Recall that pk = 1/11. Also note that, for in-stance,

(π11)T h1 = π1

11 (u1) + π112 (1 + u1)

(π11)T T1 = π1

11 (−12) + π1

12 (−14)

so that g1 = −0.5 and G1 = 1.25. Therefore,the optimality cut is θ> − 0.5 + 1.2x. Now wereturn to Step 1 and begin another iteration.

• Step 1 ν = 2. Solve:

Min −0.75x + θ

s. t. x < 5θ > − 0.5 + 1.25 x (24)

x>0

The solution is x2 = 0 and θ2 = -0.5. The proce-dure continues until optimal solution. The solu-tion to the problem is x = 0.533 and θ = 1.142.

4.2 Stochastic Decomposition

The example consists of solving again the problemdefined by Equations (17) and (18). However, u is

Table 2: Obtaining an optimality cut.k uk zk vk2 vk4 π1

k1 π1k2

1 0 3.5 2.375 0.125 2 12 -0.1 3.2 2.275 0.125 2 13 -0.2 2.9 2.175 0.125 2 14 -0.3 2.6 2.075 0.125 2 15 -0.4 2.3 1.975 0.125 2 16 -0.5 2.0 1.875 0.125 2 17 -0.6 1.7 1.775 0.125 2 18 -0.7 1.4 1.675 0.125 2 19 -0.8 1.1 1.575 0.125 2 110 -0.9 0.8 1.475 0.125 2 111 -1.0 0.5 1.375 0.125 2 1

now defined by a uniform continuous distribution(−1, 0), so that sampling is necessary. Here, a single-dimension uniform sampling is used.

Solution. Iteration 1

• Step 0 ν = 0, θ0 = −∞, V0 = {∅}, x1 = 0

• Step 1 ν = 1. From the sampling, u1 = −0.3.

• Step 2

a) Solve the problem

Max π111(u1 + 1

2x1) + π1

12(1 + u1 + 14x

1)= − 0.3 π1

11 + 0.7 π112

s. t. − π111 − π1

12 < − 1π1

11 + π112 < 3

−π111 + π1

12 < 1π1

11 − π112 < 1

The solution is π11 = [π1

11 π112]T = [1 2]T .

Set V1 = V0 ∪ π11 = {(1, 2)}

b) Coefficients of the cut for the first iteration

g11 = 1

1 [(π11)Th1] = π1

11(u1) + π112(1 + u1)

= (1)(−0.3) + (2)(0.7) = 1.1G1

1 = 11 [(π1

1)TT1] = π111(−1

2) + π112(−1

4)= (1)(−1

2) + (2)(−14) = − 1

The resulting cut is then g11−G1

1x = 1.1+x.

c) No updating of cuts is necessary in the firstiteration.


• Step 3 Solve:

Min −0.75x + θ1

s. t. x < 5θ1 > 1.1 + x

x>0

The solution is x2 = 0 and θ1 = 1.1.

Iteration 2

• Step 1 ν = 2. From sampling u2 = −0.9. Recallx2 = 0 and u1 = −0.3.

• Step 2

a) Solve the problem:

Max π221(u2 + 1

2x2) + π2

22(1 + u2 + 14x

2)= − 0.9 π2

21 + 0.1 π222

s. t. − π221 − π2

22 < − 1π2

21 + π222 < 3

−π221 + π2

22 < 1π2

21 − π222 < 1


21 π222]T = [0 1]T .

Set V2 = V1 ∪ π22 = {(1, 2), (0, 1)}

b) Get the coefficients of the cut for the sec-ond iteration. It is necessary to calculateπ2

1 by using the first sample u1. Solve:

Max π211(u1 + 1

2x2) + π2

12(1 + u1 + 14x

2)= − 0.7 π2

11 + 0.3 π212

s. t. π21 ∈ V2 = π2

1 ∈ {(1, 2), (0, 1)}


11 π212]T = [1 2]T .

Then, the coefficients of the optimalitycut for the second iteration are g2

2 =12{[1(−0.3)+2(0.7)]+[0(−0.9)+1(0.1)]} =0.6 and G2

2 = 12{[1(−1

2)+2(−14)]+[0(−1

2)+1(−1

4)]} = −0.625The resulting cut is then g2

2 −G22x = 0.6 +

0.625x.

c) Update previous cuts

g21 = 1

2g11 = 0.55, G2

1 = 12G

11 = −0.5

g21 −G2

1x = 0.55 + 0.5x

• Step 3 Solve the problem:

Min −0.75x + θ2

s. t. x < 5θ2 > 0.55 + 0.5xθ2 > 0.6 + 0.625x

x>0

the solution is x3 = 5 and θ2 = 3.725. The pro-cedure continues in the same way. The solutionto the problem after 1000 samples is x = 0.4646and θ = 1.077744.

5. Conclusion

This article presented the fundamentals and a de-scription of the two main algorithms for solvingSLPwR problems: The L-Shaped method and theStochastic Decomposition method. Two exampleswere used to illustrate in detail the application ofeach of the algorithms.

REFERENCES

[1] Birge, J. R. and F. Louveaux, A multicut algorithmfor two-stage stochastic linear programming , Euro-pean Journal of Operations Research , 1988, 34, 384-392.

[2] Birge, J. R. and F. Louveaux, Introduction tostochastic programming, Springer-Verlag, New York,1997.

[3] Bouza, C., Stochastic programming: the state of the art,Revista Investigacion Operacional, 1993, 14(2).

[4] Higle, J. L. and S. Sen, Stochastic decomposition:an algorithm for two-stage linear programs with re-course, Mathematics of Operations Research, 1991,16(3): 650-669.

[5] Higle, J. L. and S. Sen, Stochastic decomposition,Kluwer Academic Publisher, 1996.

[6] Haneveld, W. K. K. and M. H. vander Vlerk,Stochastic integer programming: general models andalgorithms, Annals of Operational Research, 1999,85, 39-57.

Combinatorial Optimization underUncertainty

Ki-Joo Kim


CUSTOM, Carnegie Mellon University, Pittsburgh, PA

15213 ([email protected])1

1. Introduction

Stochastic integer programming (SIP) refers to thatbranch of integer programming problems wherethere are uncertainties involved in the data or model.

The main difficulty of stochastic programmingstems from evaluating the uncertain functions andtheir expectations. A common method to propagatethe uncertainties is to use a sampling method, anda generalized stochastic framework for solving op-timization under uncertainty problems involves tworecursive loops: (1) the inner sampling loop and (2)the outer optimization loop (Figure 1). Since prop-agating uncertainties and evaluating the uncertainfunctions are computationally very intensive, thisarticle presents a computationally efficient SIP al-gorithm based on a new sampling method, calledHammersley sequence sampling (HSS), in the innersampling loop and the interaction between the innersampling loop and the outer optimization loop.

OptimalDesign Optimizer

Model

Decisionvariables

Objective function& Constraints

StochasticModeler

Uncertainvariables

Probabilisticobjective function

& Constraints

Figure 1: Pictorial representation of stochastic pro-gramming framework.

2. Stochastic Annealing - Theory

The stochastic annealing (STA) algorithm [1, 2, 7]is a variant of the simulated annealing [6], and is an

1Research supported by the National Science FoundationGoali Project (CTS-9729074).

algorithm designed to efficiently optimize stochas-tic integer programming problems. In the stochas-tic annealing algorithm, the optimizer not only ob-tains the decision variables but also the number ofsamples required for the stochastic model. Further-more, the stochastic annealing algorithm reduces theCPU time by balancing the trade-off between com-putational efficiency and solution accuracy by theintroduction of a penalty function in the objectivefunction. This is necessary, since at high tempera-tures the algorithm is mainly exploring the solutionspace and does not require precise estimates of anyprobabilistic function. The algorithm must select agreater number of samples as the solution nears theoptimum.

Annealing temperature schedule (more precisely,cooling schedule) is used to decide the weight (b(t))on the penalty term for imprecision in the proba-bilistic objective function. The choice of a penaltyterm also depends on the error bandwidth (ε) of thefunction that is optimized, and must incorporate theeffect of the number of samples. The new objectivefunction in stochastic annealing therefore consists ofa probabilistic objective value P and the penaltyfunction (b(t)ε), which is represented as follows:

minZ(cost) = P (x;u) + b(t)ε. (1)

The weighting function b(t) can be expressed interms of the temperature levels (t), and is given byb(t) = bo/k

t where bo and k are constants. The errorbandwidth of the MCS samples (εMCS) is estimatedfrom the central limit theorem while the error band-width of the HSS samples (εHSS) is determined froma fractal dimension analysis [2, 3].

3. Efficiency Improvements in theHSTA Algorithm

As SA is a probabilistic method, several randomprobability functions are involved in this algorithm.The random probability Aij is used for acceptancedetermination in Metropolis criterion while the ran-dom generation probabilities Gij are used to gener-ate subsequent configurational moves. It is knownthat SA is affected little by the use of different ac-ceptance probability distributions [8]. However, Gijfor generating configuration j from i can significantlyaffect the overall efficiency of the annealing process.


Gij for random selection0.00 0.05 0.10 0.15 0.20

Gij

for r

ando

m b

ump

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Gij by HSSGij by MCS

Target for random selectionTarget for random bump

Figure 2: Gij from MCS and HSS.

The Gij of the conventional SA algorithms relyon pseudo-random number generators such as MCS,which results in clustered moves over the configura-tion surface. Therefore, a larger number of movesor generations are required to cover the whole con-figuration surface evenly, and this results in a largernumber of moves (i.e., Markov chain length) at eachtemperature level. The HSS technique, a quasi -random number generator, can generate uniformsamples over the k-dimensional hypercube. In thiswork, we have used the HSS technique for the gen-eration probabilities Gij to develop a new SA algo-rithm called efficient simulated annealing (ESA).

Figure 2 shows Gij probabilities of HSS and MCSfor the following test function, f(y) =

∑10i=1 y

2i . Be-

cause there are 10 elements in the discrete decisionvector y, the ideal probability of selecting any ele-ment is 0.1, and the value of a selected element canbe randomly bumped up or down with a probabilityof 0.5. The circle and cross symbols in this figureare for the Gij values from HSS and MCS, respec-tively. Since HSS can generate more uniform sam-ples in the multivariate space, Gij from HSS is closerto the ideal probabilities than Gij from MCS. ThusHSS requires less number of moves to approximatethe ideal probabilities.

Figure 3 shows trajectories of the objective valuefor the test function with different Markov chainlengths. ESA found the global solution with aMarkov chain length of 45 at each temperature whilethe traditional SA exploited a Markov chain lengthof 75 to reach the same solution. Thus, ESA providesa significant reduction in moves at each temperature,

Markov chain length

20 40 60 80 100

Obj

ectiv

e va

lue

0

1000

2000

3000

4000

SAESA

Figure 3: Objective trajectories of ESA and SA.

and ESA is approximately 30 ∼ 54 % more efficientthan the conventional SA [4].

The same idea of using the HSS technique for thegeneration probability Gij can be applied to STA.The efficient stochastic annealing (ESTA) algorithmintegrates the HSS for the generation probability,but still uses the central limit theorem to evaluatethe sampling errors. A new variant of stochastic an-nealing, HSTA (Hammersley stochastic annealing),therefore, incorporates (i) HSS for the generationprobability Gij , (ii) HSS in the inner sampling loopfor Nsamp determination, and (iii) the HSS-specificerror bandwidth (εHSS) in the penalty term.

Table 1: Efficiency improvements.Stochastic algorithm Total movesSA + fixed Nsamp 274,200ESA + fixed Nsamp 170,000STA 5,670ESTA 3,265HSTA 1,793Test function :

z =∑10i=1

(ui xi − i

10

)2+∑10i=1

(ui y

2i

)−∏10i=1 cos(4π ui yi)

Table 1 shows total number of configurationalmoves of different stochastic optimization methods.The first two algorithms are (conventional) stochas-tic optimization algorithms with a fixed Nsamp whilethe last three algorithms are stochastic annealing al-gorithms with a varying Nsamp. We can see thatthere is significant improvement when we use theSTA algorithms instead of the conventional stochas-tic optimization methods. In addition, HSTA is 68%more efficient that the basic STA algorithm.


4. Numerical Example

This section describes step-by-step simulations of theHSTA algorithm. The test function is

min∑2

i=1

(ui × y2

i

)−20 ≤ y ≤ 20u ∼ N(0.5, 0.16)

(2)

The initial configuration of y is (20,20), and the un-certainty variable u follows normal distribution witha mean of 0.5 and a standard deviation of 0.16.2 Thesimulation conditions for HSTA are summarized inthe following Table 2.

Table 2: HSTA conditions.Initial temperature 1Temperature decrement (α) 0.85Markov chain length (I) 20Initial Nsamp 30b0 0.005k 0.940

Step 1 is to generate 20 (i.e., Markov chain length)sets of six random numbers using HSS. Two randomnumbers are used for the Gij , three random num-bers (Hk) for determining Nsamp, and the Aij forthe Metropolis criteria.Step 2 is to generate a next configuration. If Gij,1for random selection is less than 0.5, then y1 is se-lected for random bump. If Gij,2 for random bumpis less than 0.5, then the value of the selected yi isdecreased. If the bumped value resides outside thebounds, the random bump is increased to the origi-nal yi value. In this example, since theGij are 0.2031and 0.6914, y1 is increased to 21. But this is outsidethe bounds, and thus y1 becomes 19.Step 3 is to determine Nsamp for trade-off be-tween efficiency and solution accuracy. To determineNsamp, three random numbers (Hk) generated by theHSS technique are used. If H1 is less than 0.5, thenNsamp + 5 × H2 becomes a new Nsamp. Otherwise,Nsamp− 5×H3 becomes a new one. The new Nsamp

is 28.Step 4 is to generate Nsamp samples for the uncer-tain variable u using HSS. Then the probabilistic

2The 0.001 and 0.999 quantiles of this distribution are 0and 1.

Table 3: Configurations at the first TL.I y1 y2 Gij,1 Gij,2 Nsamp Penalty E[z]0 20 20 800.001 19 20 0.2031 0.6914 28 1.3E-5 219.362 19 19 0.8281 0.3580 27 1.4E-5 210.863 18 19 0.3281 0.0247 31 1.1E-5 196.784 18 20 0.5781 0.9753 35 0.9E-5 208.185 18 19 0.0781 0.6420 30 1.2E-5 208.066 18 18 0.8906 0.3086 26 1.5E-5 189.027 19 18 0.3906 0.8642 21 2.2E-5 202.63...

......

......

......

...19 15 16 0.3672 0.7160 27 1.4E-5 140.3020 14 15 0.6172 0.3827 24 1.7E-5 121.28

objective, constraints, expected value, and penaltyfunction are evaluated. The expected value of theobjective is 219.36, and the penalty term is 1.3E-5.We can see that the penalty term is very small ascompared to the expected value because of a HSS-specific ε.Step 5 is to determine if the current configurationis accepted or not based on the Metropolis criterion.Since ∆E[z] = E[z]new−E[z]old is negative, the cur-rent configuration (19,20) is accepted.Step 6 is to repeat the Steps from 2 to 5 if thecurrent iteration point is smaller than Markov chainlength. Table 3 shows this full 20 iterations at thefirst temperature level (TL).Step 7 is to check the stopping criteria and decreasetemperature. If any of the stopping criteria is sat-isfied, then the simulation is terminated with a suc-cessful result. Otherwise, the new temperature be-comes T = αT , and simulation goes back to Step 2.Table 4 shows the simulation results with respect tothe temperature level (TL). The optimum solutionis found at the 10th temperature level. This tablealso shows the STA simulation results for compar-ison. Since MCS is used for the Gij and the errorbandwidth, the STA requires a greater number oftemperature levels (i.e., 15 levels in this table).

5. Conclusion

This article presented hierarchical improvements inthe SA based algorithms for solving large-scale com-binatorial optimization problems under uncertainty,and also presented step-by-step example calculationsof the HSTA algorithm. The HSTA algorithm in-corporates the uniformity property and fast conver-


Table 4: HSTA and STA simulation results.HSTA STA with MCS

TL E[z] b(t)ε y1 y2 E[z] b(t)ε y1 y2

1 121.28 2E-5 14 15 140.82 0.126 16 152 53.41 4E-5 9 10 80.92 0.104 13 103 12.04 4E-5 4 5 38.40 0.074 10 54 0.00 4E-5 0 0 16.99 0.040 7 25 0.59 3E-5 1 -1 5.46 0.015 4 16 0.59 3E-5 1 -1 0.32 0.001 1 07 0.31 3E-5 1 0 0.32 0.001 -1 08 0.31 3E-5 1 0 0.32 0.001 -1 09 0.31 4E-5 1 0 0.61 0.001 -1 -110 0.00 5E-5 0 0 0.61 0.001 -1 -111 0.61 0.002 -1 -112 0.61 0.002 -1 -113 0.61 0.002 -1 -114 0.61 0.002 -1 -115 0.00 0.000 0 0

gence property of the HSS technique, and the errorbandwidth based on the HSS technique. Thus, it isfound that the HSTA algorithm is 68% more efficientthan the conventional STA algorithm. The HSTAalgorithm can be a useful tool for large-scale com-binatorial stochastic programming problems, and areal world case study of computer-aided moleculardesign under uncertainty can be found in the au-thor’s paper [5].

REFERENCES

[1] Chaudhuri, P. and U.M. Diwekar, Process synthe-sis under uncertainty: A penalty function approach,AIChE Journal, 1996, 42(3): 742-752.

[2] Chaudhuri, P. and U.M. Diwekar, Synthesis ap-proach to the determination of optimal waste blendsunder uncertainty, AIChE Journal, 1999, 45(8):1671-1687.

[3] Diwekar, U.M., An efficient approach to optimizationunder uncertainty, Computational Optimization andApplications (submitted), 2001.

[4] Kim, K.-J. and U.M. Diwekar, Efficient combinato-rial optimization under uncertainty – Part I. Algo-rithmic development (accepted), Ind. & Eng. Chem.Res., 2001.

[5] Kim, K.-J. and U.M. Diwekar, Efficient combinato-rial optimization under uncertainty – Part II. Appli-cation to stochastic solvent selection (accepted), Ind.& Eng. Chem. Res., 2001.

[6] Kirkpatrick, S., C.D. Gellat, Jr., and M.P. Vec-

chi, Optimization by simulated annealing, Science,1983, 220: 671-680.

[7] Painton, L.A. and U.M. Diwekar, Stochastic anneal-ing for synthesis under uncertainty, European Jour-nal of Operational Research, 83: 489-502.

[8] van Laarhoven, P.J.M. and E.H.L. Aarts, Simu-lated Annealing: Theory and Applications, ReidelPublishing Company, 1987.

Stochastic Nonlinear Optimization- The BONUS Algorithm

Kemal H. SahinCUSTOM, Carnegie Mellon University, Pittsburgh, PA

15213 ([email protected]).1

1. BONUS Background

This article describes a new algorithm for the op-timization of nonlinear, uncertain systems devel-oped by our research center [1]. General techniquesfor these types of optimization problems determinea statistical representation of the objective, suchas maximum expected value or minimum variance.Several decomposition algorithms have been devel-oped for linear problems; however, nonlinear for-mulations are solved through evaluating the modelfor a series of samples. Once embedded in an op-timization framework, the iterative loop structureemerges where decision variables are determined, asample set based on these decision variables is gen-erated, the model is evaluated for each of these sam-ple points, and the probabilistic objective functionvalue and constraints are evaluated, as shown bythe black arrows in the center of Figure 1. Whenone considers that nonlinear optimization techniquesrely on an objective function and constraints evalua-tion for each iteration, along with derivative estima-tion through perturbation analysis, the sheer num-ber of model evaluations rises significantly to renderthis approach ineffective for even moderately com-plex models.

The Better Optimization of Nonlinear UncertainSystems (BONUS) algorithm, indicated in Figure 1by the thick grey arrows, samples the solution spaceof the objective function at the beginning of the anal-ysis by using a base distribution covering the entirefeasible range. As decision variables change, the un-derlying distributions for uncertain values change,and the proposed algorithm estimates the objective

1Research supported by Sandia National Laboratories.


Fbase

Ftest

ftest

Stochastic Modeler

Model

SAMPLING LOOP

Reweighting Scheme

Cumulative Distribution Base Case Output

Probability Distribution Base Case Input

Changed Input Probability Distribution

Estimated Output Cumulative Distribution

fbase

probability probability

probability

probability

Figure 1: Density estimation approach to optimiza-tion under uncertainty.

function value based on the ratios of the probabili-ties for the current and the base distributions, whichare approximated using kernel density estimation(KDE) techniques. Thus, BONUS avoids samplemodel runs in subsequent iterations.

The algorithm can be summarized in the followingoutline:

1. Generate base samples and calculate respectiveKDEs, evaluate model for each base sample.

2. Optimization.

a. Generate sample around the decision vari-ables.

b. Determine KDEs for each sample pointwith respect to base distribution.

c. Estimate objective function value throughreweighting scheme.

d. SQP or similar state of the art nonlin-ear optimization approaches that rely onquasi-Newton methods perturb each deci-sion variable to estimate gradient informa-tion. Use same approach as in steps (a)-(c)to estimate the objective function value us-ing KDE and reweighting at the perturbedpoint.

e. Optimizer generates new vector of decisionvariables, repeat until convergence.

As seen above, the model is only evaluated for thebase sample, the objective function value is then es-timated using the base sample data and the respec-tive probabilities for the base sample points and thesamples generated during optimization.

2. Illustrative Numerical Exam-ple

To illustrate the concepts behind the BONUS algo-rithm, we present a numerical example to explainthe approach:

2.1 Model

min E[Z] = E[(x̃1 − 7)2 + (x̃2 − 4)2] (1)s.t. x̃1 ∈ N [µ = x?1, σ = 0.033 · x?1] (2)

x̃2 ∈ U [0.9 · x?2, 1.2 · x?2] (3)4 ≤ x1 ≤ 10 (4)0 ≤ x2 ≤ 5 (5)

Here, E represents the expected value, and the goalis to minimize the mean of the objective function cal-culated for two uncertain decision variables, x1 andx2. The optimizer determines the value x?1, whichhas an underlying normal distribution with ± 10%of the nominal value of x?1 as the upper and lower0.1% quantiles. Similarly, x̃2 is uniformly distributedaround x?2, with cutoff ranges at [−10%,+20%].

Step 1: The first step in BONUS is determining thebase distributions for the decision variables, followedby generating the output values for this model. Inorder to capture the entire possible range of uncer-tain variables, these base distributions have to coverthe entire range, including variations. For instance,for x2, the range extends to (0 · 0.9) ≤ x2 ≤ (5 · 1.2)to account for the uniformly distributed uncertainty.Due to space limitations, the illustrative presenta-tion of the kernel density and reweighting approachis performed for a sample size of 10, while the re-mainder of the work uses N = 100 samples. A sam-ple realization using Monte Carlo sampling is givenin Table 1.

After this sample is generated, KDE for the basesample is applied to determine the probability ofeach sample point with respect to the sample set.This is performed for each decision variable sepa-rately by approximating each point through a Gaus-sian kernel, and adding these kernels to generate theprobability distribution for each point, as given in


Table 1: Base sample.Sample No. x1 x2 Z

1 5.6091 0.3573 15.20352 3.7217 1.9974 14.75763 6.2927 4.2713 0.57384 7.2671 3.3062 0.55275 4.1182 1.3274 15.44786 7.7831 1.5233 6.74727 6.9578 1.1575 8.08188 5.4475 3.6813 2.51199 8.8302 2.9210 4.513710 6.9428 3.7507 0.0654

Mean 6.2970 2.4293 —Std. Dev 1.5984 1.3271 —

Equation 6 [2].

f̂(xi(k)) =1

N · h

N∑j=1

1√2π· e−

12

(xi(k)−xi(j)

h

)2

(6)

Here, h is the width for the Gaussian kernel anddepends on the variance σ and sample size N of thedata set as follows:

h = 1.06 · σ ·N−15 (7)

For our example, h(x1) = 1.06 · 1.5984 · 10−0.2 =1.0690 and h(x2) = 1.06 · 1.3271 · 10−0.2 = 0.8876.Using the first value, one can calculate f̂(x1(1)) =

110·1.0690

∑10j=1

1√2π· e−

12

(5.6091−x1(j)

1.0690

)2

= 0.1769. Thisstep is repeated for every point, resulting in the ker-nel density estimates provided in Table 2.

Table 2: Base sample kernel density estimates.x1 f̂(x1) x2 f̂(x2)

5.6091 0.1769 0.3573 0.12773.7217 0.0932 1.9974 0.21146.2927 0.2046 4.2713 0.16027.2671 0.2000 3.3062 0.21904.1182 0.1110 1.3274 0.20687.7831 0.1711 1.5233 0.21176.9578 0.2090 1.1575 0.19925.4475 0.1691 3.6813 0.21008.8302 0.0920 2.9210 0.21526.9428 0.2092 3.7507 0.2063

Step 2: All these steps were preparations for theoptimization algorithm, where repeated calculations

of the objective function will be bypassed throughthe reweighting scheme.Step 2a: For the first iteration, assume that theinitial value for the decision variables is x1 = 5 andx2 = 5. For these values, another sample set is gen-erated, as shown in Table 3, accounting for the un-certainties described in Equations 2 and 3.

Table 3: Sample - optimization iteration 1.Sample No. x̃1 x̃2

1 4.7790 5.76252 4.9029 5.57403 5.0347 5.91994 4.9686 5.86975 4.9001 5.99676 4.9819 5.12817 5.0316 5.48778 5.0403 5.48419 4.9447 5.755710 5.0344 4.7531

Mean 4.9618 5.5731Std. Dev 0.0836 0.3862

The expected value of Z, which is to be minimizedduring optimization, is calculated as 6.7695. Thisvalue will be estimated using a reweighting approach[3], given in Steps 2b and 2c.Step 2b: Now, the KDE for the sample (f(xi))generated around the decision variables has to becalculated. The Gaussian kernel width h(x̃1) =1.06 · 0.0837 · 10−0.2 = 5.598 · 10−2. Using this value,one can calculate f(x1(1)) = 1

10·5.598·10−2

∑10j=1

1√2π·

e− 1

2

(5.609−x̃1(j)

5.598·10−2

)2

= 5.125 · 10−23. Again, this step isrepeated for every point of the sample with respectto the base distribution data, resulting in the kerneldensity estimates provided in Table 4.Step 2c: Using these and the base KDE values,weights are calculated for each sample point j as

ωj =f(x1(j))

f̂(x1(j))· f(x2(j))

f̂(x2(j)), j = 1, ..., N (8)

In our illustrative example, the only two non-zeroweights are ω5 = 1.699·10−68) and ω8 = 4.152·10−15.These weights are normalized and multiplied withthe output of the base distribution to estimate the


Table 4: Optimization iteration 1 - KDE.No x1 f(x1) x2 f(x2)1 5.6091 5.125·10−23 0.3573 02 3.7217 0 1.9974 2.989·10−26

3 6.2927 0 4.2713 2.777·10−2

4 7.2671 0 3.3062 2.376·10−8

5 4.1182 3.918·10−31 1.3274 9.958·10−40

6 7.7831 0 1.5233 1.745·10−35

7 6.9578 0 1.1575 1.303·10−43

8 5.4475 5.218·10−12 3.6813 2.826·10−5

9 8.8302 0 2.9210 1.844·10−12

10 6.9428 0 3.7507 8.311·10−5

objective function value:

Eest[Z] =N∑j

ωj · Z(j) (9)

For our illustrative example, this reduces to

Eest[Z] = ω8 ·Z(8) = 1.0000 · 2.5119 = 2.5119 (10)

as the normalization eliminates all but one weight.Note that this illustrative example was developedwith an unrealistically small sample size. Hence,the efficiency of the estimation technique cannot bejudged from this example. Further, due to the in-accuracy of the estimate resulting from the smallsample size, we will not present results for Steps2d and 2e for just 10 samples, but use 100 sam-ples. Also note that Steps 2d and 2e basically re-peat the procedures in Steps 2a through 2c for anew sample set around a perturbed point, for in-stance x1 + ∆x1 = 5 + 0.001 · 5 = 5.005.

Table 5: Optimization progress at N = 100.Iteration x1 x2 Eest[Z]

0 5.000 5.000 5.9581 9.610 2.353 9.2382 7.065 3.814 0.258

The results obtained using the BONUS algorithmfor optimization converge to the same optimal solu-tion as obtained using a ’brute force’ analysis nor-mally used in stochastic NLPs where the objectiveis calculated for each iteration by calculating theobjective function value for each generated sample

point. In this example BONUS used only 100 modelruns, while the brute force optimization evaluatedthe model 600 times for the two iterations.

The BONUS Algorithm has been applied to ex-tend studies in off-line quality control of continu-ously stirred tank reactors to include optimization,and to capacity expansion studies for electric utili-ties. For further information, please contact the au-thor at [email protected].

REFERENCES

[1] Sahin, K.H. and U.M. Diwekar, A New Algorithmfor Stochastic Nonlinear Programming: A Reweight-ing Approach for Derivative Approximation, Session290, Annual Meeting, American Institute of Chemi-cal Engineers (AIChE), Reno, NV, 2001.

[2] Silverman, B.W, Density Estimation for Statistics andData Analysis, Chapman & Hall (CRC Reprint),Boca Raton, USA, 1998.

[3] Hesterberg, T., Weighted Average Importance Sam-pling Defensive Mixture Distributions, Technomet-rics, 1995, 37: 185.

[4] Kalagnanam, J.R. and U.M. Diwekar, An EfficientSampling Technique for Off-line Quality Control,Technometrics, 1997, 39: 308.

Bulletin

Call for Papers for Special Issue ofAnnals of Operations Research on“Applied Optimization underUncertainty”

Robust decision making under uncertainty is offundamental importance in numerous disciplines andapplication areas. For many practical issues, de-cision making involves multiple, often conflicting,goals and poses a challenging and complex optimiza-tion problem. Recent innovations in the techniquesunderlying multi-objective optimization, in the char-acterization of uncertainties, and in the ability to de-velop and apply these methods outside of traditionalapplication domains greatly enhances their utilityand promise. The main focus of this special issueof Annals of Operations Research on “Applied Opti-mization Under Uncertainty” is to provide research


papers related to systematic algorithms, methods,and approaches for rapid and reliable multi-objectivedecision making under uncertainty and to success-fully apply these methods in diverse application ar-eas. The wide scope of this area can be over-whelming and mandates extensive interactions be-tween various disciplines both in the development ofthe methods and in their fruitful application to realworld problems. The issue is dedicated to bringingtogether papers on the development of the theory ofoptimization under uncertainty methods and appli-cations to practical problems. Papers on perspec-tives and overviews that pose new challenges for thedevelopment of the theory and methods are also en-couraged.

Annals of Operations Research is one of the mostrenowned journals in the field. The Center for Un-certain Systems: Tools for Optimization and Man-agement (CUSTOM), Carnegie Mellon University isholding a mini-conference, sponsored by Sandia Na-tional Laboratories, on the topic of “Applied Op-timization and Under Uncertainty,” in December2001. This special issue and mini-conference providean excellent opportunity to advance the knowledgeof this evolving and widely applicable area.

AREAS OF INTEREST. For the special is-sue, we expect original papers of high quality pro-viding theoretical and/or computational results foralgorithms and applications of optimization underuncertainty, and multi-objective optimization. Weinvite papers in the following general areas:

1. Algorithms, Methods, and Tools for Multi-objective, Stochastic Programming and Opti-mization

2. Challenges and Opportunities for Multi-objective, Stochastic Programming andOptimization

3. Materials and Molecular Modeling4. Manufacturing, Planning, and Management5. Energy and Environment

REVIEWING. The submitted papers will bepeer-reviewed in the same manner as any other sub-mission to a leading international journal. The ma-jor acceptance criterion for a submission is the qual-ity and originality of the contribution.

SUBMISSION. The deadline for submission isMay 1, 2002. (This is a Sunday.)

Manuscripts must be written in English andshould be submitted electronically in a platform-independent format such as postscript or pdf. Pleasesend your submission to [email protected]. Pleasefollow the instructions for authors for Annals of Op-erations Research.Best Regards,

Urmila Diwekar, CUSTOM, Carnegie Mellon Uni-versity, Pittsburgh, PA 15213.Reha Tutuncu, CUSTOM, Carnegie Mellon Uni-versity, Pittsburgh, PA 15213.Andrew Schaefer, CUSTOM, University of Pitts-burgh, PA 15261.

Comments from the Chairand Editor

Seventh SIAM Conference on Optimization

Greetings to all members of SIAG/OPT and read-ers of the Newsletter.

After Sept 11, many of us contacted each otherby email just to say hello to friends. As one emailmessage stated: “It is hard to imagine the hatredthat could lead people to do this.” and: “Give hugsto everyone you know today!” and: “The world haschanged forever today.” The events following Sept.11 were also harrowing and stressful. The organizers(including me) of the upcoming conference on opti-mization (in May in Toronto) were expecting a poorturnout, with few people risking airtravel. However,I am happy to announce that this is not the case.The program schedule for the Seventh Siam Confer-ence on Optimization is ready and it is full; talksstart at 8:15AM and end at 7PM (followed by re-ceptions, poster sessions and meetings for some ofus). One problem was that there were not enoughslots for all the contributed talks. (Thanks go toAriela Sofer and Tom Coleman for all the hard workinvolved in making up the schedule.) We (the orga-nizers) hope to see you all in Toronto this spring.

Two short courses (on Numerical Optimization- Algorithms and Software and on Automatic

http://www.siam.org/confpart/program.cfm?CONFCODE=OP02

http://www.siam.org/meetings/op02/index.htm



Differentiation) immediately preceed the confer-ence, on Sunday May 19. Sunday is also DonGoldfarb Day, with Technical Sessions from 5:15PM to 7:00 PM and a Banquet at 7:30 PM.In addition, there will be a group of visitors(in optimization) before/during/after this SIAMConference on Optimization. The group will bevisiting the Fields Institute Toronto, Ontario, seeURL: www.fields.utoronto.ca/. This is part of theThematic Year on Numerical and ComputationalChallenges in Science and Engineering (NCCSE)from August 2001 to July 2002. A schedule ofvisitors and talks is available with URL:http://orion.math.uwaterloo.ca/ hwolkowi/henry/reports/talks.d/t02talks.d/02optfields.d/group.htmlFollowing the conference, there will also be two(Fields) workshops:May 23 - 25, 2002, Workshop on Validated Com-puting, Harbour Castle Hotel, Toronto, Organizers:George Corliss, Ken Jackson, Baker Kearfott,Vladik Kreinovich, Weldon Lodwick;May 27 - 31, 2002, Informal Working Group onValidated Methods for Optimization , Organizers:George Corliss, Tibor Csendes, Ken Jackson, BakerKearfott.

SIAM 50th Anniversary and 2002 AnnualMeeting The annual meeting this summer is alsoa celebration of SIAM’s 50th birthday. This specialannual meeting will look at the strides made by in-dustrial and applied mathematics during the past 50years. The meeting themes cover SIAM’s interests,including optimization.

Miscellaneous We need to thank Juan Mezaagain for his excellent job as editor of our Newslet-ter. This is the last issue edited by him. I ampleased to announce that Jos Sturm has acceptedthe SIAG/OPT committee’s invitation to be the neweditor.

The SIAG/OPT Web site, URL:www.siam.org/siags/siagopt.htm,points to our own SIAG/OPT webpage,handled by Natalia Alexandrov URL:mdob.larc.nasa.gov/staff/natalia/siagopt/.We will continue to make this web page useful andinteresting.

We continue to encourage members with homepages to register their home pages with SIAM. Ifyou wish to be listed, send a message to Laura Hel-frich at [email protected] with your name and theURL for your Web page.

The e-mail forum continues. You can use this fo-rum for technical questions, announcements of pa-pers, conferences, books, and software. In partic-ular, technical questions are encouraged. Perhaps,this will lead to interesting discussions. You can usethis forum by sending a message to [email protected].

Here is one question that I find interesting. Theclass of problems that we can solve today in opti-mization has grown tremendously, due both to soft-ware and hardware. In particular, the size of prob-lems has grown and the speed has decreased dra-matically. The motivation behind many papers nowis larger and faster. However, there seems to bemuch less emphasis put on the quality of the so-lution. (Though some work on robust optimizationis being done.)

Question: Which algorithms can solve largeclasses of large scale problems robustly?Which property is more important for analgorithm: speed or robustness?

Multi-Media

1. Optimization Online deserves being mentionedagain, URL: www.optimization-online.org/.This is a repository for eprints.

2. Eric Weisstein’s World of Mathematics is (back)at: mathworld.wolfram.com/. This is definitelyworth a look - every day.

3. The Mathematics Genealogy Project is at:hcoonce.math.mankato.msus.edu/. The intentof this project is to compile information aboutall the mathematicians of the world. They so-licit information from all schools who partici-pate in the development of research level math-ematics and from all individuals who may knowdesired information. (Go and research your ge-nealogy tree!)



http://www.fields.utoronto.ca/

http://www.fields.utoronto.ca/numerical.html

http://www.fields.utoronto.ca/numerical.html

http://www.siam.org/meetings/SIAM50/

http://www.siam.org/siags/siagopt.htm

http://mdob.larc.nasa.gov/staff/natalia/siagopt/

http://www.optimization-online.org/

http://mathworld.wolfram.com/

http://hcoonce.math.mankato.msus.edu/

siag/opt views-and-news optimization under uncertainty

Documents