bayesian optimal design for prediction of correlated processes

1
Bayesian optimal design for prediction of correlated processes Maria Adamou Sue Lewis, Sujit Sahu and Dave Woods University of Southampton [email protected] Introduction Data collected from correlated processes arise in many diverse application areas including studies in environmental and ecological science and in both real and computer experiments. Often the main aim of the study is to predict the process at unobserved points. In this work, we illustrate a new approach for the selection of Bayesian optimal designs using applications from both spatial statistics and computer experiments. AMotivating Example Networks of monitoring stations are regularly used to collect data to measure pollutant levels in water or air. Figure 1 shows such a network in the Eastern USA for monitoring chemical deposition. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * monitoring stations Figure 1: Network of monitoring stations in the Eastern USA. An important problem is to identify the best locations of stations across the region of interest (Diggle and Lophaven, 2006; Zimmerman, 2006). Usually, observations at geographically close stations are assumed to be highly correlated, and this correlation helps to predict deposition levels at unobserved locations. Statistical model We assume a Gaussian regression model Y(x)|θ, σ 2 , φ, τ 2 N (X θ, σ 2 C(φ)+ τ 2 I ) (1) where X is the model matrix, θ contains regression coecients, C is a correlation matrix from a stationary and isotropic correlation function determined by decay parameter φ, and σ 2 , τ 2 are the spatial and pure error (nugget) terms respectively. If a Bayesian approach is adopted, the model specification is completed by assignment of prior distributions to the unknown parameters. Bayesian optimal design The aim of our experiments is to enable accurate prediction of the response Y(x) at unobserved x. We adopt a decision theoretic approach (Chaloner and Verdinelli, 1995) to find Bayesian optimal designs that minimise the posterior predictive variance. To avoid the computational burden usually associated with Bayesian designs we have developed a new closed form approximation that allows quick calculation of the variance. The designs are found using the coordinate exchange algorithm (Meyer and Nachtsheim, 1995), and illustrated with the following examples. Example 1-Optimal designs for spatial data Here observations are assumed to follow model (1) with I covariance function ρ(d ij ; φ)= exp(-φd ij ), where d ij is the Euclidean distance between points x i , x j [-1, 1] 2 I prior distributions θ|σ 2 N (0, σ 2 I ) , σ 2 Inverse Gamma(3, 1) and φ Uniform(0.1, 1) I ν 2 = τ 2 2 is fixed at one of four values: 0, 0.5, 1, 2.5. Optimal designs with n = 10 points are shown in Figure 2. 0.2 0.4 0.6 0.8 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 ν 2 = 0 x 1 x 2 0.2 0.4 0.6 0.8 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 ν 2 = 0.5 x 1 x 2 0.2 0.4 0.6 0.8 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 ν 2 = 1 x 1 x 2 0.2 0.4 0.6 0.8 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 ν 2 = 2.5 x 1 x 2 Figure 2: Optimal designs and average correlation surfaces for known ratio ν 2 = τ 2 2 . The filled contours in Figure 2 give the correlation between each point of the design region and the centre of the design region, averaged across the prior for φ; the minimum values of these correlations are 0.5, 0.34, 0.25 and 0.13 for ν 2 = 0, 0.5, 1, 2.5 respectively. Clearly, the optimal designs change with ν 2 and the correlation structure. Low values of ν 2 , which result in high correlation, give rise to designs where points are spread across the design region. In contrast, high values of ν 2 (low correlation) produce designs with points at the boundaries and corners of the design region. For ν 2 = 2.5, the design is similar to a standard optimal design for a regression model with uncorrelated errors. Example 2-Computer experiments Model (1) is commonly used to describe data from a deterministic computer experiment. Our example uses data from a simple simulator of a helical compression spring (Tudose and Jucan, 2007; Forrester et al., 2008). The example has three variables and I ρ(x i , x j ; φ)= Q 3 k=1 exp(-φ k |x ik - x jk |) for x i , x j [-1, 1] 3 I priors for θ, σ 2 as in the first example Figure 3 shows Bayesian optimal designs with n = 10 runs for two dierent prior distributions prior 1 : ν 2 =0, φ 1 Unif(1, 3), φ 2 Unif(3, 5), φ 3 0 prior 2 : ν 2 =0.5, φ 1 Unif(1, 3), φ 2 Unif(1, 3), φ 3 = 0 These priors were obtained by analysing data from a maximin Latin hypercube design (LHD) (Morris and Mitchell, 1995), also shown in Figure 3. x 1 -1.0 -0.5 0.0 0.5 -1.0 0.0 0.5 -1.0 0.0 0.5 x 2 -1.0 -0.5 0.0 0.5 -1.0 -0.5 0.0 0.5 1.0 -1.0 0.0 1.0 x 3 x 1 -1.0 -0.5 0.0 0.5 -1.0 0.0 0.5 -1.0 0.0 0.5 x 2 -1.0 -0.5 0.0 0.5 -1.0 -0.5 0.0 0.5 1.0 -1.0 0.0 1.0 x 3 Figure 3: Bayesian optimal design ( ) and Latin hypercube designs ( ) for prior 1 (left) and prior 2 (right). For prior 1, the Bayesian optimal design has similar space-filling properties as the LHD (average intra-point distance of 1.43 vs 1.40) but has 30% smaller average posterior predictive variance. For prior 2, where there is little change in correlation with x 3 , the design points in the x 3 dimension collapse onto the extremes (determined by the linear trend). This second design has posterior predictive variance 18% lower than that of the LHD. Conclusions We have investigated Bayesian optimal design for collecting correlated data when the aim of the experiment is accurate prediction. The designs we have studied are influenced by the degree of correlation, with higher correlation leading to designs which are close to space-filling, and provide lower prediction variance than LHDs. References Chaloner, K. and Verdinelli, I. (1995) Bayesian experimental design: a review. Statistical Science, 10, 273–304. Diggle, P. and Lophaven, S. (2006) Bayesian geostatistical design. Scandinavian Journal of Statistics, 33, 53–64. Forrester, A., Sobester, A. and Keane, A. (2008) Engineering Design via Surrogate Modelling. Chichester: Willey. Meyer, R. and Nachtsheim, C. (1995) The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics, 37, 60–69. Morris, D. and Mitchell, J. (1995) Exploratory designs for computational experiments. Journal of Statistical Planning and Inference, 43, 381–402. Tudose, L. and Jucan, D. (2007) Pareto approach in multi-objective optimal design of helical compression springs. Annals of the Oradea University, Fascicle of Management and Technological Engineering, 6(16), 991–998. Zimmerman, L. (2006) Optimal network design for spatial prediction, covariance parameter estimation and empirical prediction. Envirometrics, 17, 635–652. Design and Analysis of Experiments 2012 University of Georgia, Athens, October 2012

Upload: others

Post on 04-Nov-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian optimal design for prediction of correlated processes

Bayesian optimal design for prediction of correlated processesMaria Adamou

Sue Lewis, Sujit Sahu and Dave WoodsUniversity of [email protected]

IntroductionData collected from correlated processes arise in many diverseapplication areas including studies in environmental and ecologicalscience and in both real and computer experiments. Often the main aimof the study is to predict the process at unobserved points.

In this work, we illustrate a new approach for the selection of Bayesianoptimal designs using applications from both spatial statistics andcomputer experiments.

A Motivating ExampleNetworks of monitoring stations are regularly used to collect data tomeasure pollutant levels in water or air. Figure 1 shows such a networkin the Eastern USA for monitoring chemical deposition.

*

**

*

**

*

*

*

**

*

*

*

**

*

*

**

* *

*

*

* *

*

**

* ***

*

**

** *

***

***

**

*

*

*

*

*

*

**

*

*** *

* *

**

*

*

* *

*

*

***

****

*

*

*

**

**

*

* *

*

*

**

***

*

*****

*

**

****

**

** *

*

*

*

***

***

**

** *

* monitoring stations

Figure 1: Network of monitoring stations in the Eastern USA.

An important problem is to identify the best locations of stations acrossthe region of interest (Diggle and Lophaven, 2006; Zimmerman, 2006).Usually, observations at geographically close stations are assumed to behighly correlated, and this correlation helps to predict deposition levelsat unobserved locations.

Statistical modelWe assume a Gaussian regression model

Y(x)|θ,σ2,φ, τ2 ∼ N(Xθ,σ2C(φ) + τ2I) (1)

where X is the model matrix, θ contains regression coefficients, C is acorrelation matrix from a stationary and isotropic correlation functiondetermined by decay parameter φ, and σ2, τ2 are the spatial and pureerror (nugget) terms respectively.

If a Bayesian approach is adopted, the model specification is completedby assignment of prior distributions to the unknown parameters.

Bayesian optimal designThe aim of our experiments is to enable accurate prediction of theresponse Y(x) at unobserved x. We adopt a decision theoretic approach(Chaloner and Verdinelli, 1995) to find Bayesian optimal designs thatminimise the posterior predictive variance. To avoid thecomputational burden usually associated with Bayesian designs wehave developed a new closed form approximation that allows quickcalculation of the variance.The designs are found using the coordinate exchange algorithm (Meyerand Nachtsheim, 1995), and illustrated with the following examples.

Example 1 - Optimal designs for spatial dataHere observations are assumed to follow model (1) withI covariance function ρ(dij;φ) = exp(−φdij), where dij is the Euclidean

distance between points xi, xj ∈ [−1, 1]2

Iprior distributions θ|σ2 ∼ N(0,σ2I) , σ2 ∼ Inverse Gamma(3, 1) andφ ∼ Uniform(0.1, 1)Iν2 = τ2/σ2 is fixed at one of four values: 0, 0.5, 1, 2.5.Optimal designs with n = 10 points are shown in Figure 2.

0.2

0.4

0.6

0.8

1.0

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

ν2 = 0

x1

x 2

0.2

0.4

0.6

0.8

1.0

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

ν2 = 0.5

x1

x 2

0.2

0.4

0.6

0.8

1.0

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

ν2 = 1

x1

x 2

0.2

0.4

0.6

0.8

1.0

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

ν2 = 2.5

x1

x 2

Figure 2: Optimal designs and average correlation surfaces for knownratio ν2 = τ2/σ2.

The filled contours in Figure 2 give the correlation between each pointof the design region and the centre of the design region, averagedacross the prior for φ; the minimum values of these correlations are 0.5,0.34, 0.25 and 0.13 for ν2 = 0, 0.5, 1, 2.5 respectively.Clearly, the optimal designs change with ν2 and the correlationstructure. Low values of ν2, which result in high correlation, give riseto designs where points are spread across the design region. Incontrast, high values of ν2 (low correlation) produce designs withpoints at the boundaries and corners of the design region. For ν2 = 2.5,the design is similar to a standard optimal design for a regressionmodel with uncorrelated errors.

Example 2 - Computer experimentsModel (1) is commonly used to describe data from a deterministiccomputer experiment. Our example uses data from a simple simulator ofa helical compression spring (Tudose and Jucan, 2007; Forrester et al.,2008). The example has three variables andIρ(xi, xj;φ) =

∏3k=1 exp(−φk|xik − xjk|) for xi, xj ∈ [−1, 1]3

Ipriors for θ, σ2 as in the first exampleFigure 3 shows Bayesian optimal designs with n = 10 runs for twodifferent prior distributionsprior 1 : ν2=0, φ1 ∼ Unif(1, 3), φ2 ∼ Unif(3, 5), φ3 ' 0prior 2 : ν2=0.5, φ1 ∼ Unif(1, 3), φ2 ∼ Unif(1, 3), φ3 = 0These priors were obtained by analysing data from a maximin Latinhypercube design (LHD) (Morris and Mitchell, 1995), also shown inFigure 3.

x1

-1.0 -0.5 0.0 0.5

-1.0

0.0

0.5

-1.0

0.00.5

x2

-1.0 -0.5 0.0 0.5 -1.0 -0.5 0.0 0.5 1.0

-1.0

0.0

1.0

x3

x1

-1.0 -0.5 0.0 0.5

-1.0

0.00.5

-1.0

0.00.5

x2

-1.0 -0.5 0.0 0.5 -1.0 -0.5 0.0 0.5 1.0

-1.0

0.0

1.0

x3

Figure 3: Bayesian optimal design ( ) and Latin hypercube designs ( )for prior 1 (left) and prior 2 (right).

For prior 1, the Bayesian optimal design has similar space-fillingproperties as the LHD (average intra-point distance of 1.43 vs 1.40) buthas 30% smaller average posterior predictive variance. For prior 2, wherethere is little change in correlation with x3, the design points in the x3dimension collapse onto the extremes (determined by the linear trend).This second design has posterior predictive variance 18% lower than thatof the LHD.

ConclusionsWe have investigated Bayesian optimal design for collecting correlateddata when the aim of the experiment is accurate prediction. The designswe have studied are influenced by the degree of correlation, with highercorrelation leading to designs which are close to space-filling, andprovide lower prediction variance than LHDs.

ReferencesChaloner, K. and Verdinelli, I. (1995) Bayesian experimental design: a review. StatisticalScience, 10, 273–304.

Diggle, P. and Lophaven, S. (2006) Bayesian geostatistical design. Scandinavian Journal ofStatistics, 33, 53–64.

Forrester, A., Sobester, A. and Keane, A. (2008) Engineering Design via Surrogate Modelling.Chichester: Willey.

Meyer, R. and Nachtsheim, C. (1995) The coordinate-exchange algorithm for constructingexact optimal experimental designs. Technometrics, 37, 60–69.

Morris, D. and Mitchell, J. (1995) Exploratory designs for computational experiments.Journal of Statistical Planning and Inference, 43, 381–402.

Tudose, L. and Jucan, D. (2007) Pareto approach in multi-objective optimal design ofhelical compression springs. Annals of the Oradea University, Fascicle of Management andTechnological Engineering, 6(16), 991–998.

Zimmerman, L. (2006) Optimal network design for spatial prediction, covarianceparameter estimation and empirical prediction. Envirometrics, 17, 635–652.

Design and Analysis of Experiments 2012 University of Georgia, Athens, October 2012