seismic data assimilation with an imperfect model · data assimilation also generally implies a...

17
https://doi.org/10.1007/s10596-019-09849-0 ORIGINAL PAPER Seismic data assimilation with an imperfect model Miguel Alfonzo 1,2 · Dean S. Oliver 2 Received: 12 July 2018 / Accepted: 17 June 2019 © The Author(s) 2019 Abstract Data assimilation methods often assume perfect models and uncorrelated observation error. The assumption of a perfect model is probably always wrong for applications to real systems, and since model error is known to generally induce correlated effective observation errors, then the common assumption of uncorrelated observation errors is probably almost always wrong, too. The standard approach to dealing with correlated observation errors, which simply ignores the correlation, leads to suboptimal assimilation of observations. In this paper, we examine the consequences of model errors on assimilation of seismic data. We show how to recognize the existence of correlated error through model diagnostics modified for large numbers of data, how to estimate the correlation in the error, and how to use a model with correlated errors in a perturbed observation form of an iterative ensemble smoother to improve the quantification of a posteriori uncertainty. The methodologies for each of these aspects have been developed to allow application to problems with very large number of model parameters and large amounts of data with correlated observation error. We applied the methodologies to a small toy problem with linear relationship between data and model parameters, and to synthetic seismic data from the Norne Field model. To provide a controlled investigation in the seismic example, we investigate an application of data assimilation with two sources of model error—errors in seismic resolution and errors in the petro-elastic model. Both types of model errors result in effectively correlated observation errors, which must be accounted for in the data assimilation scheme. Although the data are synthetic, parameters of the seismic resolution and the observation noise are estimated from the actual inverted acoustic impedance data. Using a structured approach, we are able to assimilate approximately 115,000 observations with correlated total observation error efficiently without neglecting correlations. We show that the application of this methodology leads to less overfitting to the observations, and results in an ensemble estimate with smaller spread than the initial ensemble of predictions, but that the final estimate of uncertainty is consistent with the truth. Keywords Model calibration · History matching · Model error · Observation bias · Predictability · Randomized maximum likelihood · Data assimilation · Model improvement · Acoustic impedance · Seismic noise · Norne Field 1 Introduction Data assimilation is the process of modifying uncertain parameters in a model in order to match predictions to data to within a specified tolerance. The term data assimilation is often applied in the weather prediction community to situations in which data are sequentially assimilated as they become available. In this paper, we take the more Miguel Alfonzo [email protected] 1 University of Bergen, Bergen, Norway 2 Norwegian Research Centre (NORCE), Bergen, Norway general approach in which all data might be assimilated simultaneously. In this case, data assimilation for reservoir models becomes synonymous with history matching [44, 45]. Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component. In order to use a Bayesian approach, it is necessary to characterize the prior uncertainty in model parameters, the accuracy of the observations and limitations in the model so that one is able to weight the observations appropriately. If the magnitude of the data mismatch after model calibration is larger than expected based on measurement error, then one might conclude that the difference is at least partially due to deficiencies in the model: missing physical processes, missing parameters, or errors in the forward model. Computational Geosciences (2020) 24:889–905 Published online: 10 July 2019 /

Upload: others

Post on 08-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

https://doi.org/10.1007/s10596-019-09849-0

ORIGINAL PAPER

Seismic data assimilation with an imperfect model

Miguel Alfonzo1,2 ·Dean S. Oliver2

Received: 12 July 2018 / Accepted: 17 June 2019© The Author(s) 2019

AbstractData assimilation methods often assume perfect models and uncorrelated observation error. The assumption of a perfectmodel is probably always wrong for applications to real systems, and since model error is known to generally inducecorrelated effective observation errors, then the common assumption of uncorrelated observation errors is probably almostalways wrong, too. The standard approach to dealing with correlated observation errors, which simply ignores thecorrelation, leads to suboptimal assimilation of observations. In this paper, we examine the consequences of model errors onassimilation of seismic data. We show how to recognize the existence of correlated error through model diagnostics modifiedfor large numbers of data, how to estimate the correlation in the error, and how to use a model with correlated errors in aperturbed observation form of an iterative ensemble smoother to improve the quantification of a posteriori uncertainty. Themethodologies for each of these aspects have been developed to allow application to problems with very large number ofmodel parameters and large amounts of data with correlated observation error. We applied the methodologies to a smalltoy problem with linear relationship between data and model parameters, and to synthetic seismic data from the NorneField model. To provide a controlled investigation in the seismic example, we investigate an application of data assimilationwith two sources of model error—errors in seismic resolution and errors in the petro-elastic model. Both types of modelerrors result in effectively correlated observation errors, which must be accounted for in the data assimilation scheme.Although the data are synthetic, parameters of the seismic resolution and the observation noise are estimated from theactual inverted acoustic impedance data. Using a structured approach, we are able to assimilate approximately 115,000observations with correlated total observation error efficiently without neglecting correlations. We show that the applicationof this methodology leads to less overfitting to the observations, and results in an ensemble estimate with smaller spreadthan the initial ensemble of predictions, but that the final estimate of uncertainty is consistent with the truth.

Keywords Model calibration · History matching · Model error · Observation bias · Predictability ·Randomized maximum likelihood · Data assimilation · Model improvement · Acoustic impedance · Seismic noise ·Norne Field

1 Introduction

Data assimilation is the process of modifying uncertainparameters in a model in order to match predictions to datato within a specified tolerance. The term data assimilationis often applied in the weather prediction community tosituations in which data are sequentially assimilated asthey become available. In this paper, we take the more

� Miguel [email protected]

1 University of Bergen, Bergen, Norway

2 Norwegian Research Centre (NORCE), Bergen, Norway

general approach in which all data might be assimilatedsimultaneously. In this case, data assimilation for reservoirmodels becomes synonymous with history matching [44,45]. Data assimilation also generally implies a Bayesianapproach to uncertainty estimation [20], which we take asa key component. In order to use a Bayesian approach,it is necessary to characterize the prior uncertainty inmodel parameters, the accuracy of the observations andlimitations in the model so that one is able to weight theobservations appropriately. If the magnitude of the datamismatch after model calibration is larger than expectedbased on measurement error, then one might conclude thatthe difference is at least partially due to deficiencies in themodel: missing physical processes, missing parameters, orerrors in the forward model.

Computational Geosciences (2020) 24:889–905

Published online: 10 July 2019/

Page 2: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Model error is an inherent problem in all real-worldapplications of data assimilation. In this paper, our focusis on a methodology for treatment of model error whoseorigin is unknown or of a type for which removal of thesource of error is impractical. In the examples, the actualmodel error is of two types. The first is often termed ‘errorof representation’ [28, 52]. This error is often a result ofthe neglect of small-scale processes and heterogeneity. Tomimic representation error, we examine situations in whichthe data are obtained from a system with a finer resolutionthan the predictive model. The second source of error in theexamples is an error in parameters of the forward simulator.

In applications of seismic history matching, model errormay occur because of errors in the petro-elastic model(PEM), inappropriate filtering, missing parameters, or manyother factors [58]. Furthermore, there exists differences inresolution (frequency content) between seismic data andseismic predictions made from the PEM and reservoirsimulation models. On the one hand, seismic data showsa higher lateral resolution than that of simulation models.Therefore, for better match with seismic observations,it is common practice to downscale simulation results(e.g., pressures and saturations) to the resolution of thefiner reservoir model prior to estimation of impedances[50, 55]. On the other hand, different approaches havebeen proposed to address the issue of the limited verticalresolution of seismic data [16]. These include the useof stochastic seismic inversion workflows to improve thevertical resolution of seismic [24, 32, 36], or Bayesiandownscaling of inverted impedance data [4]. Furthermore,due to discontinuities in the simulation model (gridding,faults, etc.), seismic predictions may contain high frequencyfeatures that are outside of the frequency spectrum of theactual seismic data [2]. In this paper, we perform thecomparison between seismic observations and predictions atthe seismic scale, so no downscaling of simulations resultsor seismic data is needed. Instead, seismic predictions arefiltered (smoothed) within the seismic frequency bandwidth[50].

In history matching of production data and in numericalweather prediction, model error is often either ignored ortreated in an ad hoc manner by simply inflating the observa-tion error covariance beyond the actual measurement errorlevel [7, 11]. While this approach can reduce the tendencyto underestimate uncertainty, it also tends to exclude small-scale information in the data [49, 57]. The effect of modelerror is more difficult to ignore when assimilating seismicdata, partially because the amount of data is often muchlarger than when only assimilating production data, andpartially because the information provided by the seismicappears to be complimentary to the information provided bythe production data [56]. When the amount of data is large,as it is for seismic, even small amounts of model error can

have large impact on the results of data assimilation [13, 39].We note that errors in production data tend to be correlatedin time [22], while errors in seismic data tend to be spatiallycorrelated [48]. Since all data in history matching to pro-duction and seismic data are assimilated simultaneously, thetype of correlation (spatial or temporal) is irrelevant to theprocess.

There are at least three unfortunate consequences ofignoring model error in data assimilation: (1) parameters ofthe model are adjusted to incorrect and often implausiblevalues, (2) the uncertainty in predictions tend to beunderestimated, and (3) the forecasts are biased [5, 15,64]. Although the importance of recognizing model errorand its effect on correlation of observation errors is widelyrecognized [54, 60, 61], methods for treating model errorand the subsequent autocorrelation in residual data errorsare uncommon in practice. Accounting correctly for theeffects of autocorrelation in the observation error model has,however, been shown to significantly improve calibratedparameter values and uncertainty estimates [23].

There are many ways to identify the presence ofsignificant model error, but one standard method fordiagnosing a model is through comparison of the modelpredictions with actual observations [31, 63]. If the modelparameters that we generate do not in fact reproduce thedata within the expected tolerance, then we must concludethat either the data assimilation has been done poorly, or thatthe assumptions that were made about the magnitude of themeasurement error were wrong, or that the forward modelmight be incapable of reproducing important processes.Unfortunately, most methods for evaluating the expectedvalue of a model diagnostic based on the magnitude ofthe residuals require the inverse of the observation errorcovariance matrix [38], which is difficult when the errorsare correlated and the number of data is large.

When a model has been judged to be deficient, thefirst course of action would be to attempt improvement byadding model parameters, reassessing the prior uncertainty,refining the model discretization, or by including physicalprocesses that were previously neglected. Typically, how-ever, there is a limit to improvements that can be made to themodel and when that limit has been reached, the model maystill be noticeably imperfect. The only available recourse atthis point is to reduce the expectation of the quality of thematch of the model to the data. Modification of the estimateof the observation error covariance matrix, CD , can be usedto modify the relative weighting of various types of data andprevent overly large changes in the model parameters in anattempt to make predictions from a imperfect model matchdata [29, 37].

We previously demonstrated the benefits of a workflowthat includes model criticism and iterative model improve-ment on a small flow problem with model error due to

Comput Geosci (2020) 24:889–905890

Page 3: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

onset of turbulence and errors in measurement [43]. Themethodology was shown to result in better quantificationof uncertainty and in more robust forecasts. Because thetest problem was small in those examples, the ability toapply the methodology to large reservoir problems was notobvious. In this paper, we present new methods for treat-ing several aspects of the problem of data assimilation withimperfect models that are significantly more difficult whenthe number of model parameters and the number of data arelarge, as is the case when assimilating seismic data: recogni-tion of the need for model improvement through evaluationof a model diagnostic for the posterior ensemble, the iter-ative improvement of an estimate of the total observationerror covariance, the generation of samples of correlatedobservation errors, and a methodology for computation ofthe gain matrix when observation errors are correlated.

2Methodology

In this section, we summarize key elements of a method-ology for dealing with correlated observation errors inlarge data assimilation problems for which formation ofthe observation error covariance matrix may be problematicand for which sampling of noise from the observation noise

distribution will be difficult. We also describe a methodol-ogy for checking the posteriori ensemble by comparing themagnitude of data mismatch with the expected magnitude.Finally, we develop a method for updating the estimate ofthe total observation error that accounts for the presence ofmodel error.

2.1 Assimilation of data with non-diagonal CD

The most straightforward approach to the use of correlatedobservation error (non-diagonal observation error covari-ance matrix CD) in ensemble-based data assimilation isto simply generate perturbed observations with correlatederrors, then use a low-rank representation of the obser-vation error covariance matrix as described by Evensen[20, sec. 14.3]. Instead of forming the non-diagonal matrixCD , we generate an ensemble of independent realiza-tions of correlated observation error E from the distribu-tion N(e|0,CD), then use the sample covariance matrixEET/(Ne − 1) to represent the data error covariance matrixCD , where Ne is the number of ensemble members, or sam-ples. An equation for updating the ensemble of parameters�pr for consistency with observations dobs can be obtainedeasily in the linear case.

�post = �pr + ��pr(�D)T[�D(�D)T + EET

]−1(dobs − (D + E))

≈ �pr + ��pr(�D)TU0W−T0

(I + W−1

0 UT0 EE

TU0W−T0

)−1W−1

0 UT0

× (dobs − (D + E))

(1)

where �post is the ensemble of model parameters condi-tioned to observations, ��pr is the matrix of deviations of�pr from the mean of the ensemble, D is the matrix whosecolumns are realizations of predicted data, and �D is thematrix of deviations of D from the mean of the ensem-ble. We have used a truncated singular value decomposition(SVD)

�D ≈ U0W0VT0

in the second line of Eq. 1. Although the number ofdata (Nd ) will be quite large in practical cases, it is onlynecessary to compute the SVD of a matrix of dimensionNd × Ne. The effort required for computation is O(NdN2

e )

[59], which is only linear in the number of data.Algorithm 1 summarizes the steps for data assimilation

with correlated observation error. To make the assimilationfeasible for large models and for large amounts of data,we apply localization of the Kalman gain matrix. Althoughthere will be efficiency gain from using local analysis

[8], in Algorithm 1 we simply assume that updates ofmodel parameters are computed row-by-row so that it isnot necessary to form the entire Kalman gain matrix [17].Also, in any practical application, one would replace theAlgorithm 1 with an iterative ensemble smoother (e.g.,[6, 18]) and one would apply scaling to the data beforecomputation of the truncated SVD. For clarity of exposition,none of these important steps are shown here.

2.2 Evaluation of amodel diagnostic for criticismof the posterior ensemble

A very standard way of evaluating history matched modelsis to compare predictions from the ensemble of calibratedmodels with observations. In order to determine if themagnitude of data residuals after calibration is consistentwith model assumptions, we compute a particular modeldiagnostic

Sobsd = (h(θpost) − dobs)TC−1

D (h(θpost) − dobs) (2)

Comput Geosci (2020) 24:889–905 891

Page 4: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Algorithm 1 Assimilation of observations with correlated observation error

1: function ASSIMILATEDATA(�pr,D,E,dobs, ρ)input: �pr is the initial ensemble of model parameters, D is a matrix of predicted observations, E is a matrix ofobservation noise realizations, dobs is the vector of observations, ρ is the taper or localization matrix

2: �E ← E − E3: �Dobs ← dobs − (D + E)

4: U0W0VT0 ← �D � Truncated singular value decomposition

5: X0 ← W−10 UT

0 (�E)

6: U1W1VT1 ← X0 � Singular value decomposition

7: X1 ← U0W−10 U1

8: C ← X1(I + W21)

−1XT1 � C ≈ (HPHT + C(k)

D )−1

9: �post ← �pr + ρ◦(���DTC

)�Dobs � Localized update to parameters

10: return �post

11: end function

where h is the observation operator that defines the rela-tionship between model parameters and data predictions,and θpost is a sample of model parameters from the poste-riori distribution, after conditioning to data. Note that Sobs

d

evaluates the mismatch between calibrated realizations andthe actual observations. If the model is correctly specifiedand the minimization is performed well, then the expectedmagnitude of Sobs

d is Nd .When CD is diagonal or when the number of data is small,

computation of the model diagnostic Sobsd is trivial. When

the number of data is large and when the observation errors arecorrelated, however, computation is not so straightforward.One possible approach for observation errors with stationarycovariance is to take advantage of the matrix structureto compute the inverse efficiently [40, 42]. In ourmethodology, however, the observation error covariancematrix itself is never computed. Instead, it is approximatedfrom an ensemble of residuals. Our metric is then similar tothe Mahalanobis distance, except for the fact that the samplecovariance is not full rank, so it is not invertible.

In those cases where an estimate of the magnitude ofSobs

d is needed, but the representation of CD is not full rank,there are several options for computing an approximationof C−1

D . The most obvious is to simply use the pseudo-inverse of EET/(Ne − 1) but if this approach is taken,the expected value of Sobs

d is no longer Nd . Engel et al.[19] provide a recent survey of methods for estimating theprecision matrix (the inverse of the covariance matrix) forthe case in which the number of parameters is large and thenumber of samples is small. We have chosen to shrink thelow-rank sample covariance estimate towards the diagonalmatrix whose diagonal elements are all equal to the averageof the diagonal elements of the sample covariance matrix[34, 53]. If the variance is thought to be nonstationary, thenit might be appropriate to use a different target, e.g., TargetD: “diagonal, unequal variance” [53].

Let the shrinkage-based estimate of the covariance matrixfor vector x be

� = δ(νI) + (1 − δ)

(1

Ne − 1XXT

)

where νI is the target covariance matrix, X is the Nd × Ne

matrix whose columns are mean-removed random samplesof the vector x, XXT/(Ne − 1) is the sample covariancematrix, and δ is the shrinkage parameter. Since our choiceof the target matrix is diagonal, the formula for the inversecan be simplified using the Sherman-Morrison-Woodburyformula.

�−1 = 1

δνI − 1

δ2ν2

(1−δ

Ne−1

)X

(I+ 1

δν

(1−δ

Ne−1

)XTX

)−1XT

= 1δνI − 1

δνX

(δν

(Ne−11−δ

)I + XTX

)−1XT.

(3)

Schafer and Strimmer [53] discuss a method for estimatingthe optimal shrinkage, but since it involves computation ofthe sample covariance, we instead use the simpler estimateof Leung and Chan [35]

δ∗ = 2

Ne + 2.

For this particular shrinkage target, an estimate of theMahalanobis distance can be computed efficiently,

D2M(x) = (x − μ)T�

−1(x − μ).

= 1δν

(x − μ)T(x − μ) − 1δν

(L−1XT(x − μ)

)T(L−1XT(x − μ)

)

(4)

where we have performed a Cholesky factorization of(δν

(Ne−11−δ

)I + XTX

)= LLT. Note that the matrix that

requires factorization (or inversion) is generally quite small

Comput Geosci (2020) 24:889–905892

Page 5: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

(Ne × Ne) even for very large data assimilation problems,so computation is very fast.

Although the use of shrinkage provides an improvedestimate of both CD and C−1

D , the magnitude of D2M(x)

obtained using Eq. 4 may still be far from the valuethat would be obtained using the true value of C−1

D ,so direct comparison of D2

M(x) with the number ofdata is not possible. As a final step, we compute anempirical probability distribution for values of D2

M(x)based on samples from the true distribution. Denoting theensemble of residuals after data assimilation with perturbedobservations by r (that is the ith column of r is themismatch ri = h(θ

posti ) − dobs) and the ensemble of

observation perturbations by E, we apply the functionCOMPUTEDIAGNOSTIC(E, r) from Algorithm 2 to obtaintwo distributions of squared Mahalanobis distances. Onedistribution corresponds to realizations of data residuals (thecolumns of r) and the second distribution corresponds toindependent samples of the observation error (derived fromthe columns of E). p in Algorithm 2 denotes the probabilitythat the ensemble of observation perturbations E is drawnfrom the same distribution as the ensemble of data residualsr. Larger values of p therefore suggest that the distancebetween the two distributions is smaller, and hence that E is

more likely to be drawn from the same distribution as r. Ourmetric is then based on the modified Z-score computed forthe distance between the medians of the two distributions[30]. Other nonparametric tests, such as the two-sampleKolmogorov-Smirnov test, might be more appropriate. Thesteps are summarized in Algorithm 2.

2.3 Generation of perturbed observations

In the perturbed observation forms of the iterative ensemblesmoothers with model improvement, it is necessary togenerate ensembles of observation perturbations multipletimes. When the observation errors are independent, thecost is small, but when the errors are correlated then thecost of generating the perturbations for large data sets canbe significant. In our applications, we use two differentmethods depending on the representation of the estimate ofthe observation error covariance matrix.

In most cases, the initial estimate of observationerror covariance will be obtained using an assumptionof stationarity so that standard methods for generatingunconditional realizations of gaussian random fields can beemployed. Because the size of the data may be large, we usethe method of circulant embedding of the covariance matrix

Algorithm 2 Computation approximation to Sx = 1Ne

∑Ne

i yTC−1xx y for observations with correlated error.

1: function COMPUTEMAHALANOBIS(X,z,L)2: b ← L−1XTz3: DM ← 1

δν

(zTz − bTb

)

4: return DM � approximation to Mahalanobis distance squared5: end function

6: function COMPUTEDIAGNOSTIC(X,Y)input: columns of X are mean-removed samples from the ‘true’ distribution; columns of Y are mean-removed samplesfrom the ‘test’ distribution

7: δ ← 2Ne+2 � Shrinkage from Leung and Chan [35]

8: A ← δν(Ne−1)(1−δ)

I + XTX � See Eq. 3

9: LLT ← A � Cholesky factorization of A10: for j ← 1, Ne do11: xs gets random roll of ith column of X12: y ← Yi � ith column of Y13: Dx

j ← COMPUTEMAHALANOBIS(X, xs ,L)

14: Dyj ← COMPUTEMAHALANOBIS(X, y,L)

15: end for16: σDX ← 1.4826MAD(Dx) � Median absolute deviation17: μDX ← med(Dx)

18: μDY ← med(Dy)

19: p ← 1 − erf( |μDY −μDX |√

2 σDX

)� Empirical estimate of p(μDY |X)

20: return p

21: end function

Comput Geosci (2020) 24:889–905 893

Page 6: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

[14] to generate observation perturbations. The key to theefficiency of the algorithm is that the covariance matrixis symmetric block-circulant with circulant blocks and iscompletely specified by its first block row [33].

An alternative method for generating samples of totalobservation error is required after updating the estimate of

CD . Following the methodology of Oliver and Alfonzo [43],we compute the updated estimate of the total covariancefrom the ensemble of data mismatch between modelpredictions and actual observations after data assimilationwith perturbed observations. Thus, if

θposti = argmin

θ

[(θ − θ

pri )TC−1

θ (θ − θpri ) + (h(θ) + εi − dobs)TC−1

D (h(θ) + εi − dobs)]

(5)

where εi is a perturbation to the predicted observations andεi ∼ N[0,CD] and if the residual for updated ensemblemember θ

posti is defined to be (see Appendix)

ri = h(θposti ) − dobs, (6)

then the maximum likelihood estimate of the totalobservation error covariance is

CD = 1

Ne

rrT (7)

where r = (r1, r2, . . . , rNe). In all practical cases involvingseismic data,Ne Nd so the maximum likelihood estimateof CD will be rank deficient. If, however, all we need aresamples of observation error from CD for the perturbedobservation data assimilation, then it is not necessary toestimate a full rank CD to generate new realizations, as theri are themselves samples from CD .

If only a single ensemble of realizations fromN[ε|0, CD]is required, it would be possible to use the columns of r. Insome methods, such as the perturbed observation form ofmultiple data assimilation (MDA), however, it is necessaryto generate independent perturbations at each iteration[18]. To generate an additional ensemble of samples, weapply a transformation that generates a new ensembleof observation perturbations from the original ensembleby circularly shifting and recombining perturbations. Thecircular shift simply moves elements from one position onthe grid to another. The random recombination recombinesthe realizations without affecting the covariance.

The method that uses shifting to generate new realiza-tions is very fast and does not require estimation or fittingof covariance models. It is limited, however, to rectangu-lar regular grids. Also, the shift introduces a discontinuityin the perturbation that may in some cases be significant. Itis not noticeable in our applications because it is followedby the random recombination step from a QR factorizationof a random matrix. The methodology is summarized inAlgorithm 3.

2.4 Model improvement workflow

The functionalities described in Algorithms 1–3 are partsof a structured data assimilation methodology that provides

for both model criticism and model improvement. A fairlycomplete, but necessarily simplified workflow is shownin Algorithm 4. In summary, this workflow comprisesdata assimilation with correlated observation errors (line 6,Algorithm 1), model diagnostic for the posterior ensemble(line 8, Algorithm 2), the iterative estimation of the totalobservation error from the data residuals, and the generationof samples of correlated observation errors (lines 7 and 9,Algorithm 3). The model improvement cycles stop when themodel diagnostics suggests that the model is adequate (thevalue of the probability p is larger than the threshold of 0.05,or any other defined threshold), or the maximum number ofuser-defined iterations (kmax) is reached.

In practice, each of the steps in the workflow is morecomplex than described in this paper. For example, itis common to perform a first check of the adequacyof the initial ensemble against the actual observations.One possible test is to evaluate the probability that theobservations are drawn from the same distribution as theensemble of model predictions (function CHECKMODEL inAlgorithm 4, not shown in this paper). If this test suggeststhat the ensemble of model predictions is inconsistent withthe observations (low probability values, pinit), one shouldattempt to improve the model prior to data assimilation(lines 1 to 3 in Algorithm 4). Possible avenues formodel improvement may include the addition of newmodel parameters, modification of the prior distributionof model parameters, or modification of the distributionof observation errors. Furthermore, for any practicalsubsurface data assimilation problem, it is necessary toapply an iterative data assimilation method to deal withnonlinearity instead of the ensemble smoother method usedfor illustration in Algorithm 1. In our seismic test case,we have applied ES-MDA [18]. In that case, after eachcycle of data assimilation, the data perturbations must berandomized and rescaled by an inflation factor α. If, insteadan iterative ensemble smoother (e.g., [6]) is used for thedata assimilation, then a scalar inflation would be appliedat each iteration, but resampling the data noise would notbe necessary. Also, in Step 9 of Algorithm 1, we showgain localization for the entire gain matrix. For large dataassimilation problems, it is impractical to form the entire

Comput Geosci (2020) 24:889–905894

Page 7: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Algorithm 3 Generation of observations noise samples from data residuals.

1: function GENERATEPERTURBATIONS(r)2: Apply random shift to columns of r (ensemble of residuals) along each axis after reshaping3: Create random matrix A, whose elements are independent standard normal4: QR ← A � QR decomposition5: E ← rQ � New error perturbations from residuals6: return E7: end function

Algorithm 4 Assimilation of observations with correlated observation error.

1: �pr,D,E(0) ← CREATEMODEL(Ne,h, θpr,Cθ , C(0)D )

input: Ne is the ensemble size, h is the forward operator, θpr is the prior mean for model parameters, Cθ is the prior

model covariance matrix, C(0)D is the initial estimate of observation error covariance, and kmax is the maximum number

of model improvement iterations (user-defined).output: �pr is the ensemble of initial model realizations, D is the matrix of the ensemble of predicted observations,E(0) is a matrix of observation noise realizations.

2: pinit ← CHECKMODEL(dobs,E(0),D)

3: if pinit < 0.05 then return to Step 14: k ← 0; p(0) ← 0.000015: while k < kmax and p(k) < 0.05 do6: �post ← ASSIMILATEDATA(�pr,D,E(k),dobs, ρ) � Algorithm 17: r ← h(�post) − dobs8: p(k+1) ← COMPUTEDIAGNOSTIC(E(k), r) � Algorithm 29: E(k+1) ← GENERATEPERTURBATIONS(r) � Algorithm 3

10: k ← k + 111: end while

gain matrix. In the seismic test problem, the gain matrix isformed one row at a time (as in [17]) (or the analysis canbe done using local analysis, updating model parameterscolumn-by-column [8]). Local analysis with observationtapering is used for the linear 1D example.

3 Example applications

We illustrate the methodology with two numerical exam-ples. In the first example, the observations are linear and theprior distribution is Gaussian, so the posterior distributionwill also be Gaussian. The difficulty is that the observationoperator (i.e., the true sensitivity of the data to the modelparameters) is different from the observation operator thatis assumed for the data assimilation. The problem is smallenough that the results can be examined analytically and thetrue total covariance can be computed. The second exampleis based on synthetic seismic data for the Norne Field case.Here, the number of data is large so that it becomes imprac-tical to compute the inverse of the covariance matrix whenthe observation errors are correlated. Model error affects thesynthetic seismic through an error in the petro-elastic modeland through inaccuracy of the observation operators.

3.1 Linear 1D test case

The synthetic truth for this test problem is a realization of acorrelated gaussian random vector of length 150 with priormean 0 and exponential covariance

cov(x, x′) = 4 exp

(−3|x − x′|

25

).

Observations of the truth are made in every second latticepoint with independent measurement errors characterizedby variance 0.04.

To mimic the effect of error in the forward model, weincorrectly assume that the observations were obtained froma smoothed model (Hann filter with width 31). Thus theapproximate observation operator is

h(xn) =15∑

i=−15

wixn−i for wi = (1 + cos(2πi/30))/30

while the true observation operator is h(xn) = xn. Thisis roughly equivalent to using a coarse model for dataassimilation. Figure 1a shows the true property field and thenoisy observations from that field. Figure 1b shows the trueobservation operator (a delta function) and the approximate

Comput Geosci (2020) 24:889–905 895

Page 8: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 1 The linear 1D examplewith model error. a The trueproperty field and the noisyobservations. b True andapproximate observationoperators for n = 75

observation operator that is used in the data assimilation.For a uniform field, both operators would return the samevalues, but they return different values when applied toheterogeneous fields.

Starting with an initial (correct) estimate of the mea-surement noise, the workflow of Algorithm 4 is applied toiteratively estimate the total observation error covariance.The data assimilation in each loop is performed using theensemble Kalman smoother described in Algorithm 1 withlocal analysis as described in Chen and Oliver [8]. Ensem-ble size was 60. The Gaspari-Cohn correlation function [25]with range 2a = 45 was used for observation tapering. Weassume that the covariance of the total observation erroris stationary so that we can use the methodology of Algo-rithm 3 to generate new realizations of data noise by shiftingcurrent realizations. Figure 2a shows the ensemble of pre-dicted observations (h(θ)) after assimilation using C(0)

D ,which was based only on measurement noise. To make acomparison with the observations, we must perturb the pre-dicted observations (after data assimilation) with noise fromthe estimated distribution. The comparison in Fig. 2b showsthat the spread in the posteriori ensemble is far too small anddoes not cover the observations. After iteratively improv-ing the estimate of the total observation error covariance,we again compute the ensemble of predicted observations(Fig. 2c) and the a posteriori ensemble of perturbed predic-tions (Fig. 2d). The improvement in the coverage is clear.

Although we do not require or form the estimates of C(i)D

for i > 0, it is instructive to visualize the evolution of theestimates in this test case (Fig. 2e). Because we assumestationarity of the total observation error covariance, a singlerow of the covariance matrix is sufficient to characterizethe entire matrix. Note that at iteration 0, the variance is0.04 and that the value of the variance increases quickly toa value near 1.0. The covariance also quickly develops ashort-range correlation. The lack of smoothness is due to thesmall domain and the influence of the single realization ofthe “true” observations.

In practice, iterative improvement would terminate whenthe posteriori model diagnostic (Eq. 2) attains a value that isconsistent with the expected distribution of values. Figure 2f

shows that the value of the model diagnostic (probabilityp in Algorithms 2 and 4) is acceptable (larger than 0.05)after one iteration. This is consistent with the findings ofMichel [38], who showed in a related approach that a singleiteration was often sufficient to obtain a good estimate of ascale factor for the error covariance.

Because the test problem is relatively small (75observations and 150 model parameters), it is possible tocompute the iterative improvement in the estimate of thecovariance for total observation error that would be obtainedif it were possible to use Eq. 17 directly. The initial estimateof observation error covariance (iteration 0) is pure nuggetwith a variance of 0.04. It is the variance of the actualmeasurement error for the true observation operator. Aftera single iteration, the estimate of observation error varianceincreases to nearly 1.0 and the errors become correlated(Fig. 3a). Further iterations slowly approach the theoreticaltotal observation error covariance which has a varianceof approximately 1.75 (Fig. 3b). Although convergence isslow, we note that even a single iteration results in anestimate of the covariance that is far better than the initialestimate.

The purpose of data assimilation is to make predictions.Because this is a synthetic example, however, it is possibleto compare updated estimates of the property field withthe true property field from which the observations wereobtained. In the linear 1D test case, the observation operatorused for data assimilation was intentionally made to beincorrect. If data assimilation is performed using theincorrect forward model with the actual observation errorcovariance, the results are an underestimation of uncertaintyin the updated model parameters, and biased predictions(Fig. 4a). If the methodology described in Algorithm 4 isused, however, the spread in the final ensemble is increasedand the ensemble of updated parameters appears to beconsistent with the truth (Fig. 4b).

The quality of the final ensemble is quantified using a for-mulation of the Mahalanobis distance for high dimensions.We use cross-validation to estimate the probability that thetruth was drawn from the same distribution as the ensem-ble of posterior realizations. Using the initial estimate of

Comput Geosci (2020) 24:889–905896

Page 9: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 2 Results from the linear 1D example. In this example, the num-ber of observations is 75 and the ensemble size is 60. Local analysiswith data tapering was used for data assimilation. a Predicted obser-vations from the updated ensemble using initial CD . b Perturbedpredicted observations from the updated ensemble using initial CD .c Predicted observations from the updated ensemble after correcting

CD . d Perturbed predicted observations from the updated ensembleafter correcting CD . e Several iterations of the first row of sampleestimate of CD . Errors are correlated. f The first 25 values of the pos-terior model diagnostic (probability p in Algorithms 2 and 4) for dataresiduals

CD , the probability for Fig. 4a is less than 0.0005, whilethe probability for the truth to be drawn from the samedistribution as the ensemble in Fig. 4b is approximately0.49.

3.2 Norne Field example

In this section, we apply the workflow and methodologyof Algorithm 4 to a much larger problem for which the

Comput Geosci (2020) 24:889–905 897

Page 10: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 3 Iterative estimation ofCD for infinite ensemble size. aEarly iterations (but the initialguess is not shown). b Lateriterations and the theoreticalcovariance

Fig. 4 Comparison of the truemodel (blue curve) with theensemble of model parametersafter data assimilation. aUpdated model parameters afterdata assimilation with initialestimate of CD . b Updatedmodel parameters after dataassimilation with final estimateof CD

Fig. 5 True reservoir modelproperty fields in layer 11 of theNorne Field model

Fig. 6 Model error due to imperfect vertical filtering. a Frequency spectrum computed from the Norne 2001 impedance volume. b True andapproximate observation operators

Comput Geosci (2020) 24:889–905898

Page 11: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 7 Two realizations ofseismic noise in layer 11generated from the noise modelestimated using factorialco-kriging on the 2001 and 2006seismic surveys. Units ofimpedance: (m/s).(g/cc)

computation and use of a non-diagonal observation errorcovariance matrix,CD , might be problematic using standardmethods. The synthetic test problem that we have chosenis based on the Norne Field benchmark case [51]. In ourtest problem, we assimilate synthetic acoustic impedancedata to update estimates of porosity and net-to-gross ratioin each active cell of the Norne simulation model at onereservoir condition. We intentionally add model error to theproblem so that an accurate estimate of observation noiseis not sufficient to explain the discrepancy between theobservations and the model predictions.

Our models for assimilation of seismic data are basedon the Norne reservoir simulation model provided by theI/O Centre at the Norwegian University of Science andTechnology (NTNU). The model is composed of 46 cells by112 cells by 22 cells in the i, j , and k directions respectively,with a total of 44,927 active cells. The fourth layer,representing the Not Shale Formation, consists entirely ofinactive cells. The average horizontal cell dimensions inthis model are approximately 100 ft by 100 ft (30.48 m by

30.48 m). One realization of porosity and NTG from Chenand Oliver [7] has been used to define the true porosity andtrue NTG fields in this work (Fig. 5). This realization is nota part of the 100 model realizations that are updated duringdata assimilation.

In order to simulate seismic data from a reservoirsimulation model, we minimally require a rock-physicsor petro-elastic model (PEM), a filtering function, aseismic noise model, and time-to-depth conversion. APEM is a set of relationships that aim to convert certainreservoir properties (e.g., porosity and NTG) and reservoirvariables (e.g., saturation, pressure) into elastic properties,such as velocities and density. Synthetic seismic acousticimpedance data are created using a “true” petro-elasticmodel based on the Gassmann’s equations [26, 62], theHashin-Shtrikman bound models [27] for a mixture of sandsand shales, and the fluid and mineral parameters proposedfor the Norne Field [12]. For the Hashin-Shtrikman bounds,we assume that the volume of shale in each of the activecells in the model is given by 1-NTG [3]. The true PEM

Fig. 8 True model impedance(top row) is computed at thesimulator gridblock level. Trueseismic impedance (middle row)is obtained by filtering the truemodel impedance using the“true” filter. Observed seismicimpedance (bottom row) isobtained by adding a realizationof correlated seismic noise tothe true seismic impedance.Only horizontal layer 11 andvertical section 11 are shown.Units of impedance: (m/s).(g/cc)

Comput Geosci (2020) 24:889–905 899

Page 12: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Table 1 Test cases used for the Norne Field example. The first case (Case 1) is characterized by a large model error coming from a biasedpetro-elastic model (PEM) and filtering step. In the second example (Case 2), with smaller model error, we use a biased vertical (frequency) filter

Test cases Case 1 Case 2 Truth

Biased dry moduli vs.

Petro-elastic model porosity relationships True PEM True PEM

Error in shale volume

Horizontal filter

(moving average)5 × 5 3 × 3 3 × 3

Vertical filter

(Ormsby)0-0-80-100 Hz 0-0-60-80 Hz 0-0-100-120 Hz

is then used to simulate the true impedance field from thisrealization (Fig. 8, top row).

In this paper, seismic observations and predictions arecompared at the scale of the Norne seismic impedancedata. A “true” vertical frequency filter is created based onthe frequency spectrum of the inverted acoustic impedancedata from the 2001 Norne seismic dataset (Fig. 6a). Toapproximately match the frequency spectrum of the realNorne impedance data, we selected a low-pass Ormsbyfilter [46], with cut frequencies at 0-0-100-120 Hz.For a sandstone velocity of 3500 m/s, this correspondsapproximately to the sensitivity function in depth shownby the solid curve in Fig. 6b. To model the horizontalresolution of the inverted acoustic impedance, we furtherapply a horizontal moving-average filter with a window ofthree cells by three cells. The Norne Field data set onlyprovides a reservoir simulation model, not a geomodel. Inorder to apply the filtering to generate seismic predictionsfrom the simulation model, it was necessary to assignimpedance values to inactive cells in the model for whichno petrophysical properties were assigned. In this study, weassigned the Not Shale acoustic impedance value of 7360

(m/s).(g/cc) from the Norne dataset [12] to inactive cells.The filtered true seismic impedance is shown in the middlerow of Fig. 8.

Noise is often estimated from seismic data by assumingthat the noise is either spatially uncorrelated or that it is of amuch different frequency than the signal. Because the Nornedata set has multiple vintages of seismic data, it is possibleto use factorial co-Kriging (FCK) [1, 10, 47] to separatenoise from signal without making those assumptions. Weused FCK to decompose the 2001 baseline and the 2006monitor survey into three parts: A common part betweenthe two surveys, and two spatially independent residualsfrom the baseline and the monitor survey. Our seismic noisemodel for the observations in the test case is based onthe residuals extracted from the 2001 baseline survey. Thecovariance estimate for the seismic noise has a nugget effectwith amplitude 983 (units of impedance squared) and acubic covariance [9] with amplitude 8347 and ranges in thei, j , and k directions of 10, 8, and 4 grid cells, respectively.Figure 7 shows two realizations of seismic noise generatedfrom the initial data error covariance estimate. Seismicimpedance observations, obtained by adding a realization

Fig. 9 Top row shows the difference between the true porosity field and the ensemble mean of the porosity field for the example with large modelerror (Case 1). Bottom row shows spread in the ensemble of porosity

Comput Geosci (2020) 24:889–905900

Page 13: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 10 Top row shows the difference between the true porosity field and the ensemble mean of the porosity field for the example with smallmodel error (Case 2). Bottom row shows spread in the ensemble of porosity

of correlated noise to the true seismic impedance field, areshown in the bottom row of Fig. 8.

3.2.1 Test case 1: Large model error

In this test case, we consider two sources of model error:an incorrect parameter value in the petro-elastic model anda difference in scale between impedance observations andimpedance predictions (Table 1). Firstly, we constructed abiased PEM by modifying the ‘true’ relationships betweenthe dry moduli and the porosity provided in the Norne Field[12], and by adding an error on the assignment of the shalevolumes in the Norne model. Secondly, we biased the truefilters by reducing the cut frequencies in the vertical filteringto 0-0-80-100 Hz, and by increasing the moving-averagingwindow of the horizontal filter to five cells by five cells.This resulted in impedance predictions that are smootherthan the impedance observations in both the horizontal andvertical axes.

In this case of large model error (Case 1), the stoppingcriterion for estimation of total observation error covariance

based on the magnitude of the model diagnostic p wassatisfied after four iterations (Fig. 11a), but we continuedthe model improvement cycle for one additional iterationof Algorithm 4. The sequence of model diagnostics basedon the ensemble of residuals is shown by the solid curvein Fig. 11a. An increasing probability suggests that themagnitude of the data residuals are consistent with theestimated total observation error. Updated estimates of thetotal observation error covariance are shown in Fig. 12a.The result of the iteration is to monotonically increase themagnitude of the variance and to increase the range of thecorrelation, thus placing less weight on the observations ineach cycle of model improvement. The initial estimate ofCD based on actual measurement error (pink curve) is muchsmaller than the final estimate. Subsequent changes in CD

are smaller than the first change, and changes in iterations3–5 are very small.

Because this is a synthetic example, it is possible tocompare the true property fields with the ensemble ofpredictions. Our comparison is based on the probability thatthe truth is drawn from the same distribution as the ensemble

Fig. 11 Evaluation of modeldiagnostic used for the stoppingcriterion in Algorithm 4 (a), andcomparison of the true porositydistribution with the ensembleafter history matching for theNorne seismic test cases (b).Case 1 has large model error.Case 2 has smaller model error

Comput Geosci (2020) 24:889–905 901

Page 14: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

Fig. 12 Experimentalcovariance in the j -direction ofdata residuals computed usingthe entire ensemble. Curvesshow the iterative changes in theestimate of the covariance withiterations of Algorithm 4

of predictions. When the actual measurement error is usedin data assimilation, the truth is quite different from thepredictions, but after modification of the estimate of CD toaccount for model error, the difference between the truth andthe ensemble of predictions becomes much smaller (Figs. 9and 11b).

In summary, the model error in Case 1 was very large—so large that after data assimilation with the estimatedtotal observation error the data were effectively ignored.We emphasize that there was nothing ‘wrong’ with thedata in Case 1, but because the model was deficient,improvements in predictions from the model after dataassimilation were small. If this situation were to occur inpractice, the first recourse should be to try to improve themodel by identifying missing parameters, or by increasinguncertainty, or by improving the forward simulator. Afterthat has been done as well as possible, then the next stepshould be to update the estimate of CD as was done here.The result is an ensemble estimate with smaller spread thanthe initial ensemble (Fig. 9), but one which is still consistentwith the true porosity.

3.2.2 Test case 2: Small model error

In the second Norne Field example, the only source ofmodel error is a difference in scale in the forward filteringoperator (Table 1). The vertical filter used for predictionhas cut frequencies of 0-0-60-80 Hz instead of the true cutfrequencies of 0-0-100-120 Hz. In this case, there is noerror in the PEM. Similar to Case 1, the stopping criterionfor estimation of total observation error covariance basedon magnitude of data residuals was satisfied in Case 2after four iterations (Fig. 11a), but we again continued themodel improvement cycle for one additional iteration ofAlgorithm 4. The sequence of model diagnostics based onthe ensemble of residuals is shown by the dashed curve inFig. 11a.

The changes in the iterative estimates of the totalobservation error covariance (Fig. 12b) are not as large as inCase 1 because the magnitude of the model error is smaller.We still see, however, a change from an initial estimate ofCD that is the sum of small “nugget” effect and a cubic

covariance with a range of approximately 8 grid cells (pinkcurve) to an estimate that is spatially correlated with arange of approximately 20 grid cells (Fig. 12b). The rangeof the correlated error in Case 2 is substantially smallerthan the range observed in Case 1, where the sources ofthe model error included an error in the PEM. Changesto the estimate of CD in iterations 2–5 are very small.The difference between true porosity and the ensemble ofpredictions becomes smaller after using the total estimate ofCD based on the data residuals (Fig. 10). Furthermore, thespread increases with the new total estimate of CD as lessweight is placed on the observations. The probability thatthe truth is drawn from the same distribution as the ensembleof calibrated predictions is smaller than expected in thiscase (dashed curve in Fig. 11b), possibly due to inadequatelocalization. The final ensemble, however, becomes closerto the truth with the iterations of Algorithm 4 (Fig. 11b).

4 Conclusions

All models have errors. The purpose of model calibration,model criticism and model improvement is to reduce theeffects of those errors so that forecasts made from calibratedmodels provide reliable predictions of future behavior. Ifmodel errors are not accounted for properly, the resultis overconfidence in incorrect predictions. The ultimateresult is suboptimal decisions on reservoir management anddevelopment.

In this paper, we presented a procedure for identificationof the presence of model error and a method for reducingthe effect of model error when it cannot be removed throughmodel improvement. The method involves iterativelyupdating an estimate of an effective observation error thataccounts for the presence of model error.

In order to handle model error and correlated observationerror in very large problems, it was necessary to developmethods that allow model checking for large models.This requires estimation of the inverse of a low-rankapproximation of the observation error covariance matrixCD . We used a shrinkage-based regularization of thecovariance to develop model diagnostics that can be

Comput Geosci (2020) 24:889–905902

Page 15: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

used in very high dimensions. It was also necessary toperform data assimilation without forming the observationerror covariance matrix. That was accomplished usinga low-rank updating scheme [21], with a new methodfor generating unconditional realizations of sample errorfrom a distribution for which we had only a low-rankapproximation of the covariance matrix.

The methodology was tested on several syntheticproblems for which the model error could be controlled andfor which errors in predictions could be evaluated. In the1D linear example, the value of the model diagnostic isacceptable after just one iteration. The second example wasbased on the Norne Field benchmark case. The number ofobservations was about 114,000 and the number of modelparameters was about 90,000. Model errors were causedby errors in the petro-elastic model and by errors in thefiltering function (forward operator). In both cases, themodel improvement iterations (using total CD) increasedthe spread in the ensemble, which lead to less overfittingto the observations. In real applications with actualinverted seismic data, this outcome is desirable in order toavoid overfitting to biased observations. Furthermore, theresulting ensembles of porosity and NTG have a spreadthat is smaller than that in the initial ensemble, but that isconsistent with their true fields.

We showed that the methodology described in thispaper is feasible for large problems (large number ofmodel parameters and large amounts of data), and that theapplication resulted in improved predictability and reducedtendency for overconfidence.

Acknowledgments We are grateful to Geovariances for providing alicense for the use of Isatis for factorial co-kriging. Additionally, theauthors thank Equinor (operator of the Norne Field) and its licensepartners Eni Norge and Petoro for the release of the Norne data.The authors acknowledge the Center for Integrated Operations atNTNU for cooperation and coordination of the Norne Cases. Theview expressed in this paper are the views of the authors and do notnecessarily reflect the views of Equinor and the Norne license partners.

Funding information Primary support for the authors has beenprovided by the CIPR/IRIS cooperative research project “4D SeismicHistory Matching” which is funded by industry partners Eni Norge,Petrobras, and Total, as well as the Research Council of Norwaythrough the Petromaks2 program.

Open Access This article is distributed under the terms of theCreative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricteduse, distribution, and reproduction in any medium, provided you giveappropriate credit to the original author(s) and the source, provide alink to the Creative Commons license, and indicate if changes weremade.

Appendix: Expectation for residualswith incorrect forwardmodel

The differences between observations and predictions frommodels provide information about the magnitude of modelerror and the magnitude of observation error, but therelationship is not simple. In this appendix, we showthat the expected covariance for data residuals basedon the difference between predictions from a perturbedobservation method of data assimilation and the actualobservations takes a relatively simple form.

rpost = H θpost − dobs (8)

where H is an approximation to the ‘true’ forward model,H . θpost is the ensemble of model parameters conditionedto the observations, dobs is the vector of observations, andrpost is the ensemble of data residuals.

For simplicity, we make the simplifying assumption thatthe “truth” can be obtained from a model that has the sameparameters as our approximate model and that the modelerror is due to deficiency of the forward model. Define

dobs = d true + εd

= H(θpr + εθ,2) + εd(9)

where εd ∼ N(0, R) and εθ,2 ∼ N(0, P ).Because we are assuming a gaussian prior and a linear

observation operator with Gaussian errors, we use themethod of randomized maximum likelihood (RML) tosample from the posterior conditional to observations [41,45]. In this case, however, we allow for the possibility thatour forward model is imperfect and that our estimate ofthe observation error is incorrect. A sample of the modelparameters from the conditional pdf can be generated byminimizing an objective function

θpost = argminθ

((θ − θ∗)T P −1 (

θ − θ∗) + (H θ + εd − dobs)TR−1(H θ + εd − dobs)) (10)

the solution of which is

θpost = θ∗ − PH(HP HT + R)−1(H θ∗ + εd − dobs)

= θ∗ − K(H θ∗ + εd − dobs)

(11)

where we defined the gain matrix based on the approximateforward model

K = PHT(HP HT + R)−1. (12)

Comput Geosci (2020) 24:889–905 903

Page 16: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

A.1 The difference between predictionsfrom calibratedmodels and observations

The difference between the calibration of perturbedobservations (RML) and the actual observations is

r = H θpost − dobs

= H θ∗ − dobs − K(H θ∗ + εd − dobs)

= (I − H K)(H θ∗ + εd − dobs

) − εd .(13)

The covariance of these residuals is approximately

E[rrT] = (I −H K)((HPHT + R) − (HP HT + R)

)(I −H K)T+R

(14)

where we assume that (H −H)θpr = 0. Note that if H = H

and R = R, then Eq. 14 simplifies to

E[rrT] = R (15)

which is what we expect.Also, note that if HP HT + R = HPHT + R, then

E[rrT] = R. For this case, we have

R = R + HPHT − HP HT (16)

A.2 Iteration

We use an iterative fixed-point scheme for estimation of thetotal observation covariance based on the residuals as shownin Eq. 14.

R+1 = (I − H K)((HPHT + R) − (HP HT + R)

)(I − H K)

T + R

(17)

The iterative scheme has a stable fixed-point solution when

R + HPHT − HP HT (18)

is positive definite. See the discussion of convergence onpage 262 of Menard [37].

References

1. Abreu, C.E., Lucet, N., Nivlet, P., Royer, J.-J.: Improving 4Dseismic data interpretation using geostatistical filtering. In: 9thInternational Congress of the Brazilian Geophysical Society(2005)

2. Amini, H.: A pragmatic approach to simulator-to-seismic mod-elling for 4D seismic interpretation. Ph.D. thesis, Heriot-WattUniversity (2014)

3. Amini, H., Alvarez, E., MacBeth, C., Shams, A.: Finding apetro-elastic model suitable for sim2seis calculation. In: 74thEAGE Conference and Exhibition incorporating (EUROPEC2012) (2012)

4. Blouin, M., Le Ravalec, M., Gloaguen, E., Adelinet, M.:Porosity estimation in the fort worth basin constrained by 3Dseismic attributes integrated in a sequential Bayesian simulationframework. Geophysics 82(4), M67–M80 (2017)

5. Brynjarsdottir, J., O’Hagan, A.: Learning about physical parame-ters: the importance of model discrepancy. Inverse Probl. 30(11),114007 (2014)

6. Chen, Y., Oliver, D.S.: Levenberg-Marquardt forms of the iterativeensemble smoother for efficient history matching and uncertaintyquantification. Comput. Geosci. 17(4), 689–703 (2013)

7. Chen, Y., Oliver, D.S.: History matching of the Norne full-fieldmodel with an iterative ensemble smoother. SPE Reserv. Eval.Eng. 17(2), 244–256 (2014)

8. Chen, Y., Oliver, D.S.: Localization and regularization for iterativeensemble smoothers. Comput. Geosci. 21(1), 13–30 (2017)

9. Chiles, J.-P., Delfiner, P. Geostatistics: Modeling Spatial Uncer-tainty, 2nd edn. Wiley, New York (2012)

10. Coleou, T., Hoeber, H., Lecerf, D., et al.: Multivariate geosta-tistical filtering of time-lapse seismic data for an improved 4Dsignature. In: 73rd Ann. Intern Mtg., SEG, Expanded Abstracts(2002)

11. Courtier, P., Andersson, E., Heckley, W., Vasiljevic, D., Hamrud,M., Hollingsworth, A., Rabier, F., Fisher, M., Pailleux, J.:The ECMWF implementation of three-dimensional variationalassimilation (3D-Var). I: Formulation. Q. J. Roy. Meteorol. Soc.124(550), 1783–1807 (1998)

12. Dadashpour, M.: Reservoir characterization using productiondata and time-lapse seismic data. Ph.D. dissertation NTNU,Trondheim, Norway (2009)

13. Del Giudice, D., Honti, M., Scheidegger, A., Albert, C., Reichert,P., Rieckermann, J.: Improving uncertainty estimation in urbanhydrological modeling by statistically describing bias. Hydrol.Earth Syst. Sci. 17(10), 4209–4225 (2013)

14. Dietrich, C.R., Newsam, G.N.: Fast and exact simulation ofstationary Gaussian processes through circulant embedding ofthe covariance matrix. SIAM J. Sci. Comput. 18(4), 1088–1107(1997)

15. Doherty, J., Welter, D.: A short exploration of structural noise.Water Resour. Res. 46(5), W05525 (2010)

16. Doyen, P.M., Psaila, D.E., den Boer, L.D., Jans, D.: Reconcilingdata at seismic and well log scales in 3-D earth modelling. In:Proc. of the SPE Annual Technical Conference and Exhibition,pp. 5–8, San Antonio (1997)

17. Emerick, A.A.: Analysis of the performance of ensemble-basedassimilation of production and seismic data. J. Pet. Sci. Eng. 139,219–239 (2016)

18. Emerick, A.A., Reynolds, A.C.: Ensemble smoother with multipledata assimilation. Comput. Geosci. 55, 3–15 (2013)

19. Engel, J., Buydens, L., Blanchet, L.: An overview of large-dimensional covariance and precision matrix estimators withapplications in chemometrics. J. Chemometr. 31(4), e2880 (2007)

20. Evensen, G.: Data Assimilation: The Ensemble Kalman Filter, 2ndedn. Springer (2009)

21. Evensen, G.: The ensemble Kalman filter for combined state andparameter estimationMonte Carlo techniques for data assimilationin large systems. IEEE Control Syst Mag 29(3), 83–104 (2009)

22. Evensen, G., Eikrem, K.S.: Conditioning reservoir models on ratedata using ensemble smoothers. Comput. Geosci. 22(5), 1251–1270 (2018)

23. Evin, G., Thyer, M., Kavetski, D., McInerney, D., Kuczera, G.:Comparison of joint versus postprocessor approaches for hydro-logical uncertainty estimation accounting for error autocorrela-tion and heteroscedasticity. Water Resour. Res. 50(3), 2350–2375(2014)

24. Francis, A.M.: Understanding stochastic inversion: Part 2. FirstBreak 24(12), 79–84 (2006)

25. Gaspari, G., Cohn, S.E.: Construction of correlation functionsin two and three dimensions. Q. J. R. Meteorol. Soc. 125(554),723–757 (1999)

Comput Geosci (2020) 24:889–905904

Page 17: Seismic data assimilation with an imperfect model · Data assimilation also generally implies a Bayesian approach to uncertainty estimation [20], which we take as a key component

26. Gassmann, F.: Elastic waves through a packing of spheres.Geophysics 16, 673–685 (1951)

27. Hashin, Z., Shtrikman, S.: A variational approach to the theoryof the elastic behaviour of multiphase materials. J. Mech. Phys.Solids 11(2), 127–140 (1963)

28. Hodyss, D., Nichols, N.: The error of representation: basic under-standing. Tellus Series A-Dynamic Meteorology and Oceanogra-phy 67, 24822 (2015)

29. Howes, K.E., Fowler, A.M., Lawless, A.S.: Accounting for modelerror in strong-constraint 4D-Var data assimilation. Q. J. Roy.Meteorol. Soc. 143(704), 1227–1240 (2017)

30. Iglewicz, B., Hoaglin, D.C.: How to Detect and Handle Outliers,vol. 16. ASQ Press (1993)

31. Janjic, T., Bormann, N., Bocquet, M., Carton, J.A., Cohn, S.E.,Dance, S.L., Losa, S.N., Nichols, N.K., Potthast, R., Waller, J.A.,Weston, P.: On the representation error in data assimilation. Q. J.Roy. Meteorol. Soc. 144(713), 1257–1278 (2018)

32. Kalla, S., White, C.D., Gunning, J., Glinsky, M.: Downscalingmultiple seismic inversion constraints to fine-scale flow models.SPE J. 14(4), 746–758 (2009)

33. Kroese, D.P., Botev, Z.I.: Spatial process simulation. In: Schmidt,V., Geometry, S.tochastic. (eds.) Stochastic Geometry, SpatialStatistics and Random Fields: Models and Algorithms, pp. 369–404. Springer International Publishing (2015)

34. Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88(2),365–411 (Jan 2004)

35. Leung, P.L., Chan, W.Y.: Estimation of the scale matrix and itseigenvalues in the Wishart and the multivariate F distributions.Ann. Inst. Stat. Math. 50(3), 523–530 (1998)

36. Liu, M., Grana, D.: Stochastic nonlinear inversion of seismic datafor the estimation of petroelastic properties using the ensemblesmoother and data reparameterization. Geophysics 83(3), M25–M39 (2018)

37. Menard, R.: Error covariance estimation methods based on anal-ysis residuals: theoretical foundation and convergence propertiesderived from simplified observation networks. Q. J. Roy. Meteo-rol. Soc. 142(694), 257–273 (2016)

38. Michel, Y.: Diagnostics on the cost-function in variationalassimilations for meteorological models. Nonlinear ProcessesGeophys. 21(1), 187–199 (2014)

39. Miller, J.W., Dunson, D.B.: Robust Bayesian inference viacoarsening. arXiv:1506.06101 (2015)

40. Mirouze, I., Weaver, A.T.: Representation of correlation functionsin variational assimilation using an implicit diffusion operator. Q.J. Roy. Meteorol. Soc. 136(651, B), 1421–1443 (2010)

41. Oliver, D.S.: On conditional simulation to inaccurate data. Math.Geol. 28(6), 811–817 (1996)

42. Oliver, D.S.: Calculation of the inverse of the covariance. Math.Geol. 30(7), 911–933 (1998)

43. Oliver, D.S., Alfonzo, M.: Calibration of imperfect models tobiased observations. Comput. Geosci. 22(1), 145–161 (2018)

44. Oliver, D.S., Chen, Y.: Recent progress on reservoir historymatching: A review. Comput. Geosci. 15(1), 185–221 (2011)

45. Oliver, D.S., Reynolds, A.C., Liu, N.: Inverse Theory forPetroleum Reservoir Characterization and History Matching.Cambridge University Press, Cambridge (2008)

46. Ormsby, J.F.A.: Design of numerical filters with applications tomissile data processing. J. ACM 8(3), 440–466 (1961)

47. Pardo-Iguzquiza, E., Dowd, P.A.: FACTOR2D: A computerprogram for factorial cokriging. Comput. Geosci. 28(8), 857–875(2002)

48. Philip, N., Dyce, M., Whitcombe, D.: 4D amplitude significance–a technique for suppressing noise in 4D seismic surveys. In: 71stEAGE Conference and Exhibition (2009)

49. Rainwater, S., Bishop, C.H., Campbell, W.F.: The benefits ofcorrelated observation errors for small scales. Q. J. Roy. Meteorol.Soc. 141(693), 3439–3445 (2015)

50. Roggero, F., Lerat, O., Ding, D.Y., Berthet, P., Bordenave, C.,Lefeuvre, F., Perfetti, P.: History matching of production and 4Dseismic data: Application to the Girassol Field, Offshore Angola.Oil Gas Sci. Technol. —. Rev. IFP Energies nouvelles 67(2),237–262 (2012)

51. Rwechungura, R.W., Suwartadi, E., Dadashpour, M., Kleppe, J.,Foss, B.A.: The Norne Field case – a unique comparative casestudy. In: SPE Intelligent Energy Conference and Exhibition.Society of Petroleum Engineers (2010)

52. Satterfield, E., Hodyss, D., Kuhl, D.D., Bishop, C.H.: Investigat-ing the use of ensemble variance to predict observation error ofrepresentation. Mon. Weather. Rev. 145(2), 653–667 (2017)

53. Schafer, J., Strimmer, K.: A shrinkage approach to large-scalecovariance matrix estimation and implications for functionalgenomics. Stat. Appl. Genet. Mol. Biol. 4(1, 32), 1–30 (2005)

54. Seaman, R.S.: Absolute and differential accuracy of analysesachievable with specified observational network characteristics.Mon. Weather. Rev. 105(10), 1211–1222 (1977)

55. Sengupta, M., Mavko, G., Mukerji, T.: Quantifying subresolutionsaturation scales from time-lapse seismic data: A reservoirmonitoring case study. Geophysics 68(3), 803–814 (2003)

56. Stephen, K.D., Shams, A., MacBeth, C.: Faster seismic historymatching in a United Kingdom continental shelf reservoir.SPE Reserv. Eval. Eng. 12(4), 586–594 (2007). 2007 SPEEuropec/EAGE Annual Conference and Exhibition, London,England, June 11-14, 2007

57. Stewart, L.M., Dance, S.L., Nichols, N.K.: Correlated observationerrors in data assimilation. Int. J. Numer. Methods Fluids 56(8),1521–1527 (2008)

58. Thore, P.: Uncertainty in seismic inversion: What really matters?Lead. Edge 34(9), 1000–1004 (2015)

59. Trefethen, L.N., Bau, D. III.: Numerical Linear Algebra, vol. 50.SIAM (1997)

60. Waller, J.A., Ballard, S.P., Dance, S.L., Kelly, G., Nichols,N.K., Simonin, D.: Diagnosing horizontal and inter-channelobservation error correlations for SEVIRI observations usingobservation-minus-background and observation-minus-analysisstatistics. Remote Sens. 8, 7 (2016)

61. Waller, J.A., Dance, S.L., Nichols, N.K.: Theoretical insightinto diagnosing observation error correlations using observation-minus-background and observation-minus-analysis statistics. Q. J.Roy. Meteorol. Soc. 142(694), 418–431 (2016)

62. Wang, Z.Z.: Y2K tutorial: Fundamentals of seismic rock physics.Geophysics 66(2), 398–412 (2001)

63. Watson, J., Holmes, C.: Approximate models and robust decisions.Statist. Sci. 31(4), 465–489 (2016)

64. White, J.T., Doherty, J.E., Hughes, J.D.: Quantifying thepredictive consequences of model error with linear subspaceanalysis. Water Resour. Res. 50(2), 1152–1173 (2014)

Publisher’s note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.

Comput Geosci (2020) 24:889–905 905