on internal consistency, conditioning and models of uncertainty · 2011-04-22 · internal...

On internal consistency, conditioning and models of uncertainty

Jef Caers, Stanford University

Abstract

Recent research has been tending towards building models of uncertainty of the Earth, not just building a single (or

few) detailed Earth models. However, just as any model, models of uncertainty often need to be

constrained/conditioned to data to have any prediction power or be useful in decision making. In this

presentation, I propose the concept of “internal consistency” as a scientific basis to study prior (unconditional) and

posterior (conditional) models of uncertainty as well as the various sampling techniques involved. In statistical

science, internal consistency is the extent to which tests or procedures assess the same characteristic, skill or

quality. In the context of uncertainty, I will therefore define internal consistency as the degree to which sampling

methods honor the relationship between the unconditional model of uncertainty (prior) and conditional model of

uncertainty (posterior) as specified under a (subjectively) chosen “theory” (for example: Bayes’ rule). The “tests”

performed are then various different ways of sampling from the same (conditional or unconditional) distributions.

If these distributions are related to each other via a theory, then such “tests” should yield similar results. I propose

various such tests using Bayes’ rule as the “theory”. A first test is simply to generate unconditional models, extract

data from them using a forward model and generate conditional models from this randomized data. Internal

consistency with Bayes’ rule would mean that both sets of conditional and unconditional models span the exact

same space of uncertainty, simply because the data spans the uncertainty in the prior. I show that this not true for

a number of popular conditional stochastic modeling methods: sequential simulation with hard data, gradual

deformation and ensemble Kalman filters for solving inverse problems. I also show that in some cases lack of

internal consistency leads to a considerable artificial reduction of (conditional) uncertainty that may have

important consequences if such models are used for prediction purposes. A case involving predicting flow behavior

is presented. Finally, I offer some discussion on the importance of internal consistency in practical applications and

introduce some novel approaches to conditioning that are internally consistent as well as computationally

efficient.

Introduction

There has been a shift in recent year in building models of uncertainty instead of just building models.

What do I mean by that? In 3D modeling, one is interested in creating a 3D gridded model of the Earth

representing the data and geological understanding of the spatial structures and components. In a way

such a model is “a model of what we know”. In modeling uncertainty we are interested in covering the

uncertainty about a certain response evaluated on these models such as flow simulations, we are not

merely interested in just building a 3D model. Such as model is “a model of what we don’t know”.

Therefore modeling uncertainty is more than just cranking the random numbers of the same model a

few more times (Caers, 2011), it requires a conceptual change in thinking. What is our state of

knowledge, what is our lack of understanding, how do we quantify this?

Often critical parameters in this uncertainty model need to be identified because too many parameters

are uncertain for any model of uncertainty to be useful. Just like 3D model that are constrained to data,

models of uncertainty need to be constrained to data. But what does this mean? Does it simply mean

that every model in the set of 3D models generated needs to match the data in the same fashion? To

judge such conditioning, we introduce the concept of “internal consistency”. In order not to invent yet

another term, I borrowed this notion from statistical science where this refers to the extent to which

tests or procedures assess the same characteristic, skill or quality. It is a measure of the precision

between the observers or of the measuring instruments used in a study. For example, a researcher

designs a questionnaire to find out about college students' dissatisfaction with a particular textbook.

Analyzing the internal consistency of the survey items dealing with dissatisfaction will reveal the extent

to which items on the questionnaire focus on the notion of dissatisfaction. In this paper we will not

necessarily follow this literal interpretation, but will focus on the fact that if two procedures test the

same “skill” or “property”, then their scores should be similar. The property being studied here is the

conditioning of various methods in reservoir modeling, be it to well-log or production data.

At first, we will focus on a simple test, namely that if models of uncertainty are conditioning to a random

set of data, then this conditional model of uncertainty should be the same the unconditional model of

uncertainty. If that is not the case than the conditioning method is internally consistent with the theory

that links conditional and unconditional models. We will see for example conditioning techniques that

perfectly match the data are in fact internally inconsistent.

Internal consistency for Earth models

Two schools of thought

Internal consistency requires a theory, hypothesis or objective, therefore we call it “internal”. There is

no such thing as absolute internal consistency or “external consistency”. In conditioning we can use

Bayes’ rule as such a theorem. But it should be stated that we do not need to use Bayes’ rule, we could

invent other rules, as long as we stay consistent with those rules, then, we know what we are doing and

are following a scientifically rigorous path. Bayes’ rule is very simple,

|

|

( | ) ( )( | )

( )

f ff

f

D M M

M D

D

d m mm d

d

where M is the random vector describing a gridded Earth model, D is the random vector describing the

data outcomes. Bayes’ rule basically states the relationship between the prior or unconditional model of

uncertainty fM and the conditional or posterior model of uncertainty fM|D. Once you choose the prior and

the likelihood function, the posterior is fixed; you cannot choose it any longer independently without

being internally inconsistent with Bayes’ rule.

Figure 1: flowchart depiction of two schools of thought

Generally two schools of thought have prevailed in geostatistics, see Figure 1. The Bayesian school of

thinking prescribed that one should explicitly state the prior and likelihood, then use Bayes’ rule to

determine the posterior and then sample as accurately as possible from this posterior distribution. By

“accurate” sampling, we mean uniform sampling. This way of thinking is very rigorous, but also very

cumbersome. First, very few explicit multi-variate distributions are known. Secondly, parameterizing and

P(m)

P(m|d)Sampler

d RNm(1)

m(2)

m(3)

.

.

.

Explicit “theory”

Model of spatial continuity“statistics”

algorithm

d RNm(1)

m(2)

m(3)

.

.

.Implicit “theory”

Bayesian view

frequentist view

stating explicitly parameters for these distributions is tedious and as a result simple models, such as

multi-Gaussian with simplified parameters assumptions (such as homoscedasticity in the variance) are

assumed. Last but not least, sampling methods such as McMC are impractical if the data-model

relationship is complex or CPU demanding to evaluate. But the Bayesian view is internally consistent.

In juxtaposition, there is a more “frequentist approach” to modeling, namely, that any set of 3D Earth

models represents a model of uncertainty, whether or not these models are conditional or

unconditional. In their view, there is no need for specifying any distribution functions explicitly. As long

as one can create models, one is fine. Any “algorithm” can be viewed as a model and this algorithm can

be conditional or unconditional. Bayes’ rule is used in various ways, for example to constrain to data, but

it is generally not used in an explicit way to link conditional and unconditional models of uncertainty. But

are models created in this way internally consistent? Let’s consider an example.

Testing for internal consistency

We design a simple test for internal consistency between conditioning mechanism, theory and models

of uncertainty. Recall that as theory we chose for Bayes’ rule, a subjective choice, but a choice

nonetheless. Our test works as follows and works for both Bayesian and Frequentist views. We assume

that the data-model relationship is given by a forward model, namely

( )gd m

In the Bayesian world the test would be

1. Sample m from fM

2. Generate d from d=g(m)

3. Specify the likelihood fD|M and therefore the posterior

4. Sample now from fM|D(m|d)

For the frequentist view, one has the same thing, but different ways of expressing it in practice:

1. Generate an unconditional simulation m using an unconditional algorithm

2. Generate d from d=g(m)

3. Generate a conditional simulation m|d using the conditional algorithm

What is the purpose here? In repeating this workflow, we obtain multiple conditional Earth models. The

distribution of these conditional Earth model should be exactly the same as the unconditional or prior

model, indeed,

( | ) ( ) ( | ) ( ) ( ) ( | ) ( )M M Mf f d f f d f f d f M|D D D|M D|M

d d d

m d d d d m m d m d m d m

(1)

This makes sense, since we randomize the data in such a way that it is consistent with the prior. Note

that in this derivation we used “our theory”, namely Bayes’ rule.

Figure 2: outline of the simple Boolean model

A simple example

To illustrate this internal consistency test, we use a simple but perhaps baffling example of what can go

wrong. Consider a 1D grid, see Figure 1. The prior model is a simple Boolean model. The Boolean model

consist placing exactly five objects on this line, where each object is drawn from a given uniform

distribution with length [minL, maxL]. The objects can overlap. As data, we consider the exact

observation of absence/presence of an object at the middle location, see Figure 1. We describe the

random variable a A(i), where A(i)=1 indicates that the object is present, A(i)=0 that the object is absent.

Consider a simple conditioning method, i.e. a method for generating conditional realizations. Clearly we

have two case, either A(50)=1 or A(50)=0. If the first case would occur, we generate a conditional model

as follows:

1. Draw an object with certain length from the uniform distribution

2. Put it uniformly around the conditioning location 50

3. Generate the four remaining objects

When A(50), we do the following

1. Draw the length of a single object

2. Put it uniformly at those locations that will not violate the conditioning data A(50)=0

3. Repeat this till you have 5 objects

Clearly this will generate conditional simulations, and it appears, follow the Boolean model. Not quite.

Let’s run our consistency test. First we need to know the marginal P(A(i)=1). This is easy, we simple

generate 1000 unconditional models and calculate their ensemble average. To run our test, we simply

generate N conditional models with A(50)=1, N conditional models with A(50)=0, calculate their

ensemble averages and average them according to the marginal, namely

1 100i=50

1 0( 1) (1 ( 1))P A P A EA EA EA

If the conditioning method were to be internally consistent, one should obtain that

( 1)P A EA

that is, get back the marginal as stated in Eq (1). Figure 3 shows that this is not the case, the marginal is

shown in red, while the result from our test is shown in blue. The conclusion is that this method is not

internally consistent. What happened? To further analyze this puzzling result, we use a sampler in the

rigorous Bayesian sense that is known to be exact, the rejection sampler. In rejection sampling we

simply generate a model and accept it is it matches the data, reject it else. Consider first rejection

sampling when A(50)=0. In Figure 4 we plot the ensemble average (conditional mean) of 1000 rejection

sampler results together the ensemble average of the simple conditioning method. We get a perfect

match. The result is however different for the case A(50)=1,see Figure 4. Indeed it seems that the

average size of objects generated with our technique is too small. This makes sense because of the fact

that an observation of on object is more likely when the object is large than when it is small, a fact that

was not considered in our simple conditioning method.

Figure 3: results of the internal consistency test

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0 20 40 60 80 100

P( )m

1D grid

P(A=1)

P( | )P( )dd

m d d d

Figure 4: comparing rejection sampler with simple conditioning for both cases A(50)=0 and A(50)=1

Figure 5: example TI and single hard data

More examples

In the current and previous SCRF reports we have provided several example illustrations of this internal

consistency property. I will briefly summarize the results.

Rejection sampler

P(A=0|A(50)=1)

Naïve conditioning

1D grid

Rejection sampler

P(A=1|A(50)=1)

Naïve conditioning

1D grid

Training image

Channel sand

Background mud

Grid with single hard data

Figure 6: ensemble ab=verage of the rejection sampler and conditional MPS using snesim

Conditional MPS simulation

The conditional MPS simulation algorithms often differ from the unconditional ones, just as in the

Boolean example, this can lead to internal inconsistency. Consider the simple example shown in Figure

5. A single hard conditioning data indicating sand is located in the center. A training image is given on

the left of simple sinuous channels. Consider now conditioning first using rejection sampler. 150 models

are created that match the data. The rejection sampler uses the unconditional version of snesim. The

ensemble average is provided on the left in Figure 6. Next 150 models are created using the same

snesim algorithm, but now the conditional version. Clearly the ensemble average in Figure 6 differs from

the rejection sampler, meaning that there exists an internal consistency problem between conditional

and unconditional snesim. Where does this problem occur? Since snesim works on multi-grids, the single

hard data needs to be relocated to the nearest coarse grid node. This data re-location does not occur

when performing unconditional simulation; hence this is the source of the discrepancy. In the work of

Honarkhah (2011), this problem is resolved using a different data relocation algorithm, see his work for

details. Figure 7 show this his dispat code indeed does have results comparable to rejection sampling,

even in cases with complex data configurations as opposed to the simple data relocation implemented

in snesim.

Rejection samplerunconditional sequential simulation

Conditional sequential simulationRandom path

Figure 7: results from dispat

History matching by ensemble Kalman filters

Ensemble Kalman filters (EnKf) have recently been popular in researching methods for obtaining

multiple history matched models. In this method a set of initial or prior reservoir models is generated.

Next, a first time step of the flow simulation is executed and dynamic variables are calculated. This initial

set of models is then updated (linearly) as a whole using the difference between the field data and the

response of the initial set, as well as the covariance matrix between the static and dynamic variables.

The theory requires that the models a multi-Gaussian and the relationship between data and model is

linear (or almost linear). This update is repeated until the last time step of the flow simulation. Consider

an example in Figure 8. A simple injector and producer configuration is shown on a 31x31 grid. As prior

model, we have a training image, see Figure 9, from which a set of initial reservoir models can be

generated, see Figure 10. To apply the EnKf method to these clearly non-Gaussian fields, we use the

metric ensemble Kalman filter (Park, 2011), an adaptation of the ensemble Kalman filter performed

after kernel transformation. Figure 11-12 shows that “success” is obtained by generating 30 models that

reproduce the training image patterns as well as match the data. So, to the eye of the innocent

bystander, everything seems perfect. Consider however generating 30 history matched models using the

rejection sampler. When then comparing the conditional variance of the permeability fields generated

using these two techniques, see Figure 13, we notice the low variance of the ensemble Kalman filter as

compared to the rejection sampler. Clearly the linear and Gaussian hypothesis of the Kalman filter lead

to internal inconsistency.

Figure 8: setup problem for the EnKf

Figure 9: training image

The data

In the work of Park (2011), a different approach is taken that does not require linearity, nor Gaussianity.

His distance-based approach has been tested on the same example resulting in Figure 13. Clearly, while

not yet perfect, he achieves a much greater randomization than the EnKf, while matching the data

equally well.

Figure 10: unconditional simulation

Figure 11: conditional simulation

Unconditional simulation

Some posterior models obtained by EnKf

Figure 12: history matching with rejection sampler and Enkf

Figure 13: comparing conditional variance

Rejection sampler 10.000 forward simulations

EnKf30 forward simulations

Rejection sampler EnKf

Distance methods

Figure 14: 9 wells of a real case and the surface-based model

History matching by means of optimization

The shortcoming of the EnKf method is that it is basically a kriging-type approach (linear updating,

covariances), hence only works for posterior determination if that posterior is multi-Gaussian. Once can

therefore question how well other “optimization” technique work in terms of internal consistency when

applied to the history matching problem, since basically EnKf, as kriging, is an optimization technique.

Figure 15: (top) matching performance, (bottom)prediction performance

The data

Rock type and boundary of surfaces

The model

Surface-based model

How well do wematch the 8 wells?

How well do we predict the one well?

↑ Bias↓ Variance

Consider therefore a more complex example, studied in Bertoncello (2011). In his work, a complex

surface-based forward model is built, see Figure 14, based on various input parameters such as length

and height of the lobes, origin location, migration and progradation statistics as well as several rules

related to the deposition and erosion of these bodies. Spatial uncertainty is modeled by placing the lobe

surfaces in various different positions. A complex but geologically realistic model can be created. In

order to fit such a model to data, such as well data, one can execute an iterative trial-and-error type

optimization algorithm that modifies the lobe parameters and placement such that some objective

function quantifying the mismatch between data and model is minimized.

In order to check how predictive such optimization is, consider a realistic case outlined in Bertoncello

(2011). A surface-based model is matched iteratively to 8 wells, with the aim of predicting the outcome

of a 9th well in a real-case dataset. Figure 15 shows that as the iteration proceeds, the match is getting

better. Several models were matched, each starting from a different initial solution, providing an

envelope of mismatches. All models reached a good match as long as the optimization is ran long

enough. How well do these models predict the 9th well. The prediction is good up to a certain amount of

matching. Clearly when the iterations are run a long time, the predictions will start to deteriorate,

meaning that the solutions spread an uncertainty space that has become too narrow and also biased.

Focusing too much on matching data may therefore create poor models of uncertainty. There may be

various reasons for this in this example. The forward model may not accurately capture reality, hence

any strict matching leads to a bias. Secondly, optimization methods tend to provide a too narrow space

of uncertainty.

Does it matter?

One can wonder whether this sudden focus on internal consistency really matters. Consider again the

MPS conditional simulation and consider now predicting flow in a neighboring producer well, see Figure

16. Clearly the uncertainty in terms of flow can be highly affected.

Nevertheless, one should have a broader discussion on internal consistency than this simple example. If

we return to Bayes’ rule than the role of the prior becomes important. Often we have a good handle on

the likelihood, that is, how well we should match the data. The problem in modeling uncertainty often

lies in the prior. Clearly in the above examples, the data was matched, but it was matched “incorrectly”.

In previous years, we put a lot of emphasis on geological consistency. This geological consistency issue is

not the cause of the incorrect matching. All models reproduce the prior statistics and the data. But the

posterior has become inconsistent with the prior. Does this matter? It matters if considerable effort has

been put in constructing the prior, such as for example the case with multiple training images. If

however the prior is multi-Gaussian, then, to my opinion, there is no need to be consistent with it since

it is already a fabricated model that is not very in tune with reality.

Figure 16: the consequence in terms of flow prediction of internal inconsistency

References

Bertoncello, A., 2011. Conditioning of Surface-Based Models to Wells and Thickness Maps. PhD

dissertation, Stanford University.

Caers, J., 2011. Modeling Uncertainty in the Earth Sciences. Wiley-Blackwell. 250p.

Honarkhah, M., 2011. Stochastic Simulation of Patterns Using Distance-based Pattern Modeling. PhD

dissertation, Stanford University.

Park, K, 2011. Modeling Uncertainty in Metric Space, PhD dissertation, Stanford University.

Grid with single hard data

producer

injector

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1000 2000 3000 4000 5000 6000 7000 8000

% w

ate

r p

rod

uce

d

Time

P90

P10

Rejection sampler Conditional simulation

on internal consistency, conditioning and models of uncertainty · 2011-04-22 · internal...

Documents