lecture 4: practical examples. remember this? m est = m a + m [ d obs – gm a ] where m = [g t c d...
Post on 20-Dec-2015
219 views
TRANSCRIPT
Lecture 4:
Practical Examples
Remember this?
mest = mA + M [ dobs – GmA]
where M = [GTCd-1G + Cm
-1]-1 GT Cd-1
It’s exactly the same as solving this equation
Cd-½G
Cm-½
Cd-½d
Cm-½mA
m =
which has the form Fm=h by simple least-squares!
m = [FTF]-1FTh
This form of the equation is usually easier to set up
m = [FTF]-1FTh
in the uncorrelated case, the equation simplifies to
d-1G
m-1
d-1d
m-1mA
m =
each data equation weighted by the variance of that datum
each prior equation weighted by the variance of that prior
Example 1
1D Interpolation
Find a function f(x) that
1) goes through all your data points
(observations)
2) does something smooth inbetween
(prior information)
This is interpolation … but
why not just use least-squares?
m – a vector of all the points at which you want to estimate the function, including the points for which you have observations
d – a vector of just those points where you have observations
So the equation Gm=d is very simple, a model parameter equals the data when the corresponding observation is available:
…0 … 0 1 0 … 0…
…mi
…
…dj
… =
Just a single “1” per row
You then implement a smoothness constraint by first developing a matrix D that computes the non-smoothness of m
…0 … 1 -2 1 … 0…
D =
One possibility is to use the finite-difference approximation of the second derivative
And by realizing that:maximizing smoothness is the same as
minimizing |Dm|2
and minimizing |Dm|2 is the same as choosing
Cm-1DTD (along with mA=0).
First derivative
[dm/dx]i (1/x) mi – mi-1
mi – mi-1
Second derivative
[d2m/dx2]i [dm/dx]i+1 - [dm/dx]i
= mi+1 – mi – mi + mi-1
= mi+1 – 2mi + mi-1
So the F m = h equation is:
G
D
d
m =
is a damping parameter that represent the relative weight of the smoothness constraint, that is, how certain we are that the solution is smooth.
m =
1 0 … 0 0 0
0 0 … 0 1 0
… … … … … …
0 0 … 0 0 1
d1
d7
…
dN
- 0 0 0 0
-2 0 0 0
… … … … … …
0 0 … -2
0 0 … 0 -
0
0
0
0
0
example101 equally spaced along the x-axis
So 101 values of the function f(x)
40 of these values measured (the data, d)the rest are unknown
Two prior informationminimize 2nd derivative for interior 99 x’sminimize 1st derivative at left and right x’s
(nice to have the same numberof priors as unknowns, but notrequired)
= 10-6
data
result
f(x)
x
can be chosen by trial and error
but usually the result fairly insensitive to , as long as its small
varied over six orders of magnitude
log10 (T
otal Error)
log10()
A purist might say that this is not really interpolation, because the
curve goes through the data only in the limit 0
but for small ’sthe error is extremely small
Example 2
Reconstructing 2D data known to obey a differential equation
2f = 0
e.g. f(x,y) could be temperature
21 unknowns
21 u
nk
now
ns
2121=441 unknowns
44 observed data
Prior information:
2f = d2f/dx2 + d2f/dy2 = 0 in interior of the box
nf = 0 on edges of box
(sides of box are insulating)
The biggest issue here is bookkeeping
Conceptually, the model parameters are on a nm grid mij
But they have to be reorganized into a vector mk to do the calculations
m11 m12 m13 … m1n
m21 m22 m23 … m2n
m31 m32 m33 … m3n
…
mm1 mm2 mm3 … mmn
m1
m2
m3
…
mnm
e.g. mij mk with k=(i-1)*m+j
Thus a large percentage of the code is concerned with converting back and forth between positions in the grid and positions in the corresponding vector. It can look pretty messy!
results
comparison
Example 3
Linear Systems
Scenario 1: no past history needed
Flame with time-varying heat h(t)
Thermometer measuring temperature (t)
Flame instantaneously heats the thermometer
Thermometer retains no heat
(t) h(t)
Scenario 2:past history needed
Flame with time-varying heat h(t)
Thermometer measuring temperature (t)
Heats takes time to seep through plate
Plate retains heat
(t=t’) history of h(t) for time t<t’
Steel plate
How to write a Linear System(t) history of h(t’) for all times in the past
(t0) = … + g0 h(t0)
+ g1 h(t-1)
+ g2 h(t-2)
+ g3 h(t-3)
+ g4 h(t-4) + …
(t1) = … + g0 h(t1)
+ g1 h(t0)
+ g2 h(t-1)
+ g3 h(t-2)
+ g4 h(t-3) + …g is called the “impulse response” of the system
Matrix formulations
0
1
…N
h0
h1
…hN
g0 0 0 0 0 0g1 g0 0 0 0 0…gN … g3 g2 g1 g0
=
Note problem with parts of the equation being “off the ends” of the matrix
0
1
…N
g0
g1
…gN
h0 0 0 0 0 0h1 h0 0 0 0 0…hN … h3 h2 h1 h0
=
This formulation might be especially usefulwhen we know and g
and want to find h
0
1
…N
h0
h1
…hN
g0 0 0 0 0 0g1 g0 0 0 0 0…gN … g3 g2 g1 g0
=
= G h
0
1
…N
g0
g1
…gN
h0 0 0 0 0 0h1 h0 0 0 0 0…hN … h3 h2 h1 h0
=
= H g
This formulation might be especially usefulwhen we know andh and
and want to find g
Thermometer measuring plate temperature
Goal: infer “physics” of plate, as embodied in its impulse response function, g
plateThermometer measuring flame heat h
g(t)
htrue(t)
true(t)
Set up of problem
obs(t)=true(t)+noise
hobs(t)=htrue(t)+noise
Simulate noisy data
Results
gtrue(t) and gest(t) … yuck!
fix-uptry for shorter g(t) and use
2nd derivative damping
Damping: 2=100
Example 4
prediction error filter
how well does the past predict the present?
5 = g14 + g23 + g32 + g41 …6 = g15 + g24 + g33 + g42 …7 = g16 + g25 + g34 + g43 …
= g05 + g14 + g23 + g32 + g41 … = g06 + g15 + g24 + g33 + g42 … = g07 + g16 + g25 + g34 + g43 …
with g0 = -1
Solve g=0 by least squares with prior information g0=-1
matrix of ’s
use large damping
20 years of Laguardia Airport Temperatures, filter length M = 10 days
g
filter length M = 10 days
g
filter length M = 100 days
g
filter length M = 100 days
g
*g is the unpredictable part of
Let’s try it with the Neuse River Hydrograph Dataset
Filter length M=100
What’s that?
g
g
Close up of first year of data
g
Note that the prediction error, *g, is spikier than the hydrograph data, . I think that this means that some of the dynamics of the river flow is being captured by the filter, g, and that the unpredictable part is mostly the forcing, that is, precipitation
Example 4
Tomography
Tomography: reconstructing an image from measurements made along rays
CAT scan: density image, reconstructed from X-ray absorption
Seismic Tomography: velocity image, reconstructed from seismic ray travel times
MRI : proton density image, reconstructed from radio wave emission intensity along lines of constant precession frequency
source
receiver
ray
dataray i = ray i model(x,y) dL
arc length
source
receiver
ray
dataray i = voxel j modelj Lij arc length of ray i in voxel j
Discretize image into pixels or voxels
So the data kernel, G, is very simple
… … … … …
… … … … …
… … Lij … …
… … … … …
… … … … …
Arc length ofray i in voxel j
G =
Many elements will be zero
… … … … …
… … … … …
… … … …
… … … … …
… … … … …
ray i does not go through voxel j
G =
the hard parts are:
1. computing the ray paths, if they are more complicated than straight lines
2. book-keeping, e.g. figuring out which rays pass through which voxels
Sample seismic tomography problemhere’s the true model, mtrue
sources and receivers
Note: for the equation Gm=d to be linear, m
must be 1/velocity or “slownes”
Straight line ray paths
The true traveltime data, dtrue
In the previous plot, each ray is indexed by its closest distance to the origin, R, and it orientation,
ray
R
R
Each ray makes plots as one point
on the image, with its travel
time indicated by its color
true model, dtrue
estimated model, dest
(solution via damped least squares)
true model, dtrue
estimated model, dest
After doubling the station/receiver density …