environmental data analysis with matlab lecture 21: interpolation
TRANSCRIPT
Environmental Data Analysis with MatLab
Lecture 21:
Interpolation
Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares ProblemsLecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power Spectral DensityLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps
SYLLABUS
purpose of the lecture
to introduce
Interpolation
the process of filling in missing data points
time0 1 2
A(t)Scenario 1: data are collected at irregular time intervals, but you want to compute power spectral density, which requires evenly sampled data.
frequency
psd
?
time0 1 2
A(t)Scenario 2: two datasets are collected with different sampling intervals, but you want to combine them into a scatter plot
AB?
1 2
B(t)
in both scenarios
the times that the data are collected at are
inconvenient
we encountered a problem similar to this one back in Lecture 8,
where we used
prior information
to fill in data gaps
time0 1 2
observed data with missing pointsdobs (t
)
time0 1 2
dest (t)estimated data with missing points filled in
find diest so that
diest ≈ diobsat the observation points
and
roughness of diest ≈ 0everywhere
the solution is inexact
diest ≠ di
obs
everywhere
and
roughness of diest ≠ 0
everywhere
but the inexactness isn’t a problem
because
bothobservations
andprior information
have error
now we examine an alternative approach
traditional interpolation
similar, but subtly different
find d(t) so that
d(ti) = diobsat the observation points
and
roughness of d(t) = 0in between the observation points
find d(t) so that
d(ti) = diobsat the observation points
and
roughness of d(t) = 0in between the observation points
exact
exact
find d(t) so that
d(ti) = diobsat the observation points
and
roughness of d(t) = 0in between the observation points
“interpolant”
disadvantagethe observation points are singled out as special
advantageinterpolant d(t) is an analytic function that is known
everywhere
disadvantagethe observation points are singled out as special
advantageinterpolant d(t) is an analytic function that is known
everywhere
can evaluate d(t) at any time, tcan differentiate d(t), integrate it, etc.
d(t) behaves differently at the observation points than between them
the interpolation problem
find an interpolantd(t)that goes through all the data points
and
“does something sensible”
or
“satisfies some prior information”
between them
some obvious ideas don’t work at all
an (N-1) order polynomial can easily be constructed to that it passes through N points
so use a polynomial for d(t)
d(t)
time, t
example
d(t)
time, t
what happened here? and here?
example
solution
a low-order polynomial
has less potential for wild swings
so use many low-order polynomial
each valid in a small time interval
such a function is called a “spline”
simplest case
set of linear polynomials
each valid between two data points
“connect the data points with straight lines”
tdti ti+1
d(t)
disadvantage
advantages
conceptually very simple
always get what you expect
d(t) has kinks at observation points
zero roughness between observations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5
0
5
d(t)
time, t
example
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5
0
5
d(t)
time, t
example
kink
in MatLab
observations
times of interpolation
interpolated observations
getting rid of the kinks
use cubic polynomialsSi(t) = c0 + c1 t + c2 t2 + c3 t3each valid between two data points
cubic polynomial has 4 coefficients
two constrained by need to pass through two data
two to implement prior information
no kinks in d(t) or its first derivative
the trick
second derivativeof cubic is linear
so use linear interpolation formulafor second derivative
t2nd d
eriv
ativ
e
ti ti+1ti-1yi-1yi
yi+1
t2nd d
eriv
ativ
e
ti ti+1ti-1yi-1yi
yi+1
the second derivative at the observation points, denoted yi,
become an unknown in the problem
the second derivative is now integrated twice to give the spline function
here ai and bi are two more unknowns that arise from the integration constants
finallyone finds the y’s, a’s and b’s
so that the spline
1. goes through the observations
and
2. has a first derivative that is continuous across the observation points
the solution involves solving a matrix equation for the unknowns
(see text for details)
in MatLab
observations
times of interpolation
interpolated observations
d(t)
time, t
example
d(t)
time, t
exampleno kinks
interpolation involves
prior information of smoothness
in generalized least-squaresthe prior information of smoothness is quantified by a
roughness matrix, HHm
then we minimize the overall roughness, which is to say the overall error in the prior information(Hm)T (Hm)
note that
(Hm)T (Hm) = mT (HTH) mbut in generalized error also has the form
mT Cm-1 mwhere Cm-1 is a covariance matrix
so in this caseCm = (HTH)-1
so the prior information that the data are smooth
is equivalent to the requirement that they have a specific covariance matrix
which for stationary time series is equivalent to saying that they have a specific autocorrelation function
so an alternative, more flexible way of interpolating data
is by specifying the autocorrelation function that we want the results to have
this is called Kriging(after Danie G Krige, its inventor)
Kriging
estimate data at arbitrary time t0
determine weights wby
minimizing the variance of
with respect to wiwe’ll find that we don’t need to know d0true
only its autocorrelation
assuming and
j
assuming and
means approximately cancel
j
assuming and
means approximately cancel
expand square
j
assumming and
means approximately cancel
expand square
insert weighted average formula
j
assumming and
means approximately cancel
expand square
insert weighted average formula
jidentify terms proportional to autocorrelation
now differentiate with respect to the weight, wk
which yields the matrix equation
Mw = v
now differentiate with respect to the weight, wk
which yields the matrix equation
Mw = v note that the autocorrelation appears on both sides of the equation, so that its overall normalization cancels out
all we need now do is specify an autocorrelation function
for examplewe could use the Normal function
the variance, L2, controls the width of the autocorrelation and hence the smoothness of the interpolation
In MatLab
observations: tobs, dobsinterpolated values: test, destNormal autocorrelation function with variance L2
0 10 20 30 40 50 60 70 80 90 100-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
x
d
0 20 40 60 80 100-2
-1
0
1
2
x
dd(t)
A) Kriging B) Generalized Least Squares
time, t time, t
d(t) d(t)
Example
Interpolation in two-dimensions
construct an interpolantd(x,y)that goes through the observations
anddoes something sensible in between
1 dimensions
td
t0 x2 dimensions
y0
notion of bracketing observations more complicated
y0
x0
1 dimensions
td
ti ti+1t0 xy0
x02 dimensions
ynotion of bracketing observations
more complicated
triangular tile
segment of t-axis
Delaunay triangles
set of most equilateral triangles connecting data points
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
data
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
dataA) Observations B) Delaunay triangles
y y
x x
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
data
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
dataA) Observations B) Delaunay triangles
y y
x x
triangle enclosing a point of interest
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
linear interpolation
0 5 10 15 20 25 30 35 40
0
5
10
15
20
25
30
35
40
y
x
cubic interpolationD) Cubic SplinesC) Linear Splines
y y
x x
In MatLab
linear splines
cubic splines