Download - Functional Data Analysis in Matlab and R
![Page 1: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/1.jpg)
1
Functional Data Analysis in Matlab and R
James Ramsay, Professor, McGill U., Montreal
Hadley Wickham, Grad student, Iowa State, Ames, IA
Spencer Graves, Statistician, PDF Solutions, San José, CA
![Page 2: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/2.jpg)
2
Outline • What is Functional Data Analysis?
• FDA and Differential Equations
• Examples: – Squid Neurons– Continuously Stirred Tank Reactor (CSTR)
• Conclusions
• References
![Page 3: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/3.jpg)
3
What is FDA? • Functional data analysis is a collection of
techniques to model data from dynamic systems – possibly governed by differential equations – in terms of some set of basis functions
• The ‘fda’ package supports the use of 8 different types of basis functions: constant, monomial, polynomial, polygonal, B-splines, power, exponential, and Fourier.
![Page 4: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/4.jpg)
4
Observations of different lengths • Observation vectors of different lengths
can be mapped to coordinates of a fixed basis set
• All examples in the ‘fda’ package have the same numbers of observations
• No conceptual obstacles to handling observation vectors of different lengths
![Page 5: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/5.jpg)
5
Time Warping
• “start” and “stop” are sometimes determined by certain transitions
• Example: growth spurts in the life cycle of various species do not occur at exactly the same ages in different individuals (even within the same species)
![Page 6: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/6.jpg)
6
10 Girls: Berkeley Growth Study• Tuddenham, R. D.,
and Snyder, M. M. (1954) "Physical growth of California boys and girls from birth to age 18", _University of California Publications in Child Development_, 1, 183-364.
ooo
ooo
oo
oo
ooooooooooooooooooooo
5 10 15
8010
012
014
016
018
0
age
Hei
ght
(cm
.)
ooooo
oo
oo
ooo
ooooooooooooooooooo
oooo
oo
oo
oo
ooooooooooooooooooooo
oooo
oo
oo
oo
ooooooooooooooooooooo
ooo
ooo
oo
oo
oooooooooo
ooooooooooo
oooo
oo
oo
oo
ooooooo
oooooooooooooo
ooo
ooo
oo
oo
ooooooooooooooooooooo
ooooo
oo
oo
oo
oo
oo
oooooooooooooooo
ooo
ooo
oo
oo
ooooooooooooooooooooo
oooo
o
oo
oo
ooooooo
ooooooooooooooo
![Page 7: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/7.jpg)
7
Acceleration • Growth spurts
occur at different ages
• Average shows the basic trend, but features are damped by improper registration
ooo
ooo
oo
oo
ooooooooooooooooooooo
5 10 1580
100
120
140
160
180
age
Hei
ght
(cm
.)
ooooo
oo
oo
ooo
ooooooooooooooooooo
oooo
oo
oo
oo
ooooooooooooooooooooo
oooo
oo
oo
oo
ooooooooooooooooooooo
ooo
ooo
oo
oo
oooooooooo
ooooooooooo
oooo
oo
oo
oo
ooooooo
oooooooooooooo
ooo
ooo
oo
oo
ooooooooooooooooooooo
ooooo
oo
oo
oo
oo
oo
oooooooooooooooo
ooo
ooo
oo
oo
ooooooooooooooooooooo
oooo
o
oo
oo
ooooooo
ooooooooooooooo
5 10 15
-4-3
-2-1
01
2
age
Gro
wth
acc
eler
atio
n (c
m/y
ear^
2)
![Page 8: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/8.jpg)
8
Registration • register.fd all
to the mean
• Not perfect, but better
5 10 15
-4-3
-2-1
01
2
ageG
row
th a
ccel
erat
ion
(cm
/yea
r^2)
5 10 15
-4-3
-2-1
01
2
warped age
Gro
wth
acc
eler
atio
n (c
m/y
r^2)
![Page 9: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/9.jpg)
9
A Stroll Along the Beach
• Light intensity over 365 days at each of 190*143 = 27140 pixels was – smoothed – functional principal components
• http://www.stat.berkeley.edu/~wickham/userposter.pdf
![Page 10: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/10.jpg)
10
Other fda capabilities
• Correlations – even with
series of different lengths!
• Phase plane plots – good
estimates of derivatives
Month
Me
an
Te
mp
era
ture
Jan Apr Jun Sep Dec
-10
05
15
j F
m
A
M
JJ A
S
O
N
D
Montreal average daily tempdeviation from average (C)
-10 -5 0 5 10 15 20
-0.0
06
0.0
00
0.0
06
Temperature (C)
Acc
ele
ratio
n
jF
m
A
M JJ
AS
O
N
D
j
Montreal average daily tempdeviation from average (C)
afda-ch03.Rfda-ch01.Rfda-ch02.R
![Page 11: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/11.jpg)
11
Script files for fda books • Ramsay and Silverman
– (2002) Applied Functional Data Analysis (Springer)
– (2006) Functional Data Analysis, 2nd ed. (Springer)
• ~R\library\fda\scripts– Some but not all data sets discussed in the
books are in the ‘fda’ package – Script files are available to reproduce some but
not all of the analyses in the books. – plus CSTR demo
![Page 12: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/12.jpg)
12
FDA and Differential Equations
• Many dynamic systems are believed to follow processes where output changes are a function of the outputs, x, and inputs, u (and unknown parameters ):
Tttt ,0,|, θux,fx
• Matlab was designed in part for these types of models
![Page 13: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/13.jpg)
13
Squid Neurons • FitzHugh (1961) - Nagumo et al. (1962) Equations:
Estimate a, b and c in: cbRaVR
RVVcV
33
Vol
tage
acr
oss
Axo
n M
embr
ane
Rec
over
y vi
a O
utw
ard
Cur
rent
s
V
R
![Page 14: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/14.jpg)
14
Tank Reactions • Continuously Stirred Tank Reactor (CSTR)
Tem
pera
ture
C
once
ntra
tion
![Page 15: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/15.jpg)
15
Functional Data Analysis Process1. Select Basis Set
2. Select Smoothing Operator – e.g., differential equation– equivalent to a Bayesian prior over coefficients
to estimate
3. Estimate coefficients to optimize some objective function
4. Model criticism, residual plots, etc.
5. Hypothesis testing
![Page 16: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/16.jpg)
16
Inputs to Tank Reaction Simulation
![Page 17: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/17.jpg)
17
ba
aFFaFF
FTFT
FFFF
FTTFT
TFTFCFTTFFdtdT
CFCFTdtdC
bb
CCTC
TT
CC
TCTT
CC
,,,:parameters 4
2
,130,
,
1110exp,
,,
,
co
co
1 coco
inin
incoinco
inref4
in
cocoininininco
ininin
Computations: Nonlinear ODE
• Compute Input vectors
• Define functions
• Call differential equation solver
• Summarize, plot
Tem
pera
ture
C
once
ntra
tion
estimate parameters (, , a, b)
![Page 18: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/18.jpg)
18
Three problems
• Estimate (, , a, b) to minimize SSE in Temperature only
function SSE SSE-minMatlab lsqnonlin 5.09888 0.00236R nls 5.09652 0
optim Nelder-Mead 5.09652 0BFGS 5.09652 0CG 5.09900 0.00248SANN 5.17504 0.07852
nlminb 5.09652 0
![Page 19: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/19.jpg)
19
0 10 20 30 40 50 601.2
1.4
1.6
C(t
)
Concentration (red = true, blue = estimated)
0 10 20 30 40 50 60330
340
350
360
T(t
)
Temperature
SSE(Temp, Conc)
• Matlab: lsqnonlin • R: nls
0 10 20 30 40 50 60
1.2
1.4
1.6
1.8
Concentration (red = true, blue = estimate)
C(t
)
0 10 20 30 40 50 60
33
03
40
35
03
60
Temperature
C(t
)
Matlab RConcentration 1.149E-03 1.145E-03Temperature 2.640E-04 2.636E-04
Median absolute relative error
![Page 20: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/20.jpg)
20
R vs. Matlab • Gave comparable answers
• R code for CSTR slightly more accurate but requires much more compute time – coded by different people
• R has helper functions not so easily replicated in Matlab – summary.nls – confint.nls – profile.nls
Estimate StdErr t Pr(>|t|) kref 0.466 0.004 113.0 < 2e-16 ***EoverR 0.840 0.009 94.7 < 2e-16 ***a 1.720 0.232 7.4 8.2e-13 ***b 0.496 0.050 10.0 < 2e-16 ***
![Page 21: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/21.jpg)
21
confint.nls• Likelihood-based confidence intervals:
generally more accurate than Wald intervals – Wald subject to parameter effects curvature – Likelihood: only affected by intrinsic curvature
> confintNlsFit 2.5% 97.5%kref 0.458 0.474EoverR 0.823 0.858a 1.300 2.222b 0.401 0.599
![Page 22: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/22.jpg)
22
0.455 0.465 0.475
0.0
1.0
2.0
0.82 0.84 0.86
0.0
1.0
2.0
1.2 1.6 2.0 2.4
0.0
1.0
2.0
0.40 0.50 0.60
0.0
1.0
2.0
plot.profile.nls• for a plot
showing the sqrt(log(LR))
0.455 0.465 0.475
0.0
1.0
2.0
0.82 0.84 0.86
0.0
1.0
2.0
1.2 1.6 2.0 2.4
0.0
1.0
2.0
0.40 0.50 0.60
0.0
1.0
2.0
kref EoverR
a b
50
99
80
9590
![Page 23: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/23.jpg)
23
Conclusions
• R and Matlab give comparable answers
• R:nls has helper functions absent from Matlab:lsqnonlin
• Functional data analysis tools are key for – estimating derivatives and – working with differential operators
![Page 24: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/24.jpg)
24
References
• www.functionaldata.org
• Ramsay and Silverman (2006) Functional Data Analysis, 2nd ed. (Springer)
• ________(2002) Applied Functional Data Analysis (Springer)
• Ramsay, J. O., Hooker, G., Cao, J. and Campbell, D. (2007) Parameter estimation for differential equations: A generalized smoothing approach (with discussion). Journal of the Royal Statistical Society, Series B. To appear.
![Page 25: Functional Data Analysis in Matlab and R](https://reader036.vdocuments.net/reader036/viewer/2022062518/568140cc550346895dac97af/html5/thumbnails/25.jpg)
25
NOT free-knot splines
• For this, see – DierckxSpline package – Companion to Dierckx, P. (1993). Curve and
Surface Fitting with Splines. Oxford Science Publications, New York.
• R package by Sundar Dorai-Raj – links to Fortran code by Dierckx available from
www.netlib.org/dierckx
• soon to appear on CRAN