Improving an Ecosystem Model Using Earth Science Data
Kazumi Saito (NTT Communication Science Laboratories)Pat Langley (CSLI Stanford University)
Trond E. Grenager (Stanford University)
Introduction• Developing computational methods for
discovering knowledge in communicable forms.
• Improving CASA using observed data. • CASA: an existing computational model of
aspect of the Earth ecosystem developed by Christopher Potter and his colleagues at NASA Ames.
Portion of CASA
Known ConstantObserved VariableVariable
E IPAR
W T2 T1
PET
FPAR_FAS
SR_FAS SRDIFF
sol_conv
srmin
e_max
eet
ahi
tempcA
Topt
PET_TW_M umd_vegFAS_NDVI
solar
NPPc
Some EquationsNPPc: net primary production.
( )max 0,NPPc = E IPAR×
_ max 1 2E = e T T W × × ×
0 5IPAR = FPAR_FAS Solar sol_conv . × × ×
E: value of maximum possible photosynthetic efficiency under temperature and moisture stress scalars.
IPAR: converter for intercepted photosyntheticallyactive radiation by the vegetation cover.
General Problem• Revisions to the model must be consistent
with existing knowledge of Earth science and, ideally, retain similarity to the current model.
• Our research involves attempting to improve the CASA model’s predictive accuracy.
Outline of Approach• Transforming the equations into a neural network• Revising weights in that network• Transforming the network back into equations
( )NPPc= f
L
L
L
originalequations
( )NPPc= g
L
L
L
revisedequations
Some Types of Neural NetworksStandard (sigma-sigma) net:
( )j j jk kw f w x∑ ∑Sigma-pi net (generalized polynomial):
( )exp lnjkwj k j jk kw x w w x=∑ ∑ ∑∏
Pi-sigma net (this talk):
( )j j jk kw f w x∑∏
Transforming Equations
Known ConstantObserved VariableVariable
E IPAR
W T2 T1
PET
FPAR_FAS
SR_FAS SRDIFF
sol_conv
srmin
e_max
eet
ahi
tempcA
Topt
PET_TW_M umd_vegFAS_NDVI
solar
NPPcStress scalars
Intrinsic property
Stress ScalarsOriginal equations:
( )
( )
2
2
1 0 8 0 02 0 00051 ( 0 4472 0 0224 )
12 1 18141 exp 0 2 ( 10 )
11 exp 0 3 ( 10 )
0 5 0 5
T = . + . Topt . Topt= - . . Topt
T = .. Topt tempc
. Topt tempc
eetW = . + .PET
× − ×− + ×
×+ × − + −
×+ × − − +
×
_ max 1 2E = e T T W × × ×
Transformation into Network
( ) ( )
( ) ( ) ( ) ( ) ( )( ) ( )( )
( )
( )
21 1 11 12
2
2 21 22
21 21 21 22
3 3 31 32
1 1
1 ( 0 4472 0 0224 )1 12
1 exp 1 exp
11 exp 2 0.2 ( )
0 5 0 5
T = f x x f w w Topt
= - . . Topt
T = f x f x f xx x
f x f w w Topt tempc
Topt tempc
eet eetW = f x x f w w . + .PET PET
= − = + ×
− + ×
= × = ×+ − + −
= + −
=+ − × −
= = + = ×
0_ max 1 2 iE = e T T W w f× × × = ×∏
Intrinsic Values for Vegetation Type
min 0 95
1 ( )
SR_FAS srminFPAR_FAS , .SRDIFF
SR_FAS - srminSRDIFF
− =
≈ ×
FPAR_FAS: fraction of absorbed photosynthetically active radiation by the vegetation cover
SRDIFF: map from the ground cover to an srmax-srmin value
Transformation into Network
( )exp log( ) log( )
SR_FAS - srmin SRDIFF
= SR_FAS srmin SRDIFF− −
1 if_
0 othewiseumd_veg i
umd_veg i=
=
( )13
1log _ _
i- SRDIFF = v_i umd veg i
=
×∑
: weight in neural networkv_i
Revising weights in Networks
supervised learningStep-length
search direction
1st-order2nd-order
BP, etc.Newton method
variablefixed(constant)
Silva-Almeida algorithm,etc.SCG, OSS, BPQ, etc.
2nd-order learning algorithm
Gauss-Newton method
applicability to large-scale problems
×△○
○△×
quasi-Newton methodconjugate gradient method
performance with inaccurate step-length
BPQ Algorithm• The search direction is calculated on the
basis of partial BFGS update.
• The step-length is calculated by using a second-order approximation.
Demonstration ProblemSample set
1
2
3
4
5
y x1
2
3
4
5
0.73
0.88
0.95
0.980.99
1 Hidden unit
Input unit
Output unit
w
x(t)
1
w2
z(t)
1Sample
Learning Neural Network: Result
-0.5
0
0.5
1 -0.5 0 0.5 1
00.10.2
0.3
0.4
0.5
0.6
0.7
2W
1W
BPQ
BP + momentum(1st-order method)
(2nt-order method)Squared error
Experimental Result
360.00
400.00
440.00
480.00
520.00
560.00
RMSE RMSE RMSE
original apparent LOO CV
The RMSE of the original model was reduced by 15 percent, as measured using cross validation.
2o b served p red ic ted
sam p les
(N P P -N P P )R M S E =
n u m b er o f sam ples
∑
Intrinsic Values
0
1
2
3
4
5
6
1 2 3 4 5 6 7 8 9 10 11
initial obtained
The intrinsic values associated with vegetation types obtained in this way were consistently lower
Transforming Network
Step1. Quantize by using a clustering method.
Step2. Determine an adequate number of rulesby using cross-validation.
Step3. Generate nominal conditionby solving a standard classification problem.
( ){ }( )exp : 1,nkl klkl
v q n N=∑ L
Clustering Analysis
380.00
400.00
420.00
440.00
460.00
480.00
500.00
520.00
1 2 3 4 5
apparent RMSE LOO CV RMSE
Number of clusters
Evaluating Experimental Result
380
400
420
440
460
480
500
520
540
before clustering after clustering
initial RMSE appar LOO CV RMSE
Obtained Decision Tree8 = 1 : 0 ( 1 0 . 0 )8 = 0 :
| 9 = 1 : 1 ( 5 8 . 0 )| 9 = 0 :| | 7 = 1 : 1 ( 4 6 . 0 )| | 7 = 0 :| | | 1 1 = 1 : 1 ( 1 1 . 0 )| | | 1 1 = 0 :| | | | 1 = 0 : 2 ( 1 6 8 . 0 / 1 . 0 )| | | | 1 =
tt
tt
tt
tt
tt 1 : 1 ( 1 0 . 0 )
Clustered Intrinsic Values
0
1
2
3
4
5
6
1 2 3 4 5 6 7 8 9 10 11
initial obtained
Conclusion• This talk described an approach to
improving the predictive accuracy of the existing ecosystem model.
• In the experiments, we can reduce the mean squared error of the original model by 15 percent, as measured using cross validation
• In the future, we’ll carry out further experiments along this direction