simo python/xml simulator current situation 28/10/2005 simo seminar 28.10.2005 antti mäkinen dept....

30
SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Upload: myles-booth

Post on 21-Jan-2016

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

SIMO Python/XML Simulator

Current situation 28/10/2005

SIMO Seminar 28.10.2005

Antti Mäkinen

Dept. of Forest Resource Management /

University of Helsinki

Page 2: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

Page 3: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

SP_1_Age_BA

10 20 30 40 50 60 70 80 90 100Age

0

20

40

60

80

100

120

140

BA

Page 4: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

SP_1_Age_D_gM

10 20 30 40 50 60 70 80 90 100Age

10

15

20

25

30

35

40

D_gM

Page 5: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

SP_1_Age_H_dom

10 20 30 40 50 60 70 80 90 100Age

5

10

15

20

25

30

35

H_do

m

Page 6: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

SP_1_Age_V

10 20 30 40 50 60 70 80 90 100Age

0

500

1000

1500

2000

2500

V

Page 7: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Development of different variables at stand level...

id:1 stratum:0

40 60 80 100 120 140Age

0

2

4

6

8

10

i_BA

Page 8: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Diameter distributions and tree level attributes

id:100040543 year:2005 stratum:0

6 8 10 12 14 16 18 20

d

0

20

40

60

80

100

120

N

Page 9: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Just for comparison with J simulator...

Pine BA vs . Age ClT

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Pine BA vs . Age CT

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Pine BA vs . Age VT

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Pine BA vs. Age MT

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Page 10: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Just for comparison with J simulator...

Pine H_dom vs. Age MT

0

5

10

15

20

25

30

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Pine V vs . Age MT

0

100

200

300

400

500

600

700

800

900

1000

0 20 40 60 80 100 120

J simulator

Python/XM L Simulator

Page 11: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment?

Estimating forest variable development at both stand level & tree

level is possible at the moment (300+ models implemented), but

Forestry operations not yet implemented in the simulator

→ ”real world” simulations not yet possible

Bucking models still not ready

Optimizing module still missing

Page 12: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

How the simulation process works in SIMO?

XML Files

SIMULATOR

MODEL LIBRARY

Reporter Module

IN: data, simulation control, modelchains, model definitionsOUT: results

IN: modelname, input variablesOUT: model result, warnings & errors

IN: XML dataOUT: transformed XML, graphs

id:100040543 year:2005 stratum:0

6 8 10 12 14 16 18 20

d

0

20

40

60

80

100

120

N

SIMULATION PROCESS

Page 13: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What is missing?

XML Files

SIMULATOR

MODEL LIBRARY

Reporter Module

Optimizer Module

MODEL LIBRARYMODEL LIBRARY

MODEL LIBRARY

Validator Module

Page 14: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

XML Files

Data XML

Simulation control XML

Model Chain XML

Model XML

Result XML

Page 15: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Model Library

Includes all models used in the simulator

Programmed with C language as a Dynamic Link

Library (DLL)

Models are C functions that are called from the

simulator (model definitions also in the Model.xml)

Users can add new models to the library or create

additional model libraries

Reports warnings and errors to the simulator

Risk level models not yet implemented

Page 16: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

SIMULATOR

1. version of simulator programmed with C/C++

Later the programming language was changed to

Python, because of:

Simple and concise syntax → easier readability of

code and possibility of developing the simulator faster

http://www.python.org

Good combatibility with C language

Number of useful readymade open source tools for

variety of purposes

Code documentation is underway

Page 17: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

SIMULATOR

Intakes simulation control instructions, model

chains, model definitions and data in XML format

Processes the user defined model chains for each

computing unit in the data

Calls the model library whenever some value

needs to be calculated (Python/C interface ctypes)

Prints the resulting values into a result XML file

Transforms the XML data from different files to

simulators own data structure (more efficient than

ElementTree data structure)

Page 18: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Reporting Module

Used for visualizing data & transforming the

results from XML format to other formats

Intakes data and processing instructions in XML

format

At the moment can plot different kinds of graphs of

given variables (matplotlib) toolset

XML transformations to be implemented later...

Page 19: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Missing modules

Optimizer module

• Finds the best alternative from the alternatives

generated by the simulator

• Possibly many alternative optimizing methods?

Validator module

• Validates the XML files with XSD (Schema) files

and by external rules

• Makes sure that the XML files are well-formed and

contain all necessary elements

Page 20: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Strengths of SIMO XML Simulator

Virtually any kind of model can be used in the

simulations and added to the model library

User can define the model chains freely for

different kinds of simulations

User can define correction/rectification factors for the

models, (eg. different factors for geographical areas)

Extensive warning and error reporting system (risk

control coming later...)

Data levels are not confined to strict predifined

standard

Page 21: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Model risk management –individual variables

Minimum and maximum limits of individual variables have been

defined Documented in ModelXML Limits have been coded into ModelLibrary -> throws warnings if

the Individual parameter values are out of bounds How the minimum and maximum limits are defined?

Limits defined by author (caused by data, model shape, …) Limits of modeling data Model is tested with those limits using NFI-data as test data.

Does the model function properly if the Individual parameter

values are out of bounds? For example: Basal area growth model (Vuokila & Väliaho) for

Scots pine on mineral soils

Page 22: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Model risk management –interaction

Interaction between

variables

age

ba

Accepted combinations of varibles

(120, 5) not accepted

(20, 32) not accepted

Solution alternatives: Logit-model:

propability that the estimate is in acceptable area (at least

linear regression was not flexible enough) Grid: area of combinations of variables is divided into cells.

Every cell has information is the estimate acceptable or

not

Page 23: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Model risk management

Two levels

1. Individual parameter values out of bounds

2. All individual parameter values acceptable, but is the

specific combination of them acceptable?

Case 1: already in the simulator Case 2: Suggestion

1. get the k nearest neighbours from the VMI data,

2. evaluate the model for the data point and the k

nearest neighbours.

3. If the difference for the model estimate between the

data point and the neighbours is too big, generate an

event of ”unacceptable” model estimate

Page 24: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Isn’t that procedure too heavy computationally? Probably, not yet evaluated But what about if we store the risk evaluation results

and use those primarily:

1. Is it safe to call ModelA with parameters (5, 6, 10)

when we accept risk level X?

2. Has the risk been evaluated with parameter values

(5,6,10) and risk level X before. If yes, get the

answer from a table of risk evaluations

3. If not, get k nearest neighbours for data point

(5,6,10), evaluate the model with (5,6,10) and k

neighbours

4. Store the risk evaluation result and the mean model

result for k neighbours for the data point (5,6,10)

and risk level X

Page 25: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

10 20 30 40 50

50

10

01

50

20

0

PPA_KUORETON

IKA

_B

10 20 30 40 50

50

10

01

50

20

0

PPA_KUORETON

IKA

_B

Page 26: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Open questions:When evaluating model result shall we compare it to:

values derived directly from the nearest VMI

permanent sample plots

OR

model estimates for the nearest VMI sample plots?

Page 27: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Software license for SIMO

Types of Open Source licensesMIT & Co: “Do whatever you want”

LGPL: “Everything you do to the original code must be open source, anything on top of that can be closed”

GPL & Co: “Everything you do is open source, …well almost”

GPL under the hood: "derivative work" or "mere

aggregation“? Derivative work must be open source, but

aggregation can be closed source

Page 28: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

The case of MySQLDouble licensing: open source GPL, commercial

development with a commercial license that allows closed

source

Page 29: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

General software architecture

Individual components that communicate over the

networkValidatorSimulator – this is well underwayOptimiserReporter – simulation results to figures and other data

formats than XML, or different XML format etc.

Implications to licensing? What about if one of the

components uses a sub component that is published

under GPL?

Page 30: SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

Architecture continued

TCP/IP based communicationSecurity issues?

secured traffic (SSL, SSH)inside firewall

Scalable