2011 user aqp talk

41
Investigating soil data with R Pierre Roudier 1 and Dylan E. Beaudette 2 1 Landcare Research – Manaaki Whenua, New Zealand 2 Natural Resources Conservation Service, USDA, USA

Upload: lammien

Post on 02-Jan-2017

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2011 UseR AQP Talk

Investigating soil data with R

Pierre Roudier1 and Dylan E. Beaudette2

1 Landcare Research – Manaaki Whenua, New Zealand2 Natural Resources Conservation Service, USDA, USA

Page 2: 2011 UseR AQP Talk

Acknowledgement

Page 3: 2011 UseR AQP Talk

Talk Outline

1 The soil resourceSoil matters!Soil scienceSoil data and its analysis

2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling

3 Recent and future developmentsIntroduction of S4 classesDesign

4 Conclusions and further work

Page 4: 2011 UseR AQP Talk

Talk Outline

1 The soil resourceSoil matters!Soil scienceSoil data and its analysis

2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling

3 Recent and future developmentsIntroduction of S4 classesDesign

4 Conclusions and further work

Page 5: 2011 UseR AQP Talk

Soil matters!

Soil

is not dirt.

a very thin layer between parent rock and atmosphere,

a very complex body – physical, chemical and biological interactions,

a support for almost all terrestrial ecosystems – and thus food production

Page 6: 2011 UseR AQP Talk

Soil matters!

Soilis not dirt.

a very thin layer between parent rock and atmosphere,

a very complex body – physical, chemical and biological interactions,

a support for almost all terrestrial ecosystems – and thus food production

Page 7: 2011 UseR AQP Talk

A threatened resource

How to feed 7 billion+ people?

Growing tensions on arable land

Urbanisation

Erosion

Etc.

↪→ Important to provide soil information to a wide range of decision makers.

Page 8: 2011 UseR AQP Talk

A threatened resource

“Man has only a thin layer of soil between himself and starvation.” –Bard of Cincinnati

Page 9: 2011 UseR AQP Talk

Soils today

Long been only regarded as a producer of crops

But soils are back on the global agenda

New challenges through global projects

Page 10: 2011 UseR AQP Talk

Soils today

Long been only regarded as a producer of crops

But soils are back on the global agenda

New challenges through global projects

Page 11: 2011 UseR AQP Talk

GlobalSoilMap.net– a leading project

International network of soil scientists

Generate and provide a ≈ 100m grid of 9 key soils attributes globally

R has been identified as a key platform for the various stages involved

Data courtesy of Tomislav Hengl.

↪→ The way to do soil science is changing significantly(“pedometrics”)

Page 12: 2011 UseR AQP Talk

GlobalSoilMap.net– a leading project

International network of soil scientists

Generate and provide a ≈ 100m grid of 9 key soils attributes globally

R has been identified as a key platform for the various stages involved

Data courtesy of Tomislav Hengl.

↪→ The way to do soil science is changing significantly(“pedometrics”)

Page 13: 2011 UseR AQP Talk

The soil profile

Page 14: 2011 UseR AQP Talk

The soil profile

Soil data:

Highly multi-dimensional

Point support

Soft and hard data

Importance of legacy data

Most of the time associated with environmentalcovariates

Page 15: 2011 UseR AQP Talk

Talk Outline

1 The soil resourceSoil matters!Soil scienceSoil data and its analysis

2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling

3 Recent and future developmentsIntroduction of S4 classesDesign

4 Conclusions and further work

Page 16: 2011 UseR AQP Talk

The aqp package

aqp – Algorithms for Quantitative Pedology

Page 17: 2011 UseR AQP Talk

Soil profile sketches

An example of profile sketches manually created from soil profile observations collectedas part of the Pinnacles National Monument soil survey. Horizon designations,sequences, boundaries, and soil texture classes are usually sufficient for describingcomplex soil-landscape relationships.

Page 18: 2011 UseR AQP Talk

(Digital) Soil profile sketches

A1

A2

Bt1

Bt2

Cr

R

08A1

A/C

CrR

A2

14

A1

A2

R

13

A

Bw1

Bw2

Bw3

CrR

12

A

Bw1

Bw2

Crt

09

A

AB

Cr

Bt1

Bt2

03OiA

Bt

Crt

AB

01A1

Bw1

Bw2

A2

Cr

15A

AB2

Bw1

AB1

Bw2R

07Oi

Ad

Bw

Cr/Bw

Oe/A

Cr

05

A

AB1

Crt

AB2

Bt1

Bt2

11A

EC1

EC2

BC

AB

10Oe

AB1

A

AB2

Bt

???

02Oi/A

A

Bt1

Bt2

???

04A

R

Bt1

Bt2

Bt3

AB

BC

CB

06

180 cm

160 cm

140 cm

120 cm

100 cm

80 cm

60 cm

40 cm

20 cm

0 cm

summit backslope

swale

Page 19: 2011 UseR AQP Talk

(Digital) Soil profile sketches

A1

A2

Bt1

Bt2

Cr

R

08A1

A/C

CrR

A2

14

A1

A2

R

13

A

Bw1

Bw2

Bw3

CrR

12

A

Bw1

Bw2

Crt

09

A

AB

Cr

Bt1

Bt2

03OiA

Bt

Crt

AB

01A1

Bw1

Bw2

A2

Cr

15A

AB2

Bw1

AB1

Bw2R

07Oi

Ad

Bw

Cr/Bw

Oe/A

Cr

05

A

AB1

Crt

AB2

Bt1

Bt2

11

A

EC1

EC2

BC

AB

10Oe

AB1

A

AB2

Bt

???

02Oi/A

A

Bt1

Bt2

???

04A

R

Bt1

Bt2

Bt3

AB

BC

CB

06

180 cm

160 cm

140 cm

120 cm

100 cm

80 cm

60 cm

40 cm

20 cm

0 cm

Summit Backslope Swale

98 79 67 40 30 −5 −5 −9 −13 −30 −51 −54 −81 −86 −97Relative Topographic Position (50 meter radius)

summit backslope

swale

Page 20: 2011 UseR AQP Talk

Numerical soil classification

Profile j Profiles j+1, j+2, ... j=n

evaluate dissimilarity metricfor each depth increment

dissimilarity matricesfor all depth increments

apply depth-weightingand sum dissimilarities

final between-profiledissimilarity matrix

Page 21: 2011 UseR AQP Talk

Numerical soil classification

0

5

10

15

20

25

30

OiA

Bt

Crt

AB

01

A

Bw1

Bw2

Crt

09

Oi

Ad

Bw

Cr/Bw

Oe/A

Cr

05Oi/A

A

Bt1

Bt2

???

04

A

AB2

Bw1

AB1

Bw2R

07

A1

Bw1

Bw2

A2

Cr

15

A1

A2

Bt1

Bt2

Cr

R

08Oe

AB1

A

AB2

Bt

???

02

A1

A2

R

13

A1

A/C

CrR

A2

14

A

Bw1

Bw2

Bw3

CrR

12

A1

A2

CCr

19Tollhs

A

AB

Cr

Bt1

Bt2

03

A

EC1

EC2

BC

AB

10

Ap1

Ap2

A1

A2

Bt1

Bt21

Bt22

18Chualr

A

R

Bt1

Bt2

Bt3

AB

BC

CB

06

A1

A2

A3

B1

B2

C1

C2

16 Vista

A

AB1

Crt

AB2

Bt1

Bt2

11

A11

A12

A2

B1

B2

C

17Ahwahn

200 cm

150 cm

100 cm

50 cm

0 cm

Sampled SoilsSoil Survey Legend

Group 1Group 2Group 3Group 4

Page 22: 2011 UseR AQP Talk

Soil data harmonisation

Harmonisation of heterogeneous data sets

Legacy data

Diversity of measurement methods

Etc.

Page 23: 2011 UseR AQP Talk

Soil data harmonisation

Harmonisation of the depth support

Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5

Page 24: 2011 UseR AQP Talk

Soil data harmonisation

Harmonisation of the depth support

Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5

Page 25: 2011 UseR AQP Talk

Soil data harmonisation

Harmonisation of the depth support

Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5Organic carbon (%)

Dep

th(c

m)

120

100

80

60

40

20

0

0.0 0.5 1.0 1.5

Page 26: 2011 UseR AQP Talk

Aggregation of soil properties

segmenteach profile collect profiles representative depth function

and characteristic variability

computesummary

statistics bysegment

Page 27: 2011 UseR AQP Talk

Aggregation of soil properties

25th Percentile, Median, 75th Percentile

Depth(cm)

0

20

40

60

80

100

20 30 40 50 60 70 80

Sand (%)

10 20 30 40

Clay (%)

0 1 2 3

Total Carbon (%)

6 8 10 12 14 16

CIE Redness

6.0 6.5 7.0 7.5

pH

5 10 15 20 25 30

0

20

40

60

80

100

CEC (cmol [+] / kg)

0

20

40

60

80

100

0.1 0.2 0.3 0.4

Fe Crystallinity

0.8 1.0 1.2 1.4

Mn Crystallinity

0.5 1.0 1.5 2.0 2.5 3.0

Total Ca (%)

2 4 6 8 10 12

Total Fe (%)

60 80 100

120

140

Total Zr (ppm)

30 35 40

0

20

40

60

80

100

Total Si (%)

Meta-basalt Granodiorite

Page 28: 2011 UseR AQP Talk

Pedotransfer functions

”Pedotransfer functions” are models predicting a soil attribute from other(s) soilattribute(s) and environmental covariates.

From simple approach with multivariate linear models...

... to state-of-the-art machine learning algorithms (ANN, random forests,etc.)

↪→ R is of course a natural platform for these tasks.

Page 29: 2011 UseR AQP Talk

Talk Outline

1 The soil resourceSoil matters!Soil scienceSoil data and its analysis

2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling

3 Recent and future developmentsIntroduction of S4 classesDesign

4 Conclusions and further work

Page 30: 2011 UseR AQP Talk

Structuring soil data

100 meters Site Datax, y, z slope, erosion, drainage, geology, ... species, abundance

xxxxx xxxx xxxxxxxx xxxxx xxxx xxxxx xxxxxx xxxxxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxx

50 -

200

cm Horizon Morphology Data

name, depths, roots, color, structure, texture, ... feature, abundance

xxxx xxxxxx xxxxxxxxx xxxxxx xxxxxxx xxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxx

Laboratory Dataavailable {Na, K, Mg, Ca}, total {Si, Al, Fe}, CEC, ph, PSD, EC, SAR, ...

xxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxx xxxxxxxxxx xxxxxxxx xxxxx

Page 31: 2011 UseR AQP Talk

Spatial analysis

Spatial analyis is an example where we need bindings to other packages.

Slice the soil collection on any depth interval

Cast it as a sp object (SpatialPointsDataFrame)

Apply sp/raster methods for spatial data analysis

Page 32: 2011 UseR AQP Talk

Spatial analysis

50 100 150 200 250

0.00

0.05

0.10

0.15

0.20

0.25

0.30

distance

sem

ivar

ianc

e

10 cm depth20 cm depth30 cm depth40 cm depth50 cm depth60 cm depth

Si:Al Variogram Models

Page 33: 2011 UseR AQP Talk

Spatial analysis

Page 34: 2011 UseR AQP Talk

S4 Classes and Methods for Soil Information

Formal class 'SoilProfileCollection' [package ”.GlobalEnv”] with 4 slots..@ id : Named chr ”P001”.. ..− attr(*, ”names”)= chr ”id”..@ depths : int [1:6, 1:2] 0 2 14 49 57 89 2 14 49 57 ..... ..− attr(*, ”dimnames”)=List of 2.. .. ..$ : chr [1:6] ”1” ”2” ”3” ”4” ..... .. ..$ : chr [1:2] ”top” ”bottom”..@ units : chr ”cm”..@ horizons:'data.frame': 6 obs. of 4 variables:.. ..$ texture : Factor w/ 14 levels ”C”,”CBVSCL”,”FSL”,..: 8 8 8 8 9 NA.. ..$ stickiness : Factor w/ 4 levels ”MS”,”SO”,”SS”,..: NA NA NA NA NA NA.. ..$ plasticity : Factor w/ 4 levels ”MP”,”PO”,”SP”,..: NA NA NA NA NA NA.. ..$ field ph : num [1:6] 7.9 7.7 8 7.9 7.4 NA..@ site:'data.frame': 6 obs. of 2 variables:.. ..$ bound topography: Factor w/ 3 levels ”B”,”S”,”W”: 2 2 2 2 2 NA.. ..$ elevation: int [1:6] 13 7 9 14 21 NA

Page 35: 2011 UseR AQP Talk

S4 Classes and Methods for Soil Information

Formal class 'SoilProfileCollection' [package ”.GlobalEnv”] with 4 slots..@ id : Named chr ”P001”.. ..− attr(*, ”names”)= chr ”id”..@ depths : int [1:6, 1:2] 0 2 14 49 57 89 2 14 49 57 ..... ..− attr(*, ”dimnames”)=List of 2.. .. ..$ : chr [1:6] ”1” ”2” ”3” ”4” ..... .. ..$ : chr [1:2] ”top” ”bottom”..@ units : chr ”cm”..@ horizons:'data.frame': 6 obs. of 4 variables:.. ..$ texture : Factor w/ 14 levels ”C”,”CBVSCL”,”FSL”,..: 8 8 8 8 9 NA.. ..$ stickiness : Factor w/ 4 levels ”MS”,”SO”,”SS”,..: NA NA NA NA NA NA.. ..$ plasticity : Factor w/ 4 levels ”MP”,”PO”,”SP”,..: NA NA NA NA NA NA.. ..$ field ph : num [1:6] 7.9 7.7 8 7.9 7.4 NA..@ site:'data.frame': 6 obs. of 2 variables:.. ..$ bound topography: Factor w/ 3 levels ”B”,”S”,”W”: 2 2 2 2 2 NA.. ..$ elevation: int [1:6] 13 7 9 14 21 NA..@ spatial:'SpatialPoints'..@ metadata:'data.frame'

Page 36: 2011 UseR AQP Talk

S4 Classes and Methods for Soil Information

# Initializationdepths(spc) <− id ˜ top + bottom

# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z

# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y

Page 37: 2011 UseR AQP Talk

S4 Classes and Methods for Soil Information

# Initializationdepths(spc) <− id ˜ top + bottom

# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z

# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y

# Accessorsunits(spc) # the units for depthsdepths(spc) # get depth matrixdepthsnames(spc) # get names of depth columnsprofile id(spc) # get profile IDshorizons(spc) # get horizon data as dataframe

Page 38: 2011 UseR AQP Talk

S4 Classes and Methods for Soil Information

# Initializationdepths(spc) <− id ˜ top + bottom

# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z

# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y

# Accessorsunits(spc) # the units for depthsdepths(spc) # get depth matrixdepthsnames(spc) # get names of depth columnsprofile id(spc) # get profile IDshorizons(spc) # get horizon data as dataframe

# Overloadsmin(spc) # min depth within collectionmax(spc) # max depth within collectionlength(spc) # number of profiles

# Coercicionas.data.frame(spc) # convert back to original dataframe

# dataframe−like interfacespc$property # read propertyspc$property <− runif(length(spc)) # write property

Page 39: 2011 UseR AQP Talk

Talk Outline

1 The soil resourceSoil matters!Soil scienceSoil data and its analysis

2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling

3 Recent and future developmentsIntroduction of S4 classesDesign

4 Conclusions and further work

Page 40: 2011 UseR AQP Talk

Conclusions

Soil science is changing, and needs new tools

aqp aims at providing soil scientists with a R-based pedometrics platform

R is a great platform for soil science:

The ultimate Excel-killer!Data and covariates in the same objectProvides advanced visualisations of soil dataHuge source of regression/classifion methods...... but also allows to test/extend/createSpatial-literate environmentSupports big data (parallelisation backends)

Page 41: 2011 UseR AQP Talk

Conclusions

Soil science is changing, and needs new tools

aqp aims at providing soil scientists with a R-based pedometrics platform

R is a great platform for soil science:

The ultimate Excel-killer!Data and covariates in the same objectProvides advanced visualisations of soil dataHuge source of regression/classifion methods...... but also allows to test/extend/createSpatial-literate environmentSupports big data (parallelisation backends)