trim for first users

48
TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS)

Upload: nguyencong

Post on 28-Dec-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Trim for first users

TRIM Workshop Arco van StrienWildlife statistics Statistics Netherlands (CBS)

Page 2: Trim for first users

What is TRIM?• TRends and Indices for Monitoring

data• Computer program for the analysis

of time series of count data with missing observations • Loglinear, Poisson regression (GLM)• Made for the production of wildlife

statistics by Statistics Netherlands (Jeroen Pannekoek / freeware / version 3.0)

Introduction

Page 3: Trim for first users

Why TRIM?• To get better indices? No, GLM in

statistical packages (Splus, Genstat...) may produce similar results • But statistical packages are often

unpractical for large datasets • TRIM is more easy to use

Introduction

Page 4: Trim for first users

The program of this workshopAim: a basic understanding of TRIM • basic theory of imputation• how to use TRIM to impute missing

counts and to assess indices etc. • basic theory of weighting procedure

to cope with unequal sampling of areas & how to use TRIM to weight particular sites

Introduction

Page 5: Trim for first users

Site

Year 1 Year 2 Year 3 Year 4 Year 5

1 20 10 8 2 3 2 20 10 12 3 2 3 16 8 10 3 3 4 8 4 6 6 5 5

10 5 7 7 8

Sum

74 37 43 21 21

Index 100 50 58 28 28

Introduction

INDEX: the total (= sum of al sites) for a year divided by the total of the base year

Page 6: Trim for first users

Site

Year 1 Year 2 Year 3 Year 4 Year 5

1 20 10 8 2 3 2 20 10 12 3 2 3 16 8 10 3 3 4 8 4 6 - 5 5

10 5 7 - 8

Sum

74 37 43 8?? 21

Index 100 50 58 11?? 28

Missing values affect indices

Theory imputation

Page 7: Trim for first users

Site

Year 1 Year 2

1 2 4 2

1 ?

Sum

3 ?

Index 100 ?

How to impute missing values?

ESTIMATION OF SITE 2 IN YEAR 2? SITE 1 SUGGESTS: TWICE THE NUMBER OF YEAR 1(site & year effect taken into account)

26

200

Theory imputation

Page 8: Trim for first users

Site

Year 1 Year 2

1 1 2 2

3 ?

Sum

4 ?

Index 100 ?

Another example..

ESTIMATION OF SITE 2 IN YEAR 2? SITE 1 SUGGESTS: TWICE THE NUMBER OF YEAR 1

6

8

200

Theory imputation

Page 9: Trim for first users

Site

Year 1 Year 2

1 1 3 2

3 ?

Sum

4 ?

Index 100 ?

And another example ...

ESTIMATION OF SITE 2 IN YEAR 2? SITE1 SUGGESTS: THREE TIMES AS MANY AS IN YEAR 1

9

12

300

Theory imputation

Page 10: Trim for first users

Site

Year 1 Year 2

1 ? 4 2

1 ?

Sum

? ?

Index 100 ?

Try this one…..

THERE IS NOT A SINGLE SOLUTION (TRIM will prompt an ERROR)

Theory imputation

Page 11: Trim for first users

Site

Year 1 Year 2 Year 3 Year 4 Year 5

1 20 10 8 2 3 2 - - 12 3 2 3 16 8 10 3 3 4 8 4 6 - 5 5

10 5 7 - 8

Sum

? ? 43 8?? 21

Index 100 50 58 11?? 28

Difficult to guess missings here..

Theory imputation

Page 12: Trim for first users

Site

Year 1 Year 2 Margin

1 2 4 6 2

1 ? 1

Margin

3 4 7

Estimating missing values by an iterative procedure(REQUIRED IN CASE OF MORE THAN A FEW MISSING VALUES)

Theory imputation

Page 13: Trim for first users

Site

Year 1 Year 2 Margin

1 2 4 6 2

1 ?

1

Margin

3 4

7

RECALCULATE THE MARGIN TOTALSAND REPEAT ESTIMATION OF MISSING

First estimate of site 2, year 2: 1 X 4/7 = 0.6

>>0.6

>>4.6

>>1.6

>>7.6

Theory imputation

Page 14: Trim for first users

Site

Year 1 Year 2 Margin

1 2 4 6 2

1 0.96 >>>> 2

>>>> 3

Margin

3 >>>> 6 >>>> 9

Index 100 200

REPEAT AGAIN: MISSING VALUE = 1.22, 1.40, 1.54 ETC. … >> 2

2nd estimate of site 2, year 2: 1.6 X 4.6/7.6 = 0.96

Theory imputation

Page 15: Trim for first users

• To get proper indices, it is necessary to estimate (impute) missings • Missings may be estimated from the margin

totals using an iterative procedure (taking into account both site effect as year effect) (Note: TRIM uses a much faster algorithm to impute missing values).

• Assumption: year-to-year changes are similar for all sites (assumption will be relaxed later!) • Test this assumption using a Goodness-of-

fit (X2 test)

Theory imputation

Page 16: Trim for first users

Site

Year 1 Year 2 Margin

1 2 4 6 2

1 3 4

Margin

3 7 10

(2.8)(4.2)

(1.2)(1.8)

X2: COMPARE EXPECTED COUNTS WITH REAL COUNTS PER CELL

X2 IS SUMMATION OF (COUNTED - EXPECTED VALUE)2 / EXP. VALUE (2-1.8)2 /1.8 + (4-4.2)2 /4.2 ETC. >> X2 = 0.08 WITH A P-VALUE OF 0.78 >>MODEL NOT REJECTED (FITS, but note: cell values in this example are too small for a proper X2 test) Theory imputation

Page 17: Trim for first users

Site

Year 1 Year 2 Year 3 Year 4 Year 5

1 20 10 8 2 3 2 20 10 12 3 2 3 16 (7.5) ? 10 3 3 4 8 4 6 (2.3) ? 5 5

10 5 7 7 8

Sum

74 36 43 17 21

Index 100 49 58 23 28

Imputation without covariate(X2 = 18 and p-value = 0.18)

Theory imputation

Page 18: Trim for first users

Site

Year 1

Year 2 Year 3 Year 4 Year 5

1 20 10 8 2 3 2 20 10 12 3 2 3 16 7.5>>9.1 10 3 3 4 8 4 6 2.3>> 5.4

5

5 10 5 7 7 8 Sum

74 36>>38 43 17>>20 21

Index 100 49>>51 58 23>>28 28

Using a covariate: better imputa-tions & indices, X2 = 1.7 p = 0.99

Theory imputation

Page 19: Trim for first users

Model

X2 df p-value

1

191 140 0.0026

2

154 133 0.09

3

161 143 0.14

What is the best model?

< not rejected

<<< rejected

< not rejected

Both model 2 and 3 are valid

Theory imputation

Page 20: Trim for first users

Summary imputation theory• To get proper indices, it is necessary to

impute missings • Assumption: year-to-year changes are

similar for all sites of the same covariate category • Test assumption using a GOF test; if p-

value < 0.05, try better covariates• If these cannot be found, the resulting

indices may be of low quality (and standard errors high). See also FAQ’s!

Theory imputation

Page 21: Trim for first users

The program of this workshopAim: a basic understanding of TRIM • basic theory of imputation• how to use TRIM to impute missing

counts and to assess indices etc. • basic theory of weighting procedure

to cope with unequal sampling of areas & how to use TRIM to weigh particular sites

Using TRIM

Page 22: Trim for first users

Using TRIM• several statistical models (time effects, linear model)• statistical complications (overdispersion, serial correlation) taken into account• Wald tests to test significances• model versus imputed indices • interpretation of slope

Using TRIM

Page 23: Trim for first users

Time effects model (skylark data) without covariate

Using TRIM

Page 24: Trim for first users

Time effects model with covariate 0 = total 1= dunes 2 = heathland

Using TRIM

Page 25: Trim for first users

Lineair trend model (uses trend estimate to impute missing values)

Using TRIM

Page 26: Trim for first users

Lineair trend model with a changepoint at year 2

Using TRIM

Page 27: Trim for first users

Lineair trend model with changepoints at year 2 and 3

Using TRIM

Page 28: Trim for first users

Lineair trend model with allchangepoints = time effects modelUse lineair trend model when: • data are too sparse for the time effects

model• one is interested in testing trends, e.g.

trends before and after a particular year (or let TRIM stepwise search for relevant changepoints)

But be careful with simple linear models!

Using TRIM

Page 29: Trim for first users

Statistical complications: • Serial correlation: dependence of

counts of earlier years (0 = no corr.) • Overdispersion: deviation from

Poisson distribution (1 = Poisson)

Using TRIM

Run TRIM with overdispersion = on and serial correlation = on, else standard errors and statisticaltests are usually invalid

Page 30: Trim for first users

Running TRIM features• trim command file• output: GOF (as X2) test and Wald

tests • output (fitted values, indices) • indices, time totals • overall trend slope• Frequently Asked Questions• different models (lineair trend

model, changepoints, covariate)

Using TRIM

Page 31: Trim for first users

Model run

X2 df p-value Akaikes Info. Criterium

1, all changepoints

191 140 0.0026 -85

2, all ch. points plus covariate

154 133 0.09 -106

3, two ch. points plus covariate

161 143 0.14 -125

Using TRIM

Both 2 and 3 are valid.Model 3 is the most sparse model.

What is the best model?

Page 32: Trim for first users

Model choice • The indices depend on the statistical

model!• TRIM allows to search for the best

model using GOF test, Akaikes Information Criterion and Wald tests • In case of substantial overdispersion,

one has to rely on the Wald tests

Using TRIM

Page 33: Trim for first users

Wald tests

Different Wald-tests to test for the significance of:• the trend slope parameters• changes in the slope• deviations from a linear trend• the effect of each covariate

Using TRIM

Page 34: Trim for first users

TRIM generates both model indices and imputed indices

Using TRIM

Page 35: Trim for first users

Imputed vs model indicesImputed indices: summation of real counts plus - for missing counts - model predictions. Closer to real counts (more realistic course in time) Model indices: summation of model predictions of all sites. Often more stable

Using TRIM

Usually Model and Imputed Indiceshardly differ!

Page 36: Trim for first users

TRIM computes both additive and multiplicative slopes

Additive + s.e. Multiplicative + s.e. 0.0485 0.0124 1.0497 0.0130

Relation: ln(1,0497) = 0.0485

Using TRIM

Multiplicative parameters are easier to understand

Page 37: Trim for first users

Interpretation multiplicative slopeSlope of 1.05 means 5% increase a year

Using TRIM

Standard error of 0.013 means a confidence interval of 2 x 0.013 = 0.026 Thus, slope between 1.024 and 1.076

Or, 2% to 8% increase a year = significant different from 1

Page 38: Trim for first users

Summary use of TRIM: • choice between time effects and linear trend model• include overdispersion & serial correlation in models• use GOF and Wald tests for better models and indices & to test hypotheses • choice between model and imputed indices • use multiplicative slope

Using TRIM

Page 39: Trim for first users

The program of this workshopAim: a basic understanding of TRIM • basic theory of imputation• how to use TRIM to impute missing

counts and to assess indices etc. • basic theory of weighting procedure

to cope with unequal sampling of areas & how to use TRIM to weight particular sites

Weighting

Page 40: Trim for first users

Unequal sampling due to• stratified random site selection, with

oversampling of particular strata. Weighting results in unbiased national indices • site selection by the free choice of

observers, with oversampling of particular regions & attractive habitat types. Weighting reduces the bias of indices.

Weighting

Page 41: Trim for first users

To cope with unequal sampling.• stratify the data, e.g. into regions

and habitat types • strata are to be expected to have

different indices & trends • weigh strata according to (1) the

number of sample sites in the stratum and (2) the area surface of the stratum • or weigh by population size per

stratum Weighting

Page 42: Trim for first users

Stratum

Total area

Area sampled Weight factor

i

50 5 (undersampled)

2

k

50 10 (oversampled)

1

Weighting factor for each stratum

Weighting factor for stratum i = total area of i / area of i sampled

Weighting

or 10

or 5

Page 43: Trim for first users

Stratum

Total area

Area sampled Weight factor

i

100 5 (undersampled)

k

50 10 (oversampled)

Another example ..

Weighting factor for stratum i = total area of i / area of i sampled

Weighting

100/5= 20(or 4)

50/10=5(or 1)

Page 44: Trim for first users

Weighting in TRIM• include weight factor (different per

stratum) in data file for each site and year record • weight strata and combine the

results to produce a weighted total (= run TRIM with weighting = on and covariate = on)

Weighting

Page 45: Trim for first users

Indices for Skylark unweighted (0 = total index 1= dunes 2 = heath-land)

Weighting

Page 46: Trim for first users

Indices for Skylark with weight factor for each dune site = 10(0 = total index 1= dunes 2 = heathland)

Weighting

Page 47: Trim for first users

Final remarks

To facilitate the calculation of many indices on a routine basis• TRIM in batch mode, using TRIM

Command Language (see manual) • Option to incorporate TRIM in your

own automation system (Access or Delphi or so) (not in manual)

Page 48: Trim for first users

That’s all, but: • if you have any questions about

TRIM, see the manual, the FAQ’s in TRIM or mail Arco van Strien [email protected]

Success!