simultaneous estimation of monotone trends and seasonal patterns in time series of environmental...

34
Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

Upload: margaretmargaret-parsons

Post on 17-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

Simultaneous estimation of monotone trends and seasonal patterns in time

series of environmental data

By

Mohamed Hussian and Anders Grimvall

Page 2: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

2

Outline

Examples of monotone relationships in environmental data

Monotone regression in one or more independent variable

Simultaneous estimation of monotone trends and seasonal

patterns

Monte Carlo methods for constrained least squares regression

Simple averaging techniques

Page 3: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

3

Tot-P concentrations (Brunsbüttel) versus water discharge(NeuDarchau) in the Elbe

River, Mean values for April 1985-2000

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1000 2000 3000 4000 5000 6000 7000

Monthly mean Runoff (m3/s)

Mon

thly

mea

n to

t-P

con

cent

ratio

ns

(mg/

l)

Page 4: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

4

20

30

40

50

60

70

80

90

100

110

40 50 60 70 80 90 100

Humidity (%)

Mo

nth

ly a

vera

ge

ozo

ne

(m g

/m3 )

season1_(Jan-Mar) season2_(Apr-Jun) season3_(Jul-Sept) season4_(Oct-Dec)

Average monthly ozone concentrations versus humidity at Ähtäri in central

Finland

Page 5: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

5

1980

1990

2000

0 2 4 6 8 10 12

2

4

6

8

10

19851990

19952000 0

5

10

15

2

4

6

8

10

Tot-N concentrations (mg/l)

Monthly mean concentrations of total

nitrogen at Brunsbüttel in the Elbe River

Year

YearMonthMonth Month

Page 6: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

6

Tot-N concentrations (Brunsbüttel) in the

Elbe River, Mean values for July 1985-2000

0

1

2

3

4

5

6

7

8

1984 1986 1988 1990 1992 1994 1996 1998 2000

Year

Tot-

N c

on

cen

trat

ion

s (m

g/l)

Tot-N concentrations

Page 7: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

7

Monotone regression

(isotonic or antitonic regression)

Given a set of two-dimensional data

Sort the data by x into

Minimise under the constraints

or

A well-known algorithm used to solve the problem is the PAV Algorithm

(Pool-Adjacent-Violators Algorithm), (Ayer, 1955; Barlow et al., 1972;

Hanson et al., 1973)

2)(1 )(

1 )ˆ( ini i yyn

)()2()1( ˆˆˆ nyyy

niii yx

1)()( ,

niii yx 1,

)()2()1( ˆˆˆ nyyy

Page 8: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

8

Tot-N concentrations (Brunsbüttel) in the

Elbe River, Mean values for July 1985-2000

0

1

2

3

4

5

6

7

8

1984 1986 1988 1990 1992 1994 1996 1998 2000

Year

Tot-

N c

on

cen

trat

ion

s (m

g/l)

Tot-N concentrations The PAV output

Page 9: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

9

If the data are already monotone, then the PAV algorithm will

reproduce them

The solution is a step function

If there are outliers, then the PAV algorithm will produce long,

flat levels.

The impact of outliers can be reduced by first smoothing the

data (Friedman and Tibshirani, 1984).

The PAV Algorithm

Page 10: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

10

An algorithm for computing the least squares regression function which is constrained to be nondecreasing in each of several independent variables was developed by R. Dykstra & T. Robertson, 1984. The algorithm was written specifically for two independent

variables, and it is to produce the solution of

where is a given two-dimensional array of the original values;

is a nonnegative array of weights; and K is the class of two-dimensional arrays, G=( ) such that

whenever

Monotone regression in two independent variables

ji

ijijijKg wxgMinimise,

2)(

ijx

ijwijg

ljandki klij gg

Page 11: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

11

Inefficient for relatively small data sets

Can not handle typical multiple regression data where at least

one of the explanatory variables is continuous

Unclear how seasonality can be handled

Limitations of classical algorithms for monotone regression in two or more

explanatory variables

Page 12: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

12

1980

1990

2000

0 2 4 6 8 10 12

2

4

6

8

10

19851990

19952000 0

5

10

15

2

4

6

8

10

Tot-N concentrations (mg/l)

Monthly mean concentrations of total

nitrogen at Brunsbüttel in the Elbe River

Year

YearMonthMonth Month

Page 13: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

13

1

4

7

10

13 Jan

Apr

Jul

Oct0

1

2

3

4

5

6

Example of linear trend with a superimposed trigonometric seasonal

pattern

y

MonthYear

Page 14: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

14

Monte Carlo methods for constrained least squares regression

Let denote a time series of data collected

over m seasons

Let denote the sum of the trend and seasonal components

at time i

Determine by minimising

under the following constraints:

iy

i

ii yyS 2)ˆ(

iy

nyyy ,...,,

21

Method I

Page 15: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

15

Monte Carlo methods for constrained least squares regression

Monotonicity constraints

is either decreasing or increasing for each season

or

Seasonality constraints

The seasonal pattern is composed of convex and concave curve pieces, i.e.,

for all time points belonging to a given season.

iy

,,...,1,ˆˆ mNiyy mii

.,...,1,ˆˆ mNiyy mii

,0)ˆˆ2ˆ)(ˆˆ2ˆ( 1111 mimimiiii yyyyyy.1,...,2 mNi

Method I

Page 16: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

16

Algorithm

General Information

The problem is a classical quadratic optimisation problem

The computational burden increases rapidly with the number of

variables and constraints

This burden can be a serious problem if the suggested

algorithms do not take into considerations the special features of

the constraints

Method I

Page 17: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

17

Algorithm

Theoretical Solution Given a crude initial estimate of

form new estimates , k = 1, 2, …,

by employing an updating formula

: is a vector defining the shape of the adjustment

: is a scaling factor

Niyi ,...,1,ˆ 0

Niyi ,...,1,ˆ

Niy ki ,...,1,ˆ

,,...,1,ˆˆ 1 Nibhyy ki

ki

ki

Nib ki ,...,1,

h

Method I

Page 18: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

18

Algorithm

1

4

7

10

13

16 S1

S6

S110

0.2

0.4

0.6

0.8

1

YearSeason

1 4

7

10

13

16 S1

S5

S90

0.2

0.4

0.6

0.8

1

Year

Season

Shapes of the functions used for updating the response surface

Method I

Page 19: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

19

is determined in such a way that

is minimised and the desired constraints are satisfied.

Applying such a solution will reduce the original multivariate

optimisation problem to a sequence of univariate optimisation

problems.

i

ki

kii bhyyhS 2)ˆ()(

Algorithm

h

Method I

Page 20: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

20

1985

1990

1995

2000 0

5

10

15

2

4

6

8

10

Response surface satisfying monotonicity

and convexity constraints

MonthYear

Fitted Tot-N concentration (mg/l)

Method I

Page 21: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

21

Simple averaging techniques

Consider satisfying ,

where denotes a vector of m explanatory

variables,

is assumed to be monotone in ( nondecreasing or non-

increasing ).

Nondecreasing case, let be an initial estimate of which

could be the data itself, then consider

and

xZ

m )(xy

),...,,( 21 mxxxx

)(xm x

xxZxM x ':min)(1

Method II

)(xm

xxZxM x ':max)(2

y

Page 22: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

22

Simple averaging techniques

Method II

M2 values

M1 values

0

1

2

3

4

5

6

7

8

1984 1986 1988 1990 1992 1994 1996 1998 2000 2002

Year

Tot_

N c

on

cen

trat

ion

s (m

g/l)

Tot-N conc. Lower limit Upper limit

Page 23: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

23

For , the set of estimators

are non decreasing in , and work well for light-tailed error

distributions (Strand, 2003; Mukerjee & Stern, 1994).

The estimate of is the value that minimises

which is

x

10 )()()1()( 21 xMxMxM

Method II

2

)]([ x

x xMZ

x

x x

xMxM

xMxMxMZ2

12

121

)]()([

)]()()][([

Simple averaging techniques

Page 24: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

24

Nonincreasing case, the same steps to create estimates based on

instead of and changing the signs on the

estimates back at the end to get the nonincreaing function.

Seasonality was handled by defining two monotone function

with respect to the seasons having high and low concentration

values.

),( ii YX ),( ii YX

Method II

Simple averaging techniques

Page 25: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

25

Method II

Simple averaging techniques

0

2

4

6

8

10

12

14

0 2 4 6 8 10 12 14

observed response function1 function2

Maximum

Minimum

Page 26: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

26

1985

1990

1995

2000 0

5

10

152

4

6

8

10

Monthly mean concentrations of total

nitrogen at Brunsbüttel in the Elbe River

Year

Monthly mean Tot-N concentration (mg/l)

Method II Month

Page 27: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

27

Method II

1985

1990

1995

2000 0

5

10

152

4

6

8

10

Fitted monthly mean concentrations of

total nitrogen at Brunsbüttel in the Elbe

River Fitted monthly mean Tot-N concentration (mg/l) based on the observed data.

YearMonth

Page 28: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

28

1985

1990

1995

2000 02

46

810

12

2

3

4

5

6

7

8

9

Smoothed monthly mean concentrations of

total nitrogen at Brunsbüttel in the Elbe

River Lightly Smoothed monthly mean Tot-N concentration (mg/l) bandwidth( 0.05 )

Method II Year

Month

Page 29: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

29

1985

1990

1995

2000 02

46

810

12

2

4

6

8

10

Smoothed monthly mean concentrations of

total nitrogen at Brunsbüttel in the Elbe

River Strongly Smoothed monthly mean Tot-N concentration (mg/l) bandwidth( 0.3 )

Method II Year

Month

Page 30: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

30

0

5

10

1985 1990 1995 2000

0

10

20

30

40

50

60

Method II

Simple averaging techniquesin multidimensional case

Year

Water discharge levels

(10^9 m^3/month)

Smoothed Tot_N transport kton/month), February values 1985-2000

Page 31: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

31

1985 1990 1995 2000 0

5

10

0

10

20

30

40

50

60

Method II

Simple averaging techniquesin multidimensional case

Fitted Tot_N transport kton/month), February values 1985-2000

Water discharge levels

(10^9 m^3/month)

Year

Page 32: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

32

Results

The two Algorithms have performed satisfactorily on water quality

data from the Elbe River and other rivers,

Regardless of the features of the data sets that were examined, the

obtained sequences of fitted surfaces converges to a function that

could be interpreted as a sum of trend and seasonal components,

The components representing irregular variation provided a good

starting point for the detection of outliers,

The major drawback of The Monte Carlo Algorithm was the

computational burden

Page 33: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

33

Results

Simple averaging techniques are efficient and work well for

initial estimates that have light-tailed

Simple averaging techniques are sensitive to outliers, and can

have problems with sparse data

For monotone functions, an alternative to using large bandwidth

is to use a slightly smaller bandwidth and then improve the

accuracy by making the estimates monotone

Page 34: Simultaneous estimation of monotone trends and seasonal patterns in time series of environmental data By Mohamed Hussian and Anders Grimvall

34

Conclusions

It is possible to combine non-parametric procedures with very natural constraints on the trend and seasonal components of time series of environmental data

The proposed procedures are so generally applicable that they can form the basis of fully-automatic systems for quality assessment and decomposition of time series of environmental data

Applications involving several explanatory variables or sparse data sets require further methodological work