11imsc, edinburgh, uk, 12-16 july 2010
DESCRIPTION
New techniques for detection and adjustment of shifts in daily precipitation series Xiaolan L. Wang 1,2 , H. Chen 3 , Y. Wu 2 , Y. Feng 1 , and P. Qiang 2. 1. Climate Research Division, Science & Technology Branch, Environment Canada - PowerPoint PPT PresentationTRANSCRIPT
New techniques for detection and adjustment of shifts in New techniques for detection and adjustment of shifts in
daily precipitation seriesdaily precipitation series
Xiaolan L. WangXiaolan L. Wang1,21,2, H. Chen, H. Chen33, Y. Wu, Y. Wu22, Y. Feng, Y. Feng11, and P. Qiang, and P. Qiang22
11IMSC, Edinburgh, UK, 12-16 July 201011IMSC, Edinburgh, UK, 12-16 July 2010
1.1. Climate Research Division, Science & Technology Branch, Environment CanadaClimate Research Division, Science & Technology Branch, Environment Canada
2. Department of Mathematics & Statistics, York University, Toronto, Canada2. Department of Mathematics & Statistics, York University, Toronto, Canada
3. Department of Mathematics & Statistics, Bowling Green State University, Ohio, USA3. Department of Mathematics & Statistics, Bowling Green State University, Ohio, USA
J. Appl. Meteor. Climatol. (accepted)
Our recent studies (Wang et al. 2007, Wang 2008a,b) 1. Propose two penalized tests, PMT and PMF, to even out the distribution of false alarm rates 2. Extend these penalized tests to account for the first order autocorrelation (red noise) 3. Propose a stepwise testing algorithm for detecting multiple changepoints in a single series
3. transPMF3. transPMFredred algorithm algorithm for detecting changepoints - in non-zero - in non-zero daily precipitationdaily precipitation series – typically non-Gaussian data series – typically non-Gaussian data - for use without a reference series - for use without a reference series
This study:
1. PMT1. PMTredred algorithm algorithm - for detecting mean shifts in - for detecting mean shifts in zero-trendzero-trend series with series with independent or AR(1)independent or AR(1) Gaussian noise Gaussian noise - for use with reference series - for use with reference series
k1 k2…
RHtestsV3 software package (R and FORTRAN; 220+ users from 55+ countries so far)
Background information:
2. PMF2. PMFredred algorithm algorithm - for detecting mean shifts in - for detecting mean shifts in constant trendconstant trend series with series with independent or AR(1)independent or AR(1) Gaussian noise Gaussian noise - can be used - can be used withoutwithout a reference series a reference series
k1 k2…
* Quantile Matching (QM) algorithm for adjusting quantile-dependent artificial shiftsQuantile Matching (QM) algorithm for adjusting quantile-dependent artificial shifts* RHtests_dlyPrcp software package for homogenization of RHtests_dlyPrcp software package for homogenization of dailydaily precipitation series precipitation series
tc - an unknown changepoint time
Nict
citXH
tXH
ii
iiia
iii
1 ,
1 ,:
against : test to
2
1
0
The PMF (TPR3) model for constant trend series (Wang 2008a and 2003):
The relevant model with Gaussian noise:
The test statistic for an unknown changepoint is a maximal F, not regular F statistic,
because of the need to search for the most probable point of change in a time series
t - independent or AR(1) Gaussian noise
tc
?
Also applicable to TPR3b model (Solow 1987) for a trend-change without an accompanying mean shift:tc
and TPR4 model (Lund & Reeves 2002) for a mean shift that may be accompanied by a trend-changetc
?
?
- Log transformation is often sufficient for monthly/annual total precipitation (Prcp) data series
recommend use the RHtestsV3 functions to test a log-transformed monthly/annual Prcp series
- Integrate a Box-Cox transformation in the PMFred algorithm, developing the transPMFred algorithm
& RHtests_dlyPrcp package for homogenization of daily precipitation data series
alleviates the limitation of the assumption of normal distribution in the RHtestsV3 package
Precipitation is typically not normally distributed; daily precipitation is not a continuous variable!
- Homogenization of daily precipitation data is much more challenging, and yet much needed for
characterizing extremes
Log-transformation is often not good enough; a data-adaptive transformation procedure is needed.
),...,2,1( 0 NiYi where is a series of non-zero daily precipitation amounts
0 ,log
0 ,/)1();(
i
iii
Y
YYhXBox-Cox transformation:
Yi can be other positive values, e.g., non-zero wind speedsThe gist of the transPMFred algorithm:
- For a set of trial λ values, use the PMFred algorithm to test each transformed series Xi- Use a profile log-likelihood statistic to find the best λ for the series being tested
A data-adaptive transformation, because different λ values (transformations) may be chosen for different series
To assess detection power of the transPMFred (nominal significance level: 5%)
Consider daily precipitation of 5 different distribution types (i.e., of different λ values: -0.2, -0.1, 0.0, 0.1, 0.2) log-normal distribution
For each distribution type (each λ): Block bootstrap 1000 surrogate series of N=600 from a homogeneous real precip. series whose λ is one of the five values
► False Alarm Rates (FARs) – apply the transPMFred to each of the homogenous surrogate series
Results: FARs are around the nominal level (5% here)
transPMFred for shift size:
transPMFred
PMFred
below 70%
above 95%
shorter upper tail longer upper tail
► Hit Rates (HRs):
– insert, at a randomly chosen position,
one shift to each surrogate series of N=600
then apply the new and old methods to detect
the inserted shift
Results: hit rates as a function of λ value
]10,10[ˆ :Hit KKk
HRs are all above 95%except for very small shifts
transPMFredfor shift size:
as a function of shift position K
HRs are basically independent of K
Quantile Matching (QM) algorithm – for adjusting quantile-dependent shifts,
i.e. shifts that affect not only the mean, but also the entire distribution of the data.
- regime dependent shifts- seasonality of shifts, e.g., …
Gist of QM adjustments – to match the distributions of different segments of the de-trended base series,
i.e., to diminish differences in the distribution caused by non-climatic factors.
to preserve in the QM-adjusted series the linear trend estimated from a multi-phase regression fit- important not to remove the natural trend!
Site moves at an Australian station quantile-dependent shifts:
de-seasonalized daily Tmin
QM-adjusted daily Tmin
Largest diff in the lowest 10% of daily Tmin
Mean-adjusted daily Tmin
var. diff. remains
Lord Howe Island daily Tmin
Site moves in Jan 1955 and Dec 1988
different variances
Larger effects onlow extremes
Seg. 1 Seg. 2 Seg. 3
Do these for each valueto be adjusted
Probability Distribution at Mq categories for each segment between-segment differences for each category & interpolate them by fitting splines (Mq=8 here):
Adjust to Seg. 3
For daily precipitation, the QM adjustments are estimated this way:
Precipitation trend component: );ˆ(ˆ 1b
tri
tri XhY i
tri tX ̂ˆ
triNi
tr YY ˆmaxˆ1max 0ˆ
m̂ax tri
tr YYDe-trended precipitation series:
0)ˆˆ(ˆmax tr
itr
idtri YYYY
Add these to the original series to make it homogeneous
Different quantiles inthe same segment could
be adjusted differently
Empirical Cumulative Frequency of the value to be adjusted
Adjustment if in Seg. 1
Adjustment if in Seg. 2
Examples to show:
1. The proposed new algorithm works well in detecting changepoints in real daily P
2. Small P are harder to measure with accuracy than larger P (larger %error)
– discontinuities often exist in freq. series of measured small P (e.g., P < 1 mm)
3. In the presence of frequency discontinuity,
any adjustment derived from the measured daily P is not good.
(e.g., ratio-based, Quantile-Matching)
One must address the issue of freq. discontinuity first!
The RHtestsV3 functions can be used to detect frequency discontinuities
Daily precipitation recorded at The Pas (Manitoba, Canada) for Jun 1st, 1910 to Dec 31st, 2007
- snowfall water equivalent; rainfall adjusted for wetting loses and gauge undercatch
(Mekis & Hogg 1999; and updates by Mekis)
- joining of two stns: 5052864 for up to 31 Dec. 1945, 5052880 1 Jan 1946 to 31 Dec. 2007
Same three changepoints detected
Examples of application
Both series have a very significant
changepoint near the time of joining of stations
Before including trace precipitation amounts, we have two Prcp data series for this site:
1. not adjusted for joining (noT_naJ)
2. has been adjusted for joining (noT_aJ)
Vincent & Mekis (2009):Ratio-based adjustments (used one rainfall ratio & one snowfall ratio for all data)
Results for the two series not including trace amounts(noT series):
Type Date Documented date of change(s) 1 4 Jul 1938 9 Oct 1937 to 8 Aug 1938: changes in gauge type, rim
height, observing frequency; poor gauge condition reported on 9 Oct 1937
1 24 Oct 1946 31 Dec 1945: joining of two nearby stations (5052864 + 5052880)
1 4 Oct 1976 16 Oct 1975 to 18 Oct 1977: gauge type change (standard at 12” rim height to Type B at 16” rim height)
Changes in the min. measurable amount (precision, unit) 1976-77
1945-46joining
1937-38
transPMFred detected the same 3 changepoints:
noT_naJ
-0.76 mm
1. noT_naJ (closest to original measurements):
2. noT_aJ (aJ changed the mean shift size from -0.76 mm to -0.73 mm)
The ratio-based adjustments for station joining failed to homogenize the series, because …
The discontinuities are mainly in the measurements of small precipitation (P ≤ 3 mm), especially in the frequency of measured small precipitation: Series of daily P > 3 mm – homogeneous!
noT_naJ > 3mm
noT_naJ > 3mm
No P < 0.3 mm or 0.4 < P ≤ 0.5 mm before 1976
noT_naJ
Much fewer0.3 ~ 0.4 mmuntil 1945 -joining point
0.21 mm from SWE
Much fewer0.5 ~ 1 mmuntil 1937
Good news for studying extreme (high) precipitation
Any ratio-based adjustments for joining are not good in this case, because larger P are adjusted more than smaller P when they should not be adjusted at all!
noT_aJ
The above frequency discontinuities largely remain:
b) IBC adjustments the Inverse Box-Cox (IBC) transformation of the fitted multi-phase regression lines
Homogenization of daily precipitation series – very challenging!!
Happy? – No!Because large P are adjusted similarly,while they should not be adjusted at all
wT_naJ
homogeneous
Seg. 1 Seg. 2
a) Ratio-based adjustments – bad in the presence of frequency discontinuity
We also tried
wT_naJ
c) QM adjustments
e.g., Quartile-Matching:(4 categories)
Adjust to last Seg. - Seg. 3
inhomogeneous
This is worse than the simple IBC adjustments!- still inhomogeneous; - larger absolute adjustments made to larger P
Seg. 1 Seg. 3Seg. 2
Quantile matching algorithms would work only if there is no discontinuity in the frequency, becausethey line up the adjustments by empirical frequency, implicitly assuming homogeneous frequencies. they should be used after all frequency discontinuities have been diminished!
How to address the issue of frequency discontinuity?
noT_naJ
1945-46
1955-56
PMFred algorithm
Flag more days with T? – which dates to flag? Needs obs’ of other variables, such as cloud, humidity…
At least, monthly and annual total Prcp can be adjusted to account for the frequency discontinuities,e.g., adjust the total trace amount in each month to that month’s current trace amount when
no trend in trace frequency
In spite of the uncertainty in the date of trace Prcp, adding days of a trace amount in the serieswould help obtain more accurate adjustments for other discontinuities using quantile-matching
Apply a homogeneity test to the frequency series, and homogenize the series if necessary, e.g.:
The frequency of reported trace occurrence at station The Pas is not homogeneous!
No trendAdding a trace amount for T-flagged days is
not good enoughin this case
Need to address the issue of frequency discontinuity!
But how?
Concluding remarks
Shall aim to get better insight into the cause (metadata) and characteristics of discontinuity
(e.g., freq.) before any attempt to adjust daily precipitation data!
In the presence of frequency discontinuity, any adjustment derived from the measured daily P is not good, no matter how it was derived!One must address the issue of frequency discontinuity before doing any adjustment (incl. QM)!
2) also test the frequency series of zero P and small P (e.g. Trace, ≤0.3 mm, 0.3-0.5 mm, …) (e.g., using the PMFred algorithm)
- Homogenization of precipitation data, especially daily P, is very challenging would recommend: 1) use transPMFred to test series of P > Pmin with different Pmin values
(e.g. 0.0, 0.3 mm, 0.4 mm, 0.5 mm, 1.0 mm, …)
should reflect changes in measurement precision/unit
- the new method, transPMFred, works well for both simulated and real daily precipitation data
References:Wang, X. L., H. Chen, Y. Wu, Y. Feng, and Q. Pu, 2010: New Techniques for detection and adjustment of shifts in daily precipitation data series. J. App. Meteor. Climatol, accepted subject to revision.
Wang, X. L., 2008a: Penalized maximal F test for detecting undocumented mean-shift without trend change.
J. Atmos. Oceanic Technol., 25 (No. 3), 368-384. DOI:10.1175/2007/JTECHA982.1Wang, X. L., 2008b: Accounting for autocorrelation in detecting mean-shifts in climate data series
using the penalized maximal t or F test. J. App. Meteor. Climatol, 47, 2423–2444.
Wang, X. L., Q. H. Wen, and Y. Wu, 2007: Penalized Maximal t-test for Detecting Undocumented Mean Change
in Climate Data Series. J. App. Meteor. Climatol., 46 (No. 6), 916-931. DOI:10.1175/JAM2504.1
Wan, H., X. L. Wang, and V. R. Swail, 2007: A Quality Assurance System for Canadian Hourly Pressure Data.
J. App. Meteor. Climatol., 46 (No. 11), 1804-1817.
Thank you very much for your attention!Thank you very much for your attention!Questions and/or comments?Questions and/or comments?
The RHtestsV3 and RHtests_dlyPrcp software packages are available free of charge at
http://cccma.seos.uvic.ca/ETCCDMI/software.shtml
- used by WMO ETCCDI in 12 training workshops so far (Expert Team on Climate Change Detection and Indices)