Appendix a Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A Algorithms
This section includes a detailed description of data handling and processing for the determination of the maximum (MS), beginning (BS) and end (ES) of the SMD.
On request, program code is available from the authors in the open source data analysis and programming language R (R Development Core Team 2005).
Method A: points of inflexion
MS:
1) logarithmic transformation of values with ln(x+1) in order to diminish the importance of outliers and
counting uncertainties in case of low values and to ensure positive values;
2) kernel smoothing with Gaussian filter (c.f. Venables and Ripley 1999) and bandwidth h = 4 · d m
with the median sampling interval dm; here regarded as strong smoothing to overcome twofold
maxima;
3) determination of time window between ice-out and midyear;
4) in case of declining values after ice-out, the time window starts at the first pit;
5) determination of the next peak as MSA.
BS:
1) kernel smoothing with Gaussian filter and bandwidth h = 2 ·dm; this is regarded as minimum
smoothing in order not to distort the time series and to obtain an equidistant time spacing;
2) logarithmic transformation of values with ln(x+1); here performed after smoothing to ensure less
deviations from original data;
3) differencing with a time lag of one week;
4) determination of difference peaks (i.e. the inflexion points) in the time window from ice-out to MSA;
5) if necessary (i.e. no peak after ice-out), the time window is elongated to the beginning of the year;
6) selection of the highest difference peak in the time window (i.e. the strongest increase) as BSA.
ES:
1) using differences of smoothed logarithmic values from BS determination;
1 2) determination of difference pits in the time window from MSA to midyear;
3) if necessary (i.e. no peak until midyear), the time window is elongated to the end of the year;
4) selection of the lowest difference pit in the time window (i.e. the strongest decrease) to be ESA.
Method B: Weibull function
This method is based on a nonlinear regression of the values to function w (Eq. 1) and requires linear interpolation between values in some cases. It proved to be advantageous when less than 35 samples were taken until midyear or when the median sampling rate was greater than 10 days. It also is recommended when very few dates form the peak, i.e. when less than 5 values are above the mean value. For the cardinal date determination, the fitted curve wf is used.
MS:
1) the maximum value of wf is MSB.
BS:
1) determination of the integral of wf that denotes the area below the peak;
2) substraction of the baselines (d+1) ·(1−a) before and d after MSB.
3) BSB denotes the 10 % quantile of the area before the peak.
ES:
1) ESB denotes the 90 % quantile of the area after the peak.
Method C: linear segments
This method is based upon the assumption of an exponential increase and decline of the SMD and uses the logarithmic values.
MS:
1) selection of one of the dates with a value higher than half the maximum value as priliminary MSC.
BS:
1) the first constant segment is the median in the interval between the first and the next sampling
date;
2) the second increasing segment connects the median at the respective sampling date with MSC;
2 3) repetition of both steps for each sampling date;
4) determination of the mean square error between the combined segments and measured values;
5) selection of date BSC for which the deviation is at its minimum.
ES: ESC as analogon to BSC with a declining and a constant segment.
The determination of MS, BS and ES are conducted for all dates with higher values than half the maximum value. The final choice are those with the minimum deviation from the measured values.
Appendix B Mann-Kendall Statistics
The description of the statistics follows Yue and Wang (2004).
Given a time series x of length n with equal time intervals, then the slope of its trend, b , results from
x j xl median l j n j l
The corresponding MK statistic is defined as
n1 n 1 z 0 S sign(x j xi ) where sign(z) 0 z 0 i1 ji1 1 z 0
S is assumed to be approximately normally distributed when n ≥ 8, with mean E(S) = 0 and variance
1 m V(S) n(n 1)(2n 5) ti (ti 1)(2ti 5) , 18 i1
where m is the number of ties of extent ti.
The standardised test statistic of the MK test is given by
(S 1) / V(S) S 0 z 0 S 0 (S 1) / V(S) S 0 and the corresponding p-value for the one-tailed test is determined by
2 1 z t p 0.5 Φ z with Φ z exp dt 0 2 2
3 At the significance level of 95% (p < 0.05), the existing trend is considered unlikely to be caused by random sampling but to be statistically significant.
4