<<

Terminus Estimation from Landsat Image Intensity Profiles

Joseph Usset1, Arnab Maity2, Ana-Maria Staicu3 and Armin Schwartzman4

Abstract

Mountain glacier retreat is an important problem related to temperature increase caused by global climate change. The retreat of mountain has been studied from the ground, but there exists a need for automated methods to catalog glacial change with a wider scope. A viable approach is to extract intensity profiles from Landsat images along the glacial flowline and follow the terminus location over time. We propose a new robust and accurate statistical algorithm to estimate the movement of glacial termini over time from these extracted image intensity profiles. The method we propose first uses regression splines to smooth the image intensity profiles. For each profile the glacial terminus location is assumed to lie near a point of high negative change in the smoothed profiles. An approximate path of termini locations over time is obtained by an algorithm that seeks to minimize the cumulative first derivative value across the profiles. Spline smoothing is applied to this pilot path for estimation of long-term terminus movement. The predictions from the method are evaluated on simu- lated data and compared to available ground measurements for the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers.

Keywords: Change-Point Estimation; Cross-Validation; Non-parametric Regression; Satellite Imagery; Spline smoothing.

1Department of Biostatistics, University of Kansas, Kansas city, USA (E-mail: [email protected]). 2Department of Statistics, North Carolina State University, Raleigh, USA (E-mail: [email protected]). 3Department of Statistics, North Carolina State University, Raleigh, USA (E-mail: [email protected]). 4Department of Statistics, North Carolina State University, Raleigh, USA (E-mail: [email protected]).

1 1 Introduction

Mountain glacier retreat is an important problem for several reasons. Glaciers can be used as a proxy indicator for global climate change, and in particular temperature change [17]. Also, glacial retreat affects water supply in communities where glacial melt is a source of freshwater [5, 21]. Over the past several decades, there has been a widely reported retreat of mountain glaciers, and this retreat has accelerated in the past decade [18, 6, 17, 21, 5]. For these reasons, there is an interest in tracking mountain glacier retreat worldwide. One method for tracking glacial retreat has been ground measurements, but these can be difficult to obtain. Instead, [12] and [11] have proposed to locate and track glacier termini over time using Landsat images, which have been available to the public since 2009. This approach is based on image intensity profiles extracted from the Landsat images along the glacier flowline. In this paper we present an improved statistical algorithm used to locate and track glacial termini over time from the extracted image intensity profiles. Figure 1 demonstrates the basic elements of our analysis, using the Nigardsbreen mountain glacier as an example. The top left plot shows a 2-dimensional (2-D) false color composite image of the glacier cropped from the Landsat archive at a given time frame. Drawn on it in yellow is a 1-dimensional (1-D) outline of the glacier flowline, produced semi-manually by the method of [12]. The red arrow indicates the terminus location, at the boundary of the ice and a sub-. The top right plot displays the extracted intensity profiles along the flowline from 15 spatially registered Landsat images (thermal band B62) between 2000 to 2012. The terminus location corresponds to the high-to-low transition in each profile. The bottom plots present the image profiles over time, where the terminus location corresponds to the blue-to-brown transition. The estimated glacier terminus path from our proposed statistical algorithm is shown in yellow, with confidence bands shown in black and compared to ground measurements represented by the red dots. Note that the ground measurements were not used at all in the estimation, indicating how remarkably accurate the proposed method can be based on the Landsat images alone. In this paper we describe the statistical algorithm that produced the estimated glacier terminus path and confidence bands. The algorithm we propose receives as input the time series of 1-D spatial intensity profiles and gives as output the estimated terminus path over time. The proposed algorithm contains three main steps:

1. Smooth the intensity profiles; and obtain the first derivative estimates of the smoothed

2 250 200 150 Profile Intensity Profile 100 50

0 500 1000 1500 2000 2500

Space Along Glacier Path

250

250 2010 2010

MIAE = 7.7 MAD = 12.9 200 200 2008 2008 COV = .90

2006 150 2006 150 Year Year 2004 2004 100 100 2002 2002

0 500 1000 1500 2000 2500 1300 1400 1500 1600 1700 Space along the Glacial Path (Meters) Space along the Glacial Path (Meters)

Figure 1: Top left: 2-D cropped Landsat image of Nigardsbreen glacier with marked flow- line (yellow) and terminus (red arrow). Top right: Extracted 1-D intensity profiles along the flowline. Bottom left: Extracted 1-D intensity profiles laid out over time; along with the estimated terminus path (yellow), confidence intervals (black), and ground measure- ments (red dots), with evaluated metrics in the legend. Obstructed profiles appear as light blue stripes. Bottom right: Zoomed in display of estimated path, confidence intervals, and ground measurements. Time points where the image profiles were observed are marked by purple hash marks on the left of the image.

3 profiles.

2. Input the estimated first derivative profiles into an algorithm that outputs a pilot path of the terminus location over time by minimizing the integrated first derivative.

3. Globally smooth the pilot path to estimate the long-term advance or retreat of the terminus.

The above algorithm, beginning with Step 1, is motivated by the fact that the terminus of the glacier is often marked by a sharp high-to-low transition of the intensity profile, which corresponds to a highly negative local minimum of the first derivative after smooth- ing (Figure 1). However, as we shall see in other glaciers, sharp local minima can also occur due to obstructive elements such as shadows from nearby mountains and debris. In addition, the glacier terminus itself may be occluded by clouds and seasonal snowfall, producing flat profiles such as those observed in Figure 1 (top right panel) and as blue horizontal stripes in Figure 1 (bottom panels). These sources of error, which we call sys- tematic, are mostly deterministic but hard to model because not enough information about them is available. Step 2 overcomes the systematic error by finding a controlled sequence of locations over time that captures persistent local minima. Step 3 removes local noise. The proposed algorithm follows roughly the same framework as [11], although there are some important differences in the approaches. In Step 1 of [11], first derivatives of the mean intensity profiles are estimated directly with local kernel smoothing. A drawback of this approach is the difficulty of determining a fixed local smoothing bandwidth that will provide a good fit globally over the entire spatial domain of the intensity profiles. We correct this by using global spline smoothing instead, fitted by penalized weighted least squares with generalized cross-validation. In Step 2 of [11], a tracking algorithm is applied to connect local minima of the first derivative between consecutive time points. However, because it forces the pilot path to pass through those local minima, it is sensitive to systematic noise. We fix this by globally minimizing the integrated first derivative over time. The proposed pilot path algorithm robustly finds the terminus path despite systematic error in the derivative profiles. Finally, in Step 3, [11] estimate the long-term terminus location change by local kernel smoothing of the pilot path. In addition to the bandwidth selection problem mentioned before, local smoothing gives imprecise or non- existent estimates over time periods where the image data is sparse. Instead, we gain more precise estimates of long-term glacial retreat by smoothing the pilot path with global spline smoothing.

4 From a broader statistical viewpoint, the work presented here relates to the problem of change point detection, which has been well studied in the context of nonparametric regression [15, 13, 10]. However, the specific data analysis problem treated here presents a unique methodological challenge; as opposed to identifying change points in an individual continuous curve, we seek a path of change points from a time series of continuous curves. Moreover, while multiple change points may exist in each intensity profile (e.g. due to shadows in the original images), we are interested in a sequence of change points whose location changes smoothly over time. The 3-step algorithm presented above solves this problem. The paper is organized as follows. Section 2 provides a description of the data and how it was obtained. Section 3 discusses smoothing of the intensity profiles with penal- ized regression splines (Step 1), and shows how this facilitates smooth derivative estimates. Section 4 describes the pilot path algorithm that gives an initial estimate of the terminus path (Step 2), and compares this method to the tracking algorithm of [11]. Section 5 dis- cusses temporal spline smoothing of the pilot path (Step 3), and evaluates several weight- ing schemes that can be used to estimate the long-term trend of the terminus location. Sections 6 and 7 include a numerical study and real data analyses of the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers. Section 8 summarizes our method and presents future directions for research.

2 Data Description

For a given mountain glacier, the proposed method takes as inputs Landsat image intensity profiles extracted along the glacier flowline. The image profiles used here were obtained using a Matlab graphical user interface (GUI) designed for this purpose; see [12]. Briefly, for every glacier considered, Landsat scenes taken at different time points were spatially registered and cropped around the glacier using user-defined geographical coordinates. The glacier flowline was outlined manually in the GUI on an arbitrary image of good quality and automatically smoothed; see Figure 1, top left. The flowline was then applied to all the spatially registered images to extract the intensity profiles at each time frame. To reduce the influence of shadows, it was chosen for each glacier to use images correspond- ing to the thermal frequency band B62, or alternatively the Normalized Difference Snow Index (NDSI) processed band (defined as the difference between the visual frequency band

5 B20 and the infrared frequency band B50, divided by their sum [9, 20, 12]). For each glacier, the intensity profiles were extracted by sampling the registered im- ages at ns locations si, i = 1, . . . , ns along the flowline at constant arclength intervals, measured in meters. For nt time frames, this produced a data array Yij, i = 1, . . . , ns, j = 1, . . . , nt, where each column j corresponds to the intensity profile at time tj. While the locations sj are equally spaced, the times tj are not. According to the Landsat orbit, the satellite passes over the same location on the Earth every 16 days, but most of these images are not available to the public in the Landsat catalog, possibly because of quality issues. The proposed method is applied here to four glaciers: Nigardsbreen (Norway), Gorner (Switzerland), Rhone (Switzerland), and Franz Josef (New Zealand). For each glacier, ground measurements for the location of the terminus were obtained from the World Glacier Monitoring Service database, and are used to evaluate the performance of the proposed method.

3 Spatial Smoothing

3.1 Model

In this section we discuss the modeling approach. At each time point tj, the intensity profiles (si,Yij), i = 1, ..., ns, are modeled non-parametrically as

2 Yij = ξ(si, tj) + ij, ij ∼ (0, σj ), (1) where ξ(si, tj) represents the profile’s mean function at spatial location si and time tj, 2 and ij is independently distributed error with variance σj . To facilitate the estimation of derivatives, we model the mean function of each profile as

K X ξ(s, tj) = ψk(s)ajk, k=1 where ψ1(s), ..., ψK (s) are smooth pre-specified basis functions and aj1, ..., ajK are un- known scalar coefficients: the dependence on j is to emphasize that the coefficients are

6 allowed to vary with tj. The basis expansion approach is ideal because it allows us to calculate derivatives of the profile intensities by simply taking derivatives of the basis functions; i.e. for l ≥ 0,

l K ∂ ξ(s, tj) X (l) ξ(l)(s, t ) = = ψ (s)a , (2) j ∂sl k jk k=1 where the superscript (l) denotes the lth derivative. For the basis functions, we use B- spline bases of order 6 with knots placed at equally spaced quantiles. The specification of order 6 facilitates smooth estimates of the third derivatives, which we show in Section 5 can be used to improve the precision of the estimation procedure in the time domain. Next we discuss estimation of model components.

3.2 Estimation

T For profile j, define Yj = (Y1j, ..., Ynsj) to be a column vector of observations, and Ψ to be an ns × K matrix whose entry (i, k) equals ψk(si). For each profile, the coefficients aj = (aj1, ..., ajK ) are found by the penalized weighted least squares criterion  T T  argmin (Yj − Ψaj) (Yj − Ψaj) + λjaj aj , aj where λj is a roughness penalty parameter that controls the smoothness of the regressor function. The solution to this criterion is the well-known ridge regression estimator

T T T aˆj = (Ψ Ψ + λjI) Ψ Yj. The coefficients provide estimates of the derivatives (2) via

K K ˆ(1) X (1) ˆ(3) X (3) ξ (s, tj) = ψk (s)ˆajk, ξ (s, tj) = ψk (s)ˆajk. (3) k=1 k=1 Note that only the derivatives of order 1 and 3 are needed in what follows.

3.3 Basis and Penalty Specification

An important aspect of the spatial smoothing is the specification of the number of knots determining the regression matrix Ψ and the penalty parameter λj for each profile j =

7 1, ..., nt. These model components determine the trade-off between the regression fit to the data and the smoothness of the fitted functions. Because the spatial sampling is the same for all the profiles, we use the same number of knots (and thus the same regression matrix Ψ) for all profiles and let the penalty parameter λj adjust the fit to each profile as needed. To specify the bases and penalty we follow the established framework of [7] and [19], and set Ψ to be a basis capable of capturing a flexible enough fit to the data so that the amount of smoothness is driven by the selection of λj. Specifically, we follow a rule of ns thumb set by [19] and let K = min(35, 4 ), selecting the smoothing parameters λj for each profile with the generalized cross-validation criterion (GCV) [8]. The GCV smooth- ing parameter tends to fit a regression function whose residuals resemble white noise, in agreement with our modeling assumptions in (1).

4 Pilot Path Algorithm

In this section, we describe a pilot path algorithm that takes as input the time series of first derivative profile estimates, and outputs a rough estimate of the glacier terminus lo- cation over time. We first give the motivation behind the algorithm and then describe the algorithm itself.

4.1 A global optimization criterion based on first derivatives

We are interested in the profile means ξ(s, tj) in (1) only insofar as they provide knowledge about the location of glacier terminus over time. An essential assumption is that underlying each profile is a transition from glacial ice to land or water, causing a sharp downward trend in the intensity values over the terminus location. Therefore, in an ideal situation where the image profile is not obstructed by systematic error, the terminus location g(tj) at time tj is marked by the location where the first spatial derivative reaches its apex, i.e. (1) g(tj) = argmin ξ (s, tj). (4) s An example of such an ideal situation is shown in Figure 4.1. The top plot displays an intensity profile obtained from the Gorner glacier in June 1987. The ideal profile shape

8 250 150 Intensity 50

0 500 1000 1500 2000 2500 3000

0.5 -0.5 First Derivative -1.5

0 500 1000 1500 2000 2500 3000

Space in Meters

Figure 2: Top: Individual profile for Gorner from June 1987 (black circles), superimposed penalized regression spline fit (orange) and ideal mean function (blue). Bottom: Estimated first derivative (orange) and ideal first derivative (blue); hypothesized terminus location is marked in red. is illustrated by the blue sigmoid, fitted here with a logistic function with four parame- ters (left height, right height, transition location and transition slope). The nonparametric penalized spline fit ξˆ(s, t) in orange follows this general shape. More importantly, the esti- mated first derivative ξˆ(1)(s, t) has its minimum nearly at the same location as the inflection point of the sigmoid, finding agreement in the estimation of the terminus location. Unfortunately, as seen in the top right panel of Figure 1 and other examples below, most intensity profiles do not conform to the ideal sigmoidal shape because of system- atic noise. Thus, (4) cannot be taken as a hard modeling assumption. Nevertheless, the intensity transition corresponding to the terminus location does appear often as a local minimum of the first derivative and tends to persist over time (except for occasional oc- clusions), while other intensity transitions do not. Based on these observations, the pilot path algorithm is guided by two principles. The

first stems from our main assumption in (4) and states that for a given time tj, the terminus location will tend to lie near the point in the profile where the first derivative is minimal. Second, the location of the glacier terminus moves at a finite speed over time. This second principle helps rule out spurious intensity transitions (e.g. caused by shadows or debris) that are too far from the true terminus location in order to be reached by actual glacial retreat or advance. Based on the two principles, we define the pilot path as a sequence

9 of locations over time that result in a minimal cumulative first derivative value across profiles, and move at a finite speed. Mathematically, this pilot path is defined as a function g˜(t) from the temporal to the spatial domain that minimizes

nt X ˆ(1) g˜(·) = argmin ξ (g(tj), tj) (5) g j=1 subject to the constraint

g˜(t ) − g˜(t ) j+1 j < ρ for all j. (6) tj+1 − tj The rate ρ represents the maximum possible annual advance or retreat of the terminus location, and can be based on prior knowledge. Here we use a conservative value of ρ = 2 km/year, substantially higher than the speed of advance or retreat of most mountain glaciers.

4.2 The algorithm

To minimize criterion (5) subject to constraint (6) we take the following approach. We run ns paths from unique start locations in space (s1,...,sns ) in both forward and backward manner in time. From each unique starting point, we seek a path that takes a minimum cu- mulative first derivative value over time. Each individual is path found with the following greedy approach:

1 Initialize gF i(t1) = si

for j ∈ {1, ..., nt−1} do

2 Choose a set {k} such that |si+{k} − gF i(tj)|/(tj+1 − tj) < ρ. ∗ ˆ(1) ∗ 3 For s ∈ (si + {k}) define gF i(tj+1) = argmin{ξ (s , tj+1)}. s∗

Store gF i(t).

The backward algorithm is run similarly to the forward algorithm. Then, the collection of nS nS forward {gF i(t)}i=1 and backward {gBi(t)}i=1 paths are evaluated over the first derivative profiles at time points (t1, ..., tnt ), and their cumulative first derivative values are recorded.

10 The path with the lowest cumulative first derivative score is set to be the pilot path:

nt X ˆ(1) g˜(·) = argmin ξ (g(tj), tj). g∈(gF 1,..,gF ns ;gB1,..,gBns ) j=1 The algorithm is illustrated in Figure 3.

1.0 250

0.5 200 2000 2000

0.0 150 Year Year 1995 1995 −0.5 100

Pilot Path −1.0 50 Start Location 1990 Gates 1990 −1.5 0 500 1500 2500 0 500 1500 2500 Space along the Glacial Path (Meters) Space along the Glacial Path (Meters)

Figure 3: Left: The estimated profile first derivatives displayed over time for the Gorner glacier. From each start point, a path moves across time to the subsequent point with the the lowest first derivatives, subject to the rate constraint (shown by the gates). The red dots represent the estimated pilot path. Right: Observed intensity profiles for the Gorner glacier laid out over time, along with the corresponding pilot path in red.

4.3 Pilot Path versus Tracking

This pilot path algorithm is a robust alternative to the tracking algorithm of [11]. The key difference is that we base estimation off the global criterion in (5). This leads to two specific differences between the approaches: 1) the starting points of the algorithm, and 2) the movement of the pilot path across time.

1) Starting Points

11 The pilot path algorithm begins at 2ns unique start locations that correspond to com-

binations across spatial values (s1, ..., sns ) and the time endpoints t1 and tnt . In contrast, the tracking algorithm of [11] begins at the individual point that takes the lowest first derivative value across time and space for all the profiles. For track- ing, heavy systematic error can cause the individual point with lowest estimated first derivative value to lie far from the true terminus location, and lead the path astray.

2) Movement Across Time Points The pilot path algorithm moves across adjacent time points towards the point within the gated window that takes a lowest first derivative value. Tracking moves across adjacent time points to the nearest inflection point within the gated window; and if no inflection point exists the path maintains the same spatial location. A drawback of the tracking algorithm is that a spurious inflection within the gated window may lead the path off-track with an inability to recover. The pilot path algorithm, on the other hand, is not forced to go through inflection points and recovery from spurious inflection points is possible because many alternative paths are compared according to a global optimization criterion.

In summary, the pilot path algorithm is a robust alternative to tracking that requires negli- gible extra computation. The main innovation is that we seek a minimum cumulative first derivative value in the profiles over time.

5 Temporal Smoothing

5.1 Model

The pilot path locations directly relate to the change in glacier length over time. However, these locations are observed at discrete times and estimated with error. Furthermore, there is not enough image data in the Landsat catalog (a few images a year at best) to track seasonal variability. However, we have enough data to estimate the long term trend.

12 To estimate the long term trend, we propose a nonparametric regression fit with global spline smoothing. In contrast to the spatial noise model, we assume independent het- eroscedastic temporal noise between time frames. We propose the model

g˜(tj) = g(tj) + θj, j = 1, . . . , nt (7) where g(t) is a smoothed representation of the glacial terminus location over time, g˜(t) is the pilot path obtained earlier, and each θj is an independent noise term distributed 2 (0, δjσ ). The independence assumption in the errors is reasonable given that the time points are often months apart. The heterogenous variance weights δj reflect the fact that some pilot path points are more accurate than others in their estimation of the terminus location. For example, pilot path points with high negative derivative correspond to sharper intensity transitions and thus tend to be closer to the glacier terminus location than points with first derivatives that are closer to zero.

5.2 Weights

To consider the heteroscedasticity, we consider three approaches to estimating propor- tional variances in the temporal smoothing model.

5.2.1 Equal Weights

The simplest approach is to give equal weights to each of the pilot path points in the least squares fitting: δj = 1 for all j. In this scenario, the long-term trend is estimated with ordinary least squares.

5.2.2 Third Derivative Weights

An alternative approach is obtained as a local approximation to the optimization criterion ˆ(1) (5). Consider a quadratic approximation of the estimated first derivative profile ξ (s, tj) about s =g ˜(tj) (e.g. in the neighborhood of the red mark in the bottom panel of Figure 4.1). Evaluated at s = g(tj), this approximation states: ξˆ(3)(t ) ξˆ(1)(g(t ), t ) ≈ ξˆ(1)(t ) + ξˆ(2)(t )(g(t ) − g˜(t )) + 0 j (g(t ) − g˜(t ))2, (8) j j 0 j 0 j j j 2 j j

13 ˆ(1) ˆ(1) ˆ(2) ˆ(2) ˆ(3) ˆ(3) where ξ0 (tj) = ξ (˜g(tj), tj), ξ0 (tj) = ξ (˜g(tj), tj), and ξ0 (tj) = ξ (˜g(tj), tj). Since the pilot path g˜(tj) is usually near a local minimum of the first derivative, we can ˆ(2) further approximate ξ0 (tj) ≈ 0. Plugging (8) into (5), we obtain that the optimization criterion (5) becomes

nt X ˆ(3) 2 gˆ(·) ≈ argmin ξ0 (tj)(˜g(tj) − g(tj)) . g j=1

This corresponds to the weighted least squares fitting criterion in the proposed temporal ˆ(3) smoothing model (7) with δj = 1/ξ0 (tj). Intuitively, the third derivative weights are desirable because they pull the smooth path g(tj) more strongly towards pilot path points g˜(tj) that are near local minima with sharp inflections and are thereby more accurate. Note that approximation (8) is representative of the terminus location only if g˜(tj) is near a local minimum rather than a local maximum, ˆ(3) i.e. if ξ0 (tj) > 0. Therefore, pilot path points whose third derivative is negative (due to systematic noise) must be dropped from the model.

5.2.3 Propagation

The final weights considered are adapted from [12] as another way of incorporating the accuracy of the pilot path estimates. This weighting scheme assumes the location of the pilot path is an inflection point in the estimated spatial intensity profile. By considering local minima of the first derivative as zero crossings of the second derivative, [12] shows that the location of an estimated inflection point g˜(tj) has an associated standard error of ˆ(2) ˆ(3) the form SE(ξ (˜g(tj), tj))/ξ (˜g(tj), tj). For spatial model (1) with independent and identically distributed errors,

ˆ(2) (2) 2 2 Varc {ξ (˜g(tj), tj)} = kψ (tj)k σˆj ,

00 (2) (2) T where ψ (tj) = (ψ1 (g(tj), tj), . . . , ψK (g(tj), tj)) is the vector of spatial spline deriva- tives used in (2) and (3).

In model (7), the propagation weight at each time tj is set equal to the variance of the ˆ(2) ˆ(3) 2 inflection point location, δj = Varc [ξ (˜g(tj), tj)]/[ξ (˜g(tj), tj)] .

14 5.3 Estimation

Similar to spatial smoothing of the image profiles described in Section 3, we use a spline basis expansion to facilitate long term estimation of the terminus location. From equation (7) we assume

L X g(t) = φl(t)νl, l=1 where φl(t) are cubic spline basis functions and ν = [ν1, ..., νL] are scalar coefficients. Define Φ to be an nt × L matrix whose entry (j, l) equals φl(tj), let V be a diagonal matrix filled with weights [δ1, ..., δnt ] from section 5.2, and g˜ = [˜g(t1), ..., g˜(tnt )]. The coefficients are estimated with penalized weighted least squares as

νˆ = (ΦT V−1Φ + λI)−1ΦT V−1g˜ and the estimated path of glacier termini is

gˆ(t) = Φνˆ.

The specification of the number of basis functions L, and selection of λ, follows the frame- work from subsection 3.3.

5.4 Confidence Intervals

To account for the bias induced by the smoothing parameter, we follow the standard ap- proach to quantify uncertainty with penalized splines and use the Bayesian covariance matrix [22, 19, 23]. The estimated Bayesian covariance matrix takes the form

Covd(ˆν) = (ΦT V−1Φ + λˆI)−1σˆ2, where we use the residual variance estimate Pnt (ˆg(t ) − g˜(t ))2 σˆ2 = 1 j j . nt − L

At time tj, we find the standard error of the path estimate via q T SEc(ˆg(tj)) = Φ(tj)Covd(ˆν)Φ(tj) ,

15 where Φ(tj) is the j-th row of Φ. The level 1 − α confidence intervals are set as

gˆ(tj) ± zα/2SEc(ˆg(tj)), where zα/2 represents the α/2 quantile of a standard normal. Confidence intervals fit with penalized regression splines are best interpreted in the “across-the-function” sense described by [22, 16, 14]. This interpretation states that the average coverage probability should be approximately 1 − α, but due to the bias in the fitting procedure the pointwise coverage maybe be uneven across time. These intervals are applied and evaluated in the simulation and real data analyses presented in the following sections.

6 Simulation Study

This simulation study serves to evaluate the viability of our method, and to guide the specification of weights on the pilot path points.

6.1 Data Generation

We consider profiles observed at times tj, j ∈ {1, 2, ..., 50} and at equally spaced points si, i ∈ {1, 2, ..., 200}. The time intervals between the sampled profiles, tj+1 − tj, follow a Poisson distribution with mean 12 months, while the spatial intervals are non-random and equal to 200 meters. As described in Section 2, the images in the real data are unequally spaced in time, and Poisson sampling is used as a simple way to simulate this unknown process. Within each profile intensity values are evaluated over the same spatial coordinates. 2 Each profile at time tj is generated independently according to model (1) with σj = 1. The signal ξ is generated as the sum of a sigmoidal function µ representing an underlying transition from glacial ice to land or water, and a systematic noise component ζ. For each time period tj, the sigmoidal function µ is generated independently as

1 − zij si − g(tj) µ(si, tj) = aj + bj , zij = , q 2 200 1 + zij

16 where aj is a random intercept distributed N(200, 30) and bj is a random slope distributed

N(60, 20). The location of the glacier terminus g(tj) corresponds to the point of lowest first derivative in the sigmoid. The locations of the glacier terminus as a function of time take two forms:

1. g1(tj) = 2400 − 2tj.

2. g2(tj) = 2100 − 2tj − 75 sin(tj/30). 500 500 400 400 200 200 300

150 150 300 300 200 Months Months 100 100 200 200 Profile Intensity Profile 100 50 50 100 100

0 0 0

0 1000 2000 3000 4000 0 1000 2000 3000 4000 0 1000 2000 3000 4000

Space along Path Space in Meters Space in Meters

Figure 4: Left: Simulated profiles for systematic noise ζ1. Middle: Profiles over time with terminus location g1(t) (red). Right: Profiles when terminus location is g2(t) (red).

To generate systematic error we consider two settings:

P6 1. ζ1(si, tj) = δj1 + q=1 δj(2q+1)cos(qsi) + δj(2q)sin(qsi)

• δjq ∼ gamma(shape = 2, rate = 250) − 500  ζ1(si, tj), with probability 0.9 2. ζ2(si, tj) = 100, with probability 0.1.

The cosine and sine waves that generate the systematic noise 1 correspond to Fourier basis functions. The weights for the noise are generated from a centered gamma distribution with mean 0. The specified asymmetry of the weights, and number of basis functions, are somewhat arbitrary choices, but were made to approximate visually the profiles from the real data analyses. The flat profiles introduced with probability 0.1 reflect the fact that

17 some satellite images are cloudy and do not show the desired downward transition in the intensity profile. Examples of profiles generated by the above recipe are shown in Figure 4. The profiles exhibit visual resemblance to the real data profiles of Figure 1 and Figures 5, 6 and 7 below. At each combination of path and systematic error we generate 200 Monte Carlo sam- ples. For each sample we apply the versions of our algorithm where temporal smoothing uses the three types of weights - equal, third derivative, and propagation - detailed in sec- tions 5.2.1 - 5.2.3. Results are compared with the tracking procedure (as described in Section 4) in combination with the propagation weights, as done in [11].

6.2 Simulation Results

The results of the simulation are assessed in terms of three criteria:

1. Integrated mean absolute errors:

200 n X Xt MIAE(ˆg) = abs(ˆgi(tj) − g(tj))/(200 · nt), i=1 j=1

2. Mean maximum absolute deviation over the path:

200 X MMAD(ˆg) = maxj{abs[ˆgi(tj) − g(tj)]}/(200), i=1

3. Mean confidence coverage over the path:

200 n X Xt h i MCI(ˆg) = I g(tj) ∈ {gˆi(tj) ± 2SE(ˆc gi(tj))} /(200 · nt). i=1 j=1

Table 1 displays the results for the case where the terminus location change is linear, g1(t). The pilot path algorithm outperforms tracking for each application of least squares weights and according to all metrics. Given that the pilot path algorithm is applied, when no flat profiles are present (error ζ1), temporal smoothing with third derivative weights performs best across all metrics, followed by the propagation weights. These methods

18 Path Error Method Weights MIAE MMAD MCI

g1(t) ζ1 Pilot Path Equal 9.8 18.8 96.1 Pilot Path Third Derivative 6.0 11.5 97.5 Pilot Path Propagation 7.7 14.7 92.5 Tracking Propagation 45.9 100.1 83.2

ζ2 Pilot Path Equal 21.0 40.6 93.2 Pilot Path Third Derivative 8.9 16.6 96.7 Pilot Path Propagation 87.6 303.5 78.8 Tracking Propagation 114.4 360.6 86.8

Table 1: Simulation results for g1(t). All standard errors are less than 1. outperform the equal weighting procedure because the third derivatives weights pull the paths toward points of high inflection, which on average, correspond to the true glacial terminus location.

Flat profiles (error ζ2) adversely affect all methods, but to varying degrees. The tempo- ral smoothing procedure that uses third derivative weights is the most robust to flat profiles, as indicated by the small MIAE and MMAD, and it is the only method to maintain 95% confidence coverage on the average. The equal weighting procedure is also fairly robust to the flat profiles, but not to the same extent. The fitting procedure based on propaga- tion weights lacks robustness, and performs poorly with flat profiles. Further investigation found that for flat profiles, the propagation weights tended to be very large. This gives un- desirable high leverage to the flat profiles, which contain no information on the terminus location. Finally, the tracking method performs poorly due to the propagation weights, and a substantial number of pilot paths being led astray by systematic error.

For all the other simulations (Table 2), when the terminus location is given by g2(t), with systematic error ζ1 and ζ2, the performance of the methods are all slightly worse than with g1(t) — it is more difficult to estimate a glacial path that contains short term variability. However the overall story is similar. The pilot path algorithm and temporal fitting with third derivative weights provide the most robust and precise fits that maintain the desired 95% confidence coverage on the average. These results guide our real data analyses in the discussed in the following section.

19 Path Error Method Weights MIAE MMAD MCI

g2(t) ζ1 Pilot Path Equal 17.2 52.6 95.4 Pilot Path Third Derivative 12.4 39.1 97.4 Pilot Path Propagation 14.5 42.9 91.5 Tracking Propagation 50.5 129.4 91.7

ζ2 Pilot Path Equal 36.1 100.2 91.3 Pilot Path Third Derivative 22.2 67.4 95.5 Pilot Path Propagation 88.0 297.9 79.2 Tracking Propagation 137.7 417.2 85.0

Table 2: Simulation results for g2(t). All standard errors are less than 1.

7 Real Data Analyses

The results from the simulation study guide our real data analyses. The version of our fitting method with third derivative weights is applied to image intensity data collected for the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers. The results are compared to independent ground measurements sampled over time. The results are displayed in Figures 1, 5, 6, and 7. All four figures are presented in the same format as Figure 1, described before in the Introduction: the top left plot displays a 2-D Landsat image at an individual time point, with the coordinates of the 1- D extraction; the time series of image profiles from the 1-D extraction are shown in the top right plot; the bottom plots show the profiles laid out over time, the estimated glacial path, the estimated confidence bands and the ground measurements for comparison. The ground measurements are originally given as relative changes in the terminus location between adjacent time points; to facilitate comparison, these measurements are shifted by a constant offset so that their average is equal to the average path estimate calculated over the years of available data. The plot legends contain the mean integrated absolute error (MIAE) over the path and maximum absolute deviation (MAD) in meters of the estimated path with respect to the ground measurements, along with the empirical coverage (COV) of the confidence bands. Broadly, the estimates for each data set follow the ground measurements well: the estimated terminus locations for the Nigardsbreen, Gorner, and Rhone glaciers retreated

20 over the measured time period, while the estimated Franz Josef terminus retreated and then advanced in equal measure. The performance of the method varies across the four glaciers, depending on the image quality, density of the profile samples over time and the amount of systematic error in the profiles. For the Nigardsbreen, Gorner and Rhone glaciers the per- formance is best, with an accuracy of about 8 to 20 meters. This is remarkable given that the resolution of the Landsat images is 30 meters. Note that the Gorner and Rhone glaciers have periods of time of several years with no image data, yet the temporal spline model is able to interpolate in these periods. For Franz Josef the results are less accurate; here the data suffers from heavy systematic noise due to shadows from an adjecent mountain and cloud cover.

8 Discussion

In this paper we have proposed a statistical algorithm that estimates glacial recession as a path of near-inflection points from a time series of image intensity profiles. The proposed method involves three stages. First, the intensity profiles are spatially smoothed using penalized spline smoothing, and the first derivatives of the smoothed profiles are extracted. Second, a fast and robust algorithm is applied to the collection of first derivative profiles that identifies a rough estimate of the glacial terminus location over time. The guiding framework for the proposed algorithm is that the temporal terminus location corresponds to a path that takes the lowest summed first derivative values, and that the terminus changes at a limited speed over time. Third, the rough pilot path of terminus locations is globally smoothed to obtain an estimate of long-term terminus trend. The viability of our method was shown through simulation and real data analyses. Through simulation we showed that our approach is more robust to systematic error in the intensity profiles than tracking. Also, we considered different weighting schemes for the pilot path points. Weighting by third derivatives gave the best results, since pilot path points near the true terminus location have higher inflection and are more accurate. The data analyses showed that the performance of the method can vary depending on the glacier of interest. For glaciers with high image quality and densely sampled intensity profiles, such as Nigardsbreen and Gorner, the estimates from our algorithm matched very closely those obtained from ground measurements. For the Rhone and Franz Josef glaciers

21 the image quality was corrupted by more systematic noise, and the profiles were observed less densely. Therefore, the estimates were somewhat less accurate. However, in these situations the method qualitatively showed fidelity to empirical measurements of terminus location. A limitation of our approach is that the pilot path algorithm is conditional on the smooth first derivative estimates of the intensity profiles, and also, the temporal smooth- ing is conditional on the output from the path algorithm, thereby making it difficult to properly estimate standard errors. While we showed the viability of our method through simulation, it would be ideal to incorporate and propagate the uncertainty that exists in the initial stages of estimation. To address this limitation we considered bootstrap and jackknife procedures to account for estimation uncertainty. For the bootstrap, the data is difficult to effectively resample due to the systematic error in the profiles, whose under- lying distribution is unknown. For jackknife estimation, the leave-one-out path estimates are highly correlated, but this correlation is challenging to quantify due to the algorithmic nature of the fitting procedure. Without knowledge of this correlation of the leave-one-out estimates, estimating the inflation factor for the jackknife variance is difficult. An alternative direction we attempted for estimation of the target terminus location is to use maximum likelihood. In this framework, each profile is represented by a multi- variate normal distribution whose mean is a sigmoid function with an inflection point that corresponds to the true glacial terminus location. By building a likelihood of the profiles observed over time the terminus path can estimated. An advantage of this approach is standard likelihood-based tools for uncertainty quantification are available. However, we decided against this approach because the systematic error in the profiles is difficult to specify in a general way.

References

[1] Landsat-US National Aeronautics and Space Administration (NASA). http:// landsat.gsfc.nasa.gov/.

[2] Landsat.org. http://landat.org/.

[3] US Geographical Survey (USGS). http://landat.usgs.gov/.

22 [4] The Swiss Glacier Inventory 2000 (SGI 2000)., 2000.

[5] R. Bintanja, R. S. van de Wal, and J. Oerlemans. Modelled atmospheric temperatures and global sea levels over the past million years. Nature, 437(7055):125–128, 2005.

[6] R. S. Bradley, M. Mann, and M. K. Hughes. Northern hemisphere temperatures during the past millennium: inferences, uncertainties, and limitations. Geophysical research letters, 26(6):759–762, 1999.

[7] P. H. Eilers and B. D. Marx. Flexible smoothing with b-splines and penalties. Statis- tical science, pages 89–102, 1996.

[8] G. H. Golub, M. Heath, and G. Wahba. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21(2):215–223, 1979.

[9] R. R. Irish. Landsat 7 automatic cloud cover assessment. In AeroSense 2000, pages 348–355. International Society for Optics and Photonics, 2000.

[10] J.-H. Joo and P. Qiu. Jump detection in a regression curve and its derivative. Tech- nometrics, 51(3):289–305, 2009.

[11] N. N. Kachouie, T. Gerke, P. Huybers, and A. Schwartzman. Nonparametric regres- sion for estimation of spatiotemporal mountain glacier retreat from satellite images. IEEE Transactions on Geosciences and Remote Sensing (accepted), 2014.

[12] N. N. Kachouie, P. Huybers, and A. Schwartzman. Localization of mountain glacier termini in landsat multi-spectral images. Pattern Recognition Letters, 2012.

[13] C. R. Loader. Change point estimation using nonparametric regression. The Annals of Statistics, 24(4):1667–1678, 1996.

[14] G. MARRA and S. N. WOOD. Coverage properties of confidence intervals for gener- alized additive model components. Scandinavian Journal of Statistics, 39(1):53–74, 2012.

[15] H.-G. Muller. Change-points in nonparametric regression analysis. The Annals of Statistics, 20(2):737–761, 1992.

[16] D. Nychka. Bayesian confidence intervals for smoothing splines. Journal of the American Statistical Association, 83(404):1134–1143, 1988.

23 [17] J. Oerlemans. Holocene glacier fluctuations: is the current rate of retreat exceptional? Annals of , 31(1):39–44, 2000.

[18] J. Oerlemans and R. van de Wal. Response of glaciers to climate change and kinematic waves: a study with a numerical ice-flow model. Journal of glaciology, 41(137):142–152, 1995.

[19] D. Ruppert. Selecting the number of knots for penalized splines. Journal of Compu- tational and Graphical Statistics, 11(4):735–757, 2002.

[20] V. Salomonson and I. Appel. Estimating fractional snow cover from modis using the normalized difference snow index. Remote sensing of environment, 89(3):351–360, 2004.

[21] K. Seidel and J. Martinec. Remote sensing in snow hydrology: runoff modelling, effect of climate change. Springer, 2004.

[22] G. Wahba. Bayesian” confidence intervals” for the cross-validated smoothing spline. Journal of the Royal Statistical Society. Series B (Methodological), pages 133–150, 1983.

[23] S. N. Wood. Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC, Boca Raton, FL, 2006.

24 250 200 150 100 Profile Intensity Profile 50 0

0 500 1000 1500 2000 2500 3000

Space Along Glacier Path 2005 250 2005 250

MIAE = 12.2 MAD = 35.7 200 200 2000 COV = .79 2000

150 150 Year Year 1995 1995 100 100

50 50 1990 1990

0 500 1000 2000 3000 1800 1900 2000 2100 2200 2300 Space along the Glacial Path (Meters) Space along the Glacial Path (Meters)

Figure 5: Gorner results. Top left: Specified 1-D extraction from an individual 2-D Land- sat image of Gorner. Top right: Extracted 1-D intensity profiles. Bottom left: Estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evalu- ated metrics in the legend. Bottom right: Zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

25 250 200 150 100 Profile Intensity Profile 50 0

0 500 1000 1500 2000 2500 3000

Space Along Glacier Path

250 250 2010 MIAE = 17.9 MAD = 30.4 COV = .50 200 200 2005 2005

150 150 2000 Year Year 100 100 1995 1995

50 50 1990

0 0 1985 1985 0 500 1000 2000 3000 1700 1800 1900 2000 2100 2200 Space along Glacial Path (Meters) Space along Glacial Path (Meters)

Figure 6: Rhone results. Top left: Specified 1-D extraction from an individual 2-D Landsat image of Rhone. Top right: Extracted 1-D intensity profiles. Bottom left: Estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evaluated metrics in the legend. Bottom right: Zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

26 250 200 150 Profile Intensity Profile 100 50

0 2000 4000 6000 8000

Space Along Glacier Path

MIAE = 64.4 MAD = 137.8 2008 CI = .38 2008 200 200 2006

150 150 Year Year 2004 2004

100 100 2002 2000 2000 0 2000 4000 6000 8000 6000 6200 6400 6600 6800 7000 Space along the Glacial Path (Meters) Space along Glacial Path (Meters)

Figure 7: Franz Josef results. Top left: Specified 1-D extraction from an individual 2-D Landsat image of Franz Josef. Top right: Extracted 1-D intensity profiles. Bottom left: Estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evaluated metrics in the legend. Bottom right: Zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

27