Water Quality Trend and Change-Point Analyses Using Integration of Locally Weighted Polynomial Regression and Segmented Regression

Environ Sci Pollut Res DOI 10.1007/s11356-017-9188-x RESEARCH ARTICLE Water quality trend and change-point analyses using integration of locally weighted polynomial regression and segmented regression Hong Huang1,2 & Zhenfeng Wang1,2 & Fang Xia1,2 & Xu Shang1,2 & YuanYuan Liu 3 & Minghua Zhang1,2 & Randy A. Dahlgren1,2,4 & Kun Mei1,2 Received: 11 January 2017 /Accepted: 2 May 2017 # Springer-Verlag Berlin Heidelberg 2017 Abstract Trend and change-point analyses of water quality The reliability was verified by statistical tests and prac- time series data have important implications for pollution con- tical considerations for Shanxi Reservoir watershed. The trol and environmental decision-making. This paper devel- newly developed integrated LWPR-SegReg approach is oped a new approach to assess trends and change-points of not only limited to the assessment of trends and change-points water quality parameters by integrating locally weighted poly- of water quality parameters but also has a broad application to nomial regression (LWPR) and segmented regression other fields with long-term time series records. (SegReg). Firstly, LWPR was used to pretreat the original water quality data into a smoothed time series to represent Keywords Water quality . Long-term trend assessment . the long-term trend of water quality. Then, SegReg was used Change-point analysis . Locally weighted polynomial to identify the long-term trends and change-points of the regression . Segmented regression smoothed time series. Finally, statistical tests were applied to determine the significance of the long-term trends and change- points. The efficacy of this approach was validated using a 10- Introduction year record of total nitrogen (TN) and chemical oxygen de- mand (CODMn) from Shanxi Reservoir watershed in eastern Many countries and regions in the world suffer from chronic China. Results showed that this approach was straightforward water shortages, often from a scarcity of clean drinking water and reliable for assessment of long-term trends and (Mukheibir 2010; Tiwari and Joshi 2012;WangandYu2014). change-points on irregular water quality datasets. Therefore, water quality protection and remediation are very important aspects of sustainable social-economic develop- ment (Zhou et al. 2014). As a result, the protection and envi- Responsible editor: Marcus Schulz ronmental remediation of drinking water sources have received considerable attention in research (Basu et al. 2014), * Kun Mei since drinking water quality is closely related to human health [email protected] (Corlin et al. 2016). In general, these research efforts include Hong Huang environmental policy and legislation planning (Syme and [email protected] Nancarrow 2013;Xuetal.2016), water quality standard de- velopment (Goncharuk 2013), and health risk assessment 1 Southern Zhejiang Water Research Institute, Wenzhou Medical (Chiang et al. 2010; Houtman et al. 2014; Sun et al. 2015), University, Wenzhou 325035, People’sRepublicofChina water quality monitoring and modeling (Tropea et al. 2007; 2 Key Laboratory of Watershed Environmental Science and Health of Sokolova et al. 2013;Chenetal.2015), and protection and Zhejiang Province, Wenzhou 325035, People’sRepublicofChina remediation technology (Zhang et al. 2011;Basuetal.2014). 3 School of Public Health and Management, Wenzhou Medical An important aspect of water quality modeling for drinking water University, Wenzhou 325035, People’sRepublicofChina source protection is identifying changes in water quality trends 4 Department of Land, Air, and Water Resources, University of and specific change-point timing, which are important information California, Davis, CA 95616, USA for environmental protection performance evaluation and Environ Sci Pollut Res environmental decision-making (Bodo 1989; Chowdhury and Al- into different intervals where the relationships between the Zahrani 2014). variables are different. Nowadays, segmented regression Water quality trend and change-point analyses commonly (SegReg) has been widely used in trend and change-point employ traditional statistical methods including parametric analyses in research fields including medicine (Kazemnejad (i.e., linear regression, polynomial regression) and nonpara- et al. 2014), hydrology (Shao and Campbell 2002), economics metric (e.g., variants of the Mann-Kendall test) statistical (Wu and Chang 2012), society (Mathews and Hamilton methods (Donohue et al. 2001; Chang 2008; Kisi and Ky 2005), etc. In comparison with graphical methods, major ad- 2014). However, water quality is strongly influenced by both vantages of SegReg are that the regression function is defined natural and human factors, and data records are often chal- by a mathematical formula to describe the relationship, the lenging to analyze from a statistical perspective as the data is significance levels of the trend can be determined statistically often non-normally distributed, nonlinear fashion, non- (Kazemnejad et al. 2014), and the change-point can be quan- monotonic trend, uneven time spacing, and with large season- titatively defined (Taljaard et al. 2014). Since water quality al variations. Linear regression may not be appropriate in time series are often statistically compromised, the direct ap- cases of the data in nonlinear and non-normal distribution, plication of SegReg might produce invalid results. However, while Mann-Kendall test may not be applicable in cases of SegReg would be an effective approach for trend and non-monotonic trends and uneven sampling time spacing. change-point analyses of water quality if the data could Polynomial/logistic regression is based on global estimation, be appropriately pretreated. whereas data with these nonlinear patterns usually display Change-point detection is the identification of an local characteristics. Meanwhile, the water quality monitoring abrupt variation in process behavior due to distributional frequency is often at monthly and/or bimonthly which or structural change, whereas a trend can be defined as the resulting in discrete and irregular time series data records at estimation of a gradual departure from past norms various sampling dates and sampling intervals. In general, (Sharma et al. 2016). For water quality time series analy- trend assessment of statistically compromised time series sis, identifying changes in long-term trends is important, datasets such as those that do not fully meet required statistical yet identifying specific change-points is also important. assumptions should not rely solely on abstract test statistics The objective of this paper is to provide an integrated (Bodo 1989). LWPR-SegReg approach to analyze water quality trends Appropriate graphing techniques may be a powerful data and change-points. The efficacy of this approach was evaluation tool. As early as 1983, Chambers et al. (1983) demonstrated by trend and change-point analyses for a noted that There is no single statistical tool as powerful as a 10-year record of key water quality parameters (TN and well-chosen graph.^ The most classical graphical method is CODMn) in Shanxi Reservoir watershed of Zhejiang Cleveland’s(1979, 1981) locally weighted scatterplot smooth- Province, China. Innovative and importance aspects of er (LOWESS), which was further developed by Cleveland and the integrated LWPR-SegReg approach include its ability Devlin (1988) as the locally weighted polynomial regression to define change-points and trends visually and to quanti- (LWPR) procedure. LWPR can be directly applied to graphi- tatively define the change-points and trends with statistical analysis of 2D scatterplots for a series xi versus corre- cal rigor. sponding times ti. Soon afterwards, LOWESS was developed for seasonal-trend decomposition using loess (STL) (Cleveland et al. 1990). These graphical methods are based on various local smoothing techniques (Harding et al. 2016) Materials and methods and have been widely applied in environmental quality trend analysis including air quality (Li et al. 2014; Gong et al. Locally weighted polynomial regression(LWPR) approach 2015), water quality (Lee et al. 2010;Stowetal.2015), etc. In general, graphical methods can play an important role in Trend analysis determines whether the measured values of a trend analysis in typical time series data, both as a diag- water quality variable increase or decrease during a given time nostic tool and as visual corroborative evidence when re- period (Naddafi et al. 2007). For a water quality series WQi,a quired assumptions for formal statistical tests are not met basic linear trend analysis model is: (Bodo 1989). However, graphical methods are not able to ¼ α þ β þ ε ; ð Þ WQi ti i 1 produce regression functions that can be mathematical described to determine the significance levels and define where ti is time, a is the regression coefficient indicating the change-points (Liang 2014). slope of the line, β is the regression constant, and εi is an Segmented regression, also known as Bpiecewise irregular noise term. regression^ or Bbreak-point regression^, is a regression meth- LWPR usually employs a local linear polynomial regres- od applied to cases where independent variables are clustered sion model, but a local nonlinear regression model can also be Environ Sci Pollut Res used in some circumstances (Bodo 1989). For a water quality independently (Liang 2014). However, with the integra- series WQi,LWPRis: tion of LWPR and SegReg, it is possible to deal with many of the problems associated with statistical

Water Quality Trend and Change-Point Analyses Using Integration of Locally Weighted Polynomial Regression and Segmented Regression

How to Assess the Impact of Quality and Patient Safety Interventions with Routinely Collected Longitudinal Data

Process Control Charts and ITS Analysis of Epworth MEER Trial

Original Research Article Analysing Interrupted Time Series with A

An Exact Algorithm for Estimating Breakpoints in Segmented Generalized Linear Models

Fitting Segmented Regression Curves

Computing Primer for Applied Linear Regression, Third Edition Using R and S-Plus

Testing the Significance of the Improvement of Cubic Regression

Correctional Data Analysis Systems. INSTITUTION Sam Honston State Univ., Huntsville, Tex

Parameter Estimation in Linear-Linear Segmented Regression

T2S General Functional Specifications Table of Contents

Regression Analysis of Hydrologic Data

Nonlinear Regression