Fast Algorithms for Segmented Regression

Total Page:16

File Type:pdf, Size:1020Kb

Fast Algorithms for Segmented Regression Fast Algorithms for Segmented Regression Jayadev Acharya [email protected] Massachusetts Institute of Technology, Cambridge, MA 02139, USA Ilias Diakonikolas [email protected] University of Southern California, Los Angeles, MA 90089, USA Jerry Li [email protected] Ludwig Schmidt∗ [email protected] Massachusetts Institute of Technology, Cambridge, MA 02139, USA Abstract dent observations are corrupted by random noise. More specifically, we assume that there exists a family of func- We study the fixed design segmented regression tions F such that for some f 2 F the following holds: problem: Given noisy samples from a piecewise y = f(x ) + ; where the ’s are i.i.d. random variables linear function f, we want to recover f up to a i i i i drawn from a “tame” distribution such as a Gaussian (later, desired accuracy in mean-squared error. we also consider model misspecification). Previous rigorous approaches for this problem rely on dynamic programming (DP) and, while Throughout this paper, we consider the classical notion of sample efficient, have running time quadratic Mean Squared Error (MSE) to measure the performance in the sample size. As our main contribution, (risk) of an estimator. As expected, the minimax risk de- we provide new sample near-linear time algo- pends on the family F that f comes from. The natural rithms for the problem that – while not being case that f is linear is fully understood: It is well-known minimax optimal – achieve a significantly better that the least-squares estimator is statistically efficient and sample-time tradeoff on large datasets compared runs in sample-linear time. The more general case that f is to the DP approach. Our experimental evalua- non-linear, but satisfies some well-defined structural con- tion shows that, compared with the DP approach, straint has been extensively studied in a variety of contexts our algorithms provide a convergence rate that is (see, e.g., (Gallant & A., 1973; Feder, 1975; Friedman, only off by a factor of 2 to 4, while achieving 1991; Bai & Perron, 1998; Yamamoto & Perron, 2013; speedups of three orders of magnitude. Kyng et al., 2015; Avron et al., 2013; Meyer, 2008; Chat- terjee et al., 2015)). In contrast to the linear case, this more general setting is not well-understood from an information- 1. Introduction theoretic and/or computational aspect. f We study the regression problem – a fundamental inference In this paper, we focus on the case that the function is piecewise linear k task with numerous applications that has received tremen- promised to be with a given number of unknown dous attention in machine learning and statistics during the pieces (segments). This is known as fixed de- segmented past fifty years (see, e.g., (Mosteller & Tukey, 1977) for a sign regression, and has received considerable classical textbook). Roughly speaking, in a (fixed design) attention in the statistics community (Gallant & A., 1973; regression problem, we are given a set of n observations Feder, 1975; Bai & Perron, 1998; Yamamoto & Perron, (x ; y ), where the y ’s are the dependent variables and the 2013). The special case of piecewise polynomial functions i i i (splines) has been extensively used in the context of in- xi’s are the independent variables, and our goal is to model the relationship between them. The typical assumptions ference, including density estimation and regression, see, are that (i) there exists a simple function f that (approxi- e.g., (Wegman & Wright, 1983; Friedman, 1991; Stone, mately) models the underlying relation, and (ii) the depen- 1994; Stone et al., 1997; Meyer, 2008). rd Information-theoretic aspects of the segmented regression Proceedings of the 33 International Conference on Machine problem are well-understood: Roughly speaking, the min- Learning, New York, NY, USA, 2016. JMLR: W&CP volume 48. Copyright 2016 by the author(s). ∗Authors ordered alphabetically. Fast Algorithms for Segmented Regression imax risk is inversely proportional to the number of sam- classical regression model: ples. In contrast, the computational complexity of the prob- y = f(x ) + : (1) lem is poorly understood: Prior to our work, known al- i i i gorithms for this problem with provable guarantees were Here, the i are i.i.d. sub-Gaussian noise variables with 2 2 quite slow. Our main contribution is a set of new provably variance proxy σ , mean E[i] = 0, and variance s = 2 fast algorithms that outperform previous approaches both E[i ] for all i. We will let = (1; : : : ; n) denote the vec- in theory and in practice. Our algorithms run in time that is tor of noise variables. We also assume that f : Rd ! R is nearly-linear in the number of data points n and the number a k-piecewise linear function. Formally, this means: of intervals k. Their computational efficiency is established d Definition 1. The function f : R ! R is a k-piecewise both theoretically and experimentally. We also emphasize linear function if there exists a partition of the real line into that our algorithms are robust to model misspecification, k disjoint intervals I1;:::;Ik, k corresponding parameters i.e., they perform well even if the function f is only ap- d θ1;:::; θk 2 R , and a fixed, known j such that for all proximately piecewise linear. d x = (x1; : : : ; xd) 2 R we have that f(x) = hθi; xi if Note that if the segments of f were known a priori, the seg- xj 2 Ii. Let Lk;j denote the space of k-piecewise linear mented regression problem could be immediately reduced functions with partition defined on coordinate j. to k independent linear regression problems. Roughly Moreover, we say f is flat on an interval I ⊆ R if I ⊆ Ii speaking, in the general case (where the location of the seg- for some i = 1; : : : ; k, otherwise, we say that f has a jump ment boundaries is unknown) one needs to “discover” the on the interval I. right segments using information provided by the samples. To address this algorithmic problem, previous works (Bai In the full paper, we also discuss the agnostic setting where & Perron, 1998; Yamamoto & Perron, 2013) relied on dy- the ground truth f is not piecewise linear itself but only namic programming that, while being statistically efficient, well-approximated by a k-piecewise linear function. For is computationally quite slow: its running time scales at simplicity of exposition, we assume that the partition coor- least quadratically with the size n of the data, hence it is dinate j is 1 in the rest of the paper. We remark that this rather impractical for large datasets. model also contains the problem of (fixed design) piece- Our main motivation comes from the availability of large wise polynomial regression as an important subcase (see datasets that has made computational efficiency the main the full paper for details). bottleneck in many cases. In the words of (Jordan, 2013): Following this generative model, a regression algorithm re- “As data grows, it may be beneficial to consider faster infer- ceives the n pairs (xi; yi) as input. The goal of the algo- ential algorithms, because the increasing statistical strength rithm is then to produce an estimate fb that is close to the of the data can compensate for the poor algorithmic qual- true, unknown f with high probability over the noise terms ity.” Hence, it is sometimes advantageous to sacrifice sta- i and any randomness in the algorithm. We measure the tistical efficiency in order to achieve faster running times distance between our estimate fband the unknown function because we can then achieve the desired error guarantee f with the classical mean-squared error: faster (provided more samples). In our context, instead of n using a slow dynamic program, we employ a subtle itera- 1 X 2 MSE(fb) = (f(xi) − fb(xi)) : tive greedy approach that runs in sample-linear time. n i=1 Our iterative greedy approach builds on the work n×d of (Acharya et al., 2015a;b), but the details of our algo- Throughout this paper, we let X 2 R be the data ma- T rithms here and their analysis are substantially different. In trix, i.e., the matrix whose j-th row is xj for every j, and particular, as we explain in the body of the paper, the nat- we let r denote the rank of X. ural adaptation of their analysis to our setting fails to pro- The following notation will also be useful. For any func- vide any meaningful statistical guarantees. To obtain our tion f : Rd ! R, we let f 2 Rn denote the vector with results, we introduce novel algorithmic ideas and carefully components fi = f(xi) for i 2 [n]. For any interval I, we combine them with additional probabilistic arguments. let XI denote the data matrix consisting of all data points n I jIj xi for i 2 I, and for any vector v 2 R , we let v 2 R 2. Preliminaries be the vector of vi for i 2 I. In this paper, we study the problem of fixed design seg- 2.1. Our Contributions d mented regression. We are given samples xi 2 R for i 2 [n] ( = f1; : : : ; ng), and we consider the following Our main contributions are new, fast algorithms for the aforementioned segmented regression problem. We now informally state our main results and refer to later sections Fast Algorithms for Segmented Regression for more precise theorems. ing the rest, we now split the candidates into log n buck- Theorem 2 (informal statement of Theorems 13 and 14). ets based on the lengths of the candidate intervals. In this α REEDY ERGE scheme, bucket contains all candidates with length be- There is an algorithm G M , which, given X α α+1 (of rank r), y, a target number of pieces k, and the variance tween 2 and 2 , for α = 0;:::; log n−1.
Recommended publications
  • How to Assess the Impact of Quality and Patient Safety Interventions with Routinely Collected Longitudinal Data
    How to Assess the Impact of Quality and Patient Safety Interventions with Routinely Collected Longitudinal Data Diego A. Martinez, PhD, Assistant Professor of Emergency Medicine, Johns Hopkins University School of Medicine, Baltimore, MD Mehdi Jalalpour, PhD, Assistant Professor of Civil and Environmental Engineering, Cleveland State University, Cleveland, Ohio, USA. David T. Efron, MD, FACS, Professor of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD Scott R. Levin, PhD, Associate Professor of Emergency Medicine, Johns Hopkins University School of Medicine, Baltimore, MD ABSTRACT Objectives: Measuring the effect of patient safety improvement efforts is needed to determine their value but is difficult due to the inherent complexities of hospital operations. In this paper, we show by case study how interrupted time series design can be used to isolate and measure the impact of interventions while accounting for confounders often present in complex health delivery systems. Methods: We searched for time-stamped data from electronic medical records and operating room information systems associated with perioperative patient flow in a large, urban, academic hospital in Baltimore, Maryland. We limited the searched to those adult cases performed between January 2015 and March 2017. We used segmented regression and Box-Jenkins methods to measure the effect of perioperative throughput improvement efforts and account for the loss of high volume surgeons, surgical volume, and occupancy. Results: We identified a significant decline of operating room exit delays of about 50%, achieved in 6 months and sustained over 14 months. Conclusions: By longitudinal assessment of intervention effects, rather than cross-sectional comparison, our measurement tool estimated and provided inferences of change-points over time while taking into account the magnitude of other latent systems factors.
    [Show full text]
  • Process Control Charts and ITS Analysis of Epworth MEER Trial
    Supplementary material BMJ Open Qual Process control charts and ITS analysis of Epworth MEER trial Contents Introduction 2 References............................................. 2 Load R packages 3 Useful functions 3 Import .csv data file for standard 5 incident counts 4 Read in the standard 5 data and add field for incident rate: . 4 Display the data . 5 Summarise the data . 7 View the data 8 Plot patient activity levels . 8 Plot RiskMan incident rates for standard 5 . 9 Probablity distributions for the culpcount data 10 Poisson distribution . 10 Negative binomial distribution . 11 CHART AND ANALYSIS FUNCTIONS 12 Process control chart . 12 Segmented regression model function . 13 Segmented regression full analysis function . 13 ANALYSIS & FIGURES 15 ED+4G vs Other . 15 glm analyses . 15 Residual & autocorrelation plots for negative binomial analysis . 17 Generate u-charts and ITS plot . 19 1 Curtin AG, et al. BMJ Open Qual 2020; 9:e000741. doi: 10.1136/bmjoq-2019-000741 Supplementary material BMJ Open Qual Introduction This notebook file provides R code and results for process control charts (u-charts) and Interrupted Time Series (ITS) analyses of an intervention trial at the Epworth Richmond hospital in Melbourne which was applied to two units at the Epworth hospital in Richmond, 4 Gray (4G) and Emergency (ED), from Jan 2018 to Oct 2018. The intervention was the application of a quality improvement approach called MEER. The effectiveness of the MEER intervention was gauged by its influence on reported adverse adverse incidents at the Epworth Richmond hospital as recorded by an electronic reporting system called RiskMan. The RiskMan incidents analysed in this study can be categorised according to which of five National Safety and Quality Health Standards (NSQHS) they pertain to.
    [Show full text]
  • Original Research Article Analysing Interrupted Time Series with A
    Original Research Article Analysing interrupted time series with a control Authors Bottomley C 1,2* , Scott JAG 2,3 , Isham V 4 Institutions 1. MRC Tropical Epidemiology Group, London School of Hygiene & Tropical Medicine, London, UK 2. Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK 3. KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya 4. Department of Statistical Science, University College London, London, UK * Corresponding Author: [email protected] London School of Hygiene & Tropical Medicine, Keppel Street, London, UK Tel: +44 207 927 2533 1 Abstract Interrupted time series are increasingly being used to evaluate the population-wide implementation of public health interventions. However, the resulting estimates of intervention impact can be severely biased if underlying disease trends are not adequately accounted for. Control series offer a potential solution to this problem, but there is little guidance on how to use them to produce trend-adjusted estimates. To address this lack of guidance, we show how interrupted time series can be analysed when the control and intervention series share confounders, i.e., when they share a common trend. We show that the intervention effect can be estimated by subtracting the control series from the intervention series and analysing the difference using linear regression or, if a log-linear model is assumed, by including the control series as an offset in a Poisson regression with robust standard errors. The methods are illustrated with two examples. Key words: interrupted time series, segmented regression, common trend model 2 Introduction Interrupted time series (ITS) analysis is an increasingly popular method for evaluating public health interventions [1].
    [Show full text]
  • An Exact Algorithm for Estimating Breakpoints in Segmented Generalized Linear Models
    Küchenhoff: An exact algorithm for estimating breakpoints in segmented generalized linear models Sonderforschungsbereich 386, Paper 27 (1996) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner An exact algorithm for estimating breakp oints in segmented generalized linear mo dels Helmut Kuc henho University of Munich Institute of Statistics Akademiestrasse D Munc hen Summary We consider the problem of estimating the unknown breakp oints in seg mented generalized linear mo dels Exact algorithms for calculating maxi mum likeliho o d estimators are derived for dierenttyp es of mo dels After discussing the case of a GLM with a single covariate having one breakp oint a new algorithm is presented when further covariates are included in the mo del The essential idea of this approach is then used for the case of more than one breakp oint As further extension an algorithm for the situation of two regressors eachhaving a breakp oint is prop osed These techniques are applied for analysing the data of the Munichrental table It can b e seen that these algorithms are easy to handle without to o much computational eort The algorithms are available as GAUSSprograms Keywords Breakp oint generalized linear mo del segmented regression Intro duction In many practical regressiontyp e problems we cannot t one uniform re gression function to the data since the functional relationship b etween the resp onse Y and the regressor X changes at certain p oints of the domain of X These p oints are usually called breakp oints or changep oints One imp
    [Show full text]
  • Fitting Segmented Regression Curves
    University of Montana ScholarWorks at University of Montana Graduate Student Theses, Dissertations, & Professional Papers Graduate School 1968 Fitting segmented regression curves Kenneth P. Johnson The University of Montana Follow this and additional works at: https://scholarworks.umt.edu/etd Let us know how access to this document benefits ou.y Recommended Citation Johnson, Kenneth P., "Fitting segmented regression curves" (1968). Graduate Student Theses, Dissertations, & Professional Papers. 8324. https://scholarworks.umt.edu/etd/8324 This Thesis is brought to you for free and open access by the Graduate School at ScholarWorks at University of Montana. It has been accepted for inclusion in Graduate Student Theses, Dissertations, & Professional Papers by an authorized administrator of ScholarWorks at University of Montana. For more information, please contact [email protected]. FITTING SEGMENTED REGRESSION CURVES By Kenneth P. Johnson B.A., University of Montana, l965 Presented in partial fulfillment of the requirements for the degree of Master of Arts UNIVERSITY OF MONTANA 1968 Approved by: Chairman, Board of Examiners /C'..,,_^ De ^ , Graduate' School Date Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: EP39125 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMT DMSsdation PublisNng UMI EP39125 Published by ProQuest LLC (2013). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC.
    [Show full text]
  • Parameter Estimation in Linear-Linear Segmented Regression
    Brigham Young University BYU ScholarsArchive Theses and Dissertations 2010-04-20 Parameter Estimation in Linear-Linear Segmented Regression Erika Lyn Hernandez Brigham Young University - Provo Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Statistics and Probability Commons BYU ScholarsArchive Citation Hernandez, Erika Lyn, "Parameter Estimation in Linear-Linear Segmented Regression" (2010). Theses and Dissertations. 2113. https://scholarsarchive.byu.edu/etd/2113 This Selected Project is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. Parameter Estimation in Linear-Linear Segmented Regression Erika L. Hernandez A project submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science C. Shane Reese, Chair Scott D. Grimshaw Gilbert W. Fellingham Department of Statistics Brigham Young University August 2010 Copyright c 2010 Erika L. Hernandez All Rights Reserved ABSTRACT Parameter Estimation in Linear-Linear Segmented Regression Erika L. Hernandez Department of Statistics Master of Science Segmented regression is a type of nonlinear regression that allows differing functional forms to be fit over different ranges of the explanatory variable. This paper considers the simple segmented regression case of two linear segments that are constrained to meet, often called the linear-linear model. Parameter estimation in the case where the joinpoint between the regimes is unknown can be tricky. Using a simulation study, four estimators for the parameters of the linear-linear model are evaluated.
    [Show full text]
  • Water Quality Trend and Change-Point Analyses Using Integration of Locally Weighted Polynomial Regression and Segmented Regression
    Environ Sci Pollut Res DOI 10.1007/s11356-017-9188-x RESEARCH ARTICLE Water quality trend and change-point analyses using integration of locally weighted polynomial regression and segmented regression Hong Huang1,2 & Zhenfeng Wang1,2 & Fang Xia1,2 & Xu Shang1,2 & YuanYuan Liu 3 & Minghua Zhang1,2 & Randy A. Dahlgren1,2,4 & Kun Mei1,2 Received: 11 January 2017 /Accepted: 2 May 2017 # Springer-Verlag Berlin Heidelberg 2017 Abstract Trend and change-point analyses of water quality The reliability was verified by statistical tests and prac- time series data have important implications for pollution con- tical considerations for Shanxi Reservoir watershed. The trol and environmental decision-making. This paper devel- newly developed integrated LWPR-SegReg approach is oped a new approach to assess trends and change-points of not only limited to the assessment of trends and change-points water quality parameters by integrating locally weighted poly- of water quality parameters but also has a broad application to nomial regression (LWPR) and segmented regression other fields with long-term time series records. (SegReg). Firstly, LWPR was used to pretreat the original water quality data into a smoothed time series to represent Keywords Water quality . Long-term trend assessment . the long-term trend of water quality. Then, SegReg was used Change-point analysis . Locally weighted polynomial to identify the long-term trends and change-points of the regression . Segmented regression smoothed time series. Finally, statistical tests were applied to determine the significance of the long-term trends and change- points. The efficacy of this approach was validated using a 10- Introduction year record of total nitrogen (TN) and chemical oxygen de- mand (CODMn) from Shanxi Reservoir watershed in eastern Many countries and regions in the world suffer from chronic China.
    [Show full text]
  • Fast Algorithms for Segmented Regression
    Fast Algorithms for Segmented Regression Jayadev Acharya Ilias Diakonikolas MIT University of Southern California [email protected] [email protected] Jerry Li Ludwig Schmidt MIT MIT [email protected] [email protected] May 25, 2016 Abstract We study the fixed design segmented regression problem: Given noisy samples from a piece- wise linear function f, we want to recover f up to a desired accuracy in mean-squared error. Previous rigorous approaches for this problem rely on dynamic programming (DP) and, while sample efficient, have running time quadratic in the sample size. As our main contribution, we provide new sample near-linear time algorithms for the problem that – while not being minimax optimal – achieve a significantly better sample-time tradeoff on large datasets compared to the DP approach. Our experimental evaluation shows that, compared with the DP approach, our algorithms provide a convergence rate that is only off by a factor of 2 to 4, while achieving speedups of three orders of magnitude. 1 Introduction We study the regression problem – a fundamental inference task with numerous applications that has received tremendous attention in machine learning and statistics during the past fifty years (see, e.g., [MT77] for a classical textbook). Roughly speaking, in a (fixed design) regression problem, we are given a set of n observations (xi; yi), where the yi’s are the dependent variables and the xi’s are the independent variables, and our goal is to model the relationship between them. The typical assumptions are that (i) there exists a simple function f that (approximately) models the underlying relation, and (ii) the dependent observations are corrupted by random noise.
    [Show full text]
  • Piecewise Linear Regression for Leaf Appearance Rate Data
    Iowa State University Capstones, Theses and Creative Components Dissertations Spring 2021 Piecewise linear regression for leaf appearance rate data Lin Quan Follow this and additional works at: https://lib.dr.iastate.edu/creativecomponents Part of the Biostatistics Commons Recommended Citation Quan, Lin, "Piecewise linear regression for leaf appearance rate data" (2021). Creative Components. 786. https://lib.dr.iastate.edu/creativecomponents/786 This Creative Component is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Creative Components by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Piecewise linear regression for leaf appearance rate data by Lin Quan A Creative Component submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Major: Statistics Program of Study Committee: Dan Nettleton, Major Professor Lily Wang Peng Liu The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this creative component. The Graduate College will ensure this creative component is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2021 Copyright © Lin Quan, 2021. All rights reserved. ii TABLE OF CONTENTS Page LIST OF TABLES . iii LIST OF FIGURES . iv ABSTRACT . vi CHAPTER 1. INTRODUCTION . .1 CHAPTER 2. DATASET AND METHODS . .4 2.1 Dataset . .4 2.2 Methods . .4 2.2.1 Package segmented ................................4 2.2.2 Least-squares .
    [Show full text]
  • Segmented Regression Analysis of Interrupted Time Series Studies in Medication Use Research
    Journal of Clinical Pharmacy and Therapeutics (2002) 27, 299–309 RESEARCH NOTE Segmented regression analysis of interrupted time series studies in medication use research A. K. Wagner* PharmD, MPH,S.B.Soumerai*ScD, F. Zhang* MS and D. Ross-Degnan* ScD *Department of Ambulatory Care and Prevention, Harvard Medical School and Harvard Pilgrim Health Care and Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA ineffective (3) or a drug with unfavourable side- SUMMARY effects can be withdrawn from the market (4). Interrupted time series design is the strongest, Interrupted time series (5, 6) is the strongest, quasi-experimental approach for evaluating lon- quasi-experimental design to evaluate longitudinal gitudinal effects of interventions. Segmented effects of such time-delimited interventions. Seg- regression analysis is a powerful statistical mented regression analysis of interrupted time method for estimating intervention effects in series data allows us to assess, in statistical terms, interrupted time series studies. In this paper, we how much an intervention changed an outcome of show how segmented regression analysis can be interest, immediately and over time; instantly or used to evaluate policy and educational inter- with delay; transiently or long-term; and whether ventions intended to improve the quality of factors other than the intervention could explain medication use and ⁄ or contain costs. the change. Segmented regression analysis is appropriate for Keywords: health policy evaluation, interrupted studying effects of interventions conducted as part time series design, longitudinal analysis, medi- of a randomized trial as well as interventions that cation use research, quasi-experimental design, constitute a natural experiment.
    [Show full text]
  • Package 'Segmented'
    Package ‘segmented’ April 22, 2021 Type Package Title Regression Models with Break-Points / Change-Points Estimation Version 1.3-4 Date 2021-04-28 Maintainer Vito M. R. Muggeo <[email protected]> Description Given a regression model, segmented `updates' it by adding one or more segmented (i.e., piece- wise linear) relationships. Several variables with multiple breakpoints are allowed. The estimation method is dis- cussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008- 1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estima- tion for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). License GPL NeedsCompilation no Author Vito M. R. Muggeo [aut, cre] (<https://orcid.org/0000-0002-3386-4054>) Repository CRAN Date/Publication 2021-04-22 14:00:02 UTC R topics documented: segmented-package . .2 aapc.............................................4 broken.line . .5 confint.segmented . .7 davies.test . .9 down............................................. 11 draw.history . 12 intercept . 13 lines.segmented . 14 1 2 segmented-package plant . 15 plot.segmented . 16 points.segmented . 19 predict.segmented . 20 print.segmented . 21 pscore.test . 22 seg.control . 24 seg.lm.fit . 27 segmented . 29 selgmented . 35 slope . 37 stagnant . 38 summary.segmented . 39 vcov.segmented . 41 Index 43 segmented-package Segmented relationships in regression models with breakpoints / changepoints estimation Description Estimation and Inference of Regression Models with piecewise linear relationships having a fixed number of break-points. The estimation method is described in Muggeo (2003) <doi: 10.1002/sim.1545>.
    [Show full text]
  • EMPIRICAL LIKELIHOOD METHOD for SEGMENTED LINEAR REGRESSION by Zhihua Liu a Dissertation Submitted to the Faculty of the Charles
    EMPIRICAL LIKELIHOOD METHOD FOR SEGMENTED LINEAR REGRESSION by Zhihua Liu A Dissertation Submitted to the Faculty of The Charles E. Schmidt College of Science in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Florida Atlantic University Boca Raton, FL December 2011 Copyright by Zhihua Liu 2011 ii ACKNOWLEDGEMENTS First of all, I have to confess that the thought of writing this dissertation was intimidating to me. As much as I wish to express my appreciation to all the people who have been there for me to make this dissertation possible, I know I can never include everyone, and not adequately express my immense gratitude to them in simple words. I would like to express my deepest gratitude to all my committee members, Dr. Lianfen Qian, Dr. Hongwei Long, Dr. Heinrich Niederhausen and Dr. Dragan Radulovic. I appreciate their time, interest, and valuable comments concerning my thesis. For the last few years, I had the opportunity and privilege to work with Dr. Lianfen Qian. She has given me ideas and suggestions that enlightened my under- standing of this research and gave me a better perspective on my own work. I would also want to use this opportunity to thank many faculty members and my colleagues for all their inspiration and encouragement. I am also fortunate to be surrounded by sweet people from PenServ and ERISA Pension Systems. Without their support, I would never have completed this disser- tation. In the course of writing this dissertation, I am running between a full-time job and actuarial exams. Most of the time, I wish I could have 40 hours a day.
    [Show full text]