The Use of Receiver Operating Characteristic Curve Analysis for Academic Progress and Degree Completion

Total Page:16

File Type:pdf, Size:1020Kb

The Use of Receiver Operating Characteristic Curve Analysis for Academic Progress and Degree Completion The University of Southern Mississippi The Aquila Digital Community Dissertations Spring 5-2016 The Use of Receiver Operating Characteristic Curve Analysis for Academic Progress and Degree Completion Joshua William Schutts University of Southern Mississippi Follow this and additional works at: https://aquila.usm.edu/dissertations Part of the Educational Assessment, Evaluation, and Research Commons, Educational Leadership Commons, Educational Methods Commons, Educational Psychology Commons, Higher Education Commons, Higher Education Administration Commons, and the Social Statistics Commons Recommended Citation Schutts, Joshua William, "The Use of Receiver Operating Characteristic Curve Analysis for Academic Progress and Degree Completion" (2016). Dissertations. 357. https://aquila.usm.edu/dissertations/357 This Dissertation is brought to you for free and open access by The Aquila Digital Community. It has been accepted for inclusion in Dissertations by an authorized administrator of The Aquila Digital Community. For more information, please contact [email protected]. THE USE OF RECEIVER OPERATING CHARACTERISTIC CURVE ANALYSIS FOR ACADEMIC PROGRESS AND DEGREE COMPLETION by Joshua William Schutts A Dissertation Submitted to the Graduate School and the Department of Educational Studies and Research at The University of Southern Mississippi in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Approved: ________________________________________________ Dr. Richard Mohn, Committee Chair Associate Professor, Educational Studies and Research ________________________________________________ Dr. Forrest Lane, Committee Member Assistant Professor, Educational Leadership, Sam Houston State University ________________________________________________ Dr. Thomas Lipscomb, Committee Member Professor, Educational Studies and Research ________________________________________________ Dr. Kyna Shelley, Committee Member Professor, Educational Studies and Research ________________________________________________ Dr. Karen S. Coats Dean of the Graduate School May 2016 ABSTRACT THE USE OF RECEIVER OPERATING CHARACTERISTIC CURVE ANALYSIS FOR ACADEMIC PROGRESS AND DEGREE COMPLETION by Joshua William Schutts May 2016 College student retention and graduation are important to students, institutions, and the community. Institutions must commit to understanding why students persist and depart in order to address student success. As a result, institutions and governmental entities have increased the emphasis they place on using data to improve student success and degree completion. An abundance of research suggests that background factors (such as high school GPA and ACT score) combined with environmental factors (such as one’s major and first semester GPA) are predictive of student success. However, the literature has yet to explore the value of ROC curve analysis as a statistical technique to improve decision making. The purpose of this study was to identify the variables that best predicted satisfactory academic progress and degree completion, and model the use of ROC curve analysis. This study utilized a quantitative approach and secondary data from the institutional research office of a State University System institution in Florida. Logistic regression was successful in identifying factors that were predictive of each outcome. ROC curve analysis successfully discriminated the success/non-success groups based on predicted probability scores and a cumulative risk index. Students with three or more risk factors were less likely to make academic progress or graduate in four years. Students with two or more risk factors were less likely to graduate in six years. ii iii COPYRIGHT BY JOSHUA WILLIAM SCHUTTS 2016 ACKNOWLEDGMENTS Thank you, Dr. Rich Mohn, for being an insightful and supportive mentor to me. You pushed me when I needed it, and likewise gave me space when work and life necessitated it. Most of all, you are a fantastic teacher. You made it fun to learn, and challenged me to be the best student I could be . I am indebted to you, and look forward to pursuing a faculty career in a similar manner to yours. Thank you, Drs. Kyna Shelley, Forrest Lane, and Thomas Lipscomb for your advice, challenge, support, and suggestions along the way. I count you all among my friends and mentors–and most importantly as individuals I hold in the highest regard. Your willingness to serve on my committee is demonstrative of the care and dedication you show to students, and I am extremely honored and grateful for your support Thank you, Dr. Gentry McCreary, Dr. Martha Saunders, Dr. Michael White, Dr. Dorothy Minear, Dr. Kim LeDuff, Dr. Ben Stubbs, Dr. Ruth Davison, Ms. Cindy Cattelier, Ms. Lauren Loeffler, Ms. Robin Zimmern, and all of my coleagues at USM and UWF, for your support, flexibility, and understanding. I am encouraged by all of you, and have felt the warmth of your support throughout all points of this journey. Thank you, Mom, Pop, and Bear, for instilling the value of education in me at such a young age, and encouraging me to learn everything I could. Growing up, I asked “why” endlessly. You always made time to answer me. I cannot possibly quantify the impact you had on me, and love I have for you, in so few words. Thank you, Katie, for being the person most responsible for my perseverance in this effort. You understood when I needed support, listened to me, and pushed me to always make time to write this dissertation. I love you so much. iii TABLE OF CONTENTS ABSTRACT…………………………………………………………..………………… ii ACKNOWLEDGMENTS ……………………………………………………………. iii LIST OF TABLES……………………………………………………………………… vi LIST OF ILLUSTRATIONS…………………………………………………………… vii LIST OF ABBREVIATIONS ………………………………………………………….. xiii CHAPTER I. INTRODUCTION……………………………………………………… 1 Background Statement of the Problem Conceptual Framework Purpose of the Study Research Questions Operational Definitions Delimitations and Assumptions Justification and Significance of the Study Organization of the Dissertation II. REVIEW OF RELATED LITERATURE……………………………… 18 Introduction Overview of Studies on Retention and Degree Completion Major Theoretical Models of Retention and Degree Completion Methods of Studying Retention and Degree Completion Predictors of Retention and Degree Completion Specific Studies Using Logistic Regression on Cohort Data ROC Curves in the Literature Summary III. METHODOLOGY….………………………………………………….. 41 Research Questions Research Design Subjects and Setting Variables Data Collection Procedures Data Analysis iv IV. RESULTS….…………………………………………………………… 61 Data Screening Satisfactory Academic Progress Degree Completion in Four Years Degree Completion in Six Years V. DISCUSSION.………………………………………………………… .133 Satisfactory Academic Progress Degree Completion in Four Years Degree Completion in Six Years Advantages and Disadvantages of Method Limitations Implications for Future Research Implications for Policy and Practice Conclusions APPENDICES…………………………….…………………………………………… 156 REFERENCES…………………………………………………………………....…… 159 v LIST OF TABLES Table 1. Methodological approaches Found in Prior Research……………………….. .. 27 2. General Nature of Relationships between Input and Outcome Variables of this Study…………………………………………………………………… … 30 3. General Nature of Relationships between Environmental and Outcome Variables of this Study……………………………………………………….... 32 4. General Nature of Relationships between Outcome Variables of this Study …. 33 5. Expected Relationship between Predictors and Retention to the Second Year.. 34 6. Expected Relationship between Predictors and Degree Completion in Four Years…………………………………………………………………………… 35 7. Expected Relationship between Predictors and Degree Completion in Six Years…………………………………………………………………………… 36 8. Variables in the Dataset…..…………………….……………………………… 44 9. Statistical Analyses Conducted by Research Question………………………… 48 10. General Guidelines for Sample and Model Evaluation……………………… .. 50 11. Confusion Matrix…..…………………….……………………………..……... 51 12. Comparison of Common Effect Sizes..…………………….……...…………… 53 13. Intercorrelations Among the Predictor Variables……………………………… 65 14. Comparison of Retained and Removed Student Records……………………… 67 15. Group Comparisons, Academic Progress……………………………………… 70 16. Correlations, Academic Progress……………………………………………… 72 17. Classification Statistics for High School GPA and Academic Progress……… 74 18. Classification Statistics for Application Date and Academic Progress……….. 75 19. Classification Statistics for Scholarship Aid and Academic Progress…...……. 76 vi 20. Classification Statistics for First Fall Term GPA and Academic Progress……. 77 21. Classification Statistics for Inputs Model of Academic Progress……………… 79 22. Logistic Regression for Inputs Model of Academic Progress…………………. 79 23. Classification Statistics for Final Model of Academic Progress………………. 82 24. Final Logistic Regression, Academic Progress………………………………… 83 25. Predicted Probabilities, Academic Progress…………………………………… 84 26. Evaluation of Sample and Model, Academic Progress………………………… 85 27. Results by Research Question, Academic Progress…………………………… 89 28. Group Comparisons, Four-Year Degree Completion……………………… …. 91 29. Correlations, Four-Year Degree Completion………………………………….. 93 30. Classification Statistics for High School GPA and Four-Year Degree Completion.……………………………………….………………………….... 95 31. Classification Statistics for Application Date and Four-Year Degree Completion……………………………………………………………………. 96 32. Classification Statistics for Amount of Scholarship
Recommended publications
  • Descriptive Statistics (Part 2): Interpreting Study Results
    Statistical Notes II Descriptive statistics (Part 2): Interpreting study results A Cook and A Sheikh he previous paper in this series looked at ‘baseline’. Investigations of treatment effects can be descriptive statistics, showing how to use and made in similar fashion by comparisons of disease T interpret fundamental measures such as the probability in treated and untreated patients. mean and standard deviation. Here we continue with descriptive statistics, looking at measures more The relative risk (RR), also sometimes known as specific to medical research. We start by defining the risk ratio, compares the risk of exposed and risk and odds, the two basic measures of disease unexposed subjects, while the odds ratio (OR) probability. Then we show how the effect of a disease compares odds. A relative risk or odds ratio greater risk factor, or a treatment, can be measured using the than one indicates an exposure to be harmful, while relative risk or the odds ratio. Finally we discuss the a value less than one indicates a protective effect. ‘number needed to treat’, a measure derived from the RR = 1.2 means exposed people are 20% more likely relative risk, which has gained popularity because of to be diseased, RR = 1.4 means 40% more likely. its clinical usefulness. Examples from the literature OR = 1.2 means that the odds of disease is 20% higher are used to illustrate important concepts. in exposed people. RISK AND ODDS Among workers at factory two (‘exposed’ workers) The probability of an individual becoming diseased the risk is 13 / 116 = 0.11, compared to an ‘unexposed’ is the risk.
    [Show full text]
  • Spatial Duration Data for the Spatial Re- Gression Context
    Missing Data In Spatial Regression 1 Frederick J. Boehmke Emily U. Schilling University of Iowa Washington University Jude C. Hays University of Pittsburgh July 15, 2015 Paper prepared for presentation at the Society for Political Methodology Summer Conference, University of Rochester, July 23-25, 2015. 1Corresponding author: [email protected] or [email protected]. We are grateful for funding provided by the University of Iowa Department of Political Science. Com- ments from Michael Kellermann, Walter Mebane, and Vera Troeger greatly appreciated. Abstract We discuss problems with missing data in the context of spatial regression. Even when data are MCAR or MAR based solely on exogenous factors, estimates from a spatial re- gression with listwise deletion can be biased, unlike for standard estimators with inde- pendent observations. Further, common solutions for missing data such as multiple im- putation appear to exacerbate the problem rather than fix it. We propose a solution for addressing this problem by extending previous work including the EM approach devel- oped in Hays, Schilling and Boehmke (2015). Our EM approach iterates between imputing the unobserved values and estimating the spatial regression model using these imputed values. We explore the performance of these alternatives in the face of varying amounts of missing data via Monte Carlo simulation and compare it to the performance of spatial regression with listwise deletion and a benchmark model with no missing data. 1 Introduction Political science data are messy. Subsets of the data may be missing; survey respondents may not answer all of the questions, economic data may only be available for a subset of countries, and voting records are not always complete.
    [Show full text]
  • Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, Or Screening Marker
    American Journal of Epidemiology Vol. 159, No. 9 Copyright © 2004 by the Johns Hopkins Bloomberg School of Public Health Printed in U.S.A. All rights reserved DOI: 10.1093/aje/kwh101 Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, or Screening Marker Margaret Sullivan Pepe1,2, Holly Janes2, Gary Longton1, Wendy Leisenring1,2,3, and Polly Newcomb1 1 Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA. 2 Department of Biostatistics, University of Washington, Seattle, WA. 3 Division of Clinical Research, Fred Hutchinson Cancer Research Center, Seattle, WA. Downloaded from Received for publication June 24, 2003; accepted for publication October 28, 2003. A marker strongly associated with outcome (or disease) is often assumed to be effective for classifying persons http://aje.oxfordjournals.org/ according to their current or future outcome. However, for this assumption to be true, the associated odds ratio must be of a magnitude rarely seen in epidemiologic studies. In this paper, an illustration of the relation between odds ratios and receiver operating characteristic curves shows, for example, that a marker with an odds ratio of as high as 3 is in fact a very poor classification tool. If a marker identifies 10% of controls as positive (false positives) and has an odds ratio of 3, then it will correctly identify only 25% of cases as positive (true positives). The authors illustrate that a single measure of association such as an odds ratio does not meaningfully describe a marker’s ability to classify subjects. Appropriate statistical methods for assessing and reporting the classification power of a marker are described.
    [Show full text]
  • Guidance for Industry E2E Pharmacovigilance Planning
    Guidance for Industry E2E Pharmacovigilance Planning U.S. Department of Health and Human Services Food and Drug Administration Center for Drug Evaluation and Research (CDER) Center for Biologics Evaluation and Research (CBER) April 2005 ICH Guidance for Industry E2E Pharmacovigilance Planning Additional copies are available from: Office of Training and Communication Division of Drug Information, HFD-240 Center for Drug Evaluation and Research Food and Drug Administration 5600 Fishers Lane Rockville, MD 20857 (Tel) 301-827-4573 http://www.fda.gov/cder/guidance/index.htm Office of Communication, Training and Manufacturers Assistance, HFM-40 Center for Biologics Evaluation and Research Food and Drug Administration 1401 Rockville Pike, Rockville, MD 20852-1448 http://www.fda.gov/cber/guidelines.htm. (Tel) Voice Information System at 800-835-4709 or 301-827-1800 U.S. Department of Health and Human Services Food and Drug Administration Center for Drug Evaluation and Research (CDER) Center for Biologics Evaluation and Research (CBER) April 2005 ICH Contains Nonbinding Recommendations TABLE OF CONTENTS I. INTRODUCTION (1, 1.1) ................................................................................................... 1 A. Background (1.2) ..................................................................................................................2 B. Scope of the Guidance (1.3)...................................................................................................2 II. SAFETY SPECIFICATION (2) .....................................................................................
    [Show full text]
  • Sensitivity Analysis
    Chapter 11. Sensitivity Analysis Joseph A.C. Delaney, Ph.D. University of Washington, Seattle, WA John D. Seeger, Pharm.D., Dr.P.H. Harvard Medical School and Brigham and Women’s Hospital, Boston, MA Abstract This chapter provides an overview of study design and analytic assumptions made in observational comparative effectiveness research (CER), discusses assumptions that can be varied in a sensitivity analysis, and describes ways to implement a sensitivity analysis. All statistical models (and study results) are based on assumptions, and the validity of the inferences that can be drawn will often depend on the extent to which these assumptions are met. The recognized assumptions on which a study or model rests can be modified in order to assess the sensitivity, or consistency in terms of direction and magnitude, of an observed result to particular assumptions. In observational research, including much of comparative effectiveness research, the assumption that there are no unmeasured confounders is routinely made, and violation of this assumption may have the potential to invalidate an observed result. The analyst can also verify that study results are not particularly affected by reasonable variations in the definitions of the outcome/exposure. Even studies that are not sensitive to unmeasured confounding (such as randomized trials) may be sensitive to the proper specification of the statistical model. Analyses are available that 145 can be used to estimate a study result in the presence of an hypothesized unmeasured confounder, which then can be compared to the original analysis to provide quantitative assessment of the robustness (i.e., “how much does the estimate change if we posit the existence of a confounder?”) of the original analysis to violations of the assumption of no unmeasured confounders.
    [Show full text]
  • Adaptive Wavelet Clustering for Highly Noisy Data
    Adaptive Wavelet Clustering for Highly Noisy Data Zengjian Chen Jiayi Liu Yihe Deng Department of Computer Science Department of Computer Science Department of Mathematics Huazhong University of University of Massachusetts Amherst University of California, Los Angeles Science and Technology Massachusetts, USA California, USA Wuhan, China [email protected] [email protected] [email protected] Kun He* John E. Hopcroft Department of Computer Science Department of Computer Science Huazhong University of Science and Technology Cornell University Wuhan, China Ithaca, NY, USA [email protected] [email protected] Abstract—In this paper we make progress on the unsupervised Based on the pioneering work of Sheikholeslami that applies task of mining arbitrarily shaped clusters in highly noisy datasets, wavelet transform, originally used for signal processing, on which is a task present in many real-world applications. Based spatial data clustering [12], we propose a new wavelet based on the fundamental work that first applies a wavelet transform to data clustering, we propose an adaptive clustering algorithm, algorithm called AdaWave that can adaptively and effectively denoted as AdaWave, which exhibits favorable characteristics for uncover clusters in highly noisy data. To tackle general appli- clustering. By a self-adaptive thresholding technique, AdaWave cations, we assume that the clusters in a dataset do not follow is parameter free and can handle data in various situations. any specific distribution and can be arbitrarily shaped. It is deterministic, fast in linear time, order-insensitive, shape- To show the hardness of the clustering task, we first design insensitive, robust to highly noisy data, and requires no pre- knowledge on data models.
    [Show full text]
  • Design of Experiments for Sensitivity Analysis of a Hydrogen Supply Chain Design Model
    OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible This is an author’s version published in: http://oatao.univ-toulouse.fr/27431 Official URL: https://doi.org/10.1007/s41660-017-0025-y To cite this version: Ochoa Robles, Jesus and De León Almaraz, Sofia and Azzaro-Pantel, Catherine Design of Experiments for Sensitivity Analysis of a Hydrogen Supply Chain Design Model. (2018) Process Integration and Optimization for Sustainability, 2 (2). 95-116. ISSN 2509-4238 Any correspondence concerning this service should be sent to the repository administrator: [email protected] https://doi.org/10.1007/s41660-017-0025-y Design of Experiments for Sensitivity Analysis of a Hydrogen Supply Chain Design Model 1 1 1 J. Ochoa Robles • S. De-Le6n Almaraz • C. Azzaro-Pantel 8 Abstract Hydrogen is one of the most promising energy carriersin theq uest fora more sustainableenergy mix. In this paper, a model ofthe hydrogen supply chain (HSC) based on energy sources, production, storage, transportation, and market has been developed through a MILP formulation (Mixed Integer LinearP rograrnming). Previous studies have shown that the start-up of the HSC deployment may be strongly penaliz.ed froman economicpoint of view. The objective of this work is to perform a sensitivity analysis to identifythe major parameters (factors) and their interactionaff ectinga n economic criterion, i.e., the total daily cost (fDC) (response), encompassing capital and operational expenditures. An adapted methodology for this SA is the design of experiments through the Factorial Design and Response Surface methods.
    [Show full text]
  • 1 Global Sensitivity and Uncertainty Analysis Of
    GLOBAL SENSITIVITY AND UNCERTAINTY ANALYSIS OF SPATIALLY DISTRIBUTED WATERSHED MODELS By ZUZANNA B. ZAJAC A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2010 1 © 2010 Zuzanna Zajac 2 To Król Korzu KKMS! 3 ACKNOWLEDGMENTS I would like to thank my advisor Rafael Muñoz-Carpena for his constant support and encouragement over the past five years. I could not have achieved this goal without his patience, guidance, and persistent motivation. For providing innumerable helpful comments and helping to guide this research, I also thank my graduate committee co-chair Wendy Graham and all the members of the graduate committee: Michael Binford, Greg Kiker, Jayantha Obeysekera, and Karl Vanderlinden. I would also like to thank Naiming Wang from the South Florida Water Management District (SFWMD) for his help understanding the Regional Simulation Model (RSM), the great University of Florida (UF) High Performance Computing (HPC) Center team for help with installing RSM, South Florida Water Management District and University of Florida Water Resources Research Center (WRRC) for sponsoring this project. Special thanks to Lukasz Ziemba for his help writing scripts and for his great, invaluable support during this PhD journey. To all my friends in the Agricultural and Biological Engineering Department at UF: thank you for making this department the greatest work environment ever. Last, but not least, I would like to thank my father for his courage and the power of his mind, my mother for the power of her heart, and my brother for always being there for me.
    [Show full text]
  • Relative Risks, Odds Ratios and Cluster Randomised Trials . (1)
    1 Relative Risks, Odds Ratios and Cluster Randomised Trials Michael J Campbell, Suzanne Mason, Jon Nicholl Abstract It is well known in cluster randomised trials with a binary outcome and a logistic link that the population average model and cluster specific model estimate different population parameters for the odds ratio. However, it is less well appreciated that for a log link, the population parameters are the same and a log link leads to a relative risk. This suggests that for a prospective cluster randomised trial the relative risk is easier to interpret. Commonly the odds ratio and the relative risk have similar values and are interpreted similarly. However, when the incidence of events is high they can differ quite markedly, and it is unclear which is the better parameter . We explore these issues in a cluster randomised trial, the Paramedic Practitioner Older People’s Support Trial3 . Key words: Cluster randomized trials, odds ratio, relative risk, population average models, random effects models 1 Introduction Given two groups, with probabilities of binary outcomes of π0 and π1 , the Odds Ratio (OR) of outcome in group 1 relative to group 0 is related to Relative Risk (RR) by: . If there are covariates in the model , one has to estimate π0 at some covariate combination. One can see from this relationship that if the relative risk is small, or the probability of outcome π0 is small then the odds ratio and the relative risk will be close. One can also show, using the delta method, that approximately (1). ________________________________________________________________________________ Michael J Campbell, Medical Statistics Group, ScHARR, Regent Court, 30 Regent St, Sheffield S1 4DA [email protected] 2 Note that this result is derived for non-clustered data.
    [Show full text]
  • Cluster Analysis, a Powerful Tool for Data Analysis in Education
    International Statistical Institute, 56th Session, 2007: Rita Vasconcelos, Mßrcia Baptista Cluster Analysis, a powerful tool for data analysis in Education Vasconcelos, Rita Universidade da Madeira, Department of Mathematics and Engeneering Caminho da Penteada 9000-390 Funchal, Portugal E-mail: [email protected] Baptista, Márcia Direcção Regional de Saúde Pública Rua das Pretas 9000 Funchal, Portugal E-mail: [email protected] 1. Introduction A database was created after an inquiry to 14-15 - year old students, which was developed with the purpose of identifying the factors that could socially and pedagogically frame the results in Mathematics. The data was collected in eight schools in Funchal (Madeira Island), and we performed a Cluster Analysis as a first multivariate statistical approach to this database. We also developed a logistic regression analysis, as the study was carried out as a contribution to explain the success/failure in Mathematics. As a final step, the responses of both statistical analysis were studied. 2. Cluster Analysis approach The questions that arise when we try to frame socially and pedagogically the results in Mathematics of 14-15 - year old students, are concerned with the types of decisive factors in those results. It is somehow underlying our objectives to classify the students according to the factors understood by us as being decisive in students’ results. This is exactly the aim of Cluster Analysis. The hierarchical solution that can be observed in the dendogram presented in the next page, suggests that we should consider the 3 following clusters, since the distances increase substantially after it: Variables in Cluster1: mother qualifications; father qualifications; student’s results in Mathematics as classified by the school teacher; student’s results in the exam of Mathematics; time spent studying.
    [Show full text]
  • Missing Data Module 1: Introduction, Overview
    Missing Data Module 1: Introduction, overview Roderick Little and Trivellore Raghunathan Course Outline 9:00-10:00 Module 1 Introduction and overview 10:00-10:30 Module 2 Complete-case analysis, including weighting methods 10:30-10:45 Break 10:45-12:00 Module 3 Imputation, multiple imputation 1:30-2:00 Module 4 A little likelihood theory 2:00-3:00 Module 5 Computational methods/software 3:00-3:30 Module 6: Nonignorable models Module 1:Introduction and Overview Module 1: Introduction and Overview • Missing data defined, patterns and mechanisms • Analysis strategies – Properties of a good method – complete-case analysis – imputation, and multiple imputation – analysis of the incomplete data matrix – historical overview • Examples of missing data problems Module 1:Introduction and Overview Module 1: Introduction and Overview • Missing data defined, patterns and mechanisms • Analysis strategies – Properties of a good method – complete-case analysis – imputation, and multiple imputation – analysis of the incomplete data matrix – historical overview • Examples of missing data problems Module 1:Introduction and Overview Missing data defined variables If filled in, meaningful for cases analysis? • Always assume missingness hides a meaningful value for analysis • Examples: – Missing data from missed clinical visit( √) – No-show for a preventive intervention (?) – In a longitudinal study of blood pressure medications: • losses to follow-up ( √) • deaths ( x) Module 1:Introduction and Overview Patterns of Missing Data • Some methods work for a general
    [Show full text]
  • Understanding Relative Risk, Odds Ratio, and Related Terms: As Simple As It Can Get Chittaranjan Andrade, MD
    Understanding Relative Risk, Odds Ratio, and Related Terms: As Simple as It Can Get Chittaranjan Andrade, MD Each month in his online Introduction column, Dr Andrade Many research papers present findings as odds ratios (ORs) and considers theoretical and relative risks (RRs) as measures of effect size for categorical outcomes. practical ideas in clinical Whereas these and related terms have been well explained in many psychopharmacology articles,1–5 this article presents a version, with examples, that is meant with a view to update the knowledge and skills to be both simple and practical. Readers may note that the explanations of medical practitioners and examples provided apply mostly to randomized controlled trials who treat patients with (RCTs), cohort studies, and case-control studies. Nevertheless, similar psychiatric conditions. principles operate when these concepts are applied in epidemiologic Department of Psychopharmacology, National Institute research. Whereas the terms may be applied slightly differently in of Mental Health and Neurosciences, Bangalore, India different explanatory texts, the general principles are the same. ([email protected]). ABSTRACT Clinical Situation Risk, and related measures of effect size (for Consider a hypothetical RCT in which 76 depressed patients were categorical outcomes) such as relative risks and randomly assigned to receive either venlafaxine (n = 40) or placebo odds ratios, are frequently presented in research (n = 36) for 8 weeks. During the trial, new-onset sexual dysfunction articles. Not all readers know how these statistics was identified in 8 patients treated with venlafaxine and in 3 patients are derived and interpreted, nor are all readers treated with placebo. These results are presented in Table 1.
    [Show full text]