Multiple Comparisons Using R

Multiple Comparisons Using R © 2011 by Taylor and Francis Group, LLC C5742_FM.indd 1 6/29/10 2:35:24 PM Multiple Comparisons Using R Frank Bretz Torsten Hothorn Peter Westfall © 2011 by Taylor and Francis Group, LLC C5742_FM.indd 3 6/29/10 2:35:24 PM Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2011 by Taylor and Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number: 978-1-58488-574-0 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit- ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright. com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Bretz, Frank. Multiple comparisons using R / Frank Bretz, Torsten Hothorn, Peter Westfall. p. cm. Includes bibliographical references and index. ISBN 978-1-58488-574-0 (hardcover : alk. paper) 1. Multiple comparisons (Statistics) 2. R (Computer program language) I. Hothorn, Torsten. II. Westfall, Peter H., 1957- III. Title. QA278.4.B74 2011 519.5--dc22 2010023538 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com © 2011 by Taylor and Francis Group, LLC C5742_FM.indd 4 6/29/10 2:35:25 PM To Jiamei and my parents | F.B. To Carolin, Elke and Ludwig | T.H. To Marilyn and Amy | P.W. © 2011 by Taylor and Francis Group, LLC Contents List of Figures ix List of Tables xi Preface xiii 1 Introduction 1 2 General Concepts 11 2.1 Error rates and general concepts 11 2.1.1 Error rates 11 2.1.2 General concepts 17 2.2 Construction methods 20 2.2.1 Union intersection test 20 2.2.2 Intersection union test 22 2.2.3 Closure principle 23 2.2.4 Partitioning principle 29 2.3 Methods based on Bonferroni's inequality 31 2.3.1 Bonferroni test 31 2.3.2 Holm procedure 32 2.3.3 Further topics 34 2.4 Methods based on Simes' inequality 35 3 Multiple Comparisons in Parametric Models 41 3.1 General linear models 41 3.1.1 Multiple comparisons in linear models 41 3.1.2 The linear regression example revisited using R 45 3.2 Extensions to general parametric models 48 3.2.1 Asymptotic results 48 3.2.2 Multiple comparisons in general parametric models 50 3.2.3 Applications 52 3.3 The multcomp package 53 3.3.1 The glht function 53 3.3.2 The summary method 59 3.3.3 The confint method 64 vii © 2011 by Taylor and Francis Group, LLC viii CONTENTS 4 Applications 69 4.1 Multiple comparisons with a control 70 4.1.1 Dunnett test 71 4.1.2 Step-down Dunnett test procedure 77 4.2 All pairwise comparisons 82 4.2.1 Tukey test 82 4.2.2 Closed Tukey test procedure 93 4.3 Dose response analyses 99 4.3.1 A dose response study on litter weight in mice 100 4.3.2 Trend tests 103 4.4 Variable selection in regression models 108 4.5 Simultaneous confidence bands 111 4.6 Multiple comparisons under heteroscedasticity 114 4.7 Multiple comparisons in logistic regression models 118 4.8 Multiple comparisons in survival models 124 4.9 Multiple comparisons in mixed-effects models 125 5 Further Topics 127 5.1 Resampling-based multiple comparison procedures 127 5.1.1 Permutation methods 127 5.1.2 Using R for permutation multiple testing 137 5.1.3 Bootstrap testing { Brief overview 140 5.2 Group sequential and adaptive designs 140 5.2.1 Group sequential designs 141 5.2.2 Adaptive designs 150 5.3 Combining multiple comparisons with modeling 156 5.3.1 MCP-Mod: Dose response analyses under model uncer- tainty 157 5.3.2 A dose response example 162 Bibliography 167 Index 183 © 2011 by Taylor and Francis Group, LLC List of Figures 1.1 Probability of committing at least one Type I error 2 1.2 Scatter plot of the thuesen data 4 1.3 Impact of multiplicity on selection bias 8 2.1 Visualization of two null hypotheses in the parameter space 24 2.2 Venn-type diagram for two null hypotheses 25 2.3 Schematic diagram of the closure principle for two null hypotheses 25 2.4 Schematic diagram of the closure principle for three null hypotheses 27 2.5 Partitioning principle for two null hypotheses 30 2.6 Schematic diagram of the closure principle for three null hypotheses under the restricted combination condition 33 2.7 Rejection regions for the Bonferroni and Simes tests 36 2.8 Power comparison for the Bonferroni and Simes tests 37 3.1 Boxplots of the warpbreaks data 54 3.2 Two-sided simultaneous confidence intervals for the Tukey test in the warpbreaks example 65 3.3 Compact letter display for all pairwise comparisons in the warpbreaks example 67 4.1 Boxplots of the recovery data 70 4.2 One-sided simultaneous confidence intervals for the Dunnett test in the recovery example 76 4.3 Closed Dunnett test in the recovery example 81 4.4 Two-sided simultaneous confidence intervals for the Tukey test in the immer example 87 4.5 Compact letter display for all pairwise comparisons in the immer example 89 4.6 Alternative compact letter display for all pairwise comparisons in the immer example 90 4.7 Mean-mean multiple comparison and tiebreaker plot for all pairwise comparisons in the immer example 92 4.8 Mean-mean multiple comparison plot for selected orthogonal contrasts in the immer example 94 ix © 2011 by Taylor and Francis Group, LLC x LIST OF FIGURES 4.9 Schematic diagram of the closure principle for all pairwise comparisons of four means 96 4.10 Summary plot of the Thalidomide dose response study 102 4.11 Plot of contrast coefficients for the Williams and modified Williams tests 105 4.12 Simultaneous confidence band for the difference of two linear regression models 114 4.13 Boxplots of alpha synuclein mRNA expression levels 115 4.14 Comparison of simultaneous confidence intervals based on different estimates of the covariance matrix 119 4.15 Association between smoking behavior and disease status stratified by gender 121 4.16 Simultaneous confidence intervals for the probability of suffering from Alzheimer's disease 123 4.17 Layout of the trees513 field experiment 125 4.18 Probability of roe deer browsing damage 126 5.1 Schematic diagram of the closed hypotheses set for the adverse event data 130 5.2 Histograms of hypergeometric distributions 132 5.3 Wang-Tsiatis boundary examples 143 5.4 Hwang-Shih-DeCani spending function examples 144 5.5 Stopping boundaries for group sequential design example 147 5.6 Power plots for group sequential design example 149 5.7 Average sample sizes for group sequential design example 150 5.8 Schematic diagram of the closure principle for testing adap- tively two null hypotheses 153 5.9 Disjunctive power for different adaptive design options 156 5.10 Schematic overview of the MCP-Mod procedure 158 5.11 Model shapes for the selected candidate model set 164 5.12 Fitted model for the biom data 166 © 2011 by Taylor and Francis Group, LLC List of Tables 2.1 Type I and Type II errors in multiple hypotheses testing 12 2.2 Comparison of the Holm and Hochberg procedures 39 3.1 Multiple comparison procedures available in multcomp 62 4.1 Comparison of several multiple comparison procedures for the recovery data 82 4.2 Comparison of several multiple comparison procedures for the immer data 100 4.3 Summary data of the Thalidomide dose response example 101 4.4 Summary of the alzheimer data 120 5.1 Mock-up adverse event data 128 5.2 Adverse event contingency tables 131 5.3 Upper-tailed p-values for Fisher's exact test 133 5.4 Mock-up dataset with four treatment groups 136 5.5 Summary results of a study with two treatments and four outcome variables 139 5.6 Dose response models implemented in the DoseFinding package 160 xi © 2011 by Taylor and Francis Group, LLC Preface Many scientific experiments subject to rigorous statistical analyses involve the simultaneous evaluation of more than one question.

Load more