Bias and Causation, Models and Judgment for Valid Comparisons By

Weisberg review by J. Pearl, pages 659-660. TECHNICAL REPORT Biometrics 68, 657–660 R-399 June 2012 June 2012 BOOK REVIEWS Frailty Models in Survival Analysis Design of Experiments: An Introduction Based on (A. Wienke) David Oakes Linear Models (M.D.Morris) Gary W. Oehlert Logistic Regression Models (J.M.Hilbe) Annette J. Dobson Models and Judgment for Valid Comparisons (H. I. Weisberg, Bias and Causation,) Judea Pearl WIENKE, A. Frailty Models in Survival Analysis. It is important to emphasize what this book is not. It does Chapman and Hall/CRC, Boca Raton, Florida, 2011. xxi + not give a structured “how-to” guide for the practitioner. Use- 301 pp. $104.95/£66.99, ISBN 9781420073881. ful references are given to packages and programs in SAS, STATA, and R, but no specific examples of code are presented. As usually understood in survival analysis, frailty models are The relevant chapter in Therneau and Grambsch (2000) is extensions of the proportional hazards model that incorpo- still an excellent source in this regard. Nor is the book a text- rate unobserved random multiplicative components into the book in the conventional sense. There are no exercises for hazard function. Frailty models have become quite popular in the reader, to develop and test comprehension. My biggest recent years both as a way of modeling otherwise unexplained complaint is that access to the raw datasets analyzed in the heterogeneity in survival data and as an approach to modeling book is not provided. These datasets will mostly be too large associations between survival times, as may arise for example to reproduce in print. However the book would have been in the study of lifetimes of twins or the times to loss of visual very much more useful if some form of web access had been acuity in the left and right eyes of diabetics. Unlike previ- provided to at least some of the datasets. Without access to ous books on this topic—Duchateau and Janssen (2008) and the raw data, the reader cannot verify the numerical results Hougaard (2000) come to mind—this book has a special focus presented or examine the sensitivity of the analyses to dif- on correlated frailty models for bivariate survival data. These ferent model specifications. And think how many additional models include separate independent frailty components for citations the author could accumulate if his work could be the pair and for each member of the pair. A strength of the referenced as a source for interesting data sets! (The same book is the wide variety of real datasets used to illustrate mod- criticism also applies to the book by Duchateau and Janssen, els and methods. Occasionally some idiosyncratic choices are though these authors do provide access to “pseudo datasets” made—for example the treatment variable was omitted from of the same structure as the real data sets to enable the reader a cancer dataset because the author believes that conclusions at least to go through the motions of a statistical analysis). about the effectiveness of a treatment should be based only on The exposition in Wienke’s book is generally clear, al- randomized studies, and the present data were observational. though it is obvious that the author is not writing in his native This leaves the reader wondering whether an important vari- language. The proof reading is good. Overall, I would recom- able might have been omitted from the analysis—thus void- mend this book to specialists for the breadth of its coverage ing any substantive conclusions that could be drawn from the of the literature and to other readers seeking to sample the data. flavor of ongoing methodological research in frailty models. This book will be a very useful reference for researchers in For practitioners wanting a more discursive introduction to the area. The concise summaries of relevant literature that the general concepts—as well as to the apparent paradoxes appear at intervals throughout the text are particularly valu- and interpretive issues that permeate this fascinating area—I able in this regard. Many formulas are presented. As a word recommend they first look at Chapters 6–8 of Aalen, Borgan, of warning, some of these formulas are very complicated. Of- and Gjessing. (2008). ten their derivation may be simple conceptually but complex and tedious in practice (at least if done by hand, it would be interesting to know if any use has been made of symbolic References manipulation software in developing or checking this work). Aalen, O. O., Borgan, O. and Gjessing, H. K. (2008). Survival and Event Although the author often makes valiant efforts to explain the History Analysis: A Process Viewpoint. New York: Springer. meaning of the formulas, they can still be quite forbidding. Duchateau, L. and Janssen, P. (2008). The Frailty Model. New York: Needless to say, apart from recognizing some “old friends,” Springer. this reviewer has not made any systematic attempt to verify Hougaard, P. (2000). Analysis of Multivariate Survival Data. New York: the derivations. Springer. C 2012, The International Biometric Society 657 658 Biometrics, June 2012 Therneau, T. and Grambsch, P. (2000) Modeling Survival Data: Ex- In addition to the expected topics such as assessing model tending the Cox Model. New York: Springer. fit and modelling overdispersion, some less common topics are covered in depth. These include a variety of alternative ways of specifying and fitting models for variables with more than David Oakes two categories, and a chapter on panel/clustered/longitudinal Department of Biostatistics and Computational Biology data. University of Rochester The supporting material on the website covers not just ty- Rochester, New York, USA pographical errors (which are extremely rare) and datasets, but revised sections of the book which the author felt needed clarification, as well as updates of newer work and software enhancements. HILBE, J. M. Logistic Regression Models. Chapman The focus is on helping the reader to learn and understand & Hall/CRC, Boca Raton, Florida, 2009. xviii + 637 pp. logistic regression. The audience is not just students meeting $83.95/£53.99. ISBN 9781420075755. the topic for the first time, but also experienced users. I be- lieve the book really does meet the author’s goal “to serve This book really does cover everything you ever wanted to as a handbook that researchers can turn to as a resource for know about logistic regression, at least what was worth know- applying this range of models to their data.” ing until 2008—with updates available on the author’s website http://works.bepress.com/joseph hilbe/. Hilbe, a former nnette J. obson national athletics champion, philosopher, and expert in as- A D tronomy, is a master at explaining statistical concepts and Professor of Biostatistics methods. Readers familiar with his other expository work will School of Population Health know what to expect—great clarity. University of Queensland, Australia The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see MORRIS, M. D. Design of Experiments: An Introduc- everything spelt out, while a person familiar with some of the tion Based on Linear Models. Chapman & Hall/CRC, topics has the option to skip “obvious” sections. The mate- Boca Raton, FL, 2011. xviii + 355 pp. $89.95/£59.99, ISBN rial has been thoroughly road-tested through classroom and 9781584889236. web-based teaching. Suggestions from students are graciously acknowledged and incorporated; for instance, there is an Morris’ text targets second or third year graduate students appendix of the Greek alphabet with the names of the let- who are getting their first look at experimental design after ters and how they are used in the statistical literature. having had a few core courses in statistical theory and meth- The prior knowledge assumed of readers is a solid back- ods, in particular, basic linear models. As such, it is more ground in linear regression, with an additional recommenda- mathematical than most introductory books on design of ex- tion of calculus and probability theory (which I think would periments, but it is not a “theory of design” text. As might also be essential, together with introductory level familiarity be expected, the book relies heavily on matrix algebra, but with the principles of statistical inference). There are exercises there are no advanced methods involved. at the end of every chapter with a solutions manual available There are three distinguishing characteristics that separate from the author or the publisher. this text from many related books. First, this book empha- Stata is used extensively, as might be expected given the sizes design quality, or more precisely, how features of vari- author’s long involvement with this and other statistical soft- ous designs such as nuisance parameters and random or fixed ware. There is a 27-page introduction to Stata illustrated with blocks affect the precision of estimates and the power of tests. numerical examples of logistic regression. Almost every sec- Second, this is not a book from which to learn data analysis tion has examples using Stata. The corresponding R code is for designed experiments. Chapter 6 covers residual analysis, provided at the end of every chapter. SAS and SPSS proce- multiple comparisons, and transformations in just a dozen dures are mentioned, mainly where it is necessary to point out pages. There is, however, at least one “real world” experi- differences from Stata. More detail on the use of SAS, SPSS, ment in each chapter to motivate the designs being used, and and other software for logistic regression is available on the usually one or two designs with data in the chapter prob- author’s website.

Bias and Causation, Models and Judgment for Valid Comparisons By

JSM 2017 in Baltimore the 2017 Joint Statistical Meetings in Baltimore, Maryland, Which Included the CONTENTS IMS Annual Meeting, Took Place from July 29 to August 3

2016 IAA Awards Outstanding Contributions to Astrostatistics Award

Cou Tdata Modeling Count Data

LRP2020: Astrostatistics in Canada Arxiv:1910.08857V1 [Astro-Ph.IM

Our Place in Space N Ted’S Tour America’S Future N New Deals in Scholarships on the Final Frontier

Data Analysis Using Regression and Multilevel/Hierarchical Models

STATA July 1993 TECHNICAL STB-14 BULLETIN a Publication to Promote Communication Among Stata Users

July 2008 Article in the IMS Bulletin

Faculty and Staff Arts and Humanities Ivisiond

Front Matter

Survival Analysis for Epidemiologic and Medical Research Steve Selvin Cambridge University Press, Cambridge, UK, 2008

STATA May 1992 TECHNICAL STB-7 BULLETIN a Publication to Promote Communication Among Stata Users