Biostatistics for Oral Healthcare

Biostatistics for Oral Healthcare Jay S. Kim, Ph.D. Loma Linda University School of Dentistry Loma Linda, California 92350 Ronald J. Dailey, Ph.D. Loma Linda University School of Dentistry Loma Linda, California 92350 Biostatistics for Oral Healthcare Biostatistics for Oral Healthcare Jay S. Kim, Ph.D. Loma Linda University School of Dentistry Loma Linda, California 92350 Ronald J. Dailey, Ph.D. Loma Linda University School of Dentistry Loma Linda, California 92350 JayS.Kim, PhD, is Professor of Biostatistics at Loma Linda University, CA. A specialist in this area, he has been teaching biostatistics since 1997 to students in public health, medical school, and dental school. Currently his primary responsibility is teaching biostatistics courses to hygiene students, predoctoral dental students, and dental residents. He also collaborates with the faculty members on a variety of research projects. Ronald J. Dailey is the Associate Dean for Academic Affairs at Loma Linda and an active member of American Dental Educational Association. C 2008 by Blackwell Munksgaard, a Blackwell Publishing Company Editorial Offices: Blackwell Publishing Professional, 2121 State Avenue, Ames, Iowa 50014-8300, USA Tel: +1 515 292 0140 9600 Garsington Road, Oxford OX4 2DQ Tel: 01865 776868 Blackwell Publishing Asia Pty Ltd, 550 Swanston Street, Carlton South, Victoria 3053, Australia Tel: +61 (0)3 9347 0300 Blackwell Wissenschafts Verlag, Kurfürstendamm57, 10707 Berlin, Germany Tel: +49 (0)30 32 79 060 The right of the Author to be identified as the Author of this Work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher. First published 2008 by Blackwell Munksgaard, a Blackwell Publishing Company Library of Congress Cataloging-in-Publication Data Kim, Jay S. Biostatistics for oral healthcare / Jay S. Kim, Ronald J. Dailey. – 1st ed. p. ; cm. Includes bibliographical references and index. ISBN-13: 978-0-8138-2818-3 (alk. paper) ISBN-10: 0-8138-2818-X (alk. paper) 1. Dentistry–Statistical methods. 2. Biometry. I. Dailey, Ronald. II. Title. [DNLM: 1. Biometry–methods. 2. Dentistry. WA 950 K49b 2008] RK52.45.K46 2008 617.60072–dc22 2007027800 978-0-8138-2818-3 Set in 10/12pt Times by Aptara Inc., New Delhi, India Printed and bound in by C.O.S. Printers PTE LTD For further information on Blackwell Publishing, visit our website: www.blackwellpublishing.com Disclaimer The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting a specific method, diagnosis, or treatment by practitioners for any particular patient. The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. Readers should consult with a specialist where appropriate. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware that Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising herefrom. The last digit is the print number: 987654321 Contents Preface ix 3.12 Box-Whisker Plot 41 3.13 Variance and Standard Deviation 43 3.14 Coefficient of Variation 46 1 Introduction 1 3.15 Variance of Grouped Data 48 1.1 What Is Biostatistics? 1 3.16 Skewness 48 1.2 Why Do I Need Statistics? 2 3.17 Exercises 50 1.3 How Much Mathematics Do I Need? 2 3.18 References 53 1.4 How Do I Study Statistics? 2 1.5 Reference 3 4 Probability 55 4.1 Introduction 55 2 Summarizing Data and Clinical Trials 5 4.2 Sample Space and Events 55 2.1 Raw Data and Basic Terminology 5 4.3 Basic Properties of Probability 56 2.2 The Levels of Measurements 7 4.4 Independence and Mutually 2.3 Frequency Distributions 9 Exclusive Events 61 2.3.1 Frequency Tables 9 4.5 Conditional Probability 62 2.3.2 Relative Frequency 12 4.6 Bayes Theorem 65 2.4 Graphs 13 4.7 Rates and Proportions 69 2.4.1 Bar Graphs 13 4.7.1 Prevalence and Incidence 69 2.4.2 Pie Charts 14 4.7.2 Sensitivity and Specificity 70 2.4.3 Line Graph 14 4.7.3 Relative Risk and Odds Ratio 73 2.4.4 Histograms 15 4.8 Exercises 75 2.4.5 Stem and Leaf Plots 19 4.9 References 79 2.5 Clinical Trials and Designs 20 2.6 Confounding Variables 22 5 Probability Distributions 81 2.7 Exercises 22 5.1 Introduction 81 2.8 References 24 5.2 Binomial Distribution 81 5.3 Poisson Distribution 86 3 Measures of Central Tendency, 5.4 Poisson Approximation to Dispersion, and Skewness 27 Binomial Distribution 87 3.1 Introduction 27 5.5 Normal Distribution 88 3.2 Mean 27 5.5.1 Properties of Normal 3.3 Weighted Mean 30 Distribution 88 3.4 Median 31 5.5.2 Standard Normal Distribution 90 3.5 Mode 33 5.5.3 Using Normal Probability 3.6 Geometric Mean 34 Table 91 3.7 Harmonic Mean 34 5.5.4 Further Applications of 3.8 Mean and Median of Grouped Data 35 Normal Probability 94 − α th 3.9 Mean of Two or More Means 37 5.5.5 Finding the (1 ) 100 3.10 Range 38 Percentiles 95 3.11 Percentiles and Interquartile Range 39 5.5.6 Normal Approximation to the Binomial Distribution 96 v vi Contents 5.6 Exercises 99 9.2 Two-Sample Z Test for 5.7 References 102 Comparing Two Means 159 9.3 Two-Sample t Test for 6 Sampling Distributions 103 Comparing Two Means with 6.1 Introduction 103 Equal Variances 161 6.2 Sampling Distribution of the Mean 103 9.4 Two-Sample t Test for Comparing 6.2.1 Standard Error of the Two Means with Unequal Variances 163 Sample Mean 104 9.5 The Paired t Test 165 6.2.2 Central Limit Theorem 106 9.6 Z Test for Comparing Two 6.3 Student t Distribution 108 Binomial Proportions 168 6.4 Exercises 110 9.7 The Sample Size and Power of a 6.5 References 111 Two-Sample Test 170 9.7.1 Estimation of Sample Size 170 9.7.2 The Power of a 7 Confidence Intervals and Sample Size 113 Two-Sample Test 172 7.1 Introduction 113 9.8 The F Test for the Equality of 7.2 Confidence Intervals for the Mean Two Variances 173 μ and Sample Size n When σ 9.9 Exercises 176 Is Known 113 9.10 References 179 7.3 Confidence Intervals for the Mean μ and Sample Size n When σ Is Not Known 117 10 Categorical Data Analysis 181 7.4 Confidence Intervals for the 10.1 Introduction 181 Binomial Parameter p 119 10.2 2 × 2 Contingency Table 181 7.5 Confidence Intervals for the 10.3 r × c Contingency Table 187 Variances and Standard Deviations 121 10.4 The Cochran-Mantel-Haenszel 7.6 Exercises 124 Test 189 7.7 References 126 10.5 The McNemar Test 191 10.6 The Kappa Statistic 194 χ 2 8 Hypothesis Testing: One-Sample Case 127 10.7 Goodness-of-Fit Test 195 8.1 Introduction 127 10.8 Exercises 198 8.2 Concepts of Hypothesis Testing 128 10.9 References 201 8.3 One-Tailed Z Test of the Mean of σ 2 a Normal Distribution When 11 Regression Analysis and Correlation 203 Is Known 131 11.1 Introduction 203 8.4 Two-Tailed Z Test of the Mean of 11.2 Simple Linear Regression 203 σ 2 a Normal Distribution When 11.2.1 Description of Is Known 137 Regression Model 206 8.5 t Test of the Mean of a Normal 11.2.2 Estimation of Distribution 141 Regression Function 207 8.6 The Power of a Test and Sample 11.2.3 Aptness of a Model 209 Size 144 11.3 Correlation Coefficient 212 8.7 One-Sample Test for a Binomial 11.3.1 Significance Test of Proportion 148 Correlation Coefficient 215 χ 2 8.8 One-Sample Test for the 11.4 Coefficient of Determination 217 Variance of a Normal Distribution 150 11.5 Multiple Regression 219 8.9 Exercises 153 11.6 Logistic Regression 221 8.10 References 157 11.6.1 The Logistic Regression Model 221 9 Hypothesis Testing: Two-Sample Case 159 11.6.2 Fitting the Logistic 9.1 Introduction 159 Regression Model 222 Contents vii 11.7 Multiple Logistic Regression 14.10 The Squared Rank Test for Model 223 Variances 272 11.8 Exercises 223 14.11 Spearman’s Rank Correlation 11.9 References 225 Coefficient 274 14.12 Exercises 275 14.13 References 277 12 One-way Analysis of Variance 227 12.1 Introduction 227 12.2 Factors and Factor Levels 227 15 Survival Analysis 279 12.3 Statement of the Problem and 15.1 Introduction 279 Model Assumptions 228 15.2 Person-Time Method and 12.4 Basic Concepts in ANOVA 228 Mortality Rate 279 12.5 F Test for Comparison of k 15.3 Life Table Analysis 281 Population Means 229 15.4 Hazard Function 282 12.6 Multiple Comparison Procedures 234 15.5 Kaplan-Meier Product

Biostatistics for Oral Healthcare

Wavelet Entropy Energy Measure (WEEM): a Multiscale Measure to Grade a Geophysical System's Predictability

Big Data for Reliability Engineering: Threat and Opportunity

Volatility Estimation of Stock Prices Using Garch Method

A Python Library for Neuroimaging Based Machine Learning

BIOSTATS Documentation

Public Health & Intelligence

Measuring Predictability: Theory and Macroeconomic Applications," Journal of Applied Econometrics, 16, 657-669

Central Limit Theorems When Data Are Dependent: Addressing the Pedagogical Gaps

Measuring Complexity and Predictability of Time Series with Flexible Multiscale Entropy for Sensor Networks

Incubation Periods Impact the Spatial Predictability of Cholera and Ebola Outbreaks in Sierra Leone

Understanding Predictability

Randomness, Predictability, and Complexity in Repeated Interactions V0.02 Preliminary Version, Please Do Not Cite