Convergence of the Biostatistical and Survey Worlds Thomas A. Louis, PhD Department of Biostatistics Johns Hopkins Bloomberg SPH [email protected] Research & Methodology U. S. Census Bureau T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 1 Outline The Census Bureau A sampling of research at Census Adaptive design Disclosure avoidance A few other topics Design-based/Model-based Convergence of the Biostatistical and survey cultures T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 2 HAPPY 50th! Preamble Historically, survey, biostatistical and epidemiological methods and cultures were quite distinct, or at least appeared to be so However, service as Associate Director for Research & Methodology and Chief Scientist at the U. S. Census Bureau has heightened my awareness of the similarities of goals and methods, and of the many potentials Convergence steadily increases to the benefit of all I highlight some examples, but first T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 3 Preamble Historically, survey, biostatistical and epidemiological methods and cultures were quite distinct, or at least appeared to be so However, service as Associate Director for Research & Methodology and Chief Scientist at the U. S. Census Bureau has heightened my awareness of the similarities of goals and methods, and of the many potentials Convergence steadily increases to the benefit of all I highlight some examples, but first HAPPY 50th! T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 4 Selected surveys (of ≈ 130/yr) The American Community Survey (continuous) The Current Population Survey (CPS) Includes Health Insurance Qs The Survey of Income and Program Participation (SIPP) Ditto The National Survey of College Graduates The National Crime Victimization Survey (NCVS) The National Survey on Family Growth (NSFG) The Health Interview Survey International surveys and censuses The U. S. Census Bureau Employees ≈ 15,000 employees, of these, ≈ 5,000 are on permanent appointments The remainder are primarily part-time interviewers and other field staff Central office in Suitland MD, and 6 Regional offices Censuses The decennial census (the only activity embedded in the U. S. Constitution) The Population & Housing Census - every 10 years The Economic Census - every 5 years The Census of Governments - every 5 years Monthly Import/Export compilations T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 5 The U. S. Census Bureau Employees ≈ 15,000 employees, of these, ≈ 5,000 are on permanent appointments The remainder are primarily part-time interviewers and other field staff Central office in Suitland MD, and 6 Regional offices Censuses The decennial census (the only activity embedded in the U. S. Constitution) The Population & Housing Census - every 10 years The Economic Census - every 5 years The Census of Governments - every 5 years Monthly Import/Export compilations Selected surveys (of ≈ 130/yr) The American Community Survey (continuous) The Current Population Survey (CPS) Includes Health Insurance Qs The Survey of Income and Program Participation (SIPP) Ditto The National Survey of College Graduates The National Crime Victimization Survey (NCVS) The National Survey on Family Growth (NSFG) The Health Interview Survey International surveys and censuses T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 6 Necessary Inputs Sampling frame (under-utilized in clinical and field trials) Paradata= ) propensity models Cost & Quality metrics Measures of statistical information Timely and accurate data Adaptive Design Goals & Methods Reduce the time/expense from the start of data collection to completion Efficiently allocate data collection resources Use dynamic mode-switching to increase efficiency and enhance quality (dynamic treatment regimens) Employ stopping rules (possibly stratum-specific) T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 7 Adaptive Design Goals & Methods Reduce the time/expense from the start of data collection to completion Efficiently allocate data collection resources Use dynamic mode-switching to increase efficiency and enhance quality (dynamic treatment regimens) Employ stopping rules (possibly stratum-specific) Necessary Inputs Sampling frame (under-utilized in clinical and field trials) Paradata= ) propensity models Cost & Quality metrics Measures of statistical information Timely and accurate data T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 8 R-indicators: Overview Based on the sampling frame and attributes, R-Indicators quantify representativeness of survey coverage They identify the attributes that drive variation in response propensities and support adaptation by evaluating which subgroups are over/under represented The sample R-indicator ρi is the estimated (possibly adjusted) response propensity for group i v u N u 1 X R(ρ) = 1 − 2t (ρ − ρ¯)2 N − 1 i 1 R(ρ) = 1 indicates that the sample is fully representative T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 9 The National Survey of College Graduates1 Data are collected by a variety of modes: web, telephone, . The 2013 NSCG uses monitoring to identify target cases for mode-switching with the goal of moving a case to the mode with the highest response propensity or to control costs by not moving Hold a case in web if it is \low impact" Switch to CATI (Computer assisted telephone interview) if it has not responded via web and is \high impact" Put a CATI case on hold (no contacts) if the \R-indicator" shows that the group is over-represented Strike an effective cost/quality tradeoff 1Thanks to Ben Reist T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 10 Comparison of incentive approaches in the NSCG 4 separate surveys each using a different set of incentives, but with the same attributes used in the propensity model T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 11 Partial, unconditional, R-indicators Identify subgroups that are over/under represented and use the information to encourage or \not encourage" specific cases or groups Adapt by switching modes, incentives, etc. With ρk the estimated (possibly adjusted) response propensity for group X = k, ρ the composed vector, andρ ¯ the (weighted) mean, the (partial) unconditional R-indicator is „ « 1 Nk 2 Ru(X = k; ρ) = (ρk − ρ¯) N+ It's a residual and Ru = 0 ) balance T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 12 NSCG Data Monitoring Example T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 13 How long to wait before sending hard copy? Event-time analysis In the American Community Survey (ACS), need to determine how long to wait for an internet response before sending hard-copy Demographic group-specific, event-time distributions were estimated with the event being \answered via the internet" The event-time is administratively censored via sending hard-copy, contacting by phone, etc. With T the internet return time, compute, P(s; d) = pr(T ≤ s + d j T > s) Switch to hard copy if P(s; d) < γ for a specified delay d. Optimize wrt (d; γ) to reduce delay and control costs T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 14 Internet response time distributions2 2 From ACS Memorandum #ACS13{RER{18 T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 15 Stopping rules When is there sufficient information to stop conducting interviews? The \stop and impute rule" θ^now : Use currently collected data, augmented by imputation of missing values The \project rule" θ^future : Collect a specified number of additional interviews, and then augment by imputation of missing values If a prediction model indicates that “ ” pr j θ^now − θ^future j> < γ then stop and use θ^now Similar to futility assessment in a clinical trial T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 16 Learning from data generated by an adaptive design is complex Adaptation may induce confounding that needs to be removed The good news is that the propensities are available The database may be less useful for learning than one produced non-adaptively There is a trade-off between generating a learning database and optimizing survey performance A single mode becomes a \vector mode" Very similar to issues in adaptive clinical trials Issues with adaptive designs Need robust approaches to avoid degrading quality due to inappropriate adaptation wrt identified subgroups of interest To avoid degrading coverage for other subgroups You are creating the database; don't mess it up! T. A. Louis: Johns Hopkins Biostatistics & Census Bureau McGill, Epidemiology/Biostatistics 50th, 2015 17 Very similar to issues in adaptive clinical trials Issues with adaptive designs Need robust approaches to avoid degrading quality due to inappropriate adaptation wrt identified subgroups of interest To avoid degrading coverage for other subgroups You are creating the database; don't mess it up!
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages46 Page
-
File Size-