<<

ANALYTICAL METHODS IN NUTRITIONAL NUTR 818-Fall 2009 Mondays 9:00 -10:30 Tuesdays 5:00-6:30 in the Micro computer lab

Linda Adair, Ph.D. Anna Maria Siega-Riz 405 University Square 205I University Square East Phone: 966-4449 E-mail [email protected] E-mail: [email protected]

Course Description: This course will blend lecture/discussion and “hands on” laboratory assignments as a means to learn about analytic methods in epidemiology. Students will gain basic proficiency in the methods through conducting statistical analysis using nutrition data selected for task.

Topics to be covered include:

I. DATA: Basics of data management, data analysis using Stata Protecting confidentiality of data: human subjects issues, deductive disclosure

II. : From assessment of intake to diet- analysis Selection, creation, and validation of dietary assessment tools Use of composition tables to calculate intakes Combining data from 24 hour recalls with FFQ to improve nutrient estimates Defining nutrition exposures: timing of measurement Measurement error, over and under-reporting of dietary intake adjustment methods Use of biomarkers

III. and BODY COMPOSITION What is measured and why? Validation of anthropometric data Use of reference data: selection of reference data, calculation of Z-scores CDC and WHO growth charts, IOTF reference data, cut points and definitions of overweight and

IV. and GENE-ENVIRONMENT INTERACTIONS IN NUTRITION EPIDEMIOLOGY

V. PHYSICAL ACTIVITY: MEASUREMENT AND ANALYSIS OF PHYSICAL ACTIVITY DATA

VI. APPLICATION OF METHODS RELEVANT TO NUTRITION EPIDEMIOLOGY Elucidating pathways: translating conceptual models into statistical models and dealing with confounders, mediators, endogeneity, multilevel models Working with large samples; design effects, use of sample weights etc. Sample selectivity: Loss to follow-up, , generalizability Longitudinal analysis

At the end of this course the student should be able to:1)select the appropriate diet assessment tool including designing/updating a food frequency questionnaire and apply the appropriate statistical method in the analysis; 2) appropriately categorize nutritional exposures; 3) calculate z scores for anthropometric data and understand their use in analysis 4) apply statistical techniques for sample design effects; 5) understand the basics of how to build statistical models, 6) analyze and interpret environment-gene interactions, 7) select appropriate physical activity indicators for epidemiologic studies and 8) assess generalizability and bias related to sample selectivity and loss to follow-up.

Requirements and Grading:

Students are expected to attend all classes and labs, read assigned materials, and participate in class discussions. Most of the assigned reading materials will be available in electronic form and will be posted on the Blackboard site for the course.

1

Students may find Willet’s text book, second edition, 1998 to be useful as a background reference.

Written assignments:

1. Labs: Labs will be started in class, and then students will complete the assignment and analysis and write up their results. Instructions, questions and data sets will be provided for each lab. Students may together to discuss concepts and methods, but individuals must do their own write-ups.

2. Paper/presentation: Each student will select a topic related to one of the methods discussed in the class. Students should ask and then answer a specific methodological question by conducting data analysis. The main topic/question and data to be used for the analysis should be selected in consultation with and approved in advance by the course instructors. Students are expected to meet with course instructors to discuss their ongoing analysis. Results should be written up in a paper (suggested length not to exceed 10 pages of text, double spaced), and presented to the class in a 10 minute talk. Where applicable, students are responsible for obtaining IRB approval for their projects.

Papers and presentations should be organized as follows:

1. Background: what motivates the research question? why is it important? what will you contribute? 2. Statement of purpose/aim/research question: 1 sentence statement of main objective 3. Methods a. description of sample and key variables b. analytic methods 4. Results 5. Discussion

Due Dates: September 8: Identify data set and main question: Please submit 1-2 paragraphs describing: (1) the main issue you will address and why it is important; (2) the data set you will use. Please fill out the table at the end of this syllabus to provide the information about your data. September 21: Progress report: Brief (1 paragraph description of what you have done, and questions you may have about how to proceed, etc.) December 7, 8 Presentations Dec 14: Final papers due

Unless otherwise noted on the syllabus, LABS will be due one week after the lab session.

Grades will be based on: labs (50%), class participation (10%) and the final project (40%: 30% written, 10% oral)

2

Dates and Topics

T Aug 25 Introduction and confidentiality, data protections, management, creation of data files, and proper documentation (lab/demo) (Adair)

A useful overview/review of basic issues in nutritional epidemiology:

Sempos CT, Liu K, Ernst ND. Food and nutrient exposures: what to consider when evaluating epidemiologic evidence. Am J Clin Nutr. 1999 Jun;69(6):1330S-1338S. sempos1.pdf

Guidelines related to data security, need for IRB review. datasecurity.pdf student_research_irb_guidance.pdf deidentified data.pdf

M Aug 31 Anthropometry, body composition, and controversies relate to the use of reference data and cutpoints (Adair)

Sources for growth reference data:

CDC/NCHS 2000: http://www.cdc.gov/growthcharts/ WHO: http://www.who.int/childgrowth/mgrs/en/

IOTF: Cole TJ, Bellizzi MC, Flegal K. Establishing a standard definition for child overweight and obesity worldwide: international survey. BMJ 2000;320:1240-1243 ( 6 May ) ColeIOTF.pdf

Cole TJ, Flegal KM, Nicholls D, Jackson AA. Body mass index cut offs to define thinness in children and adolescents: international survey. BMJ. 2007 Jul 28;335(7612):194. Coleundernutrition.pdf

Comparison of results from new and old reference data: de Onis M, Garza C, Onyango AW, Borghi E. Comparison of the WHO child growth standards and the CDC 2000 growth charts. J Nutr. 2007 Jan;137(1):144-8 deOnis.pdf

Representing BMI in adolescents: Berkey CS, Colditz GA. Adiposity in adolescents: change in actual BMI works better than change in BMI z score for longitudinal studies. Ann Epidemiol. 2007 Jan;17(1):44-50. berkey&colditz.pdf

The debate about cut points: Razak F, Anand SS, Shannon H, Vuksan V, Davis B, Jacobs R, Teo KK, McQueen M, Yusuf S. Defining obesity cut points in a multiethnic . Circulation. 2007 Apr 24;115(16):2111-8. Razak.pdf

T Sept 1 Lab #1 Anthropometry Lab (Adair)

M Sept 7 LABOR DAY: No Class

T Sept 8 Diet Assessment –new tools and methodological techniques (Siega-Riz)

Automated Self-Administered 24 hour dietary recall (ASA24) http://riskfactor.cancer.gov/tools/instruments/asa24/

Subar AF, Dodd KW, Guenther PM, Kipnis V, Midthune D, McDowell M, Tooze JA, Freedman LS, Krebs-Smith SM. The food propensity questionnaire: concept, development, and validation for use as a covariate in a model to estimate usual food intake. J Am Diet Assoc. 2006 Oct;106(10):1556-63.

3

SubarFPQ.pdf

M Sep 14 Estimating Usual Intake and Identifying Outliers (Siega-Riz)

Dodd K, Guenther P, et al. Statistical methods for estimating usual intake of and : A review of the Theory. JADA 2006;106:1640-1650. Dodd_2006JADA_usualintakes.pdf

Tooze J, Midthune D, Dodd K, et al. A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. JADA 2006;106:1575-87. Tooze.pdf

Kipnis V, Midthune D, Buckman, DW, et al. Modeling data with excess zeros and measurement error:Application to evaluating relationships between episodically consumed foods and outcomes. Biometrics 2009 Kipnis_biometrics_2009.pdf

Huang TTK Effect of Screening Out Implausible Energy Intake Reports on Relationships between Diet and BMI. Obesity Research 2005; 13:1205-17. Huang2005.pdf

T Sep 15 Lab # 2: Assessing over and under-reporting of dietary intake and maybe usual intakes (Siega-Riz)

Lissner L, Troiano RP, Midthune D, Heitmann BL, Kipnis V, Subar AF, Potischman N. OPEN about obesity: recovery biomarkers, dietary reporting errors and BMI. Int J Obes (Lond). 2007 Jun;31(6):956-61. Lissner.pdf

Huang TTK Effect of Screening Out Implausible Energy Intake Reports on Relationships between Diet and BMI. Obesity Research 2005; 13:1205-17. Huang2005.pdf

M Sep 21 Discussion of projects: Turn in your progress report, and we will discuss ideas, approaches, etc. (Adair)

T Sep 22 Lab # 3 Energy adjustment (Siega-Riz)

Hu. FB, et al. Dietary and coronary heart disease: A comparison of approaches for adjusting for total energy intake and modeling repeated dietary measurements. Am J Epidemiol 1999;149:531-40. Hu Dietary Fat.pdf

Bellach B, Kohlmeier L. Energy Adjustment Does Not Control for Differential Recall Bias in Nutritional Epidemiology. J Clin Epid 1998; 51:393-398. Bellach.pdf

M Sep 28 Representing data to test hypotheses: continuous, categorical, clusters, factors, etc. (Adair)

Note: In this paper, please focus on the ways in which the dietary data were categorized for the analysis, in particular the use of categories of intake and quintiles. Think about the statistical methods that must be used to deal with the different categorical variables, and how the categorization might affect the results. Sempos CT, Flegal KM, Johnson CL et al. Issues in the long-term of diet in longitudinal studies J Nutr 1993;123:406-12. sempos2.pdf Dietary patterns: overview of the issues Moeller SM, Reedy J, Millen AE, Dixon LB, Newby PK, Tucker KL, Krebs-Smith SM, Guenther PM. Dietary patterns: challenges and opportunities in dietary patterns research an Experimental workshop, April 1, 2006. J Am Diet Assoc. 2007 Jul;107(7):1233-9. MoellerDietaryPatterns.pdf Comparison of factor and cluster results Newby PK, Muller D, Tucker KL. Associations of empirically derived eating patterns with plasma biomarkers: a comparison of factor and cluster analysis methods. Am J Clin Nutr. 2004 Sep;80(3):759-67 Newby.pdf Reduced rank regression: description of the method

4

Hoffmann K, Schulze MB, Schienkiewitz A, Nöthlings U, Boeing H. Application of a new statistical method to derive dietary patterns in nutritional epidemiology. Am J Epid. 2004 May 15;159(10):935-44. HoffmannRRR.pdf Reduced rank regression: an example where it was used: Liese AD, Weis KE, Schulz M, Tooze JA.Food intake patterns associated with incident type 2 : the Insulin Resistance Atherosclerosis Study. Diabetes Care. 2009 Feb;32(2):263-8. liese.pdf Examples of papers that use factor or cluster analysis: the following papers were based on work done for this class!) Monda KL, Popkin BM. Cluster analysis methods help to clarify the activity-BMI relationship of Chinese youth. Obes Res. 2005 Jun;13(6):1042-51. Monda&Popkin.pdf Austin GL, Adair LS, Galanko JA, Martin CF, Satia JA, Sandler RS. A diet high in and low in meats reduces the risk of colorectal adenomas. J Nutr. 2007 Apr;137(4):999-1004. Austin.pdf Williams CD, Satia JA, Adair LS, Stevens J, Galanko J, Keku TO, Sandler RS. Dietary patterns, food groups, and rectal cancer risk in Whites and African-Americans. Cancer Epidemiol Biomarkers Prev. 2009 May;18(5):1552-61. Williams2009.pdf

T Sep 29 Lab # 5 Data representation (Adair)

M Oct 5 Genetic Epidemiology and modeling interactions. (Dr. Keri Monda)

Andreasen CH, Andersen G. Gene-environment interactions and obesity-Further aspects of genomewide association studies. Nutrition 2009;Jul 11 Andreasen.pdf Vimaleswaran KS, Li S, Zhao JH, et al. Physical activity attenuates the body mass index–increasing influence of genetic variation in the FTO gene. Am J Clin Nutr 2009; 90: 425-428. Vimaleswaran.pdf Paul R Burton, Martin D Tobin, John L Hopper. Genetic Epidemiology 1: Key concepts in genetic epidemiology. Lancet 2005; 366: 941–51. burton.pdf

T Oct 6 Lab # 6 Modeling genetic data and testing gene-environment and other interactions (Keri Monda)

M Oct 12 (University Day, class ends at 10 am) Design Effects (Adair) Chromy J. and Abeyasekera, S. Sample Surveys in Developing and Transition Countries Chapter XIX: Statistical analysis of survey data http://unstats.un.org/unsd/HHsurveys/pdf/Chapter_19.pdf Chromy surveydata.pdf Chapters from Encyclopedia of : together in file called software.pdf Pitfalls of Using Standard Statistical Software Packages for Sample Survey Data Donna J. Brogan Software for Statistical Analysis of Sample Survey Data. Barbara Lepidus Carlson An example of how sample weights are created and used: National Health and Nutrition Examination Survey III: Weighting and estimation . http://archive.nlm.nih.gov/proj/dxpnet/nhanes/docs/doc/nhanes_analysis/wgt_exec.pdf NHANES weights.pdf Analysis of Add Health data: Guidelines for Analyzing Add Health Data, by Kim Chantala http://www.cpc.unc.edu/projects/addhealth/files/wt_guidelines.pdf Chantala AddHealth.pdf

T Oct 13 Lab #7 Design effects, SVY commands, sample weights (Adair)

5

M Oct 19 Sample selectivity, missing data (Adair) Schulz KF, Grimes DA. Sample size slippages in randomized trials: exclusions and the lost and wayward. Lancet. 2002 Mar 2;359(9308):781-5. Shulzlancet.pdf Hara, Megumi, Satoshi Sasakia, Tomotaka Sobueb, Seiichiro Yamamotob, Shoichiro Tsugane, and for the JPHC Study Group. Comparison of cause-specific mortality between respondents and nonrespondents in a population-based prospective study Ten-year follow-up of JPHC Study Cohort I Journal of Clinical Epidemiology 2001; 55 (2)150-156. Hara.pdf Cheung YB. Adjustment for selection bias in cohort studies: an application of a probit model with selectivity to course epidemiology. J Clin Epidemiol. 2001 Dec;54(12):1238-43. cheung.pdf Haukoos JS, Newgard CD. Advanced statistics: missing data in clinical research--part 1: an introduction and conceptual framework.Acad Emerg Med. 2007 Jul;14(7):662-8. haukoos.pdf Newgard CD, Haukoos JS. Advanced statistics: missing data in clinical research--part 2: multiple imputation. Acad Emerg Med. 2007 Jul;14(7):669-78. newgard.pdf Kenward MG, Carpenter J. Multiple imputation: current perspectives. Stat Methods Med Res. 2007 Jun;16(3):199-218. kenward.pdf

T Oct 20 Lab #8 Sample selection and missing data (Adair) Alonso A, Segui-Gomez M, de Irala J, Sanchez-Villegas A, Beunza JJ, Martinez-Gonzalez MA. Predictors of follow-up and assessment of selection bias from dropouts using inverse probability weighting in a cohort of university graduates. Eur J Epidemiol. 2006;21(5):351-8. Alonso.pdf

M Oct 26 Measuring and representing physical activity data (Dr Derek Hales) TBA

T Oct 27 Lab # 9 Physical Activity (Dr. Derek Hales)

M Nov 2 Longitudinal modeling (Adair)

Fitzmaurice GM, Ravichandran C. A primer in longitudinal data analysis. Circulation. 2008; 118: 2005– 2010. Fitzmaurice.pdf Bhargava, Alok. A longitudinal analysis of the risk factors for diabetes and coronary heart disease in the Framingham Offspring Study. Metrics, 2003, Vol 1(1):3 Bhargava.pdf Moore AA, Gould R, Reuben DB, Greendale GA, Carter MK, Zhou K, Karlamangla A. Longitudinal patterns and predictors of alcohol consumption in the United States. Am J . 2005 Mar;95(3):458-65. Moore.pdf Good for those who want to use longitudinal models in Stata: Multilevel and Longitudinal Modeling Using Stata, 2nd Edition Sophia Rabe-Hesketh and Anders Skrondal. Stata Press, 2008

T Nov 3 Lab #10 Longitudinal modeling lab (Adair)

M Nov 9 Discussion, review of prior labs T Nov 10 Progress reports, discussion of papers

M Nov 16 Multilevel Modeling (Dr. Shuwen Ng) Diez-Roux, A. V.. 2000. "Multilevel analysis in Public Health Research". Annu. Rev. Public :171-192. diezroux_ARPH2000.pdf

6

Ball, K., D. Crawford, G. Mishra. 2005. "Socio-economic inequalities in women's and intakes: a multilevel study of individual, social and environmental mediators." Public Health Nutrition 9(5): 623-630. Balletal_PHN2005.pdf Monda, K. L., P. Gordon-Larsen, J. Stevens, B. M. Popkin. 2006. "China's transition: The effect of rapid urbanization on adult occupational physical activity." Social and Medicine 64(4): 858-870. Monda_SSM2007.pdf

The following paper is a good resource for those who wish to use multi-level method, but is quite technical. Angeles, G., D. K. Guilkey, T. A. Mroz. 2005. "The Impact of -level variables on Individual-level outcomes: Theoretical Results and Applications." Sociological Methods Research 34: 76-121. Gustavo_etal_SMR2005.pdf Optional example Griffiths, P., N. Madise, A. Whitworth, Z. Matthews. 2004. "A tale of two continents: a multilevel comparison of the determinants of child nutritional status from selected African and Indian regions." Health & Place 10: 183- 199. Griffiths_H&P2004.pdf

T Nov 17 Lab # 11 Multilevel Modeling Lab

M Nov 23 No class: work on your paper! T Nov 24 No class: work on your paper!

M Nov 30 Latent growth curve models (Meghan Slining)

Tu YK, D'Aiuto F, Baelum V, Gilthorpe MS. An introduction to latent growth curve modeling for longitudinal continuous data in dental research. Eur J Oral Sci. 2009 Aug;117(4):343-50 TuLatentCurves.pdf

Definitive text for those who want to use this method: Bollen, KA and Curran P. Latent Curve Models: A Structural Equation Perspective. Wiley Series in Probability and Statistics. 2007

T Dec 1 Structural Equation Modeling (Adair: lecture only. No more labs!)

Mishra G, Nitsch D, Black S, De Stavola B, Kuh D, Hardy R. A structured approach to modelling the effects of binary exposure variables over the life course. Int J Epidemiol. 2009 Apr;38(2):528-37. Mishra.pdf

De Stavola BL, Nitsch D, dos Santos Silva I, McCormack V, Hardy R, Mann V, Cole TJ, Morton S, Leon DA. Statistical issues in life course epidemiology.Am J Epidemiol. 2006 Jan 1;163(1):84-96 DeStavola.pdf

M Dec 7 Student presentations

T Dec 8 Student presentations

7

Student 1.Project 2. Data source 3.Publicly 4. IRB 5. Permission/ 6. Does data 7. How 8. Data disposition 9. Name (student’s available data? approved?/ data use set include Confidentiality is at end of class HIPAA? research topic) PI name agreement personal protected obtained? identifiers?

8