<<

European Journal of Clinical Nutrition (1997) 51, 437±442 ß 1997 Stockton Press. All rights reserved 0954±3007/97 $12.00 of a food frequency questionnaire used in the New York University Women's Health Study: Effect of self-selection by study subjects

E Riboli1, P Toniolo2, R Kaaks1, RE Shore2, C Casagrande1 and BS Pasternack2

1Unit of Nutrition and Cancer, International Agency for Research on Cancer, 150 cours Albert-Thomas, 69372 Lyon Cedex 08, France; and 2Nelson Institute of Environmental Medicine and Kaplan Comprehensive Cancer Center, New York University School of Medicine, 341 East 25th Street, New York, NY 10010-2598, USA

Objective: The aim of the study was to investigate the reproducibility of a food frequency questionnaire (FFQ) used in a cohort of 14 290 women enrolled in the NYU Women's Health Study. Design: A subset of 474 cohort subjects who completed the dietary questionnaire at baseline (FFQ-1) were approached again on the occasion of a second visit to the mammography study centre and asked to complete the questionnaire a second time (FFQ-2) two to four years after FFQ-1. Two to three months later the questionnaire was mailed to the subjects, and they were asked to complete it a third time (FFQ-3). Setting: A breast cancer screening clinic. Subjects: Of the 474 subjects selected, 100% completed the second questionnaire while only 56% completed and mailed back FFQ-3. This made it possible to compare the long-term reproducibility of dietary intake measurements and baseline dietary habits between the two groups of subjects: `respondents', who agreed to complete the questionnaire a third time, and `non-respondents', who did not. Results: Among respondents (56% of study subjects), energy-adjusted correlation coef®cients for short-term reproducibility between FFQ-2 and FFQ-3 ranged from 0.50±0.64 for nutrients, and from 0.44±0.67 for foods. The long-term reproducibility was lower, ranging from 0.36±0.53 for nutrients, and from 0.31±0.48 for speci®c food groups. Among those who did not respond to FFQ-3, crude correlations for long-term reproducibility, unadjusted for energy intake, were generally lower than among respondents. Nevertheless, after adjustment for energy intake, correlations for long-term reproducibility (FFQ-2 to FFQ-1) were of similar magnitude in both groups. In addition, `non-respondents' reported lower intake of fruit and vegetables and higher intake of meat. Conclusions: The results of this study suggest that subjects who volunteer to participate in substudies on the validity or reproducibility of dietary questionnaire measurements may tend to provide more accurate responses to the questionnaire. The phenomenon seems related more to accuracy of reporting of absolute intake levels than of the relative composition of diet. In addition, self-selection may be associated with differences in dietary habits. Sponsorship: This study was supported by Public Health Service (PHS) grants CA 34588, CA 13343 and CA 16087 from the National Cancer Institute, National Institutes of Health (NIH), Department of Health and Human Services (DHHS); and by PHS ES 00260 from the National Institute of Environmental Health Sciences, NIH, DHHS. Descriptors: dietary measurements; reproducibility; ; dietary questionnaires; cohort studies; dietary patterns

Introduction over a six- to twelve-month interval (Hankin et al, 1983; Wu et al, 1988; Jain et al, 1989; JaÈrvinen et al, 1993) or a In nutritional studies, in which disease risk is `validity' study, using a comparison with records of actual estimated as a function of an individual's consumption food consumption during a number of days as well (Willett levels of speci®c foods or nutrients, it is essential that et al, 1985; Pietinen et al, 1988a, 1988b; Munger et al, measurement errors leading to incorrect ranking of indivi- 1992; Rimm et al, 1992; Callmer et al, 1993). In both types duals by levels of dietary intake be kept to a minimum, so of substudy, with or without additional reference measure- as to preserve the highest possible statistical power to ments, there is the potential problem that participation rates demonstrate the presence of relevant diet-disease associa- may be lower than 100% (in some studies, participation tions. As some measurement error will, however, always rates as low as 20% have been reported (Munger et al, occur, additional substudies may be conducted to estimate 1992)), and that there may be self-selection of volunteers the magnitude of statistical power loss or bias in estimates whose capacity or motivation to respond accurately to the of diet-disease associations, incurred by the inaccuracy of dietary questionnaires is above average. In some of the dietary questionnaire measurements. These substudies can articles reporting the results of validation studies, the main take the form either of a reproducibility study, based on argument supporting the absence of bias due to `self- successive administrations of the questionnaire, usually selection' was that nutrient intakes did not appreciably differ between participants in the validation study and the Correspondence: Dr E Riboli. overall study population (Munger et al, 1992). We are not Received 24 May 1996; revised 10 March 1997; accepted 14 March 1997 aware, however, of any study that has evaluated the Reproducibility of FFQ: Subjects' self selection E Riboli et al 438 potential magnitude of over- or under-estimation of the time they returned to the clinic for mammographic screen- validity or reproducibility of questionnaire measurements ing. as a result of such self-selection. The study design is summarized in Table 1. In brief, In this paper, we report results on the reproducibility of a during a two-month period (April±May 1990) all 474 dietary questionnaire used to collect baseline exposure data cohort members returning to the screening clinic one to in the New York University Women's Health Study, a ®ve years after being enrolled in the study were asked to prospective designed to investigate the rela- complete the dietary questionnaire a second time (FFQ-2) tions between diet, endogenous hormones, and of while waiting for a mammographic examination in the cancers of the breast and other organs. The reproducibility clinic. In this situation, all of them complied. Two study, which included 475 women, was designed to com- months later, the dietary questionnaire was mailed to the pare the concordance between repeat measurements of same 474 women, and they were asked to complete it for usual diet over a longer (2±5 y) as well as over a shorter the third time (FFQ-3) and send it back in a prepaid return (2±3 months) period of time. The ®rst two questionnaire envelope. Without further solicitation, 267 women (56.3%) measurements were obtained from a random subsample of responded to the FFQ-3. We excluded six subjects who the cohort, under conditions which ensured near perfect either completed FFQ-2 less than one year after FFQ-1 or participation. The third measurement required an additional returned FFQ-3 more than six months after FFQ-2. The effort on the part of the subjects as they were asked to mean time between FFQ-1 and FFQ-2 was 47.5 months complete the questionnaire at home and mail it back. This (Æ 11.4) for those who responded to FFQ-3, and 46.2 simple procedure meant that the subjects could be separated months (Æ 12.9) for those who did not. The mean time a posteriori into two subgroups: those who ®lled out the between FFQ-2 and FFQ-3 was 3.4 months (Æ 0.9). questionnaire and mailed it back (respondents), and those Food consumption reported by the study subjects was who failed to do so (non-respondents). By analyzing results ®rst checked for completeness and missing values. The for respondents and non-respondents separately, we also number of lines in the questionnaire with missing values give an estimate of the potential effect of self-selection of was used as an indicator of carefulness. In FFQ-1 and FFQ- study participants on the evaluation of the longer-term 2, approximately 10% of the subjects had four or more reproducibility of questionnaire measurements. missing values, and 3% were missing 10 or more. The frequency of blank lines was higher in FFQ-3, for which about 20% had four or more missing values, and 8.7% had 10 or more missing. Further analyses were restricted to 411 Materials and methods subjects (223 `respondents' and 188 `non-respondents') Study subjects who had no more than seven missing values in any of the The subjects were participants in a prospective cohort study questionnaires. Thus, subjects in the upper 5% of the on hormones, diet and cancer, the New York University distribution by number of missing values in any of the Women's Health Study (Toniolo et al, 1991, 1995). three questionnaires were excluded, which is in line with Healthy women between 35 and 65 y of age, attending a rules applied in analyses of data on diet and breast cancer in large breast cancer screening clinic in New York City, were this and other cohort studies (Willett et al, 1987; Toniolo et invited to enroll in the prospective study. Those who agreed al, 1994). There were no differences in the proportions of were asked to donate 30 ml of peripheral venous blood and excluded subjects between respondents and non-respon- to answer a demographic and medical history questionnaire dents. as well as a self-administered food-frequency questionnaire Daily intake of foods and nutrients was calculated on the (FFQ-1). The dietary questionnaire was similar to that basis of food lists and food composition tables elaborated at developed at the US National Cancer Institute (Block, the NCI (Block & Hartman, 1989), with minor modi®ca- 1989; Block & Hartman, 1989), with minor modi®cations tions. to the food list. Between March 1985 and June 1991, 14 290 women were recruited. In an effort to obtain as Statistical analysis many repeated blood specimens as possible from study Mean differences and 95% con®dence intervals were com- subjects, all cohort members were contacted again each puted for the differences between FFQ-3 minus FFQ-1,

Table 1 Study design for the evaluation of the effect of subjects' self-selection on the estimation of the reproducibility of a food frequency questionnaire (FFQ). The New York University Women's Health Study (NYU-WHS)

Period FFQ Where FFQ was How FFQ was returned No. of subjects who participated ®lled in

Phase 1 FFQ-1: Administered to all women Screening clinic Handed to a nurse 14 400 1986±1990 enrolled in the NYU-WHS

Phase 2 FFQ-2: Administered to all women Screening clinic Handed to a nurse 474 April±May 1990 returning to the clinic during a 2- month period

Phase 3 FFQ-3: Mailed to the women who Subject's home Mailed back to investigators 267 out of 474 (207 did not June±July 1990 had completed FFQ-2 return the FFQ)

Time between FFQ-1 and FFQ-2: Subjects who returned FFQ-3 (267): 3 years and 11.5 months (Æ 11.4 months) Subjects who did not return FFQ-3 (207): 3 years and 10.2 months (Æ 12.9 months) Time between FFQ-2 and FFQ-3 (267): 3.35 months (Æ 0.89) Reproducibility of FFQ: Subjects' self selection E Riboli et al 439 FFQ-2 minus FFQ-1, and FFQ-3 minus FFQ-2. To estimate assessment of consumption of energy, fat, carbohydrates agreement in relative ranking, Spearman correlations were and protein remained remarkably stable. An exception is computed for the absolute intake variables as well as for the cholesterol, for which there was a 20% decrease from FFQ- energy-adjusted variables, using the linear regression `resi- 1 to both FFQ-2 and FFQ-3 but no difference between dual' approach described by Rosner & Willett (1988). FFQ-2 and FFQ-3. Similar considerations apply to the data Missing answers for speci®c food items were treated as for food groups reported in Table 4. The fewer speci®c zero consumption. In a few cases, where the frequency of foods for which there is a suggestion of a difference consumption was reported but the portion size was missing, between FFQ-1 and FFQ-2 or FFQ-3 (not shown in Table the `medium' portion was used for calculation of total 4) are beef, eggs and apples which decreased by about 20, quantities of a given type of food consumed. 45 and 28%, respectively, and spaghetti and fruit juice which increased by 54 and 32%. Comparison between respondents and non-respondents Results suggested that total energy intake was slightly lower for Means and standard deviations of daily intake of energy non-respondents (761 Kcal per day). However, no mean- and main nutrients are shown in Table 2 for the entire ingful differences in intake of any of the other nutrients cohort and for the subjects included in the reproducibility reported in Table 2 were observed between respondents and study. We compared the 411 subjects included in the non-respondents. As for foods, respondents reported present study to the entire cohort, and found no signi®cant slightly higher intakes of vegetables and ¯our products differences in the nutrient intake in FFQ-1. Regarding the and lower meat intake. Mean differences were statistically nutrient pattern, it is worth noting that energy intake is signi®cant for meat (P ˆ 0.01) or close to statistical sig- somewhat lower than would be expected on the basis of the ni®cance for ¯our products (P ˆ 0.07) in FFQ-1 and for subjects' age and weight. This is typical of the under- vegetables (P ˆ 0.09) in FFQ-2. Respondents and non- assessment of food consumption observed regularly in respondents did not differ in age, height or weight. No studies conducted with similar food frequency question- information was available on study subjects' physical naires (Willett et al, 1985; Munger et al, 1992). Never- activity. theless, the nutrient intake pattern seems characteristic of Spearman rank correlations between repeat measure- middle-aged north American women, about 40% of the ments are given in Tables 5 and 6 for nutrients and energy being provided by fat and 17% by protein. Intake of foods. Crude correlations for nutrients and foods were other nutrients is also similar to what has been measured always higher for short-term (FFQ-2 vs FFQ-3) than for with food frequency questionnaires in similar populations. long-term reproducibility (FFQ-1 vs FFQ-2). Crude corre- In particular, the estimated average intake of ®ber is lations for energy were 0.54 for FFQ-1 vs FFQ-3 and 0.45 relatively low (about 14 g/d) while the average dietary for FFQ-1 vs FFQ-2. Crude correlations for energy-provid- intake of vitamin C (supplements excluded) is relatively ing nutrients ranged between 0.39 and 0.49 for FFQ-1 vs high (about 167 mg/d). The relatively low intake of vitamin FFQ-2 and between 0.54 and 0.59 for FFQ-2 vs FFQ-3. E (about 8 mg/d) may be due to an underestimation of Adjustment for energy intake led to a modest decrease in vegetable oil consumption (one of the main sources of the correlation coef®cients for all nutrients except saturated vitamin E) as the questionnaire included speci®ed questions fat and cholesterol, for which the correlations were slightly on seasoning fats but not on oil used for cooking. improved. After energy adjustment, the short-term correla- When the means and standard deviations of daily nutri- tions (average 0.54) remained higher than long-term corre- ent intake obtained on the three occasions among respon- lations (average 0.37). Results for foods and food groups dents are compared (Table 3), no meaningful changes are followed a similar pattern, but with correlations slightly observed over time for the intake of most nutrients; higher than for nutrients.

Table 2 Average daily intake of selected nutrients estimated in three repeated administrations of the same food frequency questionnaire (FFQ) among 223 subjects who agreed to complete the questionnaire three times and 188 subjects who refused to participate the third time

Respondents to FFQ-3 Non-respondents to FFQ-3 Total Cohort (No. ˆ 223) (No. ˆ 188) (No. ˆ 13 370)

Nutrients FFQ-1 FFQ-2 FFQ-3 FFQ-1 FFQ-2 FFQ-1 1985±1988 April±May 1990 June±July 1990 1985±1988 April±May 1990 1985±1991 Mean Æ s.d. Mean Æ s.d. Mean Æ s.d. Mean Æ s.d. Mean Æ s.d. Mean Æ s.d.

Energy (Kcal 1526.1 Æ 710.0 1503.7 Æ 596.4 1541.1 Æ 714.3 1464.4 Æ 592.5 1487.5 Æ 762.1 1494.0 Æ 671.0 Protein (g) 64.6 Æ 31.0 64.2 Æ 27.6 65.0 Æ 30.9 66.6 Æ 31.3 63.9 Æ 31.4 64.7 Æ 30.3 Carbohydrates (g) 165.8 Æ 82.9 170.5 Æ 71.5 171.6 Æ 81.5 156.8 Æ 66.2 163.2 Æ 79.3 161.5 Æ 77.7 Total fat (g) 68.8 Æ 36.6 64.9 Æ 29.5 67.9 Æ 36.5 65.3 Æ 29.9 66.3 Æ 40.9 67.5 Æ 34.7 Saturated fat (g) 27.7 Æ 14.6 24.8 Æ 11.9 26.6 Æ 14.5 26.0 Æ 12.3 25.7 Æ 16.1 27.0 Æ 14.4 Oleic acid (g) 23.2 Æ 12.3 22.3 Æ 10.8 23.3 Æ 12.7 22.2 Æ 10.7 22.8 Æ 14.7 22.8 Æ 12.2 Linoleic acid (g) 9.9 Æ 6.3 10.1 Æ 4.9 10.5 Æ 7.2 9.4 Æ 4.9 10.0 Æ 6.7 9.9 Æ 5.9 Cholesterol (mg) 277.7 Æ 213.9 224.9 Æ 129.0 227.2 Æ 138.6 283.5 Æ 183.2 251.3 Æ 167.3 279.0 Æ 183.1 Dietary ®bre (g) 14.3 Æ 8.9 14.9 Æ 8.2 13.9 Æ 7.5 13.7 Æ 7.2 13.5 Æ 6.8 13.7 Æ 7.7 Carotene (mg) 3.7 Æ 3.7 3.8 Æ 2.9 3.5 Æ 3.0 3.6 Æ 2.7 3.9 Æ 5.4 3.6 Æ 3.2 Retinol (mg) 0.5 Æ 0.3 0.5 Æ 0.4 0.5 Æ 0.4 0.8 Æ 1.9 0.5 Æ 0.4 0.6 Æ 0.5 Vitamin C (mg) 170.0 Æ 113.8 168.0 Æ 92.5 163.0 Æ 98.8 164.3 Æ 87.6 161.7 Æ 87.3 164.2 Æ 103.1 Vitamin E (mg) 7.8 Æ 4.1 8.1 Æ 4.3 8.3 Æ 5.0 7.8 Æ 4.5 7.6 Æ 4.3 7.7 Æ 4.3 Calcium (mg) 808.8 Æ 474.2 759.4 Æ 413.2 809.2 Æ 461.7 755.5 Æ 510.0 744.9 Æ 446.5 753.7 Æ 468.2 Iron (mg) 10.4 Æ 5.0 10.9 Æ 5.6 10.8 Æ 5.7 10.8 Æ 5.7 10.3 Æ 5.2 10.6 Æ 5.5 Sodium (mg) 2681.2 Æ 1365.4 2746.9 Æ 1255.9 2831.4 Æ 1481.3 2532.4 Æ 1182.3 2690.4 Æ 1684.3 2596.8 Æ 1350.5 Potassium (mg) 2491.6 Æ 1272.5 2498.1 Æ 1018.4 2514.2 Æ 1204.8 2418.1 Æ 1094.0 2402.8 Æ 1026.0 2403.7 Æ 1099.6 Reproducibility of FFQ: Subjects' self selection E Riboli et al 440 Table 3 Means and 95% con®dence intervals of the individual differences in nutrient intakes from questionnaires 3, 2 and 1. Analyses restricted to 223 subjects who responded to all three questionnaires

Short-term reproducibility Long-term reproducibility (2±5 years) (2±3 months)

Difference FFQ-2±FFQ-1 Difference FFQ-3±FFQ-1 Difference FFQ-3±FFQ-2

95% CI 95% CI 95% CI

Mean Lower Upper Mean Lower Upper Mean Lower Upper

Total calories (Kcal) 722.3 7111.9 67.3 15.0 786.7 116.7 37.3 749.8 124.5 Protein (g) 71.4 75.2 2.4 70.6 74.9 3.8 0.9 73.0 4.8 Carbohdyrate (g) 4.7 75.6 15.0 5.8 75.4 17.0 1.1 78.4 10.7 Total fat (g) 74.0 78.8 0.9 70.9 76.4 4.6 3.0 71.6 7.6 Saturated fat (g) 72.8 74.9 70.8 71.1 73.3 1.2 1.8 0.0 3.5 Oleic acid (g) 70.9 72.6 0.9 0.1 71.8 2.0 1.0 70.7 2.6 Linoleic acid (g) 0.2 70.6 1.1 0.6 70.4 1.5 0.3 70.6 1.3 Cholesterol (mg) 752.8 778.3 727.2 750.5 777.0 723.5 2.2 714.3 18.8 Dietary ®bre (g) 0.6 70.5 1.6 70.4 71.5 0.7 71.0 72.0 0.0 Carotene (mg) 0.06 70.4 0.5 0.2 70.7 0.4 70.2 70.7 0.2 Retinol (mg) 70.04 70.09 0.02 0.004 70.06 0.05 0.03 70.03 0.09 Vitamin C (mg) 71.9 716.4 12.5 77.0 721.4 7.5 75.0 718.4 8.4 Vitamin E (mg) 0.3 70.3 0.9 0.5 70.2 1.1 0.2 70.4 0.8 Calcium (mg) 749.4 7114.9 16.2 0.4 765.7 66.5 49.8 712.9 112.5 Iron (mg) 0.5 70.1 1.2 0.4 70.4 1.1 70.1 70.8 0.6 Sodium (mg) 65.7 7118.4 249.8 150.2 763.8 364.2 84.5 7100.9 269.8 Potassium (mg) 6.5 7142.1 155.1 22.6 7142.1 187.3 16.1 7133.6 165.8

Table 4 Average daily intake of main groups of foods estimated in three repeated administrations of the same food frequency questionnaire (FFQ) among 223 subjects who agreed to complete the questionnaire three times and 188 subjects who refused to participate the third time

Respondents to FFQ-3 (No. ˆ 223) Non-respondents to FFQ-3 (No. ˆ 188)

FFQ-1 FFQ-2 FFQ-3 FFQ-1 FFQ-2 Food groups 1985±1988 April±May 1990 June±July 1990 1985±1988 April±May 1990 Mean Æ s.d. Mean Æ s.d. Mean Æ s.d. Mean Æ s.d. Mean Æ s.d.

Fruit 193.6 Æ 218.1 175.7 Æ 121.0 146.3 Æ 121.2 172.8 Æ 130.3 158.5 Æ 112.9 Fruit juices 143.8 Æ 149.7 147.1 Æ 162.2 156.7 Æ 174.2 149.3 Æ 148.4 140.8 Æ 161.3 Vegetables 255.1 Æ 211.9 262.4 Æ 200.5 259.9 Æ 168.9 239.2 Æ 173.4 245.5 Æ 211.0 Cereals 112.7 Æ 93.6 132.3 Æ 123.9 124.8 Æ 102.5 105.7 Æ 82.5 120.1 Æ 109.1 Pastries 20.5 Æ 23.9 18.0 Æ 19.5 17.9 Æ 21.7 17.6 Æ 20.3 20.9 Æ 25.7 Potatoes 40.7 Æ 59.6 41.0 Æ 31.0 37.1 Æ 32.8 36.4 Æ 46.0 42.6 Æ 49.9 Meat 33.3 Æ 35.8 24.5 Æ 25.5 25.0 Æ 23.2 35.2 Æ 30.9 26.7 Æ 26.9 Poultry 28.2 Æ 31.2 33.5 Æ 41.1 27.8 Æ 24.5 32.1 Æ 31.4 34.8 Æ 48.0 Fish 31.8 Æ 30.8 32.7 Æ 27.4 37.2 Æ 59.0 34.7 Æ 44.0 33.6 Æ 29.3 Eggs 21.8 Æ 36.1 12.2 Æ 16.9 11.7 Æ 18.4 21.6 Æ 27.1 17.6 Æ 27.9 Ice Cream 20.2 Æ 30.8 14.0 Æ 27.9 17.4 Æ 32.4 14.6 Æ 26.0 14.7 Æ 24.5 Dairy products 295.5 Æ 248.5 282.8 Æ 242.9 314.1 Æ 288.1 297.6 Æ 326.6 285.5 Æ 266.8 Spaghetti & Pizza 48.6 Æ 50.5 72.1 Æ 103.8 73.9 Æ 87.5 42.7 Æ 76.0 51.5 Æ 45.8

Table 5 Spearman rank correlations between nutrient intake estimated in repeated administrations of the same food frequency questionnaire (FFQ) within a short time period (2±3 months) for 223 `respondents' and within a long time period (2±5 y) for respondents and `non-respondents'

Respondents Non-respondents

Short-term (2±3 months) Long-term (2±5 y) Long-term (2±5 y) FFQ-2 vs FFQ-3 FFQ-1 vs FFQ-2 FFQ-1 vs FFQ-2

Crude Energy-adjusted Crude Energy-adjusted Crude Energy-adjusted

Energy 0.54 Ð 0.45 Ð 0.33 Ð Protein 0.56 0.50 0.45 0.34 0.34 0.25 Carbohydrates 0.59 0.64 0.49 0.38 0.33 0.41 Total fat 0.54 0.57 0.41 0.35 0.35 0.36 Saturated fat 0.54 0.59 0.39 0.41 0.35 0.44 Oleic acid 0.54 0.57 0.41 0.35 0.35 0.33 Linoleic acid 0.53 0.48 0.47 0.37 0.36 0.18 Cholesterol 0.66 0.67 0.47 0.50 0.40 0.36 Fibre 0.66 0.56 0.53 0.32 0.44 0.46 Carotene 0.57 0.50 0.36 0.32 0.36 0.29 Retinol 0.57 0.50 0.44 0.34 0.37 0.23 Vitamin C 0.62 0.59 0.52 0.41 0.45 0.35 Vitamin E 0.61 0.47 0.48 0.31 0.44 0.42 Calcium 0.56 0.57 0.49 0.44 0.42 0.37

Average correlation 0.58 0.54 0.46 0.37 0.38 0.34 Reproducibility of FFQ: Subjects' self selection E Riboli et al 441 Table 6 Spearman rank correlation between food groups estimated in repeated administrations of the same food frequency questionnaire (FFQ) within a short time period (2±3 months) for 223 `respondents' and within a long time period (2±5 y) for respondents and `non-respondents'

Respondents Non-respondents

Short-term (2±3 months) Long-term (2±5 y) Long-term (2±5 y) FFQ-2 vs FFQ-3 FFQ-1 vs FFQ-2 FFQ-1 vs FFQ-2

Crude Energy-adjusted Crude Energy-adjusted Crude Energy-adjusted

Fruit 0.77 0.71 0.65 0.52 0.50 0.48 Fruit juices 0.71 0.72 0.54 0.56 0.49 0.42 Vegetables 0.65 0.61 0.51 0.37 0.42 0.40 Cereals 0.58 0.54 0.40 0.42 0.32 0.32 Pastries 0.62 0.52 0.40 0.32 0.47 0.38 Potatoes 0.63 0.54 0.43 0.33 0.30 0.27 Meat 0.80 0.78 0.60 0.52 0.57 0.44 Poultry 0.58 0.50 0.41 0.35 0.44 0.44 Fish 0.69 0.51 0.53 0.48 0.46 0.36 Eggs 0.68 0.63 0.47 0.44 0.40 0.34 Ice cream 0.64 0.54 0.43 0.37 0.53 0.32 Dairy products 0.59 0.57 0.47 0.44 0.46 0.43 Spaghetti & pizza 0.59 0.47 0.47 0.36 0.49 0.41

Average correlation 0.65 0.59 0.48 0.43 0.44 0.39

Without adjustment for total energy intake, correlation some of their previous answers and did not try to give a coef®cients for nutrient intake were slightly higher for second, more independent evaluation of their dietary habits. respondents than for non-respondents, for all nutrients While we have no data for discriminating between different except carotene (Table 5). When adjusted for energy explanations, the higher reproducibility over short than intake, however, the correlation coef®cients were very over long periods is consistent with the ®ndings of previous similar for both groups. As for foods, the correlations studies (Friedenreich et al, 1992). were higher among respondents for a few food groups, A potentially important issue in the extrapolation of the particularly fruit and vegetables, and quite similar for the results of a methodological study to the actual epidemio- others. Adjustment for energy lowered the correlation logical investigation, in which a diet questionnaire is used coef®cients and generally made them more similar in to estimate of disease, is whether the subjects both groups. included in the methodological study are representative of those in the main etiological investigation. Methodological studies to evaluate the reproducibility and validity of diet- Discussion ary questionnaire measurements have so far been based We investigated the long- and short-term reproducibility of mostly on self-selected volunteers. In some cases the the dietary measurements obtained with the food frequency subjects were a relatively small subgroup of very large questionnaire used in the New York University Women's cohort studies (Willett et al, 1985; Rimm et al, 1992; Health Study. Reproducibility was assessed at the group Munger et al, 1992), while in others the subjects were level by comparison of group means, and by calculation of drawn independently of any on-going epidemiological correlation coef®cients between the individuals' repeat study, even though the source population was that targeted measurements. Mean estimated intake levels remained for future studies on diet and cancer (Pietinen et al, 1988a, relatively stable over time, although there was a small 1988b; Callmer et al, 1993). Concordance, either between decrease in the reporting of food items that might be repeat questionnaire measurements (in a reproducibility perceived as `bad for health', such as beef and eggs, and study) or between questionnaire and reference measure- an increase in food items that may be seen as `healthy' ments (in a validity study), depends not only on the foods such as fruit juice and spaghetti. There was, however, characteristics of the dietary questionnaire method being no meaningful change in the reported consumption of fresh evaluated and the true dietary habits of the study subjects fruit and vegetables. The energy-adjusted correlation coef- but also on the accuracy and dedication with which the ®cients obtained for short-term reproducibility in the pre- subjects ®ll in the questionnaires. One might therefore sent study are similar to (or just slightly lower than) those suspect that the reproducibility and relative validity derived reported in several other studies of similar design (Pietinen from studies on self-selected volunteers will be overesti- et al, 1988a, 1988b; Rimm et al, 1992) where correlations mated as compared to those which would actually be found ranged between 0.60 and 0.70 for most of the nutrients in a large, less selective study population. Overestimation considered. of the reproducibility or validity of questionnaire measure- The short-term reproducibility, over a period of 2±3 ments may have consequences for the evaluation of the true months, was somewhat higher than the long-term reprodu- statistical power of studies on diet and disease to detect an cibility, after 2±5 y. One possible explanation for this association, or may lead to an under-evaluation of attenua- difference is that the long-term reproducibility may have tion biases in relative risk estimates. In our study, 56% of been in¯uenced more by changes in true dietary habits over the women contacted agreed to ®ll out the FFQ for a third time. Another view is that the short-term reproducibility time. This participation rate is similar to that obtained in may have been favoured by the fact that measurements previous studies, and would in practice be considered as were obtained during the same season (assuming that the quite satisfactory in this type of methodological study. measurements more strongly re¯ected dietary habits during However, a comparison between the results of subjects the more recent months), or that some women still recalled who provided full participation and those who did not Reproducibility of FFQ: Subjects' self selection E Riboli et al 442 respond to FFQ-3 suggests that self-selection may play a Callmer E, Riboli E, Saracci R, aÊkesson B & LindgaÈrde F (1993): Dietary role, namely that volunteers may provide more reliable assessment methods evaluated in the MalmoÈ food study. J. Intern. Med. 233, 53±57. information than subjects less strongly motivated to parti- Friedenreich CM, Slimani N & Riboli E (1992): Measurement of past diet: cipate in repeat dietary measurements. In addition, subjects Review of previous and proposed methods. Epidemiol. Rev. 14, 177± who complied with the mailed questionnaire had reported 186. somewhat higher vegetable intake, and a lower meat intake Hankin JH, Nomura AMY, Lee J, Hirohata T & Kolonel LN (1983): at baseline. These differences suggest that people who are Reproducibility of a diet history questionnaire in a case-control study of breast cancer. Am. J. Clin. Nutr. 37, 981±985. willing to participate in a study on diet in relation to health Jain M, Howe GR, Harrison L & Miller AB (1989): A study of repeat- may be more sensitive to dietary guidelines, which cur- ability of dietary data over a seven-year period. Am. J. Epidemiol. 129, rently recommend eating more vegetables and reducing 422±429. meat intake. The results of our study also suggest, there- JaÈrvinen R, SeppaÈnen R & Knekt P (1993): Short-term and long-term reproducibility of dietary history interview data. Int. J. Epidemiol. 22, fore, that those involved in a reproducibility or validity 520±527. study may not be a truly representative sample of the Kaaks R, Plummer M, Riboli E, EsteÁve J & van Staveren W (1994): targeted study population when the rate of non-participa- Adjustment for bias due to errors in exposure assessments in multicenter tion in the substudy is high. cohort studies on diet and cancer: calibration approach. Am. J. Clin. In our view, in order to ensure high compliance, future Nutr. 59, 245s±250s. Kaaks R, Riboli E & van Staveren WA (1995): Calibration of dietary studies on the accuracy of dietary intake measurements intake measurements in prospective cohort studies. Am. J. Epidemiol. should preferably be based on a reference method which 142, 548±556. does not require a large number of repeat measurements Munger RG, Folson AR, Kushi LH, Kaye SA & Sellers TA (1992): (for example several weeks of food recording) from the Dietary assessment of older Iowa women with a food frequency questionnaire: nutrient intake, reproducibility, and com- study participants. To achieve this, one approach is to parison with 24-hour dietary recall interviews. Am. J. Epidemiol. 136, include more subjects in the substudy, but with a smaller 192±200. possible number of repeat reference measurements Pietinen P, Hartman AM, Haapa E, RaÈsaÈnen L, Haapakoski J, Palmgren J, (weighed food records or 24 h recalls) per person. If the Albanes D, Virtamo J & Huttunen JK (1988a): Reproducibility and aim is to estimate the correlation between dietary ques- validity of dietary assessment instruments. I. A self-administered food use questionnaire with a portion size picture booklet. Am. J. Epidemiol. tionnaire measurements and individuals' intake levels, a 128, 667±676. minimum of two additional measurements with indepen- Pietinen P, Hartman AM, Haapa E, RaÈsaÈnen L, Haapakoski J, Palmgren J, dent random errors per person is required (Kaaks et al, Albanes D, Virtamo J & Huttunen JK (1988b): Reproducibility and 1995). If the aim is limited to estimating the variation in validity of dietary assessment instruments. II. A qualitative food frequency questionnaire. Am. J. Epidemiol. 128, 655±666. true intake levels actually distinguished by questionnaire Riboli E & Kaaks R (1997): The EPIC Project: Rationale and study design. measurements (as can be the case in on-going prospective Int. J. Epidemiol. 26, Suppl. 1, S6±S14. cohort studies) then a `calibration' approach requiring only Rimm EB, Giovannucci EL, Stampfer MJ, Colditz GA, Litin LB & Willett a single daily intake record or 24 h diet recall per person WC (1992): Reproducibility and validity of an expanded self-adminis- can be used (Kaaks et al, 1994, 1995). Such an approach is tered semiquantitative food frequency questionnaire among male health professionals. Am. J. Epidemiol. 135, 1114±1126. now being implemented in the `European Prospective Rosner R & Willett WC (1988): Interval estimates for correlation coef®- Investigation into Cancer and Nutrition', a prospective cients corrected for within-person variation: Implications for study cohort study in progress in nine European countries design and hypothesis testing. Am. J. Epidemiol. 127, 377±386. (Riboli & Kaaks, 1997). However, as the reference mea- Toniolo PG, Pasternack BS, Shore RE, Sonnenschein E, Koenig KL, Rosenberg C, Strax P & Strax S (1991): Endogenous hormones and surement is based on a single day's diet measurement per breast cancer: a prospective cohort study. Breast Cancer Res. Treat. 18, subject and it is not possible to estimate the correlation 23s±26s. between dietary questionnaire measurements and indivi- Toniolo P, Riboli E, Shore RE & Pasternack BS (1994): Consumption of duals' true intake levels, this method may be less suitable meat, animal products, protein and fat and risk of breast cancer: a when the prime objective is to develop a new questionnaire prospective cohort study in New York. Epidemiology 5, 391±397. Toniolo PG, Levitz M, Zeleniuch-Jacquotte A, Banerjee S, Koenig KL, for which estimates of validity of measurement of speci®c Shore RE, Strax P & Pasternack BS (1995): A prospective study of foods, food groups and nutrients are needed for subsequent endogenous estrogens and breast cancer in postmenopausal women. J. improvement of the questionnaire itself. Natl. Cancer Inst. 87, 190±197. Willett WC, Sampson L, Stampfer MJ, Rossner B, Bain C, Witschi J, Hennekens CH & Speizer FE (1985): Reproducibility and validity of a References semiquantitative food frequency questionnaire. Am. J. Epidemiol. 122, Block G (1989): Health Habits and History Questionnaire: Diet History 51±65. and Other Risk Factors. Personal Computer System Packet. Bethesda, Willett WC, Stampfer MJ, Colditz GA, Rosner BA, Hennekens CH & MD: National Cancer Institute, Division of Cancer Prevention and Speizer FE (1987): Dietary fat and the risk of breast cancer. N. Engl. J. Control. Med. 316, 22±28. Block G & Hartman AM (1989): Issues in reproducibility and validity of Wu ML, Whittemore AS & Jung DL (1988): Errors in reported dietary dietary studies. Am. J. Clin. Nutr. 50, 1133±1138. intakes. II. Long-term recall. Am. J. Epidemiol. 128, 1137±1145.