UNIVERSITY OF NEW SOUTH WALES Thesis/Project Report Sheet

Surname or Family name: DEAKIN First name: Vicki Other name/s: Abbreviation for degree as given in the University calendar: M.Sc in Food Sc Tech School: Food Science and Technology Faculty: Applied Science Title: Validation of a Checklist for Estimating Fat Intake

Abstract 350 words maximum:

The objective of this study was to develop a short-term fat checklist (FC) and to examine whether it could replace a three-day food record (FR). The intended use of the FC was to measure and monitor current fat intake over three consecutive days in patients attending a local clinic in Canberra during a four-month lifestyle intervention program, and to provide an education message. A self-administered FC, in the form of a semi-quantitative FFQ, comprising 43 food items derived from food consumption data of two Australian population studies of adults was developed.

The FC was compared with a three-day estimated food record (FR) in 19 volunteer clinic patients (9 men, 10 women) and also, to broaden its application, in 42 tertiary nutrition students (12 men, 30 women) against a three­ day weighed FR. Correlations for fat intake between the FC and FR were 0.86 for clinic patients and 0.68 for nutrition students. Mean differences in fat intake estimated by the two methods did not differ significantly from zero although Bland-Aitman plots showed large difference in fat intake between individuals. When classified by the FR, 92% of all subjects fell into the same or adjacent quartile when classified by the FC. Fifty-four percent of subjects were classified in the same qua1tiles for fat intake although overall ranking of quartiles suggested a 'poor' but significant agreement (kappa statistic, 0.33). The FC classified subjects consuming "?.70 g fat/d with 74% sensitivity. A Pearson's correlation of0.65 was calculated by a test-retest reproducibility procedure at three weeks in 49 tertiary physiology students. The FC reflected the food choices of all subjects tested with few exceptions.

It was concluded that the FC cannot replace a three-day FR to measure absolute fat intake of individuals but was acceptable for measuring fat intake at the group level and to rank individuals into broad categories of fat intake. Its sensitivity of 74% to fat intakes "?.70g fat/d suggested that, the FC detected around three out of four subjects identified as consuming a high fat level by the FC have, in fact got a high fat consumption. It was also useful for screening between 'high' and 'low' fat intakes in groups.

Declaration relating to disposition of project report/thesis

I am fully aware of the policy of the University relating to the retention and use of higher degree project reports and theses, namely that the University retains the copies submitted for examination and is free to allow them to be consulted or borrowed. Subject to the provisions of the Copyright Act 1968, the University may issue a project report or thesis in whole or in part, in photostat or microfilm or other copying medium.

I also authorise the publication by University Microfilms of a 350 wordJl.bstr

WI ..,..,,,,. I J v ~ ..,.lao!\ '-/' 1 vjtlj£L

The University recognises that there rnay be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing to the Registrar. Requests for a longer period of restriction may be considered in exceptional circumstances if accompanied by a letter of support from the Supervisor or Head of School. Such requests must be submitted with the thesis/project report.

FOR OFFICE USE ONLY Date of completion of requirements tor Aw< rd

~aistrar anti Deo\itv Princioal

THIS SHEET IS TO BE GLUED TO THE INSIDE FRONT COVER OF THE THESIS MBT 613.284 2 Validation of a Checklist for Estimating Fat Intake

Vicki Deakin

A thesis submitted for the degree of

Master of Science

The University of New South Wales

September 2000 UN S V\1 1 4 JAN ZOOZ LIBRARY Contents and abstract

Certificate of originality

I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person nor material which to a substantial extent has been accepted for the award of any other degree or diploma of a university or other institute of higher learning, except where due acknowledgment is made in the text.

I also declare that the intellectual content of this thesis is the product of my own work, even though I may have received assistance from others on style, presentation and language expression.

Vicki Deakin

September 2000

Department of Food Science and Technology

The University ofNew South Wales

ii Contents and abstract

ACKNOWLEDGMENTS

This thesis was undertaken while the author was working in a full-time capacity at the University of Canberra. The University of Canberra provided a three-month study leave to undertake data collection and I would like to thank Mrs Janice Plain, in the first instance, for acting in my position while on this leave. Associate-Professor Peter Greenham kindly granted access to his class of undergraduate student during lecture periods.

The work was jointly supervised by Associate Professor Heather Greenfield (UNSW) who willingly and enthusiastically accepted the role of primary supervisor of an ex-campus student, an undoubtedly daunting task and Dr Karen Cashel from the University of Canberra. I am grateful and appreciative to Dr Greenfield for her dedication to this task, her thoroughness and attention to detail. Her numerous visits to Canberra and the feedback she provided, considering the circumstances, went beyond the role of any supervisor. Dr Greenfield is indeed an outstanding mentor. Dr Cashel has been invaluable in providing scholarly input, personal support, motivation and encouragement. Her contribution and assistance to me both personally and professionally is worth more than a simple acknowledgment.

Mrs Deidre Briscombe and Mrs Helen Cooke, both nurses of extensive experience in patient counselling and management, voluntarily acted as data collectors at The Cardiovascular Health Risk Management Clinic and assisted in processing of data. I would like to thank Miss Maria Roberts, an undergraduate student at The University of Canberra for voluntarily cross­ checking, double-coding and assisting in the analysis of the food records.

Dr Katrine Baghurst from CSIRO Division of Human Nutrition in Adelaide made available unpublished information about food sources of fat intake in her surveys of the population groups in SA and Victoria. Both Dr Baghurst and Professor Tony Worsley from NCEPH provided direction and encouragement to pursue this difficult topic. Dr George Cho and Ms Cathy Hales from the University of Canberra were invaluable in their assistance in data analysis and statistical procedures and I appreciate the resources and advice they provided in setting up a database and solving problems with reading the data. Finally Wayne Robertson, a statistician at the University of Canberra resolved a statistical problem that delayed the progress of this thesis for many months. Thank you Wayne.

Finally I dedicate this thesis to my son, Lachlan Deakin, who has, at the young age of six, tolerated and endured my long absences involved with completing this thesis, especially in the last few months.

iii Contents and abstract

TABLE OF CONTENTS

Page

Certificate of originality ...... ii Acknowledgments ...... iii Contents ...... iv

Appendices ...... xi

List of tables ...... xii List of figures ...... xiv Glossary ...... xvi

Acronyms ...... xix Abstract ...... xx

CHAPTER 1: Introduction and aims 1.1 Introduction ...... 1 1.2 Background ...... 1

1.1.2 Fat intake in the Australian population ...... 1 1.2.2 High fat intake and disease ...... 1 1.2.3 The need for a simple method to assess fat intake in local programs ...... 2 1.3 Methods for measuring dietary intake ...... 2 1.3 .1 Traditional methods ...... 2 1.3.2 The need for new methods for measuring dietary intakes ...... 3

1.3.3 The importance of testing the validity of new methods for measuring dietary intake ...... 3 1.4 Aims and objectives ...... 4

CHAPTER 2: Literature review: methods for measuring dietary intakes and their validation 2.1 Introduction ...... 5 2.2 Reasons for measuring food intake ...... 5 2.3 Methods for measuring food consumption ...... 7 2.3.1 Measuring food consumption ...... 7

iv Contents and abstract

2.3 .2 Conversion of food into nutrients ...... 9 2.3.2.1 Food composition data ...... 10 2.4 The design, uses and limitations of FFQs ...... 11 2.4.1 Format ofFFQs ...... 11 2.4.2 Uses ofFFQs ...... 11 2.4.2.1 Use ofFFQs in clinics ...... 12 2.4.3 Limitations ofFFQs ...... 13 2.4.4 Issues in design ofFFQs ...... : ...... 14 2.4.4.1 Food items for inclusion ...... 13 a. FFQs derived from food composition tables ...... 14 b. FFQs derived from food consumption data ...... 15 c. FFQs derived from statistical procedures linking nutrient composition with consumption practices ...... 15 2.4.4.2 Number of food items ...... 16 2.4.4.3 Use of portion sizes ...... 17 a. Does the inclusion ofportion sizes increase accuracy? ...... l8 b. Are some foods more accurately quantified than others? ...... 23 c. Does inclusion of 'standard' portion size introduce bias? ...... 23 2.4.4.4 Influence of grouping foods of similar nutrient content...... 24 a. Order offoods presented on FFQs ...... 24 b. Consensus ofopinion for grouping foods ofsimilar nutrient content in a FFQ ...... 24 2.4.4.5 Number of response options ...... 25 2.5 Assessing agreement between two methods ofmeasurement ...... 25 2.5.1 The use and limitations of approaches and statistical tests for measuring agreement in method comparison studies ...... 27 2.5 .1.1 Hypothesis testing and use of P values ...... 27 2.5.1.2 Standard deviation, standard error and confidence intervals ...... 27 2.5.1.3 Comparison of means ...... 28 2.5.1.4 Correlation ...... 28 2.5.1.5 Regression analysis ...... 29 2.5 .1.6 Differences in means for each method ...... 3 0 2.5 .1. 7 Classification techniques ...... 31 Percent agreenzent ...... Kappa statistic ......

v Contents and abstract

2.5.1.8 Specificity, sensitivity and predictive values ...... 32 2.5.1.9 Determining cut-off for nutrients that are used for calculating sensitivity and specificity of a dietary assessment method ...... 33 2.6 Validity ...... 34 2.6.1 Definition ofterms ...... 34 2.6.6.1 Reproducibility ...... 34 2.6.1.2 Validity ...... 36 Content validity ...... 36 Criterion validity ...... 36 2.7 Validity of current and retrospective methods used for measuring food consumption ...... 37 2.7.1 Validations of current methods ...... 37 2.7.2 Validations of retrospective methods ...... 37 2. 7.3 Choice of another dietary measure as the criterion method for validation ...... 3 8 2.7.4 Use of biomarkers as the criterion method ...... 39 2.7.5 Constraints to validity testing ...... 39 2.8 Validations studies ofFFQs ...... 40 2.8.1 Reproducibility studies of FFQs ...... 40 2.8.1.1 Use and interpretation of statistical tests for measuring reproducibility 41 2.8.1.2 Comparison oftime intervals between tests ...... 41 2.8.2 Content validity studies ofFFQs ...... 44 2.8.3 Criterion validity studies ofFFQs ...... 44 2.8 .3 .1 Use and interpretation of statistical tests for measuring criterion validity ofFFQs ...... 44 2.8.3.2 Validity studies ofFFQs that measure nutrients in the 'usual diet (long-term), including fat ...... 46 a. Australian studies ofFFQs that measure 'usual' diet ...... 48 2.8.3.3 Validity studies of long-term FFQs, designed to measure fat intakes in the 'usual' diet ...... 49 2.8.3.4 Validity studies of short-term FFQs, including fat ...... 52 a. FFQs that measure most nutrients ...... 52 b: FFQs that measure one or a few nutrients ...... 53 c. FFQs that measure fat ...... 53 d. Characteristics ofshort-term FFQs designed to measure fat intake ..... 54

2.9 Summary of the literature ...... !'""""'"""""'""'"'""""''"'"''""""""""'""'56

vi Contents and abstract

CHAPTER 3: Methods 3.1 The development of the FC ...... 58 3.1.1 Design ofthe FC ...... 58 3 .1.2 Selection of foods for inclusion in the FC ...... 58 3 .1.3 Number and order of food items in the FC ...... 60 3.1.4 Food grouping ...... 60 3 .1.5 Options for serve size information ...... 64 3.1.6 Inclusion offat content offoods ...... 64 3.1.7 Frequency response options ...... 64 3.2 Study design and subjects ...... 64 3 .2.1 Study design ...... 64 3 .2.1.1 Selection of subjects for study of the pilot FC ...... 66 3 .2.1.2 Selection of subjects for the reproducibility and validity study of the final version of the FC ...... 67 3 .2.1.3 Determination of sample size ...... 67 Effect size ...... 67 Power ...... 67 Significance ...... 68 3.2.1.4 Sample size estimations for correlation analysis ...... 68 3 .2.1.5 Sample size estimates for linear (least squares) regression ...... 69 3 .2.1.6 Exclusion and withdrawal criteria for subjects ...... 69 3 .2.3 Consent ...... 69 3 .2.4 Ethics approval ...... 70 3.3 Data collection ...... 70 3.3 .1 Pilot testing of the FC ...... 70 3.3.1.1 Protocol for pilot testing of the FC ...... 70 3.3.2 Reproducibility of the final version ofthe FC ...... 70 3.3.2.1 Data collection for the reproducibility study ...... 71 3.3.3 Criterion validity of the final FC ...... 71 3.3.3.1 Data collection for the criterion validity study ...... 74 3.3.4 Content validity ofthe final FC ...... 74 3.4 Dietary instruments ...... 74 3.4.1 ThefinaiFC ...... 74

vii Contents and abstract

3.4.1.1 Scoring of data from the FC ...... 75 3.4.2 Three-day food record (criterion method) ...... 78 3.4.2.1 Nutrient analysis ...... 78 3.5 Data analysis and statistical methods ...... 79 3.5.1 Reproducibility of the final FC ...... 80 3.5 .2 Criterion validity of the final FC ...... 80 a. Descriptive statistics offat intake from the FC and FR in Group 2 and Group 3 ...... 80 b. Association and agreement between the FC and FR in Group 2 and Group 3 ...... 81 Determination of the measurement error of the FC ...... 82 Determination ofthe measurement bias of the FC ...... 82 Precision ofthe limits ofagreement ofthe FC...... 82 Least squares regression analysis ...... 82 Sensitivity, specificity and predictive values ...... 83 c. Comparison ofclassification ofindividuals into quartiles of fat intake between the FC and FR ...... 84 d Comparison ofserve sizes between the FC and FR ...... 84 e. Identification offood items erroneously recalled or misclassified on the FC ...... 84 3.5 .3 Content validity of the final FC ...... 84 3.5.3.1 Statistical procedures used for measuring content validity ofthe FC ... 85 a. Frequency ofconsumption of individual foods on the FC ...... 85 b. Frequency ofalteration to serve sizes in each group ...... 85 c. Identification ofFC food items that were major contributors to 'true ' daily fat intake ...... 85

CHAPTER 4: Results of validity testing of the final fat checklist 4.1 Final numbers of participants ...... 86 4.1.1 Profile of study participants ...... 87 4.2 Reproducibility of the final fat checklist ...... 88 4.2.1 Agreement assessment between test 1 and test 2 ...... 89 4.2.2 Test-retest results (individual responses to each food item ...... 92 4.2.3 Summary of reproducibility results ...... 92 4.3 Criterion validity of the final FC ...... 92

viii Contents and abstract

4.3 .I Standards used for measuring criterion validity ...... 92 4.3 .2 Characteristics of the dietary intake in Group 2 and Group 3 ...... 93 4.3.2.1 The daily energy and nutrient content of the diet derived from the three-day FR in Group 2 and Group 3 ...... 93 4.3.2.2 Comparison of mean fat intake (g fat/d) between the FR and FC in Group 2 and Group 3 ...... 96 4.3.3 Tests of association and agreement between the methods ...... 97 4.3 .3 .I Comparison of individual fat intake (g fat/d) between the FR and FC ...... 98 For Group 2 ...... 99 For Group 3 ...... 101 4.3.3.2 Correlation coefficients between the FR and FC ...... 101 4.3 .3 .3 Measurement error of the FC for Group 2 and Group 3 ...... 101 4.3.3.4 Least squares (linear) regression analysis ...... 104 4.3.3.5 Comparison between methods using cross-classification techniques .. 109 4.3.3.6 Sensitivity, specificity and predictive value of the FC ...... 112 4.3.4 Sources of measurement bias/error of the FC ...... 113 4.3 .4.1 Comparison of misreporting of serve sizes on the FCs with FRs in Group 2 and Group 3 ...... 115 4.3 .4.2 Identification of specific food items erroneously recalled on the FCs in Group 2 and Group 3 ...... 115 4.3.5 Fat score from the FC of the 'non-responders' ...... 115 4.3.6 Summary ofthe criterion validity studies ofthe FC ...... 118 4.4 Content validity of the final FC ...... 118 4.4.1 Data used for measuring content validity ...... 118 4.4.2 Frequency of consumption of food items listed on the FC in Group 2 and Group 3 ...... :...... 118 4.4.2.I Food items that were major contributors to daily fat intake based on frequency of consumption of foods ...... 121 4.4.2.2 Suitability of the frequency format used ...... 122 4.4.2.3 Alterations to the standard amount (serve size) ...... 122 4.4.3 Comparisons offood choices between Group 2 and Group 3 ...... 123 4.4.3.1 Comparisons offood choices from selected food groups ...... 123 4.4.3.2 Comparisons of food choices from individual food items ...... 124 4.4.4 Summary of content validity results ...... 125

ix Contents and abstract CHAPTER 5: The validity and reliability of the final FC 5.1 Introduction ...... 126 5.2 Reproducibility of the final FC ...... •...... •....•...... •...... ••...••...... •••.••.... 127 5.3 Criterion validity of the final FC ...... •..•....•..•...... •....•...... 128 5.3 .1 Comparison of measures of agreement between the FRs and FCs in Group 2 and Group 3 ...... 128 5.3.1.1 Comparison ofmean fat intake ...... 128 5.3.1.2 Comparison of individual difference in fat intake between the FR and FC ...... 129 a. Scatterplots ofthe FR and FC ...... 129 b. Correlation coefficients ofthe FR and FC ...... 129 c. Regression ...... 130 d. Difference in fat intake of individuals between the FR and FC ...... 130 5.3 .1.3 Cross classification techniques ...... 131 5.3 .2 External validity of energy intake ...... 132 5.4 Content validity of the final FC ...... •...... •.•..•..•..•....••.•....••.••...... •...... ••..... 134 5 .4.1 Range of daily fat intake in Group 2 and Group 3 ...... 134 5.4.2 Design of the final FC ...... 134 5.4.2.1 Suitability of the FC to reflect the food choices of Group 2 and Group 3 ...... 134 5.4.2.2 Suitability of the frequency format for Group 2 and Group 3 ...... 135 5.4.2.3 Grouping foods: the effect of grouping meat on the FC ...... 136 5.5 Factors affecting the interpretation of validity tests ofthe FC in Group 2 and Group 3 ...... 137 5.5.1 Constraints to the study design ...... 137 5.5.2 Subject selection and sampling bias ...... 138 5.5.3 Implications of the decrease in sample size on validity ...... 138 5.5.3.1 For the reproducibility study ...... 138 5.5.3.2 For the criterion validity study ...... 138 5.5.3.3 Effect of the decreased sample size on the power of the measures used to detect differences between the FC and FR in the criterion validity study ...... 138 5.5 .4 Effects of investigator bias and training effects on agreement measures ...... 141 5.5.5 Effects of a three day time frame for the study on agreement measures ...... l41 5.5.6 Effects of using consecutive days oftesting on agreement measures ...... l41

X Contents and abstract

5.5.7 Influence of food records on agreement measures ...... 142 5.5.7.1 Reference method for validation ofFC ...... 142 5.5.7.2 Sources of inaccuracy and bias in recording food intakes ...... 142 5.5.7.3 Computation offat intakes ...... 143 5.5.7.4 Exclusion of food records from the study ...... 143 5.5.8 Influence of the FCs on the agreement measures ...... 143 5.5.8.1 Timing of administration ofthe FC ...... l43 5.5.8.2 Format and design of the FC ...... 144 5.5.8.3 Influence of memory or recall bias ...... 144 5.5.8.4 Errors in reporting serve size ...... 145

CHAPTER 6: Conclusions and suggestions for future work 6.1 Conclusions ...... 147 6.1.1 Detailed conclusions ...... 148 Reproducibility ...... 148 Criterion validity ...... 148 Content validity ...... 149 6.1.2 Problems with the study design ...... 149 6.1.3 Problems with respondent bias ...... 150 6.2 Suggested modifications to the FC ...... ••...••.....•.•...... ••.•...... 151 6.2.1 Summary of suggested modifications for clinic use ...... 151 6.3 Recommendations for use of the FC ...... •...... •.•.....•••...•.•.....•.....•••• 152 6.4 Suggestions for future testing ...... 153

~illliograJltllf ......

AJlJlendices Appendix 1 Determining sample size and power ...... Appendix 2 Background information to subjects involved in the pilot study ...... Appendix 3 Evaluation form for the pilot fat checklist ...... Appendix 4 Background information for subjects involved in the criterion validity study ...... Appendix 5 Consent forms for the validity study......

xi Contents and abstract

Appendix 6 Ethics approval ...... Appendix 7 Example of a completed FC (final version) ...... Appendix 8 Instructions for recording food recordsby weighed methods ...... •.•••....•..•••..... Appendix 9 Instructions for recording food records by estimated methods ....••.....•..••...•...• Appendix 10 Measures of agreement using Bland-Altman techniques ...... •...... ••...... ••.• Appendix 11 Differences in fat intake (g fat/d)of individual food items between Test 1 and Test 2 ...... Appendix 12 Calculations for sensitivity, specificity and predictive values of the FC .....••..• Appendix 13 Frequency distribution of responses in Group 1 ...... Appendix 14 Mean fat intake (g fat/d) from individual food items on the fat checklist for Group 2 and Group 3 ...... Appendix 15 Basis for categories of 'selected food items on the fat checklist for comparing differences in food choice between Group 2 and Group 3 ......

List of tables Table 2.1 Uses and applications offood consumption surveys ...... 6 Table 2.2 Overview of the main applications, strengths and limitations of dietary survey methods ...... 8

Table 2.3 Characteristics of criterion validity studies of FFQs with other dietary measures that measure nutrients in the 'usual' diet (for retrospective periods greater than one month) between 1968-1995 ...... 19-21

Table 2.4 Characteristics of criterion validity studies of short-term FFQs that measure the 'usual' or one or more nutrients over periods of less than four weeks ...... 22

Table 2.5 General approaches to measuring agreement ...... 26

Table 2.6 Approaches used to validate a FFQ ...... 35 Table 2.7 Characteristics of reproducibilityt studies of FFQs between 1984-1994 using a test-retest technique (including a range of correlations for nutrients tested and total fat, if reported) ...... 43

Table 3.1 Pilot fat checklist ...... 61-2 Table 3.2 Foods used for grouping 'trimmed' meat for Q3 on the pilot FC ...... 63

Table 3.3 Criteria for selection of population groups for evaluation of the FC ...... 66

Table 3.4 Timetable and protocol for data collection for the criterion validity study of the FC for Group 2 and Group 3 ...... 72

Table 3.5 The final fat checklist ...... 7 6-77 Table 3.6 Statistical procedures used to measure the criterion validity of the final FC ...... 81

Table 3.7 Methods used to analyse content validity of the final FC in Groups 2 and 3 ...... 85

xii Contents and abstract

Table 4.1 Participation and demographic characteristics of the study samples ...... 88 Table 4.2 Comparison of descriptive statistics for fat intake (g fat/d) from the FC at Test 1 and Test 2 for Group 1 (n=49, matched subjects), including unmatched samples ...... 89 Table 4.3 Statistical parameters for the FC used for assessing agreement in fat intake between Test 1 and Test 2 (n=49) ...... 89 Table 4.4 Summary statistics of macronutrient and energy contribution in Group 2 and Group 3 calculated from the FR, including estimated Basal Metabolic Rate ..... 94 Table 4.5 Mean daily fat intake (g fat/d) and descriptive statistics between the FR and FC in Group 1 and Group 2 ...... 97 Table 4.6 Comparison of the interclass correlations for fat intake (g fat/d) between the FR and FC for Group 2 and Group 3 ...... 101 Table 4.7 Measurements used for assessing agreement between the FR and FC, including the measurement error of the FC ...... 103 Table 4.8 Analysis of variance corresponding to the regression ofthe FC on the FR ...... 108 Table 4.9 Variables in the regression equation used for determining 'true' fat intake from the FC in Group 2 and Group 3 ...... 108 Table 4.10 The 25th, 50th and 75th percentile from the FR and FC for Group 2 and Group 3, pooled (n=61) ...... 109 Table 4.11 Cross-classification of ranked fat intake (based on the mean daily fat intake (g/fat/d) between the FR and FC in Group 2 and Group 3, pooled (n=61) ...... 111 Table 4.12 Mean fat intake (g fat/d) in FR quartiles and FC quartiles for subjects in Group 2 and Group 3, pooled (n=61) ...... 112

Table 4.13 Cross classification of 'lower' and 'higher' fat consumers between the FC and FR for Group 2 and Group 3 pooled (n=61) ...... 112 Table 4.14 Indicators of performance of the FC in 61 subjects at different cut-offs ...... 113 Table 4.15 Site and frequency of errors in recalling food accurately on the FC by subjects in Group 2 (n=30) and Group 3 (n=19) ...... 114 Table 4.16 Comparison of descriptive statistics of fat intake (g fat/d) from the FC between the matched and unmatched subjects in Group 2 and Group 3 ..... 116 Table 4.17 Frequency distribution of responses(%) to FCs for Group 2 (n=42) and Group3 (n=19) ...... 119-120 Table 4.18 Food items on the FC which contribute around 70% of the 'true' daily fat from the FR intake in Group 2 and Group 3 ...... 121 Table 4.19 Differences in mean fat intake (g fat/d) from selected food groups for the FC between Group 2 and Group 3 (n=61 FCs in 61 subjects) ...... 124 Table 4.20 Difference in mean fat intake (g fat/d) from individual foods that were significantly different between Group 2 and Group 3 ...... 125 Table 5.1 Power recalculations for the final subject numbers used for testing agreement between the FR and FC ...... 140

xiii Contents and abstract List of figures Figure 3.1 Study design of the pilot study, reproducibility and validity studies of the fat checklist ...... 65

Figure 3.2 Food models used for cross-checking the estimated FRs ...... 73 Figure 4.1 Repeated measures of fat intake using the FC in the same subjects in Group 1, with the line ofequity ...... 90 Figure 4.2 Agreement assessment between differences in daily fat intake (g fat/d) of test 1 minus test 2 plotted against the mean fat intake in both tests for Group 1 (n=49. The centre line represents the mean differences between the two test and the other two lines represent 2SDs from the mean ...... 91

Figure 4.3 The loge transformed data of Figure 4.2 ...... 91 Figure 4.4 Distribution of individual estimated energy intake (EEl): predicted basal metabolic rate (PBMR) from the FR in Group 2 ...... 95 Figure 4.5 Distribution of individual estimated energy intake (EEl): predicted basal metabolic rate (PBMR) calculated form the FR in Group 2 ...... 96 Figure 4.6 Mean daily fat intake (g fat/d) for Group 2 measured by the FR and FC with the line of equity and 95% prediction intervals (n=42) ...... 98 Figure 4.7 Data ofFigure 4.6 after logarithmic transformation (n=42) ...... 99 Figure 4.8 Mean daily fat intake (g fat/d) for Group 3 measured by the FR and FC with the line of equity and 95% prediction intervals (n=19) ...... 100 Figure 4.9 Data of Figure 4.8 after logarithmic transformation ...... 100

Figure 4.10 Agreement assessment between the two methods. Differences in fat intake (g fat/d) between the two methods plotted against the means for Group 2 (n-42). The centre line represents the mean differences between the two test and the other two lines represent 2SDs from the mean differences ...... 102

Figure 4.11 Agreement assessment between the two methods. Differences in fat intake (g fat/d) between the two methods plotted against the means for Group 3 (n-19). The centre line represents the mean differences between the two test and the other two lines represent 2SDs from the mean differences ...... 102

Figure 4.12 The fitted regression line for individual subjects for Group 2, including 95% prediction intervals for individuals ...... 104 Figure 4.13 The fitted regression line for individual subjects for Group 3, including 95% prediction intervals for individuals ...... 105

Figure 4.14 Studentised residuals from the regression line plotted against the FC values (or unstandardised predicted values) for Group 2 ...... 105

Figure 4.15 Studentised residuals from the regression line plotted against the FC values (or unstandardised predicted values) for Group 3 ...... 106

Figure 4.16 Normal plot of residuals from Figure 4.14 for Group 2 ...... 106 Figure 4.17 Normal plot of residuals from Figure 4.15 for Group 3 ...... 107

xiv Contents and abstract

Figure 4.18 A box and whiskers plot of fat intake of the FR quartiles for Group 2 and Group 3 (pooled), showing 2.5, 25, 50, 75, 97.5 cumulative relative frequencies (percentiles)(n=61) ...... 110 Figure 4.19 A box and whiskers plot of fat intake of the FC quartiles for Group 2 and Group 3 (pooled), showing 2.5, 25, 50, 75, 97.5 cumulative relative frequencies (percentiles)(n=61) ...... 110

XV Contents and abstract

Glossary

Anthropometric measures Measurements of the size, weight and proportions ofthe human body Area under curve (AUC) This is the area under a ROC curve (AUC) is the probability that a test will correctly identify the disease or the true measure of the variable measured. Bias Deviation from the truth or accuracy of the measure. Systematic bias One-sided deviation or deviation in the same direction from the accuracy of the measure. Biochemical markers See biomarker. Biomarker A biological or physiological measurement, used as an indicator of nutrient intakes, nutritional status, metabolic effects of exposure to substances and susceptibility to, or presence of diet-related diseases. Cohen's kappa see kappa statistic. Confidence interval Represents a range of values which includes the true value. A 95% CI for a statistic will not include the population value 5% of the time. This means that with 95% confidence the true value will be included in the defined range provided. Its calculation is linked to the SE and t tests. For example, the 95% CI for the observed mean of a variable provides a range of 'acceptable' values from around t-2SE below the observed mean (lower limit) to t+2SE above it (upper limit) (Gardner and Altman 1989, 93-4). Content validity Refers to the extent to which the measurement "incorporates the domain of the phenomenon under study" (Last 1988, 132). Content validity of a dietary method reflects the ability of the method to define or reflect the types and quantities of foods or meals consumed within a specified study population. This is also a measure of the 'internal' validity, 'face' validity or 'demonstrated' validity of the method (Block and Hartman 1989). Criterion validity Criterion validity relates to the extent to which the measurement correlates, or agrees with, an "external criterion of the phenomenon under study" (Last 1988, 132). Criterion validity of dietary methods, often called external validity in this context, refers to the comparison of a new dietary method with another dietary method considered more 'accurate', or external measure such as a biomarker (Lee and Neiman 1993, 63). Demographic The characteristics of populations of groups, especially in relation to sex, age and vital statistics Demonstrated validity see content validity Epidemiology The study of the distribution and determinants of health characteristics of specified populations, and the application of these determinants to control health problems (Last 1988, 42) Face validity see content validity

xvi Contents and abstract

Gross misclassification The term 'gross misclassification' describes a situation in which subjects are classified in extreme and opposing categories by two different methods (Feunekes et al. 1993). Hypothesis tests Tests of significance (See below ) Internal validity lnternal validity relates to the 'accuracy' of the measurement within the confines of the study and the study population (Gehlbach 1988, 3). Kappa statistic A kappa statistic assesses the degree of agreement between two measures by ranking and comparing each measure. This statistic is used where there are more than two categories for comparison (for example, tertiles or quartiles). Analogous to the Pearson Correlation coefficient that measures the validity of nominal variables Negative predictive value This is the proportion of subjects with negative test results who are correctly diagnosed Null Hypothesis A null hypothesis states that there is no difference or no association between variables. If a significant difference or association is found then the null hypothesis is rejected (Munro 1997, p. 68) P-values The areas in the extremes or tails of the distribution beyond the value of the test statistic. P-values denote the probability that the value of the test statistic occurred by chance alone Positive predictive value This is the proportion of subjects with positive test results who are correctly diagnosed Power The power of a test is the probability that a test of a specified sample size would detect, as statistically significant, a real difference of a given magnitude (Altman 1991, 455). PBMR Predicted Basal Metabolic Rate Receiver Operator A ROC is generated by plotting the sensitivity versus !-specificity for Curves (ROC) each possible cut-off and then joining the points. These plots determine the best possible cut-off (ie the point where the sum of sensitivity and specificity is maximised) based on the numerical information entered. Regression (linear) Regression is used to describe the relationship between two continuous variables in order to predict the value of one variable for an individual using another known variable (Altman 1991, 299). Reliability Reliability relates to the reproducibility of measurements and refers to the degree of stability of a measurement repeated under identical conditions (Last 1988, 114).

Repe~tability See reproducibility Reproducibility Reproducibility, often called reliability or repeatability is defined as the ability of a dietary method to produce the same quantitative estimate of food intake on two or more different occasions from the same individuals (ie a different concept from validity) (Block and Hartman 1989; Lee and Nieman 1993, 66). Residuals from the Residuals are the differences s between the observed value of the regression model dependent variable and the value predicted by the regression line after the regression model is fit or determined. xvii Contents and abstract

Semi-quantitative Usually involves a combination of measures that are mostly quantifiable and some measures that are qualitative. Sensitivity The proportion or the percent of true cases correctly categorised by the instrument or test-(= true positive rate) Significance level The level that corresponds to the area in the critical region at the extremes of the distribution of data. When a test statistic falls in this area which is usually infrequent, the results are referred to as significant at the alpha level, The significance level is the probability of rejecting a true null hypothesis (ie a Type I error or alpha) (Cohen 1988, 4). Significance test (= test A procedure that is used to establish whether the test statistic falls of significance) within the critical region. If it does the result is significant (Kuzma 1992, 122) Specificity The proportion of the percent of non-cases correctly categorised by the instrument ofr test(= true negative rat) Standard deviation A measure of the variability between individuals in the level of the factor being investigated ( eg variability of individual scores around the mean)(Altman 1991, 16) Standard error Depends on the sample size and the standard deviation and is a measure of the uncertainty of the test statistic. It is a measure of the imprecision of the sample statistics (Altman 1991, 16) Studentised residuals These are scatterplots of the differences in variability from point to point of the residuals. It is calculated by dividing the observed residual by an estimation of the standrad deviation of the residual at that point. These residuals are the preferred option to a standardised residual as the plots makes it easier to see violations of the regression assumptions (ie normality). Validation Validation is the process of establishing that a method is sound or accurate (Last 1988). The validity of any dietary method to measure food consumption is defined as 'its ability to measure the intakes of foods or nutrients with accuracy' (Hankin 1988); or the ability of a dietary method to measure 'what it is intended to measure' (Block 1982, 1988; Burema et al. 1988, 171; Block and Hartman 1989). Validity The measurement of validity is "the degree to which a measurement measures what is purports to measure" (Last 1988)

xviii Contents and abstract

Acronyms AUC The area under a receiver operator curve

BMI Body Mass Index

CHRMC Cardiovascular Health Risk Management Clinic

EEl Estimated energy intake

FC Fat checklist

FR Food record. In this study a three-day food record

FFQ Food frequency questionnaires

INFOODs International directory of food composition tables

LSAS Low Sodium Advisory Service

NUTTAB Nutritional tables (Australian data base)

PBMR (Predicted) Basal Metabolic Rate P:M:S Polyunsaturated:monounsaturated: saturated fatty acid ratio

ROC Receiver operator curve

SODA Systems -On-Line-Dietary Analysis

WHO World Health Organisation

RDI Recommended Dietary Intakes

RDA Recommended Dietary Allowances

NHANES National Health and Nutrition Examination Survey

xix Contents and abstract

Abstract

The objective of this study was to develop a short-term fat checklist (FC) and to examine whether it could replace a three-day food record (FR). The intended use of the FC was to measure and monitor current fat intake over three consecutive days in patients attending a local clinic in Canberra during a four-month lifestyle intervention program, and to provide an education message. A self-administered FC, in the form of a semi-quantitative FFQ, comprising 43 food items derived from food consumption data of two Australian population studies of adults was developed.

The FC was compared with a three-day estimated food record (FR) in 19 volunteer clinic patients (9 men, 10 women) and also, to broaden its application, in 42 tertiary nutrition students (12 men, 30 women) against a three-day weighed FR. Correlations for fat intake between the FC and FR were 0.86 for clinic patients and 0.68 for nutrition students. Mean differences in fat intake estimated by the two methods did not differ significantly from zero although Bland­ Altman plots showed large difference in fat intake between individuals. When classified by the FR, 92% of all subjects fell into the same or adjacent quartile when classified by the FC. Fifty­ four percent of subjects were classified in the same quartiles for fat intake although overall ranking of quartiles suggested a 'poor' but significant agreement (kappa statistic, 0.33). The FC

classified subjects consuming ~70 g fat/d with 74% sensitivity. A Pearson's correlation of 0.65 was calculated by a test-retest reproducibility procedure at three weeks in 49 tertiary physiology students. The FC reflected the food choices of all subjects tested with few exceptions.

It was concluded that the FC cannot replace a three-day FR to measure absolute fat intake of individuals but was acceptable for measuring fat intake at the group level and to rank

individuals into broad categories of fat intake. Its sensitivity of 74% to fat intakes ~70g fat/d suggested that, the FC detected around three out of four subjects identified as consuming a high fat level by the FC have, in fact got a high fat consumption. It was also useful for screening between 'high' and 'low' fat intakes in groups.

XX Chapter 1: Introduction and aims

CHAPTER 1: INTRODUCTION AND AIMS

1.1 Introduction Diet plays an important role in the promotion of health and prevention of many diseases. Measurement of dietary intakes is crucial to establishing the diet-disease links, to identifying risk groups in the population, and to assessment of individuals. Nutrition education and intervention programs for individuals and small groups are needed in Australia, in clinical practice and in community-wide health promotion programs conducted by dietitians and other health professionals. Traditional methods of measuring dietary intakes of individuals in health management clinics are time-consuming, expensive, require trained dietitians to interview subjects, and may require extensive coding and analysis (Bingham 1985, 1987). The development of methods for local use that streamline the collection and analysis of dietary data would help professionals deliver and evaluate such programs.

1.2 Background

1.2.1 Fat intake in the Australian population Australian adults are eating high-fat diets. An analysis of the dietary intakes of two population surveys of Australian adults, the National Dietary Survey of 1983 and the Victorian Nutrition Survey suggest that fat intakes are similar and contribute around 37- 38% of energy from fat to total energy intakes (English et al. 1987, 33; Baghurst et al. 1988). In Australia, a reduction in fat intake by the Australian population from around 3 8% of total energy in the national diet to a target of30% oftotal energy has been recommended (Nutbeam et al. 1993). Similar population targets have been recommended in the US (US Department of Health and Human Services 1988; US Department of Health and Human Services 1990) and Canada (Department of National Health and Welfare 1990). The World Health Organisation (WHO) also suggests a population target for fat intake of 15-30% of energy in the diet (WHO Study Group 1990).

1.2.2 High fat intake and disease Reducing fat intake in individuals and groups is a major target of intervention programs to decrease the prevalence and incidence of many diet-related diseases. There is a strong association between high fat intake, in combination with high serum cholesterol, and risk of cardiovascular disease (Lipid Research Clinics Program 1984; Martin et al. 1986). The rationale for reducing high fat intakes, particularly certain types of saturated fat, to reduce 1 Chapter 1: Introduction and aims serum cholesterol and reduce the morbidity and mortality due to cardiovascular disease, is well documented in numerous population studies (Keys 1980; Hjermann et al. 1981; Shekelle et al. 1981 ). In the Oslo study by Hjermann and colleagues ( 1981 ), a 17% decrease in total serum cholesterol was reported in the intervention group (n = 604 men, aged 40-49 years) who consumed around 28% of energy from fat compared to the matched control group (n = 628, 40- 49 years) consuming 44% of energy from fat, irrespective of the type of fatty acids consumed.

High fat diets are also linked to hypertension (Bjomtop 1986), cancer of the breast (Goodwin and Boyd 1987; Schatzkin et al. 1989), cancer of the large intestine (McMichael1988; Graham et al. 1988), obesity and weight gain (Danforth 1985), and Type II diabetes mellitus (Zimmet et al. 1986; Zimmet 1988).

1.2.3 The need for a simple method to assess fat intake in local programs A number of lifestyle intervention programs to reduce risk factors for diet-related diseases have been set up in Australia. One such program is The Cardiovascular Health Risk Management Clinic (CHRMC) (formerly The Low Sodium Advisory Service) initiated in 1989 in Canberra. Patients are medically referred to the clinic to undertake programs targeted at weight loss, lowering elevated serum cholesterol, stress management, control of hypertension and cessation of smoking. Patients attend the CHRMC for regular assessment, education and monitoring over four months.

New resources were required for the weight loss and cholesterol-lowering programs of the CHRMC that could be used by the staff and patients to assess dietary behaviour and lifestyle habits, to self-monitor dietary behaviour, to provide educational input, and to evaluate the intervention programs. A food frequency questionnaire (FFQ) relating to sodium had previously been developed and tested by researchers at this clinic and was found to be a valid and reliable method, in combination with external measures, for estimating sodium intakes (Millar and Beard 1988). Staff at the CHRMC requested the development of a resource to measure absolute fat intakes of individuals based on a three-day retrospective intake of diet, (the same format as the sodium checklist in use at the time).

1.3 Methods for measuring dietary intake

1.3.1 Traditional methods Measurement of dietary intake is difficult making the process of patient assessment and dietary intervention frustrating and time-consuming. There are numerous methods to assess dietary intakes of individuals and groups, each with advantages and disadvantages. Several reviews 2 Chapter I: Introduction and aims have been written about the reliability and validity of the most common methods: 24-hour recall, weighed and estimated diet records, diet history method and FFQs (Block 1982; 1989; Block and Hartman 1989; Bingham 1987; Lee and Nieman 1993, 63-8). For individuals, dietary histories and food records are used to measure food habits and behaviours and to determine nutrient intake. FFQs are infrequently used for assessment of dietary intakes of individuals. For groups, the common methods used for measuring dietary intakes are quantitative, semi-quantitative or qualitative approaches, including 24-hour recalls, 24-hour records, and FFQs.

Most data for food intake are translated into nutrient intakes and involve coding every food item prior to conversion to nutrients. When used to measure 'usual' diet, these methods produce very detailed nutrient information, which is not always required for evaluating a nutrition intervention program or for studies where ranking of only one or a few nutrients is the objective.

1.3.2 The need for new methods for measuring dietary intakes A number of shorter simpler methods of assessing dietary intake that measure one or two nutrients has been developed and validated in the US and Europe (see Chapter 2, Table 2.5, p. 34). These reduce and may eliminate the need for coding and extensive nutrient analysis if a simple scoring system is included on the questionnaire itself (Angus et al. 1989). A food checklist, targeting foods that are good sources of a particular nutrient, that is a valid and reliable replacement for the more time-consuming methods (eg diet histories, food records) could provide a tool in dietary management of individuals; and multiple uses by the clinic in nutrition education and dietary assessment.

A specific local fat checklist, in contrast to a three-day food record, could be used by CHRMC dietitians or health educators at the time of consultation to measure and monitor quantities and sources of fat in the diet. They could then provide immediate feedback for the patient on eating behaviour. Such a method could also be used for measuring intake, identifying food sources of a nutrient and for monitoring of food choices, modification of inappropriate eating habits, and nutrition education. This would offer a number of advantages for the CHRMC and may have potential for application in a wider Australian population.

1.3.3 The importance of testing the validity of new methods for measuring dietary intake The term validity, in relation to dietary intake methods, is defined as the ability of the method 'to measure the intakes of foods and nutrients with accuracy' (Hankin 1988, 183) (see Chapter 2, Section 2.5, p.16-17). Validation testing is an important component in the acceptance of any 3 Chapter 1: Introduction and aims new method of measuring dietary intake and provides, if the measure agrees favourably with other more 'accurate' measures of dietary intake, scientific credibility for its use. Techniques used to test the validity of a dietary method, particularly a FFQ, are described in Chapter 2, Section 2.5, p.16-19).

Although numerous methods for measuring dietary intake of specific nutrients have been developed and are in use in programs throughout Australia, very few are published in detail or adequately validated. This may reflect the logistics and cost associated with the design, sampling and implementation of any validity study.

1.4 Aims and objectives The aims and objectives of this project were:

• to develop a fat checklist in the form of a food frequency questionnaire (FFQ) based on local Australian food habits and food composition data;

• to pilot the fat checklist for ease of completion, clarity of directions, understanding of the format and for adequacy of the frequency options and serve sizes listed in a sub-sample of university students and clinic patients;

• to assess whether the fat checklist could be substituted for a three-day food record as a valid measure of current fat intake of individuals participating in a longitudinal dietary intervention program in a local clinic (clinic patients);

• to measure the validity of the fat checklist in another sample of subjects (university students studying nutrition), with an assumed wider range of energy intakes than clinic patients;

• to assess reproducibility of the fat checklist in a discrete study group (university students studying physiology);

• to assess criterion validity of the fat checklists in clinic patients involved in a community­ and also in university students studying nutrition;

• to identify foods misreported on the fat checklists by university students studying nutrition;

• to assess content validity of the fat checklist in university students studying nutrition and clinic patients; and

• to make recommendations for any modifications of the fat checklist found necessary based on the outcomes of validity testing.

4 Chapter 2: Literature review

CHAPTER 2: Literature review

METHODS FOR MEASURING DIETARY INTAKES AND THEIR VALIDATION

2.1 Introduction Measurement of dietary intake in combination with biochemical, anthropometric and clinical information is used to assess the nutritional status of individuals and groups, eg in national nutritional monitoring surveys, in nutrition intervention programs, epidemiological surveys, and in clinical assessments. Improving accuracy of the methods used for measuring dietary intake is crucial although it is currently impossible to measure 'true' dietary intake with total accuracy in free-living people. New methods such as checklists or questionnaires that are quick and accurate are needed, especially in clinical practice where time-consuming methods such as food records and diet histories are traditionally used. Any new method needs to be compared and tested against existing, more accurate, methods to evaluate agreement and validity.

This chapter reviews the design of food frequency questionnaires (FFQs) and their limitations in clinic situations, particularly those that focus on measurement of fat. This chapter also reviews the concept of method validation and the statistical methods used to assess agreement between measurement methods, together with their application and limitations in validating dietary methods. The constraints of undertaking validity testing of a new dietary method are also identified.

2.2 Reasons for measuring food intake Reliable measures of food consumption are important for measuring intakes of nutrients and other food components for individual assessment. For relative assessment of group intake, qualitative measures of food consumption (eg more or less fat/meat) are often used where ranking of food or nutrient intake is the objective. However, the usual objective for measuring food consumption is to provide a quantitative estimate of the 'usual' food or nutrient intake of an individual or group over a specified period (Block et al. 1989). Table 2.1 outlines the major uses and applications for food consumption data.

5 Chapter 2: Literature review

Table 2.1 Uses and application offood consumption data

Use Application National food and nutrition planning Checking adequacy of the food supply Planning food production Establishing the rationale for nutrient enrichment of the food supply Assist in determining nutritional status Calculating average nutrient intakes in population groups Comparing adequacy of group nutrient intake with population standards ( eg Recommended Dietary Intakes (RDis)) Combination with other parameters (eg biochemical, anthropometric, clinical) to assess nutritional status of individuals and groups Assessing the links between diet, health and disease Comparing and contrasting indices of nutritional status with incidence and prevalence of disease in groups Evaluating nutrition education, intervention or food Providing feedback on the efficacy of programs in fortification programs individuals and groups Monitoring toxic substances in the food supply Estimating average contribution to the diet from unwanted substances ( eg pesticides, industrial chemicals) and their risk to health for groups

Adapted from Sabry ( 1988)

Apparent food consumption data are used to estimate changes in the national food supply and patterns of food availability rather than actual food consumption practices of the population (McLennan and Podger 1993). This information is also used to assist in planning international nutrition policy as well as national agricultural policies for the production and distribution of food (Pekkarinen I 970). Comparisons between countries can be used in ecological studies of diet-disease links. An example is that of Armstrong and Doll (1975) who demonstrated, from a compilation of studies, strong positive correlations between meat consumption and colon cancer incidence across several countries.

Dietary surveys conducted on large population groups such as National Dietary Surveys of Adults in Australia (Cashel et al. 1986), and National Health and Nutrition Examination Surveys (NHANES) in the US (Woteki et al. 1988) have provided a national estimate of the intakes of individual foods, the nutrient intakes from those foods and the total nutrient intake from food. Murphy and Sempos (1989) describe four major uses for national dietary data. These include:

• assessment and monitoring; • regulatory policy; • epidemiological research; and

6 Chapter 2: Literature review • commercial applications. In national dietary surveys of individuals, comparisons of the data with national dietary standards such as Recommended Dietary Intakes (RDis) or Recommended Daily Allowances (RDAs) are used to identify population groups at greatest risk of nutrient deficiencies.

It is possible with repeated surveys to monitor trends in nutrient and food intakes over time (Sempos et al. 1992). Such data are useful in evaluation of national nutrition programs, and in monitoring progress towards meeting national nutrition and health objectives (Nutbeam et al. 1993). For epidemiological research, 'usual' dietary intake data in combination with morbidity and mortality data, and/or biochemical and clinical data, are used to assess the links between nutrients, food factors and the risk of developing diseases.

Determining nutritional status of individuals is important in a clinical situation to assess nutrient adequacy and dietary balance and the need for, or effects of, dietary intervention (Table 2.1). Of interest to dietitians in clinical practice is collection of food intake data on individuals for the purpose of assessment of nutrients/food choice, prevention of risk-related dietary behaviours, monitoring of food choices and evaluation of dietary intervention programs.

2.3 Methods for measuring food consumption

2.3.1 Measuring food consumption Methods for measuring food consumption can be categorised into two major types: current dietary intakes (eg food records); or past dietary intakes (eg retrospective short- or long-term recall of foods consumed). Methods for measuring current diet include weighed diet records (including computerised scales) and estimated diet records, and duplicate diets. These methods record food intake at the time of consumption. Retrospective methods include the 24-hour recall, diet history and FFQs that measure intakes from periods of one day (Kristal et al. 1990c) to 10-15 years (Sobell et al. 1989). FFQs include a list of foods to measure 'usual' intakes over previous six to 12 months (Larkin et al. 1989; Willett et al. 1985; Pietinen et al. 1988a, 1988b).

Techniques for measuring dietary intake have been reviewed extensively (Marr 1971; Block 1982; Kranztler et al. 1982; Roberge et al. 1984; Daniels 1984; Bingham et al. 1988, 53-106; Lee and Nieman 1993, 49-62). Further review of all these methods and techniques is outside the scope of this thesis but the main applications, strengths and limitations of these methods are summarised in Table 2.2. More detailed discussion ofFFQs is found in Section 2.4, p. 11-25.

7 Chapter 2: Literature review

Table 2.2 Overview of the main applications, strengths and limitations ofdietary survey methods

Data collection Application Strengths Limitations method CURRENT FOOD CONSUMPTION Food records Weighed Assess food choices and eating Weighed is considered an Time consuming to conduct and analyse 4 5 method usi11g habits mostly from 1 to 7 days 'accurate' method • require trained personnel I, 2 scales or Require literate and co-operative computerised Provides information respondents 2 11 approaches about eating habits Distortion offood choice • 9 10 Poor compliance after 4 days • Estimated Estimated (household Fairly valid up to 5 days8 Not representative of 'usual' diet, unless 12 13 method usi11g measures) method acceptable repeated ' house/told for research because of better Provides detailed Underestimates energy intake from 20- measures compliance than weighed information 50%6,11 7 8 method ' Duplicate food Uses duplicate meals/foods for Most accurate method High respondent burden collectio11s direct chemical analysis Expensive to analyse Distorts food choice Underreporting 14 Other biases poorly documented FOOD CONSUMPTION IN THE PAST (Recall methods) 24-/wur recall Used mainly to rank food or Minimal distortion of food Not representative of 'usual' intake of nutrient intakes of groups of intake individuals unless repeated at random15 people Memory I recall bias Low respondent burden Underestimates total energy intakes19 Can be used to rank food and nutrient intakes of individuals, Good response rate if repeated at random 15 Low administration cost Foodjreque11cy Mainly for ranking 'usual' Similar advantages to the Similar limitations to the 24- hour recall questiOitltaires food or nutrient intakes of 24-hour recall Less accurate than record methods (FFQ) groups of people in qualitative List of foods may not fully represent (See Sectio11 2.4 or semi-quantitative terms Measures 'usual' diet and foods consumed by respondents for more detail) As screening tools to detect, may be more Difficulty quantifying portion sizes measure or rank specific representative of 'usual ' Overestimates at low energy intake and 20 21 nutrients or food intakes in intake than repeated diet in long-term studies • and groups or individuals records 7 underestimates at high energy intakes As an adjunct to educating, 10,16,17,18 documenting and modifying Quick to administer dietary behaviour of individuals in clinics Cost effective Dietary history Combines a 24 hr diet recall Comprehensive Time-consuming to conduct and aFFQ assessment of the 'usual' Dependent on a highly trained Assessment of 'usual' intakes nutrient intake, including interviewer of individuals mainly for seasonal changes Dependent on memory and cooperation clinical use of the respondent Tend to overestimate nutrient intakes 8 1Biock 1989, 2Pekkarinen 1970, 3Willett et at. 1987, 4Marr 1971, 3Bingham 1985, 6Rutishaser 1988, 1Lee & Neiman 1993, 53-4, 58-9, 8Bingham et at. 1988, 67, 9Daniels 1984, 10Gersovitz et at. 1978, 11 Stockley 1985, 12 Willett 1990, 63, 13 Block 1989, 14Lee­ han et at. 1989), 15 Sempos et at. 1985, 16Stunkard & Waxman 1981, 17Carter et al. 1981, 18Faggiano et at. 1992, 19 Krall & Dwyer 1987, 20 Sorenson et at. 1985, 21 Larkin et al. 1985.

Although there is no truly accurate measure of dietary intake in free-living people, a weighed food record is considered the 'gold' standard or most accurate and feasible method against which other or new methods for measuring dietary intake are compared or validated (Marr 1971; Bingham 1987).

8 Chapter 2: Literature review For assessment of food choices and eating patterns in dietetic practice and the quality of the diet, either diet histories or three- or four-day food records (mostly using household measures) taken over consecutive days are traditionally used. Brekke et al. (1992) in a sample of 224 adults, found that for three consecutive days (Sunday, Monday and Tuesday), a correlation of 0.91 was calculated for fat intake between a seven-day diet record and these three days. In another study of 40 women, Stuff and colleagues (1983) suggested that three days can provide as good an estimate of total fat intake in groups obtained from seven days of continuous recording.

The advantages of recall methods compared with methods measuring current food consumption include a short data collection time, good response rates, lower respondent burden, lower administration cost, and minimal distortion of food intake behaviour (Pekkarinen 1970; Marr 1971; Rutishauser 1988). Long-term FFQs that measure most nutrients in the 'usual' diet have been criticised for their lower accuracy compared with diet records and the 24-hour recall. Recall bias (Marr 1971; Dwyer and Krall 1988) and varying ability of respondents to quantify foods accurately (Guthrie 1984; Willett, 1990, 80; Fogelholm and Lahti-Koski 1991; Tjonneland et al. 1992) are the major reasons for this. Approaches to reducing this bias include the use of food photographs and providing subjects with options to change the portion sizes.

Most dietary survey methods, however, tend to underestimate food and nutrient intakes as seen in Table 2.2, limitations column, p.8. The FFQ has been reported to over-estimate intakes of people with low energy intakes and under-estimate intakes of those with high energy intakes (Gersovitz et al. 1978; Stunkard and Waxman 1981; Carteret al. 1981). Similar trends were observed using a 24-hour recall (Krall and Dwyer 1987). A more recent study of 103 Italian volunteers (51 men and 52 women), using a 24-hour recall, showed a similar tendency (Faggiano et al. 1992). This was related to errors in quantifying portion size or food quantities.

2.3.2 Conversion of foods into nutrients Food composition tables or databases are used to translate food intake data into nutrient intake data, although in some circumstances, such as metabolic studies, they are not sufficiently accurate so direct analysis of duplicate foods or diets is necessary (Paul and Southgate 1988, Lee-Han et al. 1989).

9 Chapter 2: Literature review 2.3.2.1 Food composition data

The conversion of food into nutrients is the second ranked major source of error in dietary surveys (Kohlmeier 1992, 73) and is a reflection of the skills and knowledge of the researcher, the method of data collection and the available food composition tables.

In Australia, new food tables, Composition ofFoods, Australia (COFA) were published in 1989 and 1990 for Australian foods (Cashel et al. 1989; Commonwealth Department of Community Services and Health 1990a, 1990b, 1990c, 1990d). The sampling procedures and compilation of these tables have been documented elsewhere (Cunningham 1990; Cashel 1990). A simplified issue of Composition of Foods, Australia (COFA) called Nutritional Values of Australian Foods (NVAF) was released in 1989 and updated in 1992 (English and Lewis 1992). The first database version of COFA called NUTTAB, available for computer analysis, provided nutrient information on nearly 1300 Australian foods (Cashel and Lester 1987). The version currently available (NUTTAB95) contains 1805 foods include all food published in Volumes 1 to 7 ofCOFA (Lewis et al. 1995).

Converting foods into nutrient is not simply multiplying the amount of food eaten by the nutrient composition derived from the food composition database. Errors in the conversion are introduced by a lack of specificity in the description of food or quantities consumed, together with insufficient information about preparation methods and edible portion weight. Food composition databases do not contain data for all of the large number of foods consumed so inappropriate food substitutes, omission of foods and nutrients and estimates are common practice.

Analytical techniques are constantly changing because ofthe development of new technologies so food tables may not be keeping up with these changes. For these reasons, as well as the large biological, agricultural and commercial variability in nutrient content of foods between and within countries of origin, nutrient values documented in national food tables are not appropriate for use in other countries (Greenfield and Southgate 1992).

Local food composition data of known reliability should be used for estimating nutrient composition or for comparing nutrient content between dietary methods. Even within the same country of origin, nutrient composition of raw food is variable because of natural, biological, geographical and agricultural factors (Cashel 1990). Cooking and processing which may differ in other countries lead to wide differences in food composition between data bases of different countries (Cashel 1990).

In summary, conversion of food intake data into nutrient data can only provide estimates of nutrient intake of the actual foods consumed. However, although researchers usually report the

10 Chapter 2: Literature review food composition database and coding protocols used in a study, few acknowledge or address the potential effect of these errors (eg use of substitute foods, incomplete information on food, loss of accuracy using standard serves, use of different cooking methods, fat-modified foods), or how they were addressed, when interpreting nutrient data. For example, there are substantial gaps in the foods and nutrients included in the Australian tables of food composition as analysis of all foods is not possible and analysis of all nutrients has not been undertaken. Currently information on many commercial foods and miscellaneous items (eg seasonings and spices) is limited.

2.4 The design, uses and limitations of FFQs:

2.4.1 Format of FFQs FFQs (often called list-based diet histories, food intake checklists or semi-quantitative food frequency questionnaires) incorporate a pre-determined food list with or without portion sizes plus a frequency response option for respondents to report how often (eg per day, week, month) each food was eaten. This is often supplemented by questions on other foods eaten by the respondent but not on the list, including questions about food preparation, supplement use and other food-related behaviours. In its early form, the FFQ was designed as a qualitative method, seeking information on the frequency of consumption of specific food items without specification of the actual serve or portion sizes usually consumed. More recent versions of this approach use portion sizes (Baghurst and Baghurst 1981; Willett et al. 1988; Pietinen et al. 1988a; Fogelholm and Lahti-Koski 1991). Some FFQs provide options for modifying foods and portion sizes (Baghurst and Baghurst 1981; Willett 1990, 79) although the effect of this on improving accuracy is still uncertain.

2.4.2 Uses of FFQs Most FFQs are designed to measure 'usual' or habitual diet over six to 12 months (Willett et al. 1987; Pietinen et al. 1988a; 1988b; Block et al. 1989) for epidemiological purposes. Some researchers have used a FFQ technique to assess nutrients or foods consumed in the short-term; (eg in the past 24 hours (Krista! et al. 1994); in the previous seven days (Eck et al. 1991; Curtis et al. 1992); and in the preceding month (Welten et al. 1995) (see Table 2.4, p. 22).

FFQs have also been adapted for new and different uses. These include:

• as screening tools to detect, measure or rank specific nutrient or food intakes in groups or individuals, eg fat (Block et al. 1989; van Assema et al. 1992; Dobson et al. 1993; Feunekes

11 Chapter 2: Literature review et al. 1993), iron (Herzler and McAnge 1986), and calcium (Nelson et al. 1988; Angus et al. 1989; Musgrave et al. 1989; Cummings et al. 1987); and

• as adjuncts to educating, documenting and modifying dietary behaviour in individuals in clinic or dietetic practice (Millar and Beard 1988; Kristal et al. 1990b; 1994).

2.4.2.1 Use ofFFQs or food checklists in clinics

The literature on the use of food checklists or FFQs in clinics is scarce although Block et al. (1992) suggest that FFQs which measure nutrient intakes of groups can also provide useful nutrient data for individuals. One of the major reasons for literature paucity is that validity testing of a new dietary intake method is now usually required prior to publication in most peer-reviewed journals. It is likely that many dietitians use either existing or new dietary assessment tools in their practice but are not validating those techniques because of the constraints of undertaking validity testing as described later (see Section 2.7.5, p. 39)

Food checklists, or FFQs adapted to measure current eating behaviour, are a useful alternative to the more traditional and time-consuming methods of assessing diets of individuals in clinics or dietetic practice. For individual assessment in these environments where current diet is of interest, short-term recall techniques, 24-hour recall or diet histories (Munro et al. 1995), and food records (Kristal et al. 1990c; Eck et al. 1991; Buzzard et al. 1994; Munro et al. 1995) are usually used.

The major objectives of dietary assessment in a clinical situation are to document an individual's current eating behaviour and to encourage a new behaviour that improves food choice. The reported specific purposes and uses of food checklists or FFQs for individuals in clinics include:

• to evaluate the impact of dietary intervention counselling or intervention programs (Strohmeyer et al. 1984; Vailas et al. 1987; Millar and Beard 1988);

• as an education resource, for self-help intervention (Musgrave et al. 1989; Kristal et al. 1990c);

• to document and monitor adherence to diet intervention (Vailas et al. 1987; Lee-Han et al. 1989; Curtis et al. 1992);

• as a rapid screening device (Strohmeyer et al. 1984);

12 Chapter 2: Literature review The food composition approach, by itself, is not applicable when the measurement of a large number of nutrients in the diet is needed, as it does not reflect food consumption practices of populations, is subjective in its determination, and may be lengthy and time-consuming to complete. This approach is more applicable to FFQs that estimate one or two nutrients or when used in combination with other approaches. b. FFQs derivedfromfood consumption data

Another approach to compiling a food list is based on dietary surveys of large population studies. Willett (1990, 74) describes this as 'an open-ended approach' to questionnaire design that identifies foods that contribute to the total absolute intake of groups of people rather than individuals. One advantage of this method is that a comprehensive list of foods is obtained and important nutrient contributors are unlikely to be missed. However, a comprehensive list of foods is usually lengthy. Therefore, further collapsing of food items into groups is required and pilot studies are needed to assess the impact of this. The major investment of time needed to undertake this task often precludes the application of this method.

Using a population approach, Larkin et al. (1989) derived a FFQ from the US National Food Consumption Survey (NFCS), 1977-78 which described dietary intakes of a randomised sample of adults, 23 to 74 years, representative of the US population. Based on these data, only 116 foods were included on the food list. This was all that was needed to reflect the nutrient and energy content of the 'usual' diet, which was the purpose ofthe intended FFQ. The foods were then grouped into major food groups on the basis of nutrient content.

This approach is likely to be more accurate than using a food composition approach provided the study population has similar characteristics (ie age, sex ratio, cultural background, socio­ economic status) to the original population. If using this approach, differences in demographic characteristics need to be compared and taken into consideration when evaluating the performance and suitability of the FFQ within the study population. c. FFQs derived from statistical procedures linking nutrient composition with consumption practices

The third approach is to apply statistical procedures to dietary intake data and develop predictive equations between foods and nutrient contributions. Numerous researchers have used this technique to shorten lengthy questionnaires by determining foods most representative of nutrient intakes (Richard and Roberge 1986; Hankin et al. 1968; Willett et al. 1985). Hankin and colleagues (1968) used multiple regression analysis of dietary data to develop predictive equations for inclusion of food in a FFQ. Using seven-day weighed food records of 73

15 Chapter 2: Literature review Japanese-American men, 35 to 55 years, 67 to 73 percent of the variance of the measured intakes of calories, fat, carbohydrate and sodium were accounted for by a regression equation. This level of variance was considered by these authors to be reasonably predictive of intakes of these nutrients. Willett et al. (1985) applied stepwise regression to the responses of 2000 women to a I 00-item FFQ. For each individual, specific nutrients were correlated with the total nutrient intake and, in this way, identification of the foods that most discriminated between individuals was possible. Ultimately, foods that had the highest discriminatory capacity were selected and then collapsed into several nutritionally similar items to form the final FFQ containing 61 items (see Table 2.3, p. 19).

A large sample is necessary to apply these statistical procedures with confidence. Therefore, for most researchers with limited resources, this procedure is not possible. Although Hankin and colleagues explored these procedures as early as 1968 when FFQs originated as a research tool, they have been infrequently used even in the large epidemiological studies conducted in the 1980s and 1990s

2.4.4.2 Number of food items

Table 2.3 (p. 19-21) and Table 2.4 (p. 22) show characteristics of long-term and short-term FFQs, respectively. The number of food items included in published FFQs varies considerably with up to 276 food items reported for studies estimating nutrient contents of 'usual' diets (see Table 2.3) compared with smaller numbers of 12-46 food items when only a few selected nutrients are of interest (Table 2.4). 'Usual' diet in this table applies to FFQs that measure dietary intakes retrospectively for more than one month. The list of foods for single or a few nutrients is a reflection of the distribution of the nutrient(s) in the food supply.

Historically, the earlier FFQs which were designed to measure a large number of nutrients used a long list of foods. However, by the 1990s, the number of food items used for this purpose decreased (see Table 2.3, p. 19-21). This was the outcome of applying a regression model to these data to determine foods most predictive of the nutrient/s of interest. For example, Pietinen eta!. (1988a) modified their original276-item FFQ for measuring 'usual' diet down to 44-item FFQ which had similar validity to the original. This number of food items was similar to the FFQ of Willett and colleagues (1985) who also applied a regression model. However, shortened food lists resulted in a decreased validity in other studies. In contrast to the good validity of the Pietinen questionnaire, validity of the 61 item FFQ designed by Willett et al. (1985) was poor compared to the criterion method (four repeated one-week diet food records) and did not reflect sufficient detail. Subsequent questionnaires conducted by the Willett group

16 Chapter 2: Literature review showed an improved validity when the food list was expanded from 61 to 116 food items (Willett et al. 1987, 1988).

It is apparent that the minimum number of foods required to estimate nutrient intakes with the highest level of accuracy varies considerably but favours longer lists. This list is considerably reduced if only a few nutrients are measured. Byers et al. (1985), for example, using a regression analysis model, found that as few as 15-20 food items were all that were required to rank intakes of vitamin A and C, dietary fibre, protein, fat and energy. This condensed food list was originally based on 128 foods.

Other investigators have also concluded that a large fraction of the variability of 'usual' intakes in long-term studies of one or two nutrients can be explained by a small number of foods (Block 1989). Two studies, for example, that have specifically measured total fat intakes contained a small list of foods. Block eta!. (1989) used 13 food items to rank women, aged 45 years and over, according to fat intake, while van Assema et al. (1992) used 25 items covering 12 groups of food products to rank individual Dutch adults aged 18-35 years, according to fat 'score' (see Table 2.4, p. 22). However, short-term studies that measure dietary intake to estimate fat intake in individuals over the previous seven days in intervention programs use higher numbers of food items presumably to capture more detail about food choice. Krista! et al. (1990a) used 46 items in 96 US women, aged 45-59 years whereas Curtis et al. (1992) used 78 food items in 29 patients referred to a lipid management clinic for lowering fat and cholesterol intake (see Table 2.4).

In summary, if the purpose of the FFQ is to rank fat intake, a short food list is all that is required. Conversely, if quantitative measures of fat intake are of interest, more foods are required on the food list. Therefore, the number of foods listed on the FFQ varies with the purpose of the FFQ (to rank or quantify), the nutrient or nutrients of interest and the variability in consumption of the nutrients of interest.

2.4.4.3 Use of portion sizes

FFQs may or may not include an option for quantifying foods consumed. Questions still exist concerning the respondent's ability to provide, as well as the researcher's need to collect, the actual amount or portion size of the food consumed (Guthrie 1984, Hunter eta!. 1988). Both standard portion sizes and individually estimated portion sizes have been used with or without the aid of measuring devices for estimation portion sizes. A standard portion is one with a specified amount such as a glass of milk or a chicken leg. These standards are usually based on quantities that are easily recognised by consumers or are amounts based on average serve sizes of population groups (eg Pao et al. 1975). They are not the serve sizes consumed by

17 Chapter 2: Literature review individuals. Often portions are represented by small, medium and large amounts and then translated into quantifiable values based on characteristics of population consumption practices. a. Does tlze inclusion ofportion sizes increase accuracy?

The inclusion of portion sizes to enhance validity of the FFQ is still controversial. Some studies suggest that individually estimated portion sizes are of minor importance to validity measures compared with the use of standard portion sizes (Cummings et al. 1987; Tjonneland et al. 1992) while others have shown significant differences (Clapp et al. 1991; Block and Hartman 1989). In a comparison of the addition of portion size information using food models and frequency information alone, Samet et al. (1984) found only small differences when estimating intake of vitamin A. In another study investigating the accuracy of quantifying food portions, 0-67% of respondents overestimated portion sizes by more than 51%; and from 0 to 25% of respondents underestimated by more than 51% (Guthrie 1984). Another validation study of a FFQ on I 03 volunteers found that subjects who ate small portions tended to overestimate portion size while those who ate large portions underestimated portion size (Faggiano et al. 1992). Other researchers suggest that FFQs should be modified for portion size and frequency options to account for differences in food choice due to sex, age, culture and ethnicity (Cade 1988; Cade and Margetts 1988). However, few FFQs address these characteristics for differences in food consumption.

18 Table 2.3 Characteristics ofvalidity studies ofFFQs validated against other dietary measure (not biomarkers) that measured most nutrients in the 'usual' diet (for retrospective periods greater than one month) between 1968-1995

Source Population FFQdesign Measurement Criterion method Interval Statistical tests for testing agreement Correlation of Correlation period of between between methods nutrients for fat FFQ methods measured 'usual ' diet for most available nutrients

Balogh eta!. 1968 Israeli men (S) portion options, food models, all nutrients usual diet 8 x 1-day FRs week after Correlation, regression 0.69 PUFA- 0.94 0.94 (n=14) measured completion of alcohol FFQ Stuff eta!. 1983 Lactating women (S) 105 food items, food models, options for usual diet 1-weekFR simultaneous Comparison of means (t test, P-values), 0.00 iron - 0.24 0.04 (US) serve sizes, all nutrients measured ICC, correlation, cross classification (% calcium (n=40) agreement using RDAs) Mullen eta!. 1984 College students (S), 278 items, 1~ food categories, all nutrients 28-days Indirect observation (foods simultaneous correlation (Pearson), regression, 0.66 all foods na (US) measured consumed were immediately frequency distribution of errors (n=31) cross checked after consumption in a cafetria environment) Willen eta!. 1985 Nurses (R), (US) 61 items, 9 response categories, standard previous year 4 x 1-week FR (weighed) 1 month -1 correlation, cross classification (% 0.36 vitamin A 0.48* (n=173) portions, mailed, focus on nutrients year agreement using quintiles), means for 0.75 vitamin C (monthly) related to cancer, all nutrients measured each quintile 0.53* (yearly)

Willen eta!. 1987 Males and females (R), 116 items, expanded version of Willett 1 year daily weighed FRs for 1 year na but one correlation (Pearsons) adjusted for age, 0.47 iron- 0.76 0.76 (US} (1985), all nutrients measured year assumed sex and calories intake, no P-values or total fat 0.51* (n=54) hypothesis testing *0.21 linoleic acid *0.68 vitamin A Pietinen et al. Finnish men, (R) 276 item, photos for portion estimation usual diet 12x2 day FRs 1-6 months means (SD), correlation (Pearson) 0.51 vitamin A 0.42 1988a 55-69 years with a dietitian, all nutrients measured (adjusted for energy, cross-classification 0.73 PUFA 0.47* (n=103) (quintiles of nutrient intake distribution), 0.8 alcohol kappa statistic Willen eta!. 1988 Registered nurses (R), 116 items, similar to 1985 study in format 1 year (3-4 4 x 1-week FRs 3-4 years mean (SD), correlation (interclass), 0.28 iron to 0.61 0.48 (US) and design, all nutrients measured year earlier) (crude, adjusted for energy), cross carbohydrate 0.45* 39-63 year classification of nutrient scores (n=150) (quintiles) Larkin et al. 1989 Men and women, 116 food items, standard portions, all usual diet 4 x 3-day FRs (household same or next comparison of means (t-tests, P values), na (R),(US) nutrients measured measures), 4 x 24-hour diet day correlation, mean differences (t-test), % 24-51 years recalls (3 months apart) difference (the mean difference divided (n=228) by the record mean) Horwath and Elderly Australians 90 item mailed monthly qualitative usual diet direct observation and 2 weeks after % agreement, kappa, 0.42 vegs not analysed Worsley 1990 adults, 70+ years questionnaire, some quantitative itemisation of domestic food FFQ 0.85 variety of (n=200) (R) questions, all nutrients measured stores for 4 weeks foods Fogelholm and Finnish male athletes 122 items, two versions, 1) reported usual diet 7-day FR 1 week after mean + 95%CI, cross classification micronutrients, Lahti-Koski 1991 (n=84) (S) portions, 2) standard portions, all nutrients FFQ (similar/opposite), correlation, Pearson) thiamin, Fe, Zn, Ca measured Mg, VitC Munger eta!. 1992 Iowa women (S) (US) 126 items: modified form of Willett I year (usual 5 x 24-hour diet recalls 6months mean (SD), median, regression, 0.0 iron 0.27 55-69years questionnaire (1987), administered by diet) by telephone correlation (crude, energy adjusted), no 0.95 thiamin 0.62* (n=44) telephone, all nutrients measured P-values included, no hypothesis testing Callmer et a!. 1993 Malmo food study 250 items, standard portion, food usual diet 3 x 6-day FRs (weighed) 2months correlation, (crude, adjusted for energy), 0.28 energy - 0.80 0.61 * (male) (nurses), Sweden, (S) photographs, all nutrients measured % overestimation/underestimation ascorbic acid 0.58* {n=206) compared with reference method (female) Callmer et al. 1993 Malmo food study 130 items+ 2 weeks FR measuring intake usual diet 3 x 6-day FRs (weighed) 2months as above 0.27 retinol- 0.81 0.64* (male) (nurses}, Sweden (S) at each hot meal tocopherol 0.67* (n=206) (female) Nes et al. 1993 Elderly women (S) 180 items, quantitative, grouped on meal usual diet 2x3 dayFR I month after median (95% CI), median differences, 0.31 Vitamin C (1st 0.66 (1st 67-80 years (n=38) patterns 2x 4 day FRs, weighed (total lastFR, 3 (P-values}, correlation (crude, adjusted), FFQ) FFQ) 14 days over 6 months) months before cross classification (quintiles) 0.79* carbohydrate 0.55 (2nd FR (1stFFQ)) FFQ) Thompson& UK smokers (n~301, 84 foods usual diet I 0 day FR (weighed) no mean (95% Cl), mean differences, 0.18 (vitA) to 0.83 0.34 (men) Margetts 1993 122 men, 179 women), information, correlation (crude, energy adjusted), (alcohol) (men) 0.61* (men) 40-59 years, randomly FRafterFFQ Bland-Altman technique women similar) 0.44 (women) chosen from a clinic and 0.53* (men) volunteers from newspaper) Feunekes et al. men and women 104 food items, semi-quantitative, portion usual diet diet history 8weeks mean (SD) (t-test, P-values ), differences linoleic acid, total 0.78 (fat) 1993 (n=l91}, (Netherlands) serve with options to change biomarker (linoleic acid in between means (SD}, correlation fat, MUF A, SFA 0.75 (sat fat) (R) erythrocyte membranes and (Pearson), cross classification (quintiles), 96 women, 95 men) sub-cutaneous adipose tissue same/adjacent category, gross (from a sub"sample) misclassification Goldbohm et al. adults (Netherlands) (S) 150 items, (mailed) usual diet 9 day FR (3 x 3 days, 4-5 3 months mean (SD}, ratio of means (±2SD) 0.40 (vitamin A) to 0.61 1994 (n=l09) 59 men, 50 months apart) after last FR correlation (95%CI (crude, energy 0.86 (alcohol) 0.64* women, 55-69 years adjusted) cross classification (quintiles), regression Bingham et al. UK women (n=I60), (S) l.Oxford FFQ (usual diet), portions differences a. 24 hr recall (blank) variable Means (SD}, comparison of mean (t tests, With 16 dFRs With 16 d 1994 volunteers 2. Cambridge FFQ (24 hr recall), no between usual b. 24 hr recall (structured) P-values), cross classification (quartiles), 0.21 beta-carotene FRs portions diet and other c. 16 d weighed FR within and between person coefficient of ina To0.90 1. 0.52 methods, d. 7 day FFQ (no portions) variation, correlation between nutrient on alcohol in d. and 1. 2. 0.35 seasonal e. 7 day checklist (portions) 16 day FR, correlation, scatterplots for a. 0.40 differences specific nutrients between 7 d and 16 d b. 0.4 FR d.0.62 e. 0.52 Wheeler et al. Aust adults (n~207), (R) modified CSIRO FFQ - list-based (300 usual intake 4 x 4 day FR over 1 year I year (list median. differences in median intake, eg 0.41 cholesterol) not reported 1995 90 males, 117 females, foods}, FFQ (meal-based, 172 food items) (3 months (weighed} 3 month cross classification (tertiles), correlation to 0.78 (fibre} 18-62 years portions, options for changing serve size (meal-based), (meal) linear regression, kappa statistic (males, meal based, 6 months (list- excluding alcohol) based) 'Usual' diet to measure one or two nutrients

Block et al. 1989 Women (S), (US) 13 items, 13 food groups, portion serve usual intake 3 x4-dayFRs 1 year correlation (g fat, % fat) total fat, saturated 0.58 total fat 45+ years (small, med, large), no option to change over I year 0, 6 and 12 months fat (n= 101) as above 53 items, portions included usual intake 2x4-dayFR within 10 calcium 0.73 winter days (Ca) 0.83 winter (Ca) Van Assema et al. Dutch adults (R) 25 item (qualitative), by telephone, 12 usual diet over 7-dayFRs 2x 1 week correlation, cross classification (gross total fat 0.59 total fat 1992 18-93 years food categories, response converted to a 6 months household measures 1x 1 month misclassification), unweighted kappa (n=52, 29 females and fat score statistic 23 males) Pietinen et al. Finnish men (R) 44 items, frequency only, standard portion l-6months 12 x 2-day FR distributed 6months means (SD),% of food record measure, total fat, saturated 0.42 total fat 1988b 59-69years sizes assumed, interviewed by dietitian+ evenly over 6 m correlation (crude, energy adjusted), and polyunsat fat, 0.47* total fat n=187 food usc questionnaire cross-classification (quintilcs) vitamin A,C, E, 0.65PUFAs selenium, fibre Nelson et al. 1988 Women 72-90 years weekly quantitative, and qualitative 20years 5-day duplicate diet na means, (SD), comparison of means (t calcium 0.76 (UK), (R) design, portion serves, food photographs, test, P values), cross classification (n=28) options for changing serve sizes, 9 food (tertiles), correlation categories, number of food on list (na) as above Women 65-74 years as above usual diet over 7-day weighed record 4weeks calcium 0.69 (n=32) a week Angus et al. 1989 Australian women (S) 34 items, 11 food categories, portions, no usual diet over 4-dayFR within 1 week mean (SEM), linear regression, calcium 0.81 29-72 years option to change, self-administered 1 week correlation, cross classification (n=28) (quartiles) as above 29-72 years, as above as above at 1-12 calcium 0.78 (n=28) months Kemppainen et al. Finnish adults (na) a) 21 food items (short questionnaire), a) 3 months a) 3 d FR (household a) 3 months mean (SD), mean difference (SD of total fat sat, 0. 79 total fat 1993 51 women, 31 men, 16- household measures, portion size 'usual' diet measures) filled in after the difference) (Bland Altman), correlation, monounsat and 71 years drawings, six frequency categories FFQ cross classification (high, medium, low), polyunsat fat (sub-sample from a b) qualitative fat index based on four multiple regression (age, BMI and fat larger randomised questions b) as above b)3 month intake with total serum cholesterol) survey n=872) b) as above *adjusted for energy, (R)- randonused sample, (S) -selected sample, US- Umted States ofAmenca, UK~ Umted Kzngdom, na=not avarlable or not reported, FR-food record SEM =standard error ofthe mean, SD =standard deviation, BMI =Body Mass Index, CI =confidence intervals Table 2.4 Characteristics ofcriterion validity studies ofshort-term FFQs that measured 'usual' diet or one or more nutrients over periods of less than 4 weeks

Source Population FFQdesign Reference Criterion method Interval Nutrient {s) Statistical tests reported for testing Correlation period of between measured by the validity of nutrients FFQ methods FFQ measured Hertzler and Para professionals (US) (S) 93 items, 4-food categories, standard previous 24 24-hour dietary recall 0 iron Frequency distribution of iron points 0.77 McAnge 1986 (n=89) portion, no option to change categories hours Correlation (Pearsons) Krall and Dwyer Males and females (US), (S}, 39 items 1 week to 4 2 x3-dayFR 1 month energy nutrients + mean (SD),% of actual intake (, t- not analysed 1987 16-69 years weeks Fe, ca, vitA & C test, P-values for differences in mean (n=l9) in 'usual' diet % Millar and Beard Australian adults 21 items, qualitative, data, no portion 3 days 24-hour urinary sodium 0 sodium comparison of scores 0.7 1988 men and women (S) control excretion comparisons of means (t-test, P- (n=39) values) correlation (Pearson) Krista! et al. 1990a Women, (US), (S), 46 items, qualitative, no portions, by 1 week 2 x 4-day FR plus 6weeks total fat,% correlation 0.57 total fat (n=96) telephone another FFQ (modified calories from fat, Weighted kappa (quartile (45-59 years) from FFQ by Block et al. sat fat, fibre classification) 1986) % agreement, gross misclassification Krista! et al 1990b US women,. (S}, 28 item, broken into 5 scales of dietary 4weeks 2 x 4-day FR plus 3 months dietary patterns factor analysis, correlation not (n=96) fat use patterns, qualitative assessment another FFQ (modified relating to undertaken (45 -59 years) from FFQ by Block et al. selection of a low 1986) fat diet Krista! et al. 1990c US women (n=96) 19 questions (qualitative) 1 day 24 hour recall 3 months fibre and fat % agreement between methods, not (S) (45- 59 years) kappa statistic, 95% CI for undertaken differences between methods N N Eck et al 1991 US college students {n=40) Modification of FFQ by Willett et al. 7 days 3 x 24 hour recall last day of most available comparisons of means {SD){t-tests, not (S),2! women, 20 men {17-42 {1985, 1987, 1988). study period nutrients P-values), undertaken years correlation (Pearson) agreement between quartiles for each method Curtis et al. 1992 26 women, 3 men by referral 78-item FFQ based on foods with >3g 7 days 4dayFR FR dietary intake comparison of means (SE) (t-tests, P- 0.53 total fat to a lipid lowering clinic, total fat,> 1.5g sat fat,> 20mg administere relating to fat, values) (US), (S) cholesterol d cholesterol, and correlation immediately sat fat after the FFQ Dobson et al. 1993 Australian adults, 53 men, 71 17 food items, qualitative measure of I week 179 food item FFQ at same time fat, cholesterol, correlation + CI around r, cross- 0.55*total fat women (S) dietary behaviour relating to fat (Frequan) fatty acids classification (quartiles) 0.67* sat fat 0.44 P:S ratio Welten et al 1995 (n 160), 74 male and 86 quantitative dairy questionnaire 4weeks diet history 2 years calcium means (SD) , and mean differences 0.64 (CI 0.53, female, 27 - 29 year old (Cl), Bland -Altman plots, correlation 0.72) (Netherlands) (Pearson (CI), % of gross misclassification, kappa statistic * adjusted for energy, (R) randonused sample, (S) selected sample, US Umted States of Amenca, UK- umted Kmgdom. P.S rat1o polyunsaturated fatty aculs: saturated fatty aculs, sat fat- saturated fat, PUFA= polyunsaturatedfatty acids, na=not available, Bland and Altman (1986), SD =standard deviation, CI =confidence interval Chapter 2: Literature review b. Are some foods more accurately quantified than others?

Generally subjects do not describe portion serves accurately except for foods consumed in defined units such as a slice of bread, an egg, a carton of yoghurt (Guthrie 1984). People tend to have particular difficulty quantifying foods without such defined units, eg meat (Willett 1990, 80). Only one study has investigated and reported problems in accurately recalling meat and poultry intake (Fogelholm and Lahti-Koski 1991). These foods, therefore, need special attention to improve accuracy. This is crucial in individuals whose diets regularly include meat since small differences in meat serve and fat trimming may have large effects on the amount of fat consumed.

To overcome these difficulties and improve accuracy of data collection, many investigators use food models (Balogh et al. 1968; Stuff et al. 1983) or standard portions with photographs (Callmer et al. 1993; Nelson et al. 1988). Options have also been incorporated for altering portion sizes into FFQs (Block et al. 1986; Fogelholm and Lahti-Koski 1991). However, many questionnaires provide. no option for describing or changing portion serves. For large epidemiological studies, providing this option imposes additional costs to coding and processing of data.

In clinical studies using FFQs, where current food intake of individuals is quantified, standard portion sizes (Herzler and McAnge 1986; Curtis et al. 1992) or estimated portion sizes (Krall and Dwyer 1987; Welten et al. 1995) have been used. c. Does inclusion of 'standard' portion size introduce bias?

Use of standard portion sizes either guessed by the respondent or estimated with the aid of food models or photographs may also introduce systematic bias or error. It is hypothesised that systematic error would be much less likely to occur if subjects self-reported portion size information. This theory has been infrequently tested. A major confounding factor in testing this hypothesis is that few subjects alter portion serves on a FFQ when given the option (Baghurst 1993) so systematic bias would not necessarily be eliminated.

Willett (1990, 83) suggests that most of the variation in intake of any food is associated with frequency of consumption rather than actual amount consumed. Ideally, the effect of inclusion of portion size options on validity needs to be investigated in the development of any new FFQ using the target population, as well as applying food models or other methods that probe portion sizes.

23 Chapter 2: Literature review

2.4.4.4 Influence of grouping foods of similar nutrient content

Many of the earlier FFQs included long lists of individual food items to capture as many of the foods consumed as possible as previously discussed (see Section 2.4.4.2, p. 16). More recently, foods of similar nutrient content have been combined on the basis of traditional food groups to reduce respondent burden (Serdula et al. 1992; Willett 1990, 76). For example, foods such as beef, pork, lamb are often combined as a single item. In the original FFQ designed by Willett et al. (1985), a number of fruits (peaches, apricots and plums) were grouped. As the questionnaire was designed to measure consumption of these foods over a one-year period, respondents had difficulty assessing each food separately and then describing a summary frequency. Hence the reason for the increase from 61 to 116 foods items in subsequent questionnaires developed by Willett's group (See Section 2.4.4.2, p. 16).

Despite the speculation about loss of accuracy with food grouping, only one published study investigating fat intake has addressed this issue. In this study, two FFQs developed by the

N~tional Cancer Institute for ranking fat intake using a telephone survey were compared in two randomly assigned groups containing around 450 respondents in each group (Serdula et al. 1992). One FFQ comprised 29 questions about separate high-fat foods, whereas the other grouped the same foods into 14 questions. One of the assumptions of this study was that the two study groups were not different in their intake of high-fat foods. The results, based on a comparison of ranking the responses in each FFQ, showed that the respondents to the grouped­ foods questionnaire (ie 14 questions) were less likely to report consumption of high-fat foods in most of the food groups than those respondents replying to the separated-foods questionnaire (29 questions). In this study, validity of responses was not assessed because of the absence of a criterion measure for fat intake. This survey assumed that the two groups were matched for fat intake but this may be a false assumption because of the absence of another external or criterion measure for checking. a. Order offoods presented in FFQs

The way foods are presented and ordered in FFQs appears to be done by using an ad hoc or subjective approach, often using traditional food groups as a basis. The validity ofthis practice needs to be tested and has not been addressed in the literature. b. Consensus of opinion for grouping foods ofsimilar nutrient content in a FFQ

From the limited information available, it is difficult to draw any conclusions about a loss of accuracy caused by grouping foods into a few or several categories or food types. On the other hand, if the purpose of a FFQ is to monitor change in food choice, long lists of foods are I ikely to be needed to assess these changes. Research from the cognitive area of psychology suggests

24 Chapter 2: Literature review that subjects often break down a question into several steps anyway (Bradburn et al. 1987). If, however, the purpose of the study is specifically testing the relationships between single nutrients and disease, the grouped approach is adequate according to Byers et al. (1985). The number of types of foods selected for grouping needs to be pre-tested in the target population for any new questionnaire. This would be particularly important for studies of nutrients or foods that have variable consumption patterns such as vitamin A and iron (Basiotis et al. 1987).

2.4.4.5 Number of response options

Depending on the study objectives, a FFQ usually contains from five to nine options for reporting on the frequency of consumption of a particular food item. Too few options result in insufficient detail to recover information and a serious loss of information is the outcome (Willett 1990, 77). Too many frequency options (ie beyond nine), according to classical measurement theory, result in small and insignificant gains in recovered information (Nunnally 1978, 121) as well as increasing respondent burden and potentially decreasing compliance. Generally, five to nine categories are optimal in dietary survey research (Worsley 1981). 2.5 Assessing agreement between two methods of measurement The approaches used for measuring validity of any new method of measurement involve comparing two or more methods and determining the extent to which they agree. Numerous approaches are used to assess agreement as shown in Table 2.5, p. 26. Clearly a number of these approaches is used and reported in the dietary, medical/clinical and epidemiological literature (see Section 2.5.1, p. 27-33) but no single approach has been adopted. Usually a combination of approaches is used. Many clinical studies relating to dietary intakes of individuals report agreement in the same way as the epidemiological studies where group intake is of interest. Agreement techniques often differ in clinical studies involving small samples. Recently, there has been a move away from conventional hypothesis testing in method comparison studies (ie reporting of significance results using P-values) and a move towards reporting confidence intervals and other statistical estimates.

In comparisons of dietary measures in both clinical/medical and epidemiological literature, the traditional techniques used are comparison of group means (for nutrients of interest), standard deviations, correlation coefficients, cross-classification and regression. More recently, these measures have been complemented by more emphasis on reporting confidence intervals for means or differences between means, kappa statistic, Bland-Aitman techniques for studies of small sample sizes and sensitivity, specificity and predictive values of the new method.

25 Chapter 2: Literature review Table 2.5 General approaches used for measuring agreement

General approach Statistical test Hypothesis testing • comparison of means • correlation (intraclass, interclass) • regression Estimations • SD, SE, 95% CI • differences between means, and SD of the differences • cross-classification (% agreement, kappa statistic) • sensitivity, specificity and predictive values SD =standard deviation, SE = standard error, CI = confidence interval

Some of the traditional hypothesis testing procedures including comparison of means, correlation and regression analysis have been criticised (Bland and Altman 1986; Hebert and Miller 1991; Bellach 1993) on the grounds that they do not address the degree of agreement nor detect measurement bias of the new method. Garrow (I 995), the editor of the European Journal of Clinical Nutrition, recommends that papers submitted to that journal, which describe a validation of a food questionnaire method, should also report the degree of agreement, the means and their standard deviations and the percentage of subjects classified into the same/opposite quartiles or tertiles by each method.

For clinical studies, Lee et al. (1989) have advocated three criteria for evaluating agreement between two methods of measurement;

• there should be no marked systematic bias of the variables in each method;

• there should be no statistically significant difference between mean values obtained by the two methods; and

• the lower limit of the 95% CI interval for the intra-class correlation for more than two repeated measures should be at least 0.75.

Bland and Altman (1990) claim that crude comparisons of means and correlation coefficients (r) especially an intra-class correlation based on the Lee et al. (1989) criteria is unsatisfactory as a measure of agreement. Bland and Altman (1986) and others (Burema et al. 1988, 74) recommend that for satisfactory agreement, the mean differences between the paired observations of the two methods should not be significantly different from zero difference. This is an extension of the Lee et al. ( 1989) criteria and accounts for measurement error or bias.

26 Chapter 2: Literature review 2.5.1 The use and limitations of approaches and statistical tests for measuring agreement in method comparison studies This section describes the application and limitations of the approaches for measuring agreement in method comparison studies.

2.5.1.1 Hypothesis testing and use of P-values

Hypothesis testing, based on the conventional null hypothesis approach (see Glossary, p. xvii) (ie no significant difference between variables), and the generation of probability (or P) values, despite criticisms (see Section 2.5.1.3, p. 28), still remains an important and accepted statistical approach in method comparison studies. Generally, P-values are based on two-sided tests rather than one-sided tests (Altman 1991, 171). When Pis below the conventional cut-off(ie 0.05), the result is statistically significant and the null hypothesis is rejected. When P above 0.05, the null hypothesis is accepted. However, a significant result may not be a real effect. Similarly a non-significant result does not necessarily indicate 'no' effect.

While P values are informative, if meaningful biologically, they are limited to determining whether to reject or accept a null hypothesis. Hypothesis testing does not address the magnitude of the effect of interest (eg differences between two methods).

2.5.1.2 Standard deviation, standard error and confidence intervals

The standard deviation (SO), standard error (SE) and confidence intervals (CI) are estimates of data variability (see Glossary p.xvi). These values are useful to distinguish between hypothesis testing (statistical significance) and scientific importance (Curran-Everett et al. 1998). Numerous journals, particularly in the medical area, are encouraging contributors to present Cis in papers where appropriate (Gardner and Altman 1989, 4).

Confidence intervals provide the same statistical information as a hypothesis test but focus attention on the magnitude and uncertainty (or lack of precision) of the variables (Curran­ Everett et al. 1998). The confidence interval defines the range where the true value is likely to be. The 95% CI interval for a statistic (eg mean, difference between means, correlation), defined by the upper and lower limit of the range, is interpreted as a range of values which contain the true value with a probability of95% (Altman 1991, 163) (see Glossary p. xvi). The sample size affects the size of the standard error that in turn affects the width of the CI (Gardner and Altman 1989, 16). When small sample numbers are used the CI is wide which may make it difficult to draw any meaningful conclusions (Bland and Altman 1990). A larger sample size gives a narrower CI.

27 Chapter 2: Literature review Although the 95% CI values are statistically 'acceptable', they also need to be acceptable in clinical or scientific terms. For generalisation to a wider population, this interpretation of 'acceptability' of a new method depends on the assumption that the subjects tested are representative of that population group (Altman 1991, 164).

For comparing differences between means in method comparison studies, such as the validity testing of a new dietary method, two approaches are recommended for calculating and interpreting Cis (Curran-Everett eta!. 1998):

(1) calculation of95% Cis for each population mean; and

(2) calculation of 95% Cis for the differences in means between the two methods.

If the two population Cis fail to overlap in approach (1), the researcher would conclude that the population means differ. If the CI excludes zero in approach (2), this supports the interpretation that the two methods differ at the 0.05 level. The second approach imparts more confidence (at the 95% level) that an actual difference exists between the methods (Curran-Everett et a!. 1998).

2.5.1.3 Comparison of means

Crude comparison of means by a hypothesis test, often using a dependent sample t test, is frequently used to assess agreement between two measures. Comparison of mean nutrient intakes between methods discriminates between groups but gives no information about individual differences so has limited value for interpretation of individual differences in nutrient intake. This type of comparison was, and still is, often reported in dietary method agreement studies despite its limitations (Millar and Beard 1988; Larkin eta!. 1989; Eck eta!. 1991; Bingham eta!. 1994).

2.5.1.4 Correlation

The most common statistical test used for assessing agreement between methods is the interclass correlation coefficient (Pearson's Product Moment correlation (r) or Spearman's Rho correlation) (Altman 1991, 284). The intraclass correlation (ICC) is calculated in a different way to the interclass correlation and was devised to deal with the relationship between variables within classes. The intraclass correlation is used, for example, as an index of correlation between a number of repeated measures performed by the same method (Bland and Altman 1990; Lashinger 1992) and has been used in reproducibility studies of dietary measures for this purpose (Pietinen eta!. 1988a, 1988b; Wheeler et al. 1994).

28 Chapter 2: Literature review Correlation coefficients estimate the strength of the association between the methods with a correlation coefficient of zero indicating no association between the variables. The closer the correlation coefficient comes to either +I or -1, the stronger the association (Kuzma 1992, 200). A correlation of around 0.5 is considered a moderate association while values greater than 0.7 are considered a good to excellent association (Kuzma 1992, 202). Correlation alone, however, provides no indication of the significance of the 'strength' of the relationship. Further

significance testing (ie using a t test) is required. There are some situations where low correlation values do not necessarily indicate a weak relationship, and high correlation values do not necessarily indicate a strong relationship (Diekhoff 1992, 220).

Limitations of tlze use of correlation

Both correlation and regression have been criticised as inappropriate for assessing the validity and reproducibility of dietary measurement methods (Hebert and Miller 1991; Delcourt et al. 1994). Correlation measures the strength of the association between two methods not how closely they agree (Bland and Altman 1986; Altman 1991, 401). Its misuse in other method comparison studies, particularly clinical measures has been well-documented (Bland and Altman 1986; Karras 1997). It is erroneous to assume that two measures of dietary intake would not be related. Correlation is affected by factors unique to each study, such as sample size and the distribution of the dietary variable in the subjects (Delcourt et al. 1994) and does not detect systematic bias in either method (Bland and Altman 1990).

Most statisticians suggest that if correlation coefficients are presented in method comparison studies, the 95% confidence interval around r should be included (Hebert and Miller I 991, Gardner and Altman 1989, 34). However, the CI around r is also affected by sample size values, being wide with small samples and narrow with large samples as previously described (see Section 2.5.1.2, p. 27). This is another reason why correlation based on a test of significance may not be the preferred method in method comparison studies.

2.5.1.5 Regression analysis

Another method used for method comparison studies is regression analysis. The standard method used in method comparison studies is the least squares regression model (see Glossary, p. xiv). It is more useful for testing association than correlation, which reduces a set of data to a single number. In regression analysis, the strength of the relationship is clear and the level of uncertainty can be determined using confidence intervals and prediction intervals. The prediction interval reflects individual variability about the fitted regression line and is independent of sample size (Altman 1991, 307).

29 Chapter 2: Literature review The relationship between two measures can be described as a simple regression equation, that is, using the FFQ (x) to predict true intake (y) as determined by the gold standard (Willett 1990, 120). In the absence of any measurement error, two dietary intake methods could be interchangeable if they show a linear regression coefficient that is not statistically different from 1.0 (Lee et al. 1983). To apply linear regression analysis, no systematic errors in the data should be evident (Slinker and Glantz 1990, 16) so these and other criteria necessary to apply regression analysis should be tested before interpretation of the results.

2.5.1.6 Differences in means between each method

Bland and Altman (1986) and others (Lee et al. 1989) have advocated an approach that examines measurement differences between methods. With the measurement difference approach, systematic bias and the measurement error (or Jack of agreement of individual scores) can be detected and interpreted (Burema eta!. 1988). Such an approach allows separate analysis of systematic and random error (Irwig and Simpson 1989) and is frequently used in comparisons of clinical measures (eg different instruments for measuring blood pressure). Such an approach is ideal for dietary comparison methods where small samples or an analysis of a few nutrients are of interest. The technique, often referred to as the Bland-Altman technique after its originators, involves comparison of the difference between the means. This techniques involves an evaluation of scatterplots of the mean differences in the measurement variable between each method against the mean of both methods combined. The standard deviation of the mean differences and CI around the mean difference between the two methods is also calculated (Bland and Altman 1986, Lee et al. 1989).

Provided the mean differences in individual measures are within the mean difference ± two standard deviations of the mean difference and are not clinically important, two measurement methods could be used interchangeably (Bland and Altman 1986). Willett (1990, 121) however, suggests that the Bland-Altman technique is cumbersome when evaluating methods that measure many nutrients and thus is not appropriate for large-scale multi-nutrient epidemiological studies.

2.5.1.7 Classification techniques

Percent agreement between the variables in each method

Another way to compare measures between two methods is to categorise or rank the responses of subjects in the two methods and compare the differences in ranking. The variation between the ranked categories can then be compared and the percent of subjects who were not similarly categorised by the methods can be determined. Using this approach in dietary validation

30 Chapter 2: Literature review studies, subjects are grouped into quintiles (or quartiles, tertiles) on the basis of the new method and compared with a similar categorisation on the basis of the criterion method that represents the 'true' intake. As Willett (I 990, 121) suggests, the main advantage of percent agreement is that it defines the actual, quantitative differences in diet that correspond to the

relative categories defined by a food frequency questionnaire (FFQ). These ~lassifications are also associated with the measurement error of the questionnaire. There are no definitive values in the nutrition literature that describe the interpretation of these agreement measures between dietary survey methods. Percentage agreement measures are used in addition to other agreement measures to describe the level of agreement.

Kappa statistic

The kappa statistic, conceived by Cohen in 1960, is increasingly seen in studies of the reproducibility and validity of dietary assessment methods (Pietinen et al. 1988a, Horwath and Worsley 1990; Wheeler et al. 1995). It was designed as an improvement on the simpler percent agreement described above because it discounts the proportion of agreement that is expected by chance alone (Maclure and Willett 1987; Fleiss 1981, 147).

The kappa statistic is used where there are more than two categories for comparison (for example, tertiles or quartiles). The kappa statistic and the intraclass correlation are related. Similar to the correlation coefficient, the kappa statistic has a maximum of+ 1 when agreement is perfect and a value of zero for no agreement. However, the kappa statistic does not account for the degree of disagreement as all disagreement using kappa is treated equally. The weighted kappa, in contrast, allows different weights to be assigned to the frequency of each cell in the table, adjacent to the line of exact agreement), that is in disagreement. Those cells adjacent to the diagonal line (exact agreement) get a weight of 1, whereas those cells further away from the diagonal line have a higher weighting (Altman 1991, p. 407). Fleiss (1981) recommends that values of kappa exceeding 0.7 represent excellent agreement, values between 0.4 and 0.7, fair to good agreement, and values less than 0.4, poor agreement. In contrast, Altman (1991, 404) considers kappa values between 0.21 and 0.40 'fair' agreement.

The interpretation of the kappa statistic is confounded by the varying values for agreement proposed by different researchers (Willett 1990, 121). Clearly, there is no accepted standard for the interpretation of this statistic in the nutrition or epidemiological literature. Although routinely reported in clinical studies, it is frequently misused and has been criticised when assessing agreement (Maclure and Willett 1987).

31 Chapter 2: Literature review 2.5.1.8 Specificity, sensitivity and predictive values

The measurement error of a new method of measurement can also be described in terms of its specificity, sensitivity and predictive values (see Glossary p. xviii).

Sensitivity is measured in a group of subjects who test truly positive by the reference method and is defined as the proportion of subjects who are positive using a screening test that is considered the most accurate measurement of the selected variable. An operational definition of sensitivity relevant to a new dietary intake method to detect high fat intakes, for example, is that it accurately identifies 'true high fat' consumers. A high error in sensitivity (!-sensitivity) would result in the new method failing to identify a number of individuals who consume high fat diets.

Specificity is measured in a group of subjects who test negative by the reference method and is defined as the proportion of subjects with negative screening results by the other method. An operational definition of specificity relevant to a new dietary method for detecting high fat intakes is how well the new method identified individuals who are 'true low fat' consumers (or less than the cut-off for the higher fat consumers). The effect of a high error in specificity ( !­ specificity) would mean that individuals having fat intakes below the cut-off level for high fat intake would be incorrectly categorised by the new method when those individuals had a high fat intake from the reference method).

Values for specificity and sensitivity range from 0 to 1. A value of one for sensitivity indicates that the new method performs perfectly or in exact agreement with the reference method (Taylor et al. 1998).

Predictive values (PVs) provide a direct assessment of the usefulness of a new method in practice (Altman 199 I, 411 ). These values describe: the proportion of subjects with positive test results who are correctly classified by the new method (positive predictive value (PPV)) and: the proportion of subjects with negative test results who are also correctly classified as negative by the new method (negative predictive value (NPV)) (Altman 1991, 411 ).

Sensitivity, specificity and PVs are routinely undertaken for medical diagnostic tests but have rarely and only recently been applied to dietary validity studies. They have been used for determining the accuracy of: using the Body Mass Index (BMI) as a screening tool for excess body fat (Lazarus et al. 1996; Taylor et al. 1998); a FFQ for screening low calcium and fibre intakes in 183 US adults (Ritenbaugh et al. 1998); a FFQ for screening of low calcium intakes in 66 NZ children aged 3-7 years (Taylor and Goulding, 1998); and a FFQ to assess dietary calcium intakes in 58 NZ women (Wilson and Horwath, 1996). Clearly, high positive and negative PVs are crucial for the correct diagnosis of a disease or correct detection of the

32 Chapter 2: Literature review variable measured. However, to date, there are no known values for these measures that would be considered acceptable for a new method or FFQ that measures nutrient intakes. Generally, researchers present these measures in a comparative manner.

2.5.1.9 Determining cut-offs for nutrients that are used for calculating sensitivity and specificity of a new dietary assessment method

In medical diagnostic tests where the test is used to verify a diagnosis, the choice of the best cut-off (ie the point where the sum of sensitivity and specificity is maximised) is often determined by receiver operating curves (ROC) (Altman 1991, 418). A ROC is generated by plotting the sensitivity versus 1-specificity for each possible cut-off and then joining the points. These plots determine the best possible cut-off based on the numerical information entered. The area under the ROC curve (AUC) is the probability that a test will correctly identify the disease or the true measure of the nutrient intake. An AUC of 0.5 means that the new measure is no better than by chance (Lazarus et al. 1996).

In dietary surveys, the RDis for nutrients are often used to determine the reference value for determining sensitivity and specificity of a new dietary survey method. Cut-off values are then determined for the new method by plotting ROC curves either above or below the reference value. The ROC identifies the cut-off value (as distinct from the reference value) in the new method that is closest to that of the reference value.

Identifying a cut-off level for fat intake, for example, to determine the specificity and sensitivity of a FFQ designed to measure fat intake is problematic in absolute terms as no RDI or goals or targets exist to define minimum, moderate or high levels of fat intake (Cashel and Greenfield 1997). In epidemiological research where diet and disease relationships are investigated, numerous researchers have noted the need to control for energy when measuring nutrient intake and have reported fat intakes adjusted for energy. These energy-adjusted fat values have been expressed as nutrient density (g fat/1 OOOkcal or kJ, energy partition (fat versus non-fat energy), and residual fat intake in a fat/energy regression model (Willett, 1990, p 62; Beaton eta!. 1997). These values, however, do not define cut-off levels for minimum or high fat intakes.

Although Australian dietary goals and targets suggest a cut-off for fat at the population level of <30% of total energy from fat (NHMRC 1992b), this calculation is not feasible in studies where there is no baseline measure of 'usual' energy intake in the study groups selected or where assessment of energy intakes are likely to be biased.

In the absence of definitive fat cut-off levels, estimated fat values based on % contribution to total energy to validate a FFQ that measures fat have been derived from population RDis for

33 Chapter 2: Literature review energy (Dobson et al. 1993), from population energy intakes (Block and Hartman 1989) and also from estimates on energy intakes using a number of food records (Martin et al. 1997). These population values may not be suitable for a selected study group if not representative of the general population so their use as reference values needs to be interpreted with caution.

2.6 Validity

2.6.1 Definition of terms The term validation is the process of establishing validity which is 'the degree to which a measurement measures what it purports to measure' (Last 1988, 132). Definitions of the specific terms, reproducibility, content and criterion validity from an epidemiological perspective are found in the glossary on page xvi. The interpretation of these classical definitions is modified in different areas of research. In clinical studies of a new method of measurement, reproducibility is the degree of agreement of a variable/instrument/test with itself with repeated tests, whereas validity is the association of the variable/instrument or test with the 'true' (or most accurate) method of measurement. An assumption of validity testing in clinical studies is that there may be error in the new test or instrument but no error in the 'true' value.

Validation of a dietary method that measures food or nutrients uses a similar comparative methods approach to that previously described in Section 2.5, p. 25-33 and is based on agreement measures. Validation of a new dietary measurement method in epidemiological and clinical studies usually includes tests of:

• reproducibility;

• content validity; and

• criterion validity.

Table 2.6 summaries the specific approaches used to validate a new FFQ.

2.6.1.1 Reproducibility

Reproducibility or repeatability in dietary research, often called reliability, is defined as the ability of a dietary method to produce the same quantitative estimate of nutrient intake on two or more different occasions from the same individuals (ie a different concept from validity) (Block and Hartman 1989; Lee and Nieman 1993, 66). Validity tests of dietary intake methods often use reproducibility of dietary data as a criterion for acceptability (Baghurst and Baghurst

34 Chapter 2: Literature review 1981; Larkin et al. 1989; Horwath 1990) and do not necessarily include assessment of validity ofthe method (Baghurst and Baghurst 1981).

A test-retest approach is the most commonly reported method for measuring reproducibility of FFQs. The time period between tests varies considerably depending on the objective of the questionnaire (eg 'usual' diet, diet over the last month) and the nutrients of interest. The nutrients or food intakes are compared between tests for repeatability or consistency of reporting.

Table 2.6 Approaches commonly used to validate a FFQ

REPRODUCIBILITY VALIDITY

Test-retest: Content (internal) validity: • Repeated administration of FFQ • Proportion of total nutrient intake accounted for by foods listed on the FFQ. Proxy: Criterion (extemal) validi{)' • Use of another person to respond for the subject • Comparison of means and differences between means of dietary data (eg nutrient intakes) from the FFQ with data from another dietary method ( eg diet history, diet record). Inter-rater: • Correlation of individual dietary data from FFQ • Use of two or more raters to test subject with data from other dietary methods (eg diet history, diet records, duplicate diets, direct observation). • Comparison of classification of individuals into tettiles, quartiles or quintiles of dietary intake between the FFQ and the criterion method, percent agreement. • Correlation of nutrient data with biomarkers. Source: Adaptedfrom Horwath (1 990)

Proxy-reproducibility studies involve a comparison of dietary interviews between the subject and another person who is able to observe the subject. This method is often used to validate FFQs in children or elderly people (Lee-Han et al. 1989).

The inter-rater approach usually involves two or more trained personnel to rate, observe or examine the same responses to an instrument of test by the same individual and is used frequently in clinical studies. This approach in dietary studies is indirect and subjective and, although used to validate other types of dietary intake measures, has not yet been reported in FFQs. It has been used in rating three-day food records where two nutritionists assigned a rating to compare several three-day food records in the MRFIT study (Remmel and Benfari 1980).

Reproducibility tests are part of the validation procedure of a new instrument but are not substitutes for validity testing. A high degree of reproducibility does not ensure validity (Horwath 1990) and can suggest repetition of the same error (Willett 1990, 97), and/or

35 Chapter 2: Literature review enhanced consciousness or a training effect (Feunekes et al. 1993). Important nutrients or foods can be missed in a FFQ thereby decreasing the accuracy (or validity) of the questionnaire but still maintaining a high reproducibility. Factqrs that can influence the reproducibility of a FFQ include the occurrence of real dietary change between administrations, poor memory or erratic eating patterns (Block and Hartman 1989).

2.6.1.2 Validity

The validity of any dietary method to measure food consumption is defined as its ability to measure the intakes of foods and nutrients with 'accuracy' (Hankin 1988). Other researchers define validity as the ability of a dietary method to measure 'what it is intended to measure' (Biopk 1982; Block 1989; Burema et al. 1988, 171; Block and Hartman 1989). An operational objective of validity testing of a new measurement method is to determine if the measurement variable/s of interest 'agree' well enough with a more accurate measure of the same variable(s) (Altman 1991, 397). The new measurement method can replace the other method, or both methods can be interchangeable, provided the differences and measurement errors are clinically acceptable (Bland and Altman 1986) as suggested previously in Section 2.5.1.6, p. 30.

Content validity

Content validity of a dietary method reflects the ability of the method to reflect the types and quantities of foods or meals consumed within a specified study population. This term is often referred to as 'internal' validity, 'face' validity or 'demonstrated' validity ofthe method (Block and Hartman 1989).

Criterion validity

Criterion validity of dietary methods, often called external validity in this context, refers to the comparison of a new method with another dietary method considered more 'accurate', or external measure such as a biomarker (Block and Hartman 1989; Lee and Neiman 1993, 63). Because there is no dietary method that measures accurately what free-living people eat, true validation against other dietary methods is not possible. Validation studies, therefore, assess only relative or criterion validity of any newly developed dietary assessment method or instrument (Burema et al. 1988, 171 ).

36 Chapter 2: Literature review

2.7 Validity of current and retrospective methods used for measuring food consumption

2. 7.1 Validations of current methods Weighed food records have been validated against external measures (ie biomarkers), particularly the double-labelled water technique for assessing energy expenditure, and have shown good correlation in individuals (Bingham et al. 1988, 62; Feunekes et al. 1993). Validation of weighed food records against duplicate diet collections has been lacking (Block and Hartman 1989).

2. 7.2 Validations of retrospective methods The 24-hour recall has been validated against direct observation, weighed and estimated food records and is considered to be a valid method for obtaining information about food and nutrient intakes of groups. The longer-term retrospective methods, the diet history and FFQ are more difficult to validate because they are dependent on the ability of subjects to give correct information on frequencies of food intakes and estimates of serve sizes. Memory and recall bias distort these responses. The diet history has been validated against repeated diet records (weighed and estimated), seven-day diet records, and the double-labelled water technique. It is considered valid for obtaining information about food intakes of individuals but is limited to ranking of dietary data for groups (Block and Hartman 1989; Jain 1989). The process of collecting data for a diet history (eg cost, time, need for highly trained dietitians to administer) limits its use for many studies as a criterion method for validating other dietary methods.

The criterion dietary assessment methods against which FFQs have been validated include: a 24-hour recall (Bingham et al. 1994); a detailed diet history interview (Balogh et al. 1968; Jain et al. 1982), weighed or estimated diet records covering periods from three days (Krall and Dwyer 1987; Bergman eta!. 1990; Kemppainen eta!. 1993), to seven days (Nelson eta!. 1988; van Assema et a!. 1992; Bingham et al. 1994), to continuous recording of food intake for one year (Willett eta!. 1987); direct observation (Horwath and Worsley 1990; Mullen et al. 1984); and other FFQs (Bingham et al. 1994). Early validation studies ofFFQs favoured diet histories as the criterion method whereas studies in the late 80s and early 90s favoured estimated or weighed diet records although some researchers have used previously validated or partially validated versions ofFFQs to validate modifications of the same FFQ (Eck et al. 1991; Dobson eta!. 1993). This latter practice, however, is not recommended because of the introduction of the same sources of bias (see Section 2.7.3, p. 38).

37 Chapter 2: Literature review Validation of FFQs for both short- and long-term retrospective intake against food records have shown varying validity when comparing group means. FFQs that estimate 'usual' food intake in the short-term (eg past month) generally under-estimate nutrient intakes (Krall and Dwyer 1987) while long-term FFQs (eg past year) tend to over-estimate nutrient intakes (Sorenson et al. 1985; Larkin et al. 1989) (see Table 2.2, p. 8).

2. 7.3 Choice of another dietary measure as the criterion method for validation Often a combination of more than one criterion or external measure (ie several dietary methods and a biomarker) has been recommended and used to measure validity of dietary measures (Feunekes et al. 1993).

The criterion method is conducted on the same subject, relates to the same period as the method under investigation and has been previously validated against another type of dietary assessment method (Horwath 1990; Burema et al. 1988, 173). When designing a validation study, errors in criterion validity can be minimised if the new method and the criterion method (or methods) are administered around the same time frame (Burema et al. 1988, 173; Last 1988, 132). The order of administration of each method, especially short-term retrospective recall, is associated with errors (Burema et al. 1988, 173; Horwarth 1990) since the responses to the first measurement could influence the subsequent measurement. A cross-over design can overcome this and has been used in some studies (Feunekes et al. 1993) but the logistics and cost of validation studies preclude this method for most researchers.

The criterion dietary measure chosen should be practical, feasible and have a good demonstrated content validity (Block and Hartman 1989). Direct observation or duplicate diets are the definitive choice but are expensive and have limited application to free-living people. The few studies undertaken using direct observation have shown good validity of FFQs (Mullen et al. 1984; Horwath and Worsley 1990) but the cost and practical difficulties in observing sufficient numbers of subjects for an adequate time period without distorting eating habits, limit its use.

Weighed food records are considered the most accurate method of dietary assessment of current diet in free-living people (Block 1989) although diet histories and estimated dietary records have also demonstrated good accuracy (Pekkarinen 1970; Marr 1971; Stuff et al. 1983). The seven-day weighed record for assessing current diet is still considered the 'gold' standard by some researchers (Bingham 1987) although loss of compliance after four days may reduce its accuracy (Gersovitz et al. 1978; Daniels 1984). Repeated three- or four-day diet records using estimated or household measures are more applicable for validating FFQs that

38 Chapter 2: Literature review measure 'usual' diet according to several authors (Willett eta!. 1985; Krall and Dwyer 1987; Willett et a!. 1988; Larkin et a!. 1989). For validating short-term retrospective methods (ie 24- hour recall), that measure single nutrients such as fat, three- to four-day diet records are also favoured (Block and Hartman. 1989; Kempainnen eta!. 1993; Angus eta!. 1989) although diet histories (Feunekes et al. 1993) or FFQ measuring 'usual' diets have been used (Dobson et al. 1993).

The criterion dietary measure should be a different type of dietary measure to the instrument being validated to avoid the same sources of error (Burema eta!. 1988, 175; Horwath 1990). As an example, a FFQ (a recall method) should not be validated against another recall method such as a diet history or another FFQ. A record is more suitable for FFQ validation because the biases are different, although not independent.

2. 7.4 Use of biomarkers as the criterion method Biochemical markers, also called biomarkers, have been used as indicators of nutrient intakes, nutritional status, metabolic effects of exposure to substances and susceptibility to, or presence of, diet-related diseases (Kohlmeier 1991, 15-16). Biomarkers have a place in validation studies since the measurement errors are independent and external and free from the bias associated with classical dietary methods (such as memory distortion, estimating serve sizes, interpretation of questions and compliance).

Although numerous tissue, blood or urine biomarkers are available, most do not directly reflect actual consumption of specific nutrients or foods. Many are subject to the influences of metabolic, hormonal and individual variation. Those biomarkers that have shown good correlations with dietary measures include selenium in whole blood and serum (Swanson et al. 1990), urinary nitrogen as a measure of protein intake (Bingham and Cummings 1985; Baghurst and Baghurst 1981), linoleic acid in erythrocyte membranes (Feunekes et al. 1993); and eicosapentanoic acid in plasma phospholipids as a measure of fish intake (Drevon et al. 1991, 93). The topic of biochemical indicators of dietary intake is reviewed in depth elsewhere (Hunter 1990) and covered in more detail in relation to fat intake and validation studies of FFQs in Section 2.8.3.2, p. 46.

2. 7.5 Constraints to validity testing If using diet records as the criterion measure of 'usual' diet, validity testing requires a substantial number of records as a base against which data from a new method of measuring dietary intake.can be compared (Mullen eta!. 1984). Ideally, to generalise the results to a wider population, the subjects should be randomly selected from the target population in which the

39 Chapter 2: Literature review new method will be used. Large random samples representative of the population are cumbersome for the types of data collection protocol required especially where food records are used and usual diet is of interest. Validity studies in nutritional epidemiology are interested in 'usual' diet so a longitudinal component in any study design means a large commitment in cost, time and trained staff for implementation and management.

Additional constraints to both validity testing in both clinic and population studies include access to the study population and compliance of subjects to data collection protocols. For these reasons, most validity studies use small sub-samples of the target population, most of whom are volunteers. Therefore, application of a new method to a population outside the study population needs to be cautious.

Because of these constraints to large scale validity testing, many of the published validity studies of dietary intake methods use modifications of methods previously validated in other target populations (Bingham et al. 1994, Horwath and Worsley 1990; Eck et al. 1991; Munger et al. 1992; Eck et al. 1996; Greeley et al. 1992) rather than develop and validate completely new methods. In clinic situations where dietary intakes of individuals are of interest, validity testing of dietary methods is rarely undertaken although development and evaluation of dietary intervention and other health promotion programs is recommended. Presumably, because dietitians constitute small numbers in hospitals and community services, they do not have the time and resources to undertake such tasks in the workplace in addition to their other duties.

2.8 Validation studies of FFQs

2.8. 1 Reproducibility studies of FFQs Reproducibility of FFQs is often the first component of validity testing and is usually conducted on a different sample than that used for criterion validity testing (Horwath 1990). This avoids subject bias. Reproducibility studies of FFQs have been extensively reviewed (Lee-Han et al. 1989; Block and Hartman 1989; Willett 1990, 104; Horwath 1990). Reproducibility studies of FFQs published between 1984 and 1994 using a test-retest protocol are summarised in Table 2.7, p. 43. This table includes a description ofthe population studied, the FFQ design, format, protocol and statistical tests used and, if reported, correlations for total fat. Many validity studies, however, do not undertake reproducibility testing (eg Angus et al. 1989; Horwath 1990; Fogelholm and Lahti-Koski 1991; Feunekes et al. 1993; Kemppainen et al. 1993).

40 Chapter 2: Literature review 2.8.1.1 Use and interpretation of statistical tests for measuring reproducibility

In dietary studies, correlation coefficients have typically been used to evaluate the comparison of nutrient intake estimates obtained from two dietary assessment methods (validity) or by the same method at two different times (reproducibility) as seen in Table 2.3, 2.4 and 2.7. Correlations, often in combination with other methods including Cronbach's alpha, for measuring internal consistency of quantitative measures between tests (Krista] et al. 1990b), and classification measures (Rohan et al. 1987; Thompson et at. 1990; Van Assema et al. 1992; Wheeler et at. 1994) are also reported (see Table 2.7, p. 43). Intra-class correlations (ICC) are used where more than one test of reproducibility is used (Pietinen et al. 1988a, 1988b; Wheeler et at. 1994).

Until 1994, no reproducibility study for dietary methods reported the use of Bland-Aitman plots, however, Wheeler et al. (1994) reported percent differences between means for each test replication and Pietinen et at. (1988a) had earlier used analysis of variance for differences between means rather than hypothesis tests for comparisons of means. Both these studies assessed most measurable nutrients (ie where food composition data are available) in the 'usual' diet so Bland-Altman plots would have been cumbersome and inappropriate for this type of design (Willett 1990, 121). Confidence intervals around the correlation coefficient between the methods or around the means for each method were rarely reported in reproducibility studies at this time.

The range of correlations for nutrients in a test-retest protocol in the studies reported in Table 2. 7 is variable but generally equal to or greater than 0.5 for most nutrients. A correlation of 0.5 indicates a moderate strength of association (Kuzma 1992, 202). The higher correlation values and hence reproducibility for some substances (ie alcohol; 0.99 (Munger et al. 1992), 0.88 (Pietinen et al. 1988a) and 0.85 (Rohan et al. 1987)) suggest either consistency of consumption between tests or perhaps systematic bias.

Correlation coefficients for fat showed a good strength of association, eg 0.71 (Krista! et al. 1990a; van Assema et al. 1992) to 0.78 (Pietinen et al. 1988b), in FFQs designed specifically to measure fat only or fat plus a few other nutrients. These higher values may be linked to the design of the FFQs and/or specificity of the foods selected for inclusion in these FFQs and are not necessarily representative of good agreement.

2.8.1.2 Comparison of time interval between tests of reproducibility

The time interval between tests of reproducibility of FFQs designed to measure nutrients in the 'usual' diet, has ranged from three months (Pietinen et al. 1988a) to 15 years (Thompson et at. 1990). Retests measuring the 'usual' diet may be undertaken a number of times during this

41 Chapter 2: Literature review period although most usually at intervals of one year. In contrast, for measures of single or specific nutrients, shorter time intervals between tests are reported (ie usually eight weeks to three months) (Nelson et al, 1988; Pietinen et al. 1988b). Depending on the objective of the survey, Block and Hartman (1989) suggest that FFQs should be administered within a fairly short time (eg. four to six weeks) to estimate 'usual' intakes. Very short time intervals between two administrations (ie days) are associated with an overall overestimate of agreement measures in reproducibility studies because respondents are likely to remember their earlier responses (Horwath 1990). Alternatively, too long a time interval may be associated with real dietary change (Block and Hartman 1989) and consequently there is a reduced reproducibility coefficient (Horwath 1990). Correlations for repeated administrations of the same questionnaire are higher when the interval between tests is short (Nelson et al. 1988; Strohmeyer et al. 1984). Thompson et al. (1990) suggest that the ability to show a good correlation in test-retest protocols is strongly related to the stability of the diet and the level of education of the respondents. Clearly, there is no definitive time interval recommended for reproducibility studies ofFFQs that measure 'usual' diet, or a few nutrients or total fat.

42 Table 2.7 Characteristics ofreproducibility studies ofFFQs between 1984-1994 using test-retest technique (including range ofcorrelations for nutrients tested, and correlation for total fat)

Source Population Frequency design Nutrients I foods Reference Interval Statistical tests for reporting reproducibility Correlation targeted by the period of between tests coefficient for fat FFQ FFQ Strohmeyer et al. 1984 Dutch college students Dietary intake form based on food groups food groups in total usual diet per 2 weeks correlation (interclass, cross classification (quintiles)) na (n=40) (S)) (semi-quantitative) diet week Willett et al. 1985 US registered nurses 61 items, 9 response categories, standard total diet usual diet 1 year correlation (interclass, Pearsons) 0.54 34-59 years portions, mailed, focus on nutrients related to (n=173) cancer, most nutrients measured Rohan et al. 1987 Australian men and women 179 items, standard portions, mailed, total diet usual diet 1 year comparison of means (t test, P values, differences 0.62 24-79 years interviews in original study. Most nutrients range= 9-17 m between means, relation (interclass, Spearman), cross (n=lOO) measured (semi-quantitative) classification (tertiles)) Nelson et al. 1988 UK men and women Numbers of food items unavailable in calcium usual diet 8weeks correlation (interclass, Pearsons) not measured 72-90 years reference, quantitative and qualitative design, (n=28) calcium and iron measured only Pietinen et al. 1988a Finnish men 276 food items, frequency only, portion size total diet usual diet 3 tests at3 correlation (ICC, Pearsons, analysis of variance for 0.64 55-69 years picture booklet month intervals differences between means, CI for means) (n=l21) Pietinen et al. 1988b Finnish men 44 foods, frequency only, standard or average total fat, saturated 1 year 3 tests at 3 correlation (ICC, Pearsons) 0.78 55-69 years portions sizes, interviewed by dietitian fat, vitamin A, C, B, month intervals (n=l07) (qualitative) selenium and dietary fibre Thompson et al. 1990 Tecumseh Heart Study, 83 foods, 9 subsets, interview, portion size Food groups 1 year 15 years kappa statistic for each respondent (item by item 0.62 high-fat foods men and women (US) displayed by cards (qualitative) (for vitamin C, A, agreement, differences in mean kappa scores (by 0.54 medium-fat 45-64 years sodium, cholesterol, ANOVA, P-values), multiple regression) foods (n=l184) low, medium and 0.511ower-fat foods high fat foods) 0.31 fried foods Kristal et al. 1990b US women, 28 questions related to dietary behaviour, 5 Behaviour relating to usual habits na correlation (interclass), Cronbach's alpha (for internal 0.67 (substitute low- 45-59 years scales, followed by a mailed questionnaire low fat food consistency between T1 and T2) fat food for high-fat (n=99) (qualitative) selection counterpart) 0.9 (avoiding fat as seasoning) Munger et al. 1992 US women, 126 items: modified form of Willett et al. total diet usual diet 3 tests, baseline Correlation (interclass) (Pearsons) for each test, crude 0.51 (T1 and T2) (n=44) (1987) questionnaire, administered by (Tl), then 2 and energy adjusted) 0.60 (T1 and T3) telephone years (f2) then 0.71 (f2 and T3) again at6 months (T3) Van Assema et al. Dutch adults (R), >18 25-item (qualitative), by telephone, 12 food fat usual diet 1 year correlation (interclass, Pearsons), gross misclassification 0.71 1992 years categories, response converted to a fat score (n=639, 333 women, 306 men) Wheeler et al. 1994 Supermarket shoppers 300 foods, mailed total diet energy 3 tests; baseline mean (SEM), correlation (ICC), cross classification (Aust) (Tl), then 4.6 (tertiles), %differences between Tl and T2 (for Part 1; (n=144 women, 9 weeks (f2), then nutrients and food groups), kappa statistic men) (either T1 or T2) Part 2: (n=98 women, 2 again at 3 men from subjects who months (T3) completed part 1 , ICC= zntraclass correlation coefficient, *Spearman rank correlations, ** kappa scores, others are e1ther Pearson s Product moment correlations or not stated, na=not available or not reported Chapter 2: Literature review • for a variety of situations (eg pregnancy, weight loss, smoking cessation), where there is a need to gather dietary information over a short period of time (Greeley et al. 1992; Eck et al. 1996); and

• to generate inexpensive and rapid behavioural feedback to the individual on dietary intake (Krista! et al. 1990c).

Some of these uses are also applicable in community health promotion activities or for identifying risk groups for implementing dietary intervention. For example, Krista! et al. (1994) developed a short-term FFQ to measure fat and fibre intake in order to evaluate a community intervention program aimed at modifying these nutrients. Wilson and Horwath (1996) developed a FFQ to identify people in the community with low calcium intake for targeting individual dietary intervention.

Several examples ofFFQs for short-term use have been described for clinical practice but little is known about the statistical characteristics of these measures when used as program evaluation tools (Krista! et al. 1994). Further work is needed on the use and validity of short­ term FFQs used to measure current diet.

2.4.3 Limitations of FFQs For epidemiological research on the relationship between diet and disease, long-term FFQs are considered the method of choice for measurement of 'usual' diet (Block 1989). An overview of the strengths and limitations of FFQs was previously described (see Section 2.2, p. 5). Where energy is measured by the FFQ (Pietinen et al. 1988b; Eck eta!. 1991; Fogelholm and Lahti­ Koski 1991), some researchers have adjusted for energy using the method described by Willett et al. (1985), but have not addressed errors at the extremes of energy intake. Further work is needed in this area. A major criticism of FFQs designed to measure one or a few nutrients is that error related to energy intake cannot be assessed.

In addition, FFQs that omit some foods may result in underestimations of nutrient intake (Lee and Nieman 1993, 57). Further inaccuracies may be introduced when foods are combined in the food list, although the effect of this on validity has been poorly studied (see Section 2.4.4.4, p. 24).

2.4.4 Issues in design of FFQs This section reviews issues and problems associated with the design of FFQs including processes involved in compiling a food list ( eg number of foods, order of foods, the way foods are grouped, frequency options, and portion sizes).

13 Chapter 2: Literature review 2.4.4.1 Food items for inclusion

Three approaches have been used to compile a food list, often termed a food checklist, for inclusion in a FFQ. Food composition data, food record data from population studies or the application of statistical procedures that predict associations between foods and nutrient contribution have been used. The ultimate list of foods selected is likely to comprise major sources or contributors to nutrient intakes (Block et al. 1986) or can be derived using regression models (Hankin et al. 1983; Willett et al. 1985). Ideally, the list should reflect the dietary behaviour of the target population and the nature of its food supply. Depending on the nature and purpose of the questionnaire, allowances are often made for differences in food consumption patterns of respondents according to age, gender, cultural influences, type of diet (eg vegetarian, omnivorous) and socioeconomic factors. Additional qualitative questions concerning food preparation, use of modified foods and supplementation practices may also be included.

Designing a FFQ that accounts for all these factors is difficult and drafts need to be pilot-tested in a representative sample of the target population before the actual study commences. Ideally, this sample should not be the study group itself(Horwath 1990). Such an exercise requires time and considerable financial support. To overcome this expense of prior assessment, most studies use data from national dietary surveys or previously published surveys of similar population groups to establish the food list. a. FFQs derivedfromfood composition tables

The first approach to selecting foods for inclusion in a FFQ can be based on foods that have a high content of a particular nutrient or nutrients. This approach is often applied where the intake of only one or two nutrients is of interest.

The use of food composition tables alone to determine the major food sources contributing to the nutrient or nutrients of interest has several limitations. One limitation is that foods with a high nutrient content, but infrequently consumed can be included. For instance, liver has a relatively high vitamin A content but is infrequently consumed by the Australian population (Cashel et al. 1986, 87). Another limitation when measuring 'usual' intakes of several nutrients is that large lists of food items need to be included which increases respondent burden and reduces accuracy and compliance. Alternatively, reducing this list by deleting items infrequently consumed can ignore foods with high between-person variation in their consumption. Infrequently consumed foods, such as liver, however, are considered more informative for detecting individual variation than foods that are, on average, consumed more frequently by most people (Willett 1990, 72).

14 Chapter 2: Literature review

2.8.2 Content validity studies of FFQs Content validity measures the ability of a FFQ to describe food or meal patterns of subjects and is also a measure of the proportion of total nutrient intake accounted for by the foods listed on the FFQ. This aspect is infrequently reported or analysed in studies of small samples but has been addressed in some well-accepted studies where the original FFQ has been modified and reused for different population groups (Block et al. 1989). Willett and colleagues have shown that a short FFQ (61 items) accounted for 70-80% of intake of most nutrients in the 'usual' diet, and that a 116 item FFQ accounted for 90-95% of intake of 54 US adults (Willett 1990, 104). Even the 44-item FFQ ofPietinen et al. (1988b) captured nearly 100% of the intakes of vitamin A and C, about 80% of vitamin E intake, 75% of selenium intake and two-thirds of total energy intake of 187 Finnish men.

2.8.3 Criterion validity studies of FFQs

2.8.3.1 Use and interpretation of statistical tests for measuring criterion validity of FFQs

Table 2.3 (p. 19-21) summarises the design, experimental protocol and statistical tests used for criterion validity studies of FFQs designed to estimate nutrients in the 'usual' diet that employed other dietary measures as the reference criteria. Table 2.4 (p. 22) examines the characteristics of criterion validity studies ofFFQs published between 1988-1995 that measure one or a few nutrients over short-term retrospective periods (between one day and one month).

Many dietary validation studies in the 1980s and early 1990s reported P values and compared statistical values (eg means and correlations) with other validity studies (Stuff et al. 1983; Millar and Beard 1988; Larkin et al. 1989; Curtis et al. 1992). This, in isolation, is a misuse of significance testing because of differences in study design (Delcourt et al. 1994).

Comparisons of means in many ofthe early validity studies ofFFQs as shown in Table 2.3 and Table 2.4 used t tests for measuring significant differences of means (Stuff et al. 1983, Nelson et al. 1988; Millar and Beard. 1988, Eck et al. 1991; Curtis et al. 1992). This practice is likely to be a reflection of the nature and editorial guidelines of the journal in which they were published at that time ( eg Journal of the American Dietetic Association, American Journal of Clinical Nutrition). In contrast, few dietary validity studies in the 1980s and early 90s in the epidemiological and public health journals compared methods using t tests. They reported absolute means and standard deviation or CI around the mean instead (Pietinen et al. I 988a; Willett et al. 1988; Munger et al. 1992; Welten et al. 1995).

44 Chapter 2: Literature review Almost all studies continue to report and compare correlation coefficients as seen in Table 2.3 and Table 2.4 using the conventional null hypothesis approach, despite the limitations of this approach in measuring agreement between methods as discussed earlier (see Section 2.5.1.4, p. 28). Although comparison of means and correlation is evident in many of the early dietary validity studies (see Table 2.3, p. 19-21), hypothesis testing needs to be accompanied by SD, percent agreement and confidence intervals as seen in the later studies listed in Table 2.5 (p. 26), including confidence intervals. Correlation coefficients are usually reported in combination with other measures of agreement and have apparently been acceptable in dietary validation papers to the present time.

Comparing correlation values in this way demonstrates trends in strength of agreement for a particular nutrient between each study but correlations cannot be compared directly because of differences in study design, food supply, typical food choices and characteristics of the population in the country of origin.

The use of Cis in dietary validity studies was rare in the early 1990s but is seen in many, although not all such papers published since 1993. This is a reflection of the changes in publishing guidelines in medical and health-related journals. Only a few studies in nutrition journals have reported estimation measures. These estimations and studies include:

• Cis around each mean (Fogelholm and Lahti-Koski 1991, Thompson and Margetts 1993);

• differences between the methods (ie differences between scores for each method (Krista} et al. 1990b); and

• differences between means (Welten et al. 1995).

Only one study by Dobson et al. (1993) reported CI around the regression coefficient. Two studies, Nes et al. (1993) and Wheeler et al. (1995) compared median differences between the methods of measurement and reported CI around the median, presumably because nutrient intake data were skewed. This practice has been recommended by statisticians (Gardner and Altman 1989, p. 72) but is infrequently used in either clinical or dietary studies.

Cross-classification techniques in dietary validation studies of FFQs, although routinely reported by researchers in epidemiological journals in the 1980s (eg by Willett et al. 1985; Pietinen et al. 1988a; Willett et al. 1988), were not similarly reported in nutrition journals at the time. Table 2.3 (p. 19-21) and Table 2.4 (p. 22), however, clearly show that most papers published in the mid-1990s reported agreement using cross-classification techniques. Since the recommendation in 1995 to report Cis by Garrow, the editor of the European Journal of Clinical Nutrition, cross-classification techniques and other statistical estimates (SD, CI, for

45 Chapter 2: Literature review example) will be required for acceptance by these journals of papers validating dietary methods.

The kappa statistic has often been reported, particularly in epidemiological journals (Pietinen et al. 1988a; Horwath and Worsley 1990; Thompson et al. 1990; Wheeler et al. 1995) and in behavioural studies of dietary intake (Krista! et al. 1990a), despite criticisms previously described in Section 2.5 .1. 7, p. 31. The specificity and sensitivity of FFQs do not appear to have been reported before 1996.

Few dietary validation studies have adopted Bland-Altman plots or techniques to measure the differences between means (or medians) of two measurement methods. Originally published in 1986, and used extensively in clinical/medical studies, this technique was first used in a modified form for a dietary study by Larkin and colleagues in 1989 (see Table 2.3, p. 21). Presentation of mean differences between two methods of measurement was rarely seen in dietary validity studies until 1993 and again mainly in epidemiological journals (Nes et al. 1993; Thompson and Margetts 1993; Kemppainen et al. 1993). It will be interesting to see in the future if nutrition journals expect similar statistical tests for agreement between dietary measures. Bland-Altman techniques and the use of scatterplots between each method for a particular nutrient are best suited for FFQs designed to measure only one or a few nutrients (see Section 2.5.1.6, p. 30).

2.8.3.2 Validity studies of FFQs that measure nutrients in the 'usual' diet (long­ term), including fat

Dietary validity studies designed to measuring many nutrients are not usually used to calculate absolute intakes of nutrient for individuals. They are used to rank people into groups according to food or nutrient intake for epidemiological purposes, although Block (1989) suggests that they can be used for individual assessment.

Most validity studies, irrespective of the reference period of the FFQ (ie short-or long-term), use food records as the criterion measure (see Table 2.3 and 2.4). For measuring 'usual' diet a number of diet records is usually used. Measures of agreement, particularly for fat, between FFQs and food records are highly variable as expected with such a diversity of study designs and population samples as seen in Table 2.3 (p. 19). Correlation coefficients for fat intake tend to show a moderate strength of association in most studies with the exception of the low correlation of 0.04 reported by Stuff and colleagues in 1983 who compared nutrients from an extensive FFQ completed by 40 lactating women with a seven-day estimated food record. The poor correlation of 0.04 for total fat and indeed the overall poor agreement of other measures between methods for most nutrients was explained by these authors as large variations in

46 Chapter 2: Literature review individual intakes over a one-week food record (Stuff et al. 1983). It is also likely that seven days of diet record-keeping provided a poor estimate of individual long-term intake.

The strong association in the Balogh study for fat intakes from a FFQ that measured 'usual' diet and a seven-day food record (r=0.94) conducted on 14 free-living Israeli men was, however, unexplained by the authors, although very high correlations were also reported for other nutrients tested (Balogh et al. 1968). When this FFQ was used again in another population of 40 Israeli men, the correlations were more modest (Epstein et al. 1970) although the authors made no comment about the reasons for this. No other measures of agreement were undertaken except correlation and regression between the methods, which was typical of these early studies.

By comparison, the validity studies ofFFQs on randomised populations by Willett et al. (1985, 1987, 1988) and Pietinen et al. (1988a, 1988b) using multiple diet records as the criterion measures, have employed numerous measures for testing validity (see Table 2.3, p. 19-21 ). Although correlations for fat were of a similar order in each study, the cross-classification measures indicated that individuals frequently misreported food intake and hence fat intake. Cross-classification measures for the FFQ in the same or adjacent quintile of the food record quintile showed 70%, (Pietinen et al. 1988a), and 69% correct classification for fat intake (Willett et al. 1988). Both research groups suggest that these va-lues are acceptable and that the FFQs provide useful information about individual nutrient intakes over a one-year period.

Duplicate diets are rarely used as the criterion method as they are intrusive and limited as a measure of 'usual' intake. However, a correlation of 0.76 for calcium was reported between a FFQ designed to measure calcium intake over the past 20 years with current calcium intake from a five-day duplicate diet in an elderly population of 28 UK women between 72-90 years (Nelson et al. 1988) (see Table 2.3, p. 21). This, and the cross classification techniques employed in the study, showed excellent agreement between the two methods which may have been a reflection of the stability of the diet in this age group. Validity studies ofFFQs designed to measure fat intake do not appear to have used duplicate diets as the criterion method.

Few validation studies compare FFQs with biomarkers as previously discussed in Section 2.7.4, p. 39. There is no specific biomarker for total fat intake in the short- or long-term. Specific biomarkers, however, have been suggested for estimating short- or long-term fatty acid intakes. Long-term intake (ie two to three years) of linoleic acid is reflected in adipose tissue using fat aspirate samples, whereas short-term intakes (six to eight weeks) are better reflected by changes in erythrocyte membranes (Kohlmeier 1991, 15-16; Feunekes et al. 1993). Studies of fatty acid profiles in plasma have failed to show strong positive correlations with FFQs (Hunter 1990, 193). A significant correlation (r = 0.5) between a FFQ and a

47 Chapter 2: Literature review subcutaneous fat aspirate was reported for intakes of polyunsaturated fatty acid (PUFA) in 119 Boston men, aged 40-75 years (Hunter et al. 1992). In this study, analysis of subcutaneous fat aspirate as a biomarker for monounsaturated (MUF A) and saturated fatty acid (SF A) intakes showed a poor association with intake. Correlations of 0.22 and 0.16 were reported for MUFA and SFA, respectively. Subcutaneous fat aspirate although invasive, may be a useful marker to measure changes in consumption of PUFA (not MUFAs or SF As) over two to three years to measure compliance with a long-term dietary intervention program. a) Australian studies of FFQs that measure 'usual' diet

The first comprehensive Australian FFQ was developed by CSIRO Division of Human Nutrition in Adelaide and measured most nutrients in the 'usual' diet over 12 months (Rohan et al. 1987) (see Table 2.7, p. 43). This questionnaire listed 179 foods using standard portion sizes. Although reproducibility of the FFQ showed a moderate test-retest association based on cross-classification and correlation (Rohan et al. 1987), its criterion validity was not assessed. The original CSIRO FFQ containing 179 food items was used to measure changes in energy, fat and fibre intakes in 683 hospital staff with elevated cholesterol in Sydney, NWS in a work site cholesterol screening and dietary intervention program (Barratt et al. 1994). Subsequent modified versions have been used to measure 'usual' diet in an elderly population (Horwath and Worsley 1990); in a random population of207 Victorian adults (Wheeler et al. 1995); and also an adapted version has been used for measurement of fat intake behaviour over a week in a survey of 124 selected adults in Newcastle, NSW (Dobson et al. 1993). Horwath and Worsley (1990) showed good agreement for individual food intakes (ie kappa statistic for all foods greater than that expected due to chance) from the modified CSIRO FFQ in 200 elderly South Australians (70+ years) against 4 weeks of direct observation of domestic food stores. This method measures the types of foods purchased rather than individual foods consumed but is rarely used as a criterion method.

Wheeler et al. (1995) modified the original CSIRO FFQ to produce a revised food list containing 300 food items and a meal-based questionnaire containing 179 food items. This was then validated by comparison with the median intake of nutrients and differences in the median intake obtained from four x three-day weighed diet records. Preliminary analysis indicated there was no difference in the medians between the two measures, which suggested a reasonably consistent dietary pattern. Correlations were not reported for fat intake between the methods.

Classification and comparison of nutrients from each measure in the Wheeler et al. study ( 1995) indicated that respondents frequently misreported food intake. This was the case for

48 Chapter 2: Literature review both the food list FFQ and the meal-based FFQ. Overall the proportion of individuals similarly classified was 50%. Fat showed poor agreement with 15% of respondents dissimilarly classified, which was the lowest value for all nutrients. This type of classification compared differences in data ranking and did not isolate the foods or meals which were difficult for respondents to quantify. Although options were provided for changing portion sizes, Wheeler et al. (I 995) did not report on the number of respondents who modified the standard p01tion serve. The outcome of the study suggested that the meal-based FFQ, which was thought to improve the accuracy of recall, was no different to the list FFQ, and the use of either FFQ showed poor to moderate agreement at the individual level with the reference method, the food record. At the group level, however, both FFQs showed a moderate to good association with the reference measure consistent with other studies (see Table 2.4, p. 22).

In summary, FFQs designed for and showing good validation for measuring usual intake in the long term are unlikely to be suitable for use in studies of short-term retrospective intakes (ie for use in clinics) without major modification in format and presentation. As they often contain a fairly long list of foods, analysis at the time of consultation would be tedious for patients and time-consuming for dietitians to analyse and evaluate.

2.8.3.3 Validity studies of long-term FFQs designed to measure fat intakes in the 'usual' diet

Only five FFQS had been published internationally by 1993 directed at fat intake (and a few other nutrients) in the 'usual' diet (Block et al. 1989; Pietinen et al. 1988b; van Assema et al. 1992; Kemppainen et al. 1993; Feunekes et al. 1993) (see Table 2.3, p. 20). These studies used diet records as a criterion measure for validation but only the Block and the Pietinen studies used repeated diet records.

Block et al. (1989) found a correlation of 0.58 between a 13-item FFQ designed as a rapid screening tool for identifying individuals with a high (or low) fat intake and the mean fat intakes of three repeated four-day diet records in 101 US women, 45 years and older. This correlation, however, may be artificially high because subjects chosen for this validation were already involved in another study in which a similar FFQ of I 00 items was filled out during the same study period. This apparent training effect may have introduced considerable subject bias in their responses to the 13-item questionnaire because of similarities in questions. Other measures of agreement were not reported in this study.

In another study, the intake of fat and six other nutrients was measured using a FFQ of 44 food items in a qualitative format administered to 187 randomly selected elderly Finnish men ( Pietinen et al. 1988b) (see Table 2.3, p. 21 ). The foods Iis ted on the FFQ \Nere derived from

49 Chapter 2: Literature review national consumption pattems of the nutrients of interest in .the survey. Twelve food records (the criterion measure) were collected at two-day intervals for six months. In summary, the Pietinen FFQ of 44 items showed moderate association with the criterion method based on a number of agreement methods (see Table 2.3, p. 21). The correlation coefficients for total fat between the methods were of the same order as those reported in other studies. Cross­ classification for fat into the same and adjacent quintiles between the methods showed 68% agreement between the methods, a reasonable agreement. Quintile analysis, in combination with the correlation coefficients, demonstrated that on the individual level, the FFQ was acceptable for measuring fat intake but had a limited ability to measure vitamin A and selenium. It may not be possible to produce a short FFQ that measures so many nutrients.

Validation studies by van Assema eta!. (1992) and Kemppainen et al. (1993) were performed on FFQs designed to identify consumption of foods that were major sources of fat in the Netherlands and Finland, respectively. van Assema et al. (1992) reported a Pearson correlation of 0.59 for fat intake between a seven-day food record (using household measures) and a 25- item semi-quantitative FFQ for fat consumption. The FFQ was administered by telephone to 52 randomly selected respondents (29 male and 23 female). A correlation of 0.71 was calculated for the test-retest reproducibility of the same questionnaire on 639 adults (333 male and 306 female) when repeated on a similar day and time to the first measurement (see Table 2.7, p. 43). Some questions related to foods consumed in the previous six months, which was the reference period, but most questions were based on weekly or daily consumption patterns of frequently consumed fatty foods. The criterion method (a seven-day FR) was administered within one week to one month after administration of the FFQ. The methods used in collecting the food records were rigorous, despite the use of household measures. In this way, errors in measurement and interpretation of portion serves were minimised. Although a gross misclassification of 15.4% was reported for fat (ie eight out of 52 people were classified in the opposite tertiles between each method), no information was provided about exact agreement. The authors suggested that this FFQ was a useful instrument for measuring absolute fat intake in individuals and in assessing the percentage of energy as fat. As this FFQ did not measure energy directly, the authors' recommendation to extrapolate percent energy from fat from the Dutch National Food Consumption Survey data should be viewed with caution.

Another validation study of a 21-food item FFQ (short questionnaire) and a qualitative fat index (using four foods as an index measure) to estimate fat and fatty acid intakes in 82 Finnish adults suggested a strong agreement with the criterion method, a three-day food record (Kemppainen et a I. 1993) (see Table 2.3, p. 21 ). Subjects in this study were selected based on a high serum cholesterol (>8 mmoi/L), from a larger randomised survey of 872 subjects, and had

50 Chapter 2: Literature review a mean age of 42 years (age range = 16-71 years). Exact classification of responses between the two dietary methods for fat intake based on tertile classification was 53% which may be a reflection of good agreement between the methods, consistency of fat intake, or the less discriminating classification (ie te1tiles rather than quartiles or quintiles). Other studies using comparisons based on quattile classifications (Willett et al. 1985) or quintile classifications (Nes et a!. 1993) have shown lower values than this for exact classification of total fat intakes. Nes and colleagues, for example, found that only 32% of subjects were exactly classified for total fat but this may be a reflection of memory bias as the age range of the subjects was 67-79 years, a much older age group than the Kemppainen study.

The Kemppainen study of 1993 was one of the few studies that examined the mean differences and SD of differences between methods. Although the differences between means were small, the SD of the differences was large. For this reason, the authors suggested that the fat use questionnaire should be used for classifying individuals rather than for absolute measures of intake.

Feunekes and colleagues (1993) derived their semi-quantitative FFQ of 104 food items that measured total fat, fatty acids and cholesterol from multiple regression analysis from the Dutch National Food Consumption survey data 1987-1988. The FFQ was then extended with nine more food items to account for 90% of the energy intake for the particular survey sample in the study. This meant that fat intake could be assessed relative to energy intake, an important determinant recommended in epidemiological studies (Willett 1990, 245-271). This FFQ is one of the few that used a fairly large randomised population (n= 191) and was validated against a number of criterion methods. Validation criteria· included a diet history and two biomarkers for linoleic acid. Fatty acid composition of adipose tissue, one of the biomarkers used, however, is related to fat intake in the past two or three years and may be inappropriate as a criterion measure for a study estimating eight weeks of dietary intake. The FFQ was administered twice, before and after the collection of the dietary history to check the effects of bias from an order effect. Overestimation on the FFQ was apparent for most nutrients when the diet history was administered before the FFQ. At a group level the authors concluded that agreement between nutrient values from the FFQ and the two criterion methods (biomarkers (despite showing low correlation) and diet history) was satisfactory to classify individuals according to fat intake. At the individual level, only 1.6% of subjects were grossly misclassified for total fat intake between the dietary history and FFQs using quintile classification. Mean differences in the intakes between the methods were low and gross misclassification by the FFQ for other fatty acids was seen. The authors suggested that the FFQ assessed the relative magnitude of the intake of individuals acceptably and that it could be used by dietitians to form the basis of

51 Chapter 2: Literature review dietary advice or to check dietary compliance. The major fault in the design of this study, however, is that both a diet history and a FFQ are reliant on memory thereby introducing similar errors or bias. This could artificially inflate the measures of agreement. Therefore, use ofthis FFQ as an absolute measure of fat intake in individuals could be unreliable.

Usually higher measures of agreement are seen for FFQs that measure calcium rather than fat which is suggestive of the consistency of calcium intake and the lower number of foods containing calcium than fat in the 'usual' diet (see Table 2.3). Correlations between methods for calcium ranged from 0. 76 (Nelson et al. 1988) to 0.8 I (Angus et a!. 1989) (see Table 2.3 p. 22). The consumption of dairy foods, the major supplier of calcium in the diets, is considered fairly consistent (Strohmeyer et al. 1984), further reinforcing the assumption that a low variability of intakes of a nutrient is associated with good correlations between dietary methods.

In summary, based on the afore-mentioned studies, FFQs that are designed to measure fat (and/or other lipids) appear better suited for measuring 'usual' intakes of fat than FFQs designed to measure many nutrients. Comparative measures based on estimations generally showed good agreement for individual fat intake between the methods. However, these FFQs would not be appropriate or necessarily applicable for use in Australia and would need to be modified for Australian foods and revalidated in Australian subjects.

2.8.3.4 Validity studies of short-term FFQs a) FFQs that measure most nutrients

Two FFQs, designed to measure most nutrients in the 'usual' diet, have been used for measuring short-term intakes (Krall and Dwyer 1987; Eck et a!. I 991)

Krall and Dwyer (1987) used their FFQ for another purpose. Their major objective was to determine sources of error in short-term recall of foods eaten although comparative measures of agreement between the methods were undertaken for a number of nutrients. They compared their FFQ to a six-day diet record for 19 patients in a metabolic ward where all foods and food wastes were precisely weighed. The results showed that their FFQ underestimated nutrient intakes mainly because of omission of a number of food items on a FFQ that was designed for measuring many nutrients.

Eck and colleagues (1991) adapted Willett's one year FFQ (Willett et al. 1987) to measure intake of most nutrients in the previous week, in 41 college students. Additional foods were added on an ad hoc basis based on the authors' perception of foods consumed by this population. The criterion measure was three 24-hour recalls which unfortunately introduce the

52 Chapter 2: Literature review same sources of error as the FFQ. Correlation analysis and cross classification measures showed moderate to good agreement for most nutrients except protein. Since this research design can artificially increase the agreement between methods, especially on such a short-term basis, this casts doubt on these results.

It is difficult to draw conclusions about the direct use of these FFQs for short-term retrospective assessment. Clearly, because of the wide variability in nutrient intake, short-term assessment of nutrient intakes of a wide variety of nutrients should be cautious. b) FFQs tlzat measure one or a few nutrients

Most FFQs used for short-term assessment of dietary intake are designed to measure only one or a few nutrients rather than a wide range of nutrients or have been adapted from previously validated FFQs (See Table 2.4, p. 22). Millar and Beard (1988) developed a sodium checklist for the Canberra Blood Pressure Trial in 1983 to measure sodium intake retrospectively over three days. The design, format and implementation of the sodium checklist formed the basis of the development of fat checklist for this thesis as discussed previously in Section 1.2.3, p. 2. The mean 'score' for sodium from the checklist was validated against a biomarker - 24-hour urinary excretion of sodium - collected on the day after the dietary data collection. Based on a correlation of 0.7 between the measures, the researchers concluded that the FFQ was a valid measure of dietary behaviour that was designed to measure sodium in salty foods and other sources (eg pharmaceuticals and beverages). This correlation is unusually high for a biomarker but may be biased for a number of reasons. It may be linked, as suggested by the authors, to manipulation or modification of the diet by a number of subjects just prior to the urine collection. A repeat study published in 1992 by the authors did not confirm the earlier findings (Beard et al. 1992). c) FFQs that measure fat

Few validity studies have been conducted on FFQs that measure mainly fat intake in the short­ term (Krista! et al. 1990a, 1990b, 1990c; Curtis et al. 1992; Dobson et al. 1993, Feunekes et al. 1993) (see Table 2.4, p. 22). Measures of agreement between such FFQs and criterion methods appear no better than those designed to measure 'usual' diet over short periods (< 4 weeks). These results are unexpected as a list of more specific foods or behaviours relating to fat modification on FFQs should elicit more detailed or accurate responses especially in short-term FFQs. This appears not to be the case and could be due to many confounding factors such as subject bias, poor quantification of foods, poor recall and real dietary change. Each study has a different design objective and sample size and can only be interpreted relative to these characteristics.

53 Chapter 2: Literature review d) Characteristics ofsltort-term FFQs designed to measure fat intake

Krista! et al. ( 1990a) compared the mean fat values derived from their validation study of a 46- item FFQ to measure the previous week's intake of fat in a selected sample of 96 US women between 45-59 years with the mean values from three criterion measures which were:

• a modified version ofBlock's FFQ (Block et al. 1986);

• the mean of two x four-day diet records collected at week I and week 13; and

• average values of the above combined.

The rationale for combining measures of dietary intakes was based on classical measurement theory stating that several measures are more reliable than any measure used singly (Kristal et al. 1990a). Although correlations of the FFQ with each criterion method taken separately were similar, this does not provide a justification for taking an average of two dissimilar methods. None of the methods depicts the actual diet eaten accurately so combining two dissimilar methods may further decrease accuracy rather than increase it as the authors suggest and no estimation of bias was reported. Therefore, this study is difficult to interpret and firm conclusions cannot be drawn. The short, 46-item FFQ developed by Krista( et al. (1990a) was administered at six weeks, ie around the mid-point between the two four-day diet records. This may have biased reporting of the second four-day diet record or been biased by the first four­ day record. Comparisons of the two diet records were not reported.

In another study using the same population of 96 women, Krista! et al. (1990b) designed a different 28-item FFQ to measure commonly consumed fat-modified foods (eg reduced-fat, low-fat) with the objective of measuring fat intake behaviour over a week using two four-day diet records as the criterion method. Significant correlations for each of the five scales of fat behaviour (0.35-0.68) with the percent fat from the criterion method, suggested acceptable strength of association between the methods.

Around the same time, Krista! et al. (1990c) developed another new dietary assessment instrument, the Food Behaviour Checklist to assess whether people were consuming low fat and high fibre diets in response to a community intervention program. The 19-item qualitative checklist was validated against a 24-hour recall (see Table 2.4, p. 22) in the same 96 subjects who were used in the previous studies. Numerous measures of agreement were used including the 95% CI for differences in responses. Observed correct agreement was under 85% for three out of 18 items, while the kappa statistic for each item was greater than 0.6. Based on these ranking measures, the authors concluded that the previous day's food intake behaviour was in good agreement with the Food Behaviour Checklist.

54 Chapter 2: Literature review Although subjects in the Krista! studies were involved in three studies at the same time and received no information about the outcome of their responses until after all data were collected, there may have been an learning effect on responses. Given the characteristics of the subjects and underlying bias, there are limits to generalising these results for all three studies (Krista! et a!. 1990a, 1990b, 1990c) to more representative population samples. Further investigation of the validity of their method is needed in different populations.

Curtis et a!. (1992) developed a FFQ to assess fat, saturated fat, cholesterol and energy consumed in the previous week and validated it against a four-day diet record. It is unclear how energy was calculated in a questionnaire that appeared to be derived from food composition tables and included only foods categorised as containing a specified amount of fat, cholesterol and saturated fat. The statistical tests used were limited to hypothesis testing so no conclusions about the degree of agreement between the FFQ and criterion method can be made.

The only published Australian short-term FFQ for fat is that of Dobson et a!. (1993). They developed a 17-item questionnaire relating to attitude, knowledge and fat intake behaviour over the past week and found a correlation of 0.55 (for percent total fat) in 124 Australian adults validated against another well-established but only partially validated FFQ (Rohan et a!. 1987). The Dobson eta!. (1993) questionnaire was a modification of a 10-item fat habits questionnaire developed earlier by Kinlay and colleagues (1991) which assessed 'usual' diet based on food choices that accounted for 50% of the average fat intake of Australian adults. The fat habits questionnaire was validated against the same CSIRO FFQ used by Dobson et a!. ( 1993 ). Agreement measures based on a correlation of around 0.4 for the two groups measured (1 05 children and 202 adults) in a cross section validation study were poor. However, the score calculated from the fat habits questionnaire showed similar relative changes in percent fat intake to the CSIRO FFQ in a sub-set of 39 people when used to measure changes in fat intake over 12 months.

Validation of one FFQ against another FFQ again introduces the same sort of error bias (Horwath 1990; Burema et al. 1988, 174) so its use as a criterion measure should be treated with caution. Because of this limitation, the Dobson and Kin lay questionnaires are suited for use as an assessment instrument to rank Australians broadly according to fat intake behaviour or as an education or monitoring instrument in a clinical setting. Their use in the evaluation of intervention programs has not yet been determined.

55 Chapter 2: Literature review 2.9 Summary of the literature Analysis of the literature demonstrated that a reduction in total fat intake in populations and individuals is a major intervention strategy to help decrease the incidence and prevalence of many diet-related diseases in western countries. This is reflected in national nutrition policies in Australia and overseas. However, dietary assessment tools are needed to implement and monitor change in fat intake in health promotion and intervention programs for counselling individuals in clinics. Most validated FFQs have focused on ranking nutrients for epidemiological purposes which are not necessarily suitable for, or applicable to, individuals for determining compliance with intervention programs, or the outcomes of intervention programs.

Traditional methods of measuring dietary intake usually include a diet history often accompanied by a food record. These procedures are time-consuming and provide limited feedback or education to the client at the time of co~mselling. New methods are needed to replace these time-consuming methods for assessing specific foods or nutrients in the diet (eg iron, calcium, fat, vegetable intakes). FFQs, although infrequently validated for use in clinics, are becoming increasingly popular as adjuncts to educating, documenting and modifying dietary behaviour of individuals. FFQs have advantages compared with traditional methods because they are quick to administer, provide rapid feedback to the client and allow an opportunity for self-monitoring and self-help.

However, collection of reliable and accurate food consumption of individuals and groups is always difficult because of the bias inherent in measuring the dietary behaviour of free-living subjects and FFQs are no exception.

All measurement methods require validation. A consideration of the nutritional literature reveals that dietary methods have been validated only patchily, if at all. The techniques used for validating FFQs against established measures have been dominated, particularly in the epidemiological studies of the 1980s and early 1990s, by traditional hypothesis testing (eg comparison of correlations, presentation of P values) and some cross-classification methods. At that time, few dietary validation studies were using estimation techniques used in the clinical/medical literature for measuring agreement between methods. These included differences between means for each method and determination of measurement error of the new method. These types of measures are particularly applicable to dietary validation studies where one or a few nutrients are assessed and especially for assessment of dietary intakes of individuals. Several nutrition journals from 1993 onwards have recommended inclusion of

56 Chapter 2: Literature review cross classification techniques and more emphasis on reporting measurement bias in dietary work.

Few validity studies of FFQs published have assessed measurement bias although this is crucial where FFQs are used for individuals in a clinic. However, it is not so important in studies where group intakes are assessed, although some researchers have suggested that misreporting serve size on FFQs is associated with systematic bias.

It is clear from the recent literature that more attention is being paid to factors affecting measurement bias, especially those that reduce respondent errors. This is evident from, for example, the use of food models, food photographs, options for changing serve sizes, protocols for grouping foods and attention to the order of administration of the FFQ (new method) and the criterion method. Despite the controversy about the effect of including serve sizes on validity (see Section 2.4.4.3, p. 17-18, 23), many foods that are major suppliers of fat in western diets (eg butter/margarine, oil, dairy products and meats) are difficult to quantify. These foods warrant testing in the design of any new FFQ measuring fat intakes. Most researchers have identified respondent problems in recall in retrospective methods but little work has been undertaken to determine whether the types of food misrecalled are similar in an Australian population. It would be important to identify sources of error from these foods so as to improve the design and ultimately the accuracy of a FFQ in Australian populations.

FFQs measuring short-term fat intake have performed well when compared with the more 'accurate' criterion methods (see Table 2.4, p. 22). Compared with other studies measuring single or a combination of a few nutrients, there appears to be some consistency and similarity in measures of agreement for fat intake, particularly in the recent studies, despite the differences in research design.

Several western countries have developed quantitative and qualitative FFQs specifically for measuring fat intake that are specific to the food supply and eating habits of the population (Van Assema et at 1992; Kemppainen et al. 1993; Krista! et al. 1990a, 1990b). These are inappropriate for use in Australia because they do not reflect the food supply or the food choices of Australians. A well-recognised, semi-quantitative FFQ published in Australia that measures 'usual' diet (Baghurst and Record 1984) has only been partially validated (Rohan et al. 1987). A modification of this FFQ has since been published that measures 'usual' diet (Wheeler et al. 1995). The only published short-term FFQ that measures fat in Australian adults was validated against the FFQ developed by Baghurst and Record in 1984 (Dobson et al. 1993). However, validation of a FFQ against another FFQ introduces the same sort of error bias which can inflate the agreement measures.

57 Chapter 2: Literature review At the time of commencement of this study in 1992, no simple method or instrument had been reported for use in an Australian clinic or intervention program for measuring short-term fat intake using a quantitative method. Due to a need for educational and monitoring resources in a local clinic, it was therefore decided to develop a fat checklist to meet these needs that could also be extended to assessing fat intake in subjects outside this clinic environment, eg to measure fat intake in other populations such as athletes. Given the poor Australian published record on dietary validation, it was decided to design and develop a fat checklist for local use and validate it in a range of subjects.

,

58 Chapter 3:Methods

CHAPTER3

METHODS

3.1 The development of the fat checklist A fat checklist (FC) was developed in 1992 to estimate fat intake over three days retrospectively. The design, format, number and selection of foods, serve size and frequency options, and the instructions for use of the FC are described in this section.

3.1.1 Design of the FC The FC was designed to assess food intakes retrospectively over the previous 72 hours (three days) in the format of a food frequency questionnaire. The format was similar to a sodium intake checklist then in use at the Cardiovascular Health Risk Management Clinic (CHRMC) in Canberra (Millar and Beard 1988). The FC was intended for clients at the CHRMC who participated in a weight reduction or cholesterol-lowering program where health risk factors were tested and monitored (ie weekly to fortnightly, then monthly for a total ~ffour months). It was designed:

• to provide semi-quantitative estimates of fat intake (g fat/d)

• to identify foods that are major sources of fat;

• to measure the frequency of consumption of foods that are major sources of fat;

• to provide information and feedback about food consumption to clients; and

• to monitor the effectiveness of the weight-reduction and cholesterol-lowering programs.

3.1.2 Selection of foods for inclusion in the FC Three sources of information were used to select the foods for inclusion in the list of foods on the FC. These were:

• food and fat consumption data from Australian population studies of adults (Cashel et al. 1986, 33; English et al. 1987; Baghurst et al. 1988);

• food composition data from the Australian food composition tables (Cashel et al. 1989); and

58 Chapter 3:Methods

• knowledge of eating habits of adolescents, young adults and patients at the CHRMC acquired over the previous five years.

The reason for this approach was, firstly, to include a large range of foods that contributed to fat intakes as reported in different age groups in the Australian population. Secondly, foods with relatively high fat content that were infrequently consumed were also included to enhance accuracy. Thirdly, to ensure that the food habits of the clinic patients would be adequately represented.

Food consumption data used as the basis for selection of foods for the FC were derived from the National Dietary Survey of 6,255 Australian adults, 25-64 years, living in capital cities, conducted in 1983 (Cashel et al. 1986; English et al. 1987) and the Victorian Nutrition Survey (Baghurst et al. 1988; 1989) conducted in 1985 on around 2000 adults aged 18 years and over and living in the state of Victoria. Both studies used populations that were randomly selected. These surveys reported the relative contributions of food types and food groups to mean fat intake of the populations tested. The national dietary survey data did not specify the individual food items (eg meat cut, type of cheese) that contributed to fat intakes. Information about individual foods that contributed directly to fat intake was obtained directly from the researchers of the Victoria Nutrition Survey (K. Baghurst, 1992, personal communication). The data formed the foundations for the food list.

Information about fat levels of the foods was obtained from Australian food composition data (Cashel et al. 1989; Lewis & Holt 1992). Individual or composite foods with a relatively low fat content making a substantial contribution to total fat intake in the Australian population (eg milk) were included.

Known food choices and quantities of foods consumed by patients attending the clinic were prioritised for inclusion on the FC. Patients attending the CHRMC were predominantly middle­ aged to elderly with an existing heart condition and/or overweight. Many had already reduced energy or fat intake before undertaking the intervention program. Frequency options and serve sizes were designed specifically to accommodate serve sizes for this group. A wider range of foods and frequency options were also included to accommodate the known food choices of adolescents and younger adults.

59 Chapter 3:Methods

3.1.3 Number and order of food items on the FC The pilot FC as shown in Table 3.1 (p. 61-62) contained 43 questions, 39 of which were semi­ quantitative while the rest were qualitative and related to the type of fat used as a fat spread or in cooking. The food list was ordered in food groups. Meats were listed first followed by milk and milk products, convenience and takeaway foods, then snack foods. The final question was open-ended for subjects to report other food items they perceived to contain fat that were not already listed on the FC.

3. 1.4 Food grouping Within the FC, foods were combined in food groups similar to the groups used in the 1983 National Dietary Survey on the basis of similar fat content (Cashel et al. 1986). Some foods of similar type and fat content were grouped into single questions to reduce respondent burden and increase compliance.

Only a few different food types were grouped because of potential loss of accuracy. These included some lean and fatty meat cuts, and takeaways. It was anticipated that a longer list of foods would be more desirable as suggested by Willett et al. (1987, 1988) and that an assessment of these groupings would be included as part of the later validation testing.

60 Table 3.1 Pilot fat checklist

FAT INTAKE CHECKLIST (pilot study) FOOD STANDARD NUMBER OF TIMES FAT PER SERVE AMOUNT EATEN Office use only circle the total number IDNumber: Date Completed: FonnNumber: (Office use only) Instructions 6. Mincemeat in the fonn of I pattie (60g) or Y.> 0 I 2 3 4 5 ormore [8 hamburger, rissole, bolognaise, cup mince lasagne, etc 1. Please circle the number of times you have eaten the following foods over the PREVIOUS 7. Meat (trimmed) casserole or 1 cup 0 I 2 3 4 5 or more [13 THREE DAYS, not including today. meat stew, eg. chinese meat 2. Trimmed means the obvious fat or chicken skin has been cut off. Untrimmed means the fat and dishes/cuny/goulash (no veg.) chicken skin have been eaten. 8. Sliced meat: bam, beef, lamb, 2slices 0 I 2 3 4 5 or more [6 3. It is important that you fill in the questionnaire as accurately as possible, indicating every time you chicken (trimmed) - remember sandwiches have eaten any of the foods mentioned. Make sure you answer every question. 4. Weights of food are based on cooked weight, (weight after cooking), as purchased. 9. Gravy made from meat 1 tablespoon 0 I 2 3 4 5 or more [3 dripping (not made from 5. The column on the far right indicates the grams of fat for one standard amount. Do not write in Gravo:x:) this column. 10. Sliced meat: fatty cuts, eg. 2rashers / 0 2345ormore [25 6. You are free to comment on your food type and standard amount if they differ substantially from bacon, salami, luncheon meats 2slices the description given (eg. you may have and Vz standard amount). Just_write a brief note next to II. Crumbed fried chicken (eg. the item. 2 drumsticks or I ·· 0 2345ormore [29 Kentucky Fried) wing plus I thigh 12. Sausages, frankfurts FOOD STANDARD NUMBER OF TIMES FAT PER SERVE 2 thin or I thick 0 2345ormore [22 AMOUNT 13. Meat pie, pastie, sausage roll EATEN Office use only I pie or 2 small rolls 0 2345ormore [24 circle the total number 14. Fish cakes, fish fingers; i medium piece 0 2345ormore [27 crumbed, battered, oven fried (150g) or 3 small EXAMPLE ONLY fish cakes untrimmed steak, pork, Jamb (not I small serve (120g) 0 1 2 3@5 ormore [22 15. Undrained canned fish in oil, 2 tablespoons 0 2 3 4 5 orrnore [14 chops) eg. tuna, sardines 2 drumsticks or I 3-4 sardines untrirruned chicken (roasted, grilled, wing plus I thigh or inci.BBQ) 3-4 slices roast meat 16. Savoury pie with pastry made I individy~ pie or I 0 2345ormore [50 with cream, eg. Quiche large sli~e (180g) Over the three days this person had eaten one large steak, 2fried drumsticlq and 3-4 slices of roast pork so the total is 4 (ie. counting 2 services for the steak, 1 for the chicken and 1 for the pork) and the fat intake is 22x4 = 88 grams offat. START THE QUESTIONNAIRE HERE 17. Savourypiewithpastry made I individual pie or I 0 I 2 3 4 5 ormore [37 without cream, eg. spinach pie, large slice (180g) I. Uver, kidney, brains, tripe, Breakfast serve 0 2 3 4 5 ormore [8 quiche tongue (100g) 2. Untrimmed steak, pork, lamb I small (120g) 0 2 3 4 5 ormore [22 18. Ordinary and thickened cream, (not chops) 1 tablespoon 0 I 2 3 4 5 or more [75 2 drumstick or 1 sour cream, (remember soups, Unttimmed chicken (roasted, wing plus I thigh sauces, momays) grilled, incl. BBQ) or 19. Ught cream, light sour cream 1 tablespoon 0 I 2 3 4 5 or more [4 3-4 slices roast meat (remember soups, sauces, rnomays) 3. Trimmed steak, pork, lamb 1 small (120g) 0 1 2 3 4 5 orrnore [12 (not chops) 20. Full cream yoghurt, plain or I small carton (200g) 0 I 2 3 4 5 or more [7 2 drumsticks or I flavoured Trimmed chicken thigh or 3-4 slices roast meat 4. Untrimmed pork chops, lamb 2 chops (120g) or 1 0 2 3 4 5 ormore [33 21. Ice cream, (not special diet 2 scoops (120g) 0 2 3 4 5 or more [9 chops large pork chop variety) 5. Trirruned pork chops, lamb 2 chops (140g) or 1 0 2 3 4 5 ormore [17 22. Full cream milk, flavoured 1 mediwn glass 0 2 3 4 5 or more [7.5 chops large pork chop milk, milkshake (200ml)

950020.DOC 21/04198 FOOD STANDARD NUMBER OF TIMES FAT PER SERVE AMOUNT EATEN Office use only 40. When meat or vegetables are fried or roasted, what are they cooked in most often? (circle the answer circle the total number that best fits)

Butter/dripping [beef or lamb] 23. Low fat milk, Hi-Lo (not I medium glass 0 2 3 4 5 or more [4 Shape or skim) (200m!) Lard/copha Cooking or table margarine 24. Cheese: cheddar, edam, etc. I cube matchbox size 0 2 3 4 5 or more [10 (30g) Polyunsaturated margarine Polyunsaturated vegetable oils 25. Egg I egg 0 2 3 4 5 or more [5 Not polyunsaturated oils 26. Nuts: any type including 9-10 nuts (approx 0 2 3 4 5 or more [II peanut butter 20g), 1 tablespoon Cooked in own juices I never fry or roast meat 27. Rich cake, cheesecake, black I average shop serve 0 2 3 4 5 or more [40 forest cake (120g) Other (please specify) 28. Light cake: plain sponge, I small piece, 2 0 2 3 4 5 or more [9 scone, pikelet, pancake scones, 3 pikelets, I 41. How much fat have you used over the•.previous three days to fry or roast vegetables? (circle the pancake answer that best fits) ~ 29. Pastry, croissants, apple 2 croissants 0 I 2345ormore [30 strudel, sweet pie I average piece Less than 1 teaspoon (150g) 1 teaspoon 30. Biscuits: sweet/chocolate 2 biscuits 0 I 2 3 4 5 ormore [8 2 teaspoons coated/shonbread/crearn 4-5 crackers 1 desertspoon filled/cheese crackers 1 tablespoon 0 2 3 4 5 or more [6 31. Toasted muesli, muesli flakes 2 tablespoons More than one tablespoon (how many? 32. Fried ~cup 0 2 3 4 5 or more [8 a, 1/3 of medium size 0 2 3 4 5 or more [17 N 33. 42. Do you add butter or margarine to vegetables or meat after cooking (circle one answer only) 34. Hot chips 1 medium canon 0 2 3 4 5 or more [30 35. Chicko roll, dim sim, spring 2 small or 1 large 0 2 3 4 5 or more [15 always/sometimes/often/rarely/never roll 36. Potato chips, , 25g bag 0 2 3 4 5 or more [8 43. Please list below whether you have· eaten any other foods over the previous three days that may Cheezels, corn chips, health 40gbar contain fat and were not mentioned in this questionnaire. Use householdm easures (eg. cup, food bar teaspoon, small serve) to describe the amount of food eaten. For example, coconut, coconut crea, [16 37. Chocolate bar (Mars, Chokito, 50g 0 I 2 3 4 5 or more pate, duck, goose, ghee, puddings, cream substitute, non-dairy coffee whitener. etc), doughnut 2doughnuts

Answer the following questions in the space provided. ADDED FAT

38. On average, how many teaspoons of butter or margarine do you have per day? (remember sandwiches, toast, crackers and cooking) ___ teaspoons [4x 3 ] Your approximate total fat intake in grams is 39. How much mayonnaise, salad dressing or oil have you consumed over the previous three days? Do not include low fat oil or no oil dressing. ___ teaspoons [4 ] Thank you for your co-operation

Vicki Deakin Consultant Nutritionist

950020.DOC 3011Jl/9R Chapter 3:Methods

Table 3.2 provides a description of a group of foods of similar origin and fat content selected for Question 3 (pilot FC) where 'trimmed' steak, pork, lamb (excluding chops) and chicken and were combined into a single question. The fat content ranged from 7.6 g fat /120 g cooked edible food to 17.3 g fat I 120 g edible food (cooked) and the mean fat content of these foods was around 12g per 120g of cooked food. The term 'trimmed' used in the FC is a contrived term for this questionnaire and designates meat cuts that range from lean to 50% fat-trimmed. This criterion was arbitrarily chosen. As the purpose of this FC was only to estimate fat intake semi-quantitatively, this range of fat-trimming was considered to be broad enough to account for the majority of consumer habits and perceptions regarding lean meat.

Table 3.2 Foods used for grouping 'trimmed' meat for Q3 on the pilot FC (steak, pork, chicken, lamb,[not chops] and chicken) (gfat/120g cooked portion)

Food description Code number Fat content on SODA* (g fat/120g food)

Beef, boneless, unspecified cut, cooked, lean 10013 7.6

Pork, medallion steak, cooked, lean 12742 7.6

Lamb, boneless, cooked, lean 11615 8.5

Pork, butterfly steak, cooked, 50% trimmed 12721 14.0

Beef, boneless, unspecified cut, cooked, 50% 10015 12.1 trimmed

Lamb, boneless, unspecified cut, cooked, 50% 11635 17.3 trimmed

Chicken, boneless, baked, lean 11445 9.1

Chicken, leg quarter, lean and skin, commer~ial** 11451 16.0/130g AP chicken*

Q3 (pilot study)= mean fat content (rounded up) 12g/serve

Source: Lewis ands Holt, NUTTAB, 1991-92, *SODA= Systems-On-Line Dietary Analysis, ** g fat/130g chicken leg plus thigh, **AP =as purchased, edible portion (66% or 86g chicken)

Similar procedures were used for grouping foods in other food categories. In some food combinations (ie Question 37 in the pilot FC), two dissimilar foods were grouped (ie chocolate bars and doughnuts) based on a similar fat content. As this FC was designed to estimate fat intake over the previous three days, the ability to discriminate each food separately was not expected to be a major issue.

63 Chapter 3:Methods

3.1.5 Options for serve size information Serve (or portion) sizes were included to assist subjects to quantify individual food items in the grouped food list. Serve sizes on the FC were presented in household measures or in specified amounts that were commonly recognised (ie scoops of ice cream, chicken thigh, medium (regular size) carton of chips). For example, Q2, untrimmed chicken (roasted, grilled or BBQ) included meat selections such as two chicken drumsticks or one wing and one thigh, foods that are easy to identify. The weights of foods that are difficult to quantify such as meat and fish were based on the average quantity of food consumed per eating occasion by 19-34 year old US adults which was around 120g of cooked meat, poultry or fish (Pao et al. 1975, 229). This age range and corresponding serve sizes was chosen, instead of an older age group who may consume small food quantities (similar to that observed in the CHRMC patients), to broaden the application of the FC to reflect serve sizes for populations consuming larger food quantities. Australian data on serve sizes of foods commonly consumed by adults in the 1983 National Dietary Survey could not be used as they were not published at the time of the study.

3.1.6/nc/usion of fat content of foods The mean fat content of each food item was given in a box to the right of the FC. These fat values were included on the FC because its main purpose was to assist subjects, involved in a longitudinal intervention program, to decrease fat intake. The option to self-monitor food and fat intake was therefore an important component of the use of the FC for patients in the program, and for staff counselling them.

3.1. 7 Frequency response options Subjects were required to recall the number of times they had consumed the foods over the previous three days. A multiple choice frequency response format provided for six options ranging from zero to five or more times that a standard portion was eaten in the previous three days excluding the day of data collection. This number of options for a frequency scale is considered optimal for dietary research (see Chapter 2, Section 2.4.4.5 p. 25).

3.2 Study design and subjects

3.2.1 Study design A pilot study was conducted on the pilot FC on 20 subjects in January 1992 with the objectives of checking for clarity of directions, for understanding of the format, for ease of use and for

64 Chapter 3:Methods

adequacy of frequency options and serve sizes listed. The outcomes of the pilot study were used to develop a revised final FC.

The reproducibility and validity of the final FC were tested between April and June 1992. Reproducibility testing was conducted on university students (Group 1), three times during this period. Two different groups were used for the criterion and content validity study. These were a different group of university students (Group 2) and patients enrolled in the CHRMC in Canberra (Group 3). Subjects in the validity study completed a three-day food record (FR) (the criterion method) over three consecutive days, Sunday, Monday and Tuesday, then completed the FC on the next day, Wednesday. Food intake information was collected from individual volunteers in Group 3 only over a ten to twelve week period. Content validity of the FCs was measured in Group 2 and Group 3. The study design is shown in Figure 3.1

Study Date or interval

Pilot study on the FC (n - 20) January

FC1 FC2 FC3

Reproducibility study on the final FC (n* = 150, Group 1) I I

April 3 weeks then at 2 months in June

Criterion validity study on the final FC April, 1 x three-day weighed food record (Sunday, (n*= 70, Group 2) Monday and Tuesday), then FC1 (Wednesday, consecutive)

Criterion validity study on the final FC 1 x three-day estimated food record (Sunday, Monday (n* = 70, Group 3) and Tuesday), then FCJ (Wednesday, consecutive)

/!Pril Mid-June n =number ofsubjects, n* =number targeted for recruitment, FC =fat checklist, FC1 =fat checklist, Test 1, FC2 =fat checklist, Test 2, FC3 =fat checklist, Test 3)

Figure 3.1 Study design of the pilot study, the reproducibility and validity studies of the fat checklist This study design was a contrived situation and based on similar methodology to that used for

the sodium checklist then used at the CHRMC. Although completion of the FC imme~iately after collection of the FR introduces memory bias and is likely to inflate the agreement measures between the methods, it was collected to coincide with the same period of measurement as the FC. Methods to detect memory bias were incorporated into the analysis to address the effect on the agreement measures.

65 Chapter 3:Methods

3.2.1.1 Selection of subjects for study of the pilot FC

Twenty subjects were recruited for the pilot study. Ten subjects were patients attending the CHRMC; the other ten subjects were second year university students who participated in regular hard exercise (ie football, soccer, and basketball). The protocol for recruitment of subjects is found in Section 3.3.1, p. 73.

3.2.1.2 Selection of subjects for the reproducibility and validity study of the final version of the FC

Three groups of subjects, based on the selection criteria in Table 3 .3, were recruited for evaluating the performance of the pilot and final version of the FC. The selection of subjects was opportunistic.

Table 3.3 Criteria for selection ofpopulation groups for evaluation ofthe FC

Group Number of subjects Criteria for selection targeted for recruitment Pilot study 20 10 second year university students, 10 patients from the CHRMC 150 University students studying a first year subject, Human Physiology. Variable fat intake was assumed from the (sex distribution= extremes of low to high fat intakes. na*) 2 70 University students studying a third year subject in nutrition at university. Variable fat intake was assumed but overall (35 male and 35 fat intake assumed to be less than Group 1. female) 3 70 Patients referred to a weight loss and cholesterol-lowering program at the CHMRC clinic in Canberra. Low fat intakes (35 male and 35 assumed even before counselling. female)

"' na=not available, not collected

Group 1 was used for the reproducibility study and was not involved in the validity study. Group 2 and Group 3 were used for the criterion and content validity study.

Because of the opportunistic nature of the data collection procedure in Group 3, it was not possible to draw a random sample. One assumption was that the main group for the design of the FC, Group 3 (clinic patients), would comprise an older study group and may already have been consuming a low fat diet. This assumption was made because the criterion for entry into the intervention programs at CHRMC was a history of elevated serum cholesterol, cardiovascular disease or overweight. Group 3 subjects were recruited on arrival at the clinic and had no previous exposure to other lifestyle intervention program at the CHMRC.

66 Chapter 3:Methods

In the absence of matched controls to the clinic patients, university students were selected opportunistically to represent a population with presumed higher fat intake than the clinic patients. This allowed an opportunity to check the validity of the FC and its potential for use in a population with a wider range of energy and hence fat intake. University students are mostly young adults, predominantly aged between 20 to 30 years, and are likely to eat a greater quantity of food and fat than older adults. It was also assumed that Group 2 (nutrition students) would collect food records more accurately than Group 3.

In summary, because of limitations in access and resources, it was not possible to obtain and test a random sample for the study groups. Controls could not reasonably be found for Group 3 because of the ad hoc nature of recruitment. It was anticipated that the groups studied, although selected in an opportunistic manner, would be adequate to detect the upper and lower range of food frequency options and food choices listed on the FC.

Anthropometric data were collected for Group 2 and 3 only. Because of concern from the university ethics committee about using the last four digits of the student IDs without consent forms, as a means of matching and potentially identifying subjects in the reproducibility study, permission to also collect demographic or anthropometric information for Group 1 was not granted.

3.2.1.3 Determination of sample size

Cohen's sample size tables (Cohen 1988, 54-55) were used to determine the sample size needed to apply statistical tests (ie correlation and regression) used for assessing association and agreement between the FC and FR. Estimation of sample size relates to effect size, power and significance testing.

Effect size

Effect size is the difference or the strength of the association between two variables (Munro 1997, 78) and relates to the smallest effect that is of clinical significance or importance (Borestein et al. 1997, 5). The effect size statistic was calculated from the standard deviation of the population, confidence intervals, frequency differences or correlation coefficients from similar studies.

Power

A power of 0.8 was used in this study to determine sample size. The power of a test, as the probability that a test of a specified sample size would detect, as statistically significant, a real difference of a given magnitude (Altman 1991, 455), also had to meet the requirement that the

67 Chapter 3:Methods difference observed (ie fat intake) between the tests was clinically meaningful. By convention, in behavioural studies, for example, where food choice is measured, a power of 0.8 is used (Munro 1997, 78; Cohen 1988, 56). Between 0.8 and 0.9 is also common in clinical studies (Altman 1991, 456).

Because no dietary studies comparing differences in fat intake using a FFQ and a FR have yet been published using the Bland and Altman technique (Bland and Altman 1986), it was not feasible to determine expected SD for differences in fat intake to substitute in Altman's nomogram (Altman 1991, 456). Appendix 1 shows the application of Altman's nomogram to determining power and sample size for this study. Predicting an expected standard deviation of the differences between fat intake between the FC and FRs (a component of Altman's nomogram and equations) would have been subjective and speculative. For this reason, Cohen's tables for determining sample size were preferred for the agreement tests (Cohen 1988, 103). Altman's nomogram, however, was used after the completion of the study to determine the effect of the final sample size on the power of tests used to measure agreement between the FR and FC.

Significance

A difference between dietary methods was taken as statistically significant if the 95% confidence interval of the difference did not overlap zero, or if the P value for the effect or association was less than 0.05.

3.2.1.4 Sample size estimates for correlation analysis

In correlation analysis, effect size is represented by the correlation coefficient (r). In relation to behavioural studies, Cohen (1988, 79-80) defines a small effect as r equal to 0.1, a medium effect as equal to 0.3, and a large effect as r equal to 0.5. In validation studies of short-term fat intake, correlation coefficients between FFQs and the reference methods range from 0.53 (Curtis et al. 1992) to 0.57 (Kristal et al. 1990a) (see Table 2.4, p. 22). For reproducibility studies of FFQs that measure fat intake, correlations range from 0.34 (Thompson and Margetts 1993) to 0.79 (Kemppainen et al. 1993) (see Table 2.7, p. 43).

Appendix 1 shows estimated sample sizes (ie the number of paired observations need for several power levels using varying correlation coefficients.

For the reproducibility study, an effect size of 0.3 was selected because the smallest effect size reported in previously published reproducibility studies of FFQs measuring fat was 0.34 (Thompson and Margetts 1993). A power value of 0.8 is conventionally used when there is no defined basis for setting the desired power level (Cohen 1988, 56) as was the case in this study. 68 Chapter 3:Methods

Therefore, for a level of probability that was significant (P < 0.05), a power of 0.8 and an effect size of 0.3, the number of subjects required for correlation analysis in the reproducibility study was 85. One hundred and fifty subjects were therefore targeted for recruitment for the test­ retest for evaluating reproducibility of the FC. This number was selected to account for poor compliance or a low response rate, a problem often reported in dietary surveys.

For the criterion validity study, an effect size of r = 0.5 was selected because the smallest correlation (effect size) reported in previously published validity studies ofFFQs measuring fat intake in the short-term was around 0.5 (see Table 2.4, p. 22). For a level of probability that was significant, a power of 0.8 and an effect size of 0.5, the number of subjects required for correlation analysis in the criterion validity study was 28 (for P < 0.05) and 41 (for P < 0.01). The final sample size selected for recruitment in the validity study was 70 subjects in each group. The criterion validity study has much higher demands on subjects so a sample size of 70 subjects in Group 2 and Group 3 was considered large enough to allow for attrition, poor compliance with the experimental methods and expected response rate. A 75 percent response rate or better was expected in Group 2 and Group 3.

3.2.1.5 Sample size estimates for linear (least squares) regression

The minimum effect size reported in dietary validation studies of FFQs measuring short-term fat intake, represented by a correlation of 0.53 (Curtis et al. 1992), was used to determine sample size estimation for regression analysis for this study. For a level of probability that was significant at the 0.05 level, a power of 0.8 and an effect size of 0.53, the number of subjects required for regression analysis was 22 as shown in Appendix 1. A sample size of 70 subjects in Group 2 and Group 3 was used to allow for attrition or poor compliance and be adequate to detect a smaller effect size at R2 = 0.12 where a sample of 59 subjects are needed (Table 2, Appendix 1).

3.2.1.6 Exclusion and withdrawal criteria for subjects

Subjects were excluded on the basis of personal ( eg did not want to participate, poor memory, language or writing difficulties) or health reasons (eg senile dementia) which might have influenced their reponses to the tests.

3.2.2 Consent Subjects in all groups were invited to participate in the survey and informed verbally at recruitment about the background of the study and assured of confidentiality and anonymity of use of the information collected. Consent forms were completed by volunteers in the pilot

69 Chapter 3:Methods study (see Appendix 2) and by those in Group 2 and Group 3 (see Appendix 5) in the criterion validity study. No consent forms were required by the Ethics Committee for Group 1 for the reproducibility study.

3.2.3 Ethics approval Ethics approval for undertaking this study was granted by the University of Canberra Human Experimentation Ethics Committee (see Appendix 6).

3.3 Data collection

3.3. 1 Pilot testing of the FC Subjects were recruited for the pilot study on an ad hoc voluntary basis from the CHRMC. A second year group of university students in a sports coaching unit were approached during a lecture and invited to participate in the pilot study. At recruitment, subjects received verbal information on the purpose of the pilot study. The investigator conducted the investigation on an individual basis by appointment.

3.3.1.1 Protocol for pilot testing of the FC

The protocol followed by 20 volunteers was:

• Subjects received written information about the purpose of the pilot study and completed a consent form (Appendix 2);

• Subjects were provided with the Pilot FC (Table 3.1, p. 61-2) and a form entitled 'Evaluation of the FC' (Appendix 3) incorporating written instructions on how to complete each form;

• Subjects were informed verbally that the PC list was not a comprehensive list of all foods but represented a selection of foods that people might consume. No additional verbal instructions were provided on how to complete the PC;

• Subjects completed the Pilot FC alone without seeking help, using the written instructions provided; and

• Subjects then completed the short evaluation form after completing the Pilot FC (Appendix 3).

70 Chapter 3:Methods

3.3.2 Reproducibility of the final version of the FC A test-retest protocol after an interval of three weeks and again after two months was used to assess reproducibility in a discrete population of university students (Group 1). A three-week interval was used for the second administration as periods less than this are associated with an overestimation of reproducibility because respondents often remember their earlier responses (Block and Hartman 1989). A two-month interval was used for the third administration. The study group was unlikely to change fat intakes over this time because no intervention or educational influences on dietary habits were expected.

3.3.2.1 Data collection for the reproducibility study

The investigator invited students from a first year physiology unit (Group 1) to participate in the reproducibility study at the end of a lecture in April, 1992. The group was informed orally that the objective of the survey was to complete a 'dietary survey questionnaire' and repeat this survey on two other occasions. They were also informed of the approximate time involved in completing the 'questionnaire'. No verbal reference was made to the nutrients the 'dietary survey questionnaire' was actually measuring. Subjects who volunteered remained behind to complete the FC in a group situation. Subjects volunteered to participate and were not identified so no consent forms were required.

The following protocol was then used in the subjects who volunteered to participate:

• The final FC (final version) was distributed;

• Subjects were asked to enter the last four digits of their student ID on the front cover;

• Subjects were verbally informed that the list of foods on the final FC was not a comprehensive list of all foods; and

• Subjects were requested to complete the final FC without seeking help.

The same protocol was used at each retest, inviting all students present at the time of the lecture to participate, irrespective of their involvement at the previous test. The last four digits of the student ID were used to match 'repeat' FCs.

3.3.3 Criterion validity of the final FC Criterion validity was tested using a similar approach to that outlined by Horwath and Worsley (1990) (see Chapter 2, Table 2.6, p. 35), by assessing agreement between the FC and a 'more accurate' dietary method, the three-day food record (FR). Subjects in Group 2 (the nutrition students) and Group 3 (the clinic patients) were used for measuring criterion validity. Group 2

71 Chapter 3:Methods used a weighed FR method whereas Group 3 used an estimated FR using standardised household measures.

3.3.3.1 Protocol for data collection of the criterion validity study

The protocol for data collection is outlined in Table 3.4.

Table 3.4 Timetable and protocol for data collection for the criterion validity study ofthe FC for Group 2 and Group 3

Week Group 2 (n*=70) and Group 3 (n*=70)

Interview 1 • Subjects were invited to participate by the investigator or a trained data collector in a group situation (Group 2) or individually Group 2 (Week 1-10, ad hoc on (Group 3) and provided with background information on the referral to the clinic) purpose of the study (see Appendix 4). Group 3 (Week 1) Interview2 • Subjects who volunteered attended an interview and were provided with consent forms (Appendix 5) and written and verbal Group 2 (Week 2) instructions on the days designated for completing the criterion Group 3 (by appointment) method (FR) (Appendix 8 and 9). An appointment for Interview 3 was arranged with the data collectors to complete the FC. Interview 3 • Subjects completed the FC in isolation the day after completion of the reference method (ie Wednesday) using a similar protocol** to Group 2 and Group 3 (Week 3- that used in the reproducibility study (see Section 3.3.2.1, p. 71). Week 10, by appointment) After completion of the FCs, the investigator (or another data collector) cross-checked the estimated food records with subjects for completion, correctness, legibility and appropriate timing with the FCs. Food models were used to check the amounts of foods consumed that were difficult to quantify (see Figure 3.2, p. 73) • No feedback about subject responses to the food records or the FC were provided at Interview 3.

FC =fat checklist, n* =number ofsubjects targetedfor recruitment, **Data for the FCs were collected individually and in isolation rather than in a group situation.

72 Chapter 3:Methods

Figure 3.2 Food models used for cross-checking the estimated food records

73 Chapter 3:Methods

3.3.4 Content validity of the final FC Content validity tested whether the FC reflected the quantities of food consumed by the groups tested. Responses to the FC and FR for the clinic group (Group 3) and Group 2 were used. University students (Group 2) were used in addition to the clinic patients to test whether the FC had the capacity to reflect the food choices of another population that was presumably consuming larger quantities of food and perhaps fat, than patients in a clinic advising patients to limit fatty foods. The outcome measures of the content validity tests were important for detecting problems in the design and format of the FC and its potential for use in a broader population group rather than restriction to a local intervention clinic.

3.4 Dietary instruments

3.4.1 The final FC The final version of the FC (see p. 76-7) contained 43 food items only slightly modified from the FC used in the pilot study. The pilot study found that subjects read and generally understood the written instructions on the FC. Despite the verbal instructions that the purpose of the pilot study was to evaluate their understanding of the format and written instructions of the pilot FC, subjects in both groups tended to focus more on their own fat intake derived from the fat values in the FC. The clinic patients perceived that all foods listed contained fat and should be eliminated or reduced in the diet. This meant that subjects tended also to avoid foods containing some fat but nevertheless recommended in a healthy diet (eg low fat milk and lean meats). The serve sizes and frequency options on the food list were adequate to meet the quantities consumed for all subjects except for liver {Ql), fish - sardines and tuna (Qll), savoury pie made with and without cream (Q8 and 9), and crumbed fried chicken (Q3), which were not eaten. Although most subjects clearly understood the directions, few completed the option for changing the serve size. The time taken for completion of the pilot FC was between 5-10 minutes. These results were subsequently used to modify the final version of the FC.

The final version of the FC did not change the original food list, food frequency options or serve sizes, only the order of foods originally listed. Some foods categorised by the investigator as 'containing fat which can be used in moderation' were placed towards the end of the FC, thus separating the FC into two sections. The first section contained the 'fattier' foods such as 'untrimmed' meat and full cream milk and dairy products. The second section contained 'trimmed' or lean meat and low-fat milk and other foods 'containing fat which can be used in moderation'. Table 3.5, p. 76-7, the final version ofthe FC, illustrates these changes.

74 Chapter 3:Methods

The final FC was self-administered and required no oral instructions. Written instructions for subjects were listed on the front page of the FC relating to the format and filling out of the questionnaire. Written instructions included options to alter the serve size (standard amount) if it differed substantially from that consumed. Subjects were instructed to do this in an open­ ended manner by either adjusting the frequency option in the Number of Times Eaten column or indicating the change by a 'brief note' next to the altered food item. For example, where subjects consumed one doughnut rather than the serve size of two doughnuts, they could record half in the Number of Times Eaten (frequency) column.

3.4.1.1 Scoring of data from the FC

Individual responses to each food item on the FC were scored and collated in both qualitative and quantitative forms ready for statistical analysis. The frequency of consumption of each food was tabulated as the number of times a food was eaten over three days (from 0 to 5 or more times/three days) while fat intake for each food item was calculated as g fat/d.

Where the serve sizes were altered, they were translated into estimated grams of fat intake per reported serve size for tabulation and analysis. An example of a completed FC where subjects made changes is in Appendix 7. All scores and values were calculated by the investigator and crosschecked for accuracy by a trained assistant.

75 Table 3.5 The final fat checklist FAT INTAKE CHECKLIST FOOD STANDARD NUMBER OF TIMES FAT PER SERVE AMOUNT EATEN Offic:euseon1y circle the total number IDNumber: Date Completed: Form Number: 8. Savoury pie with pastry made 1 individual pie or 1 0 2 3 4 .5 or more [50 (Ojjice use only) (last 4 di&its) with cream. eg. quiche large slice (180g) ., Instructions: 9. Savoury pie with pastry made 1 individual pie or 1 0 2 3 4 5 or more [37 without cream, eg. spinach pie, large slice (180g) 1. Please circle the number of times you have eaten the following foods over the PREVIOUS quiche THREE DAYS, not including today. 10. Fish cakes, fish fmgers; 1 medium piece 0 1 2 3 4 5 or more [27 2. Trimmed means the obvious fat or chicken skin has been cut off. Untrimmed means the fat and crumbed, battered, oven fried (150g) or 3 small chicken skin have been eaten. fish cakes 3. It is important that you fill in the questionnaire as accurately as possible, indicating every time you 11. Undrained canned fish in oil, 2 tablespoons 0 2 3 4 5 or more [14 eg. tuna, sardines have eaten any of the foods mentioned. Make sure you answer every question. 3-4 sardines 4. Weights of food are based on cooked weight, (weight after cooking), as purchased. 12. Ordinary and thickened cream, 1 tablespoon 0 2 3 4 5 or more [7.5 5. The column on the far right indicates the grams of fat for one standard serve. Do not write in this sour cream, (remember soups, column. sauces, mornays) / 6. You are free to comment on your food type and standard amount if it differed substantially from 13. Ught cream, light sour cream I tablespoon 0 1 2 3 4 5 or more [4 (remember soups, sauces, the description given (eg. you may have and Vz standard amount). Just write a brief note next to mornays) the item. 14. Full cream yoghurt, plain or 1 small earton (200g) 0 2 3 4 5 or more [7 flavoured FOOD STANDARD NUMBER OF TIMES FAT PER SERVE AMOUNT EATEN 15. Ice cream, (not special diet 2 scoops (120g) 0 2345ormore [9 Office use only variety) circle the total number 16. Full cream milk, flavoured 1 medium glass 0 2345ormore [7.5 EXAMPLE ONLY milk, milkshake (200m!) untrimmed steak, pork, lamb (not 1 small serve (120g) 0 1 2 3G)s ormore [22 17. Cheese: cheddar, edam, etc. 1 cube matchbox size 0 2345ormore [10 chops) 2 drumsticks or 1 (30g) untrinuned chicken (roasted, grilled, wing plus 1 thigh or 18. Rich cake, cheesecake, black 1 average shop serve 0 2 3 4 5 ormore [40 incl BBQ) 3-4 slices roast meat forest cake (120g) -· Over the thru days this person had eaten one large steak. 2 fried drumsticks and 3-4 slices ofroast pork so the total is 4 19. Pastry, croissants, apple 2 croissants 0 2345ormore [30 (ie. counting 2 services for the steak. I for the chicken and 1 for the pork) an'dJhefat intake is 22 :c 4 88 grams offat. strudel, sweet pie = 1 average piece START THE QUESTIONNAIRE HERE (150g) !. Uver, kidney, brains, tripe, Breakfast serve 0 2 3 4 5 ormore [8 20. Biscuits: sweet/chocolate 2 biscuits 0 1 2 3 4 5 ormore [8 coated/shortbread/cream tongue (lOOg) 4-5 crackers filled/cheese crackers 2. Untrimmed steak, pork, lamb 1 small (120g) 0 2 3 4 5 ormore [22 21. Fried rice Y.tcup 0 2345orrnore [8 (not chops) 2 drumstick or 1 Untrimmed chicken (roasted, wing plus 1 thigh 22. Pizza 1/3 of medium size 0 2345ormore [17 grilled, incl. BBQ) or 23. Hot chips 1 medium carton 0 2345ormore [30 3-4 slices roast meat 24. Chicko roll, dim sim, spring 2 small or 1 large 0 2 3 4 5 ornwre [15 roll 3. Crumbed fried chicken (eg. 2 drumsticks or 1 0 2 3 4 5 ormore [29 Kentucky Fried) wing plus 1 thigh 25. Potato chips, Twisties, 25gbag 0 2345ormore [8 Cheezels, corn chips, health 40gbar 4. Untrimmed pork chops, lamb 2 chops (120g) or 1 0 2 3 4 5 ormore [33 food bar chops large pork chop 26. Chocolate bar (mars, chokito, 50g 0 2345orrnore [16 5. Sliced meat: fatty cuts, eg. 2rashers 0 2 3 4 5 ormore [25 etc}, doughnut 2doughnuts bacon, salami, luncheon meats 2slices 27. Toasted muesli, muesli flakes 2 tablespoons 0 2345ormore [6 6. Sausages, frankfurts 2 thin or 1 thick 0 2 3 4 5 ormore [22 28. Gravy made from meat 1 tablespoon 0 2345orrnore [3 7. Meat pie, pastie, sausage roll 1 pie or 2 small rolls 0 2 3 4 5 ormore [24 dripping (not made from Gravo:c)

C:\Masters\fatlist 40. When meat or vegetables are fried or roasted, what are they cooked in most often? (circle the answer that best fits) FOOD STANDARD NUMBER OF TIMES FAT PER SERVE AMOUNT EATEN Office use only Butter/dripping [beef or lamb] circle the total number Lardlcopha Cooking or table margarine FOODS CONTAINING FAT BUT WIDCH CAN BE USED REGULARLY IN Polyunsaturated margarine MODERATION Polyunsaturated vegetable oils Not polyunsaturated oils Cooked in own juices I never fry or roast meat 2 3 4 5 ormore [12 29. Trimmed steak, pork, lamb I small (120g) 0 1 Other (please specify) (not chops) 2 drumsticks or I Trimmed chicken (no skin) thigh or 3-4 slices roast meat 41. How much fat have you used over the previous three days to fry or roast vegetables? (circle the answer that best fits) 30. Trimmed pork chops, lamb 2 chops (140g) or 1 0 2 3 4 5 or more [17 chops large pork chop Less than 1 teaspoon 31. Mincemeat in the form of 1 pattie (60g) or 'h 0 2 3 4 5 or more [8 hamburger, rissole, bolognaise, cup mince 1 teaspoon lasagne, etc 2 teaspoons 32. Meat (trimmed) casserole or I cup 0 I 2 3 4 5 or more [13 1 desertspoon meat stew, eg. chinese meat 1 tablespoon dishes/cuny/goulash (no veg.) More than one tablespoon (how many? 33. Sliced meat: ham, beef, lamb, 2slices 0 I 2 3 4 5 ormore [6 chicken (trimmed)- remember sandwiches 42. Do you add butter or margarine to vegetables or meat after cooking (circle one answer only) 34. Egg I egg 0 2 3 4 5 or more [5 always/sometimes/often/rarely/never 35. Low fat milk, Hi-Lo (not I medium glass 0 2 3 4 5 or more [4 Shape or skim) (200m!) 43. Please list below whether you have eaten any other foods over the previous three days that may 36. Nuts: any type including 9-10 nuts (approx 0 2 3 4 5 or more [11 peanut butter 20g), I tablespoon contain fat and were not mentioned in this questionnaire. Use household measures (eg. cup, teaspoon, small serve) to describe the amount of food eaten. For example, coconut, coconut cream, 37. Light cake: plain sponge, I small piece, 2 0 2 3 4 5 or more [9 scone, pikelet, pancake scones, 3 pikelets, I pate, duck, goose, ghee, puddings, cream substitute, non-dairy coffee whitener. pancake

Answer the following questions in the space provided. ~ ------~ ADDED FAT ~ 38. On average, how many teaspoons of butter or margarine do you have per day? (remember ..'""'~ sandwiches, toast, crackers and cooking) ___ teaspoons [4 x 3 ] ~(11 C'l 39. How much mayonnaise, salad dressing or oil have you consumed over the previous three days? Do Si Your approximate total fat intake in grams is C'l not include low fat oil or no oil dressing. ___ teaspoons [4 ] S! (11 Thank you for your co-operation ~g. Vicki Deakin c Consultant Nutritionist ~

9S0020.DOC 21104198 Chapter 3:Methods

3.4.2 Three-day food record (criterion method) The criterion method, a three-day food record using weighed measures was used in Group 2, and estimated measures in Group 3. The particular days selected for data collection were consecutive, Sunday, Monday and Tuesday. These three days of food records were used to match the measurement period ofthe FC therefore allowing a direct comparison of food intake over the same period.

Instructions for recording dietary intakes for the weighed and estimated methods were given at Interview 2 and are found in Appendix 8. These instructions incorporated sheets for recording each day's food intake and asked subjects to self-report sex, age, height and weight at the time of recording. Standardised measuring implements were supplied to both groups including spoon and cup measures (Group 2 and Group 3) and for Group 2 only, a calibrated set of food scales (Weight Watchers™). The use of household measures as substitutes for weighed amounts was accepted in Group 2 in a few special situations (eg eating out). Food records were checked with subjects in Group 2 at Interview 3, where needed, after completion of the FC (see Table 3.4, p. 72). The description and amounts for foods and beverages that were difficult to weigh or were estimated were checked. Subjects in Group 3 who consumed foods difficult to quantify by household measure ( eg meat) were interviewed individually after the FCs were completed for cross-checking of the food records. Plastic food models (see Figure 3.2, p. 73) were used for cross-checking estimated measures. The investigator and data collectors then estimated weights for these foods.

3.4.2.1 Nutrient analysis

All food records were coded for analysis by the investigator and checked by a trained assistant. Fat intakes were calculated using the Australian food composition tables using the database NUTTAB 1991-2 (Lewis and Holt 1992) and the dietary software SODA, Version 5.0b (Systems On-Line Dietary Analysis) (Computer Models, GPO Box 423, Cottesloe, WA, 6011). Foods that were eaten as mixed meals (ega casserole, stew, stir fry) were ente.red as individual ingredients either by estimation of weight from recipes or actual weight, if reported. Where a food or ingredient was not on the database, the most similar food from the database was substituted. The choice of a substitute food was based on the professional judgement of the investigator. For example, where cheese was reported on the food records with no reference to type or commercial brand name, cheddar cheese (Code J0300) was used. Similarly, where any low-fat cheese was reported, if unspecified, the reduced fat cheddar cheese (Code J0330) was substituted.

78 Chapter 3:Methods

Household measures were converted to weights using standard weights and measures, where applicable, or estimated using suggested serve size weights based on commonly consumed food weights specific to age and sex from the data published by Pao et a!. (1975). Where food models were used for cross-checking, the investigator or a trained assistant estimated weights.

3.5 Data analysis and statistical methods All data were coded, scored, and collated by the investigator and cross-checked by a trained assistant. Fat intake data were converted to an amount consumed per day (ie g fat/d) using SPSS. Data were then entered on a database in an ASCII or unformatted file by the investigator. All data were double-checked for accuracy of transfer.

Data were statistically analysed using SPSS for WindowsTM, Version 7.0 (Statistical Package for the Social Sciences). A probability of P ::::;; 0.05 was used throughout to denote statistical significance, and 95% Cis were used to show the likely range of the 'true' values of fat intake. Data that did not follow a normal distribution were loge transformed and non-parametric tests were used, if parametric assumptions were not satisfied. Normality was checked using the Shapiro-Wilks statistic.

As an external validation of energy intake for the FR and to determine the magnitude of error in reporting on the FC, an estimated energy intake (EEl): predicted basal metabolic rate (PBMR) ratio was calculated for each individual in Group 2 and Group 3 and compared with reference cut-offs derived from Goldberg et a!. (1991 ). The EEl: PBMR ratio determines whether reported energy intake using a food record method are consistent with the energy intake required for a person to live a normal (not bed-bound) lifestyle (McLennan and Podger 1998, 140). Cut-off 2 is the lowest value for EEI:PBMR that could within defined bounds of statistical confidence, reflect a plausible measure of actual energy intake in a group of people of a given measured period (such as the three days used in this study) (Goldberg eta!. 1991). Cut-off 1 is the lowest value for energy intake for an individual subject. In the absence of values for a three-day recording period in the Goldberg paper, the next closest time period of four days was used. PBMR was calculated using the equations by Schofield et a!. (1985). Means, standard deviations and 95% CI for the means were calculated for demographic characteristics, energy (kJ/d), protein (g/d), fat (g/d) carbohydrate (g/d) and alcohol (g/d). Macronutrients were also expressed as a percentage of total energy intake (%kJ/d) and fat intake as g/1 OOOkJ/d.

79 Chapter 3:Methods

Differences between groups for demographic characteristics and macronutrients were assessed by an independent sample t test if data were normally distributed, otherwise a Mann-Whitney U-test was used.

3.5.1 Reproducibility of the final FC Reproducibility of the final FC was tested using intra-class correlation for more than one repeated test. The intra-class correlation is calculated as a single correlation for more than two tests. It is the average of the correlations between all pairs of tests (Bland and Altman I 990). Since correlation is an indication of the strength of the association between the tests, not the agreement, the Bland and Altman approach (1986) was used to assess agreement between the tests. This approach included: a scatterplot of fat intake of individuals between the tests compared with the line of equity; mean differences in fat intake betWeen the tests; the 95% confidence interval (CI) for the mean differences; the standard deviation of the differences; and the repeatability coefficient.

Agreement between the tests, based on the Bland and Altman (1986) approach, was confirmed if:

• the mean differences between the original and repeat tests were not significantly different;

• repeat measures of the FC were within the 95% CI for the mean of the original tests; and

• the repeatability coefficient between the original and repeat tests was less than two standard deviations ofthe mean differences (British Standards Institute 1979).

Unmatched subjects (ie those who only completed the first FC of the reproducibility study were compared with the matched subjects to test for differences and potential bias.

3.5.2 Criterion validity of the final FC

The statistical procedures used to analyse criterion validity using data from Group 2 and Group 3 are outlined in Table 3.6. a) Descriptive statistics offat intake from the FC and FR in Group 2 and Group 3

Univariate statistics (eg means and CI, standard deviations, medians and ranges) were calculated for fat intakes (g fat/d) obtained from the FR and the FC for each group separately. These analyses describe the distribution of the data and whether parametric or non-parametric tests could be applied.

80 Chapter 3:Methods

Table 3.6 Statistical procedures used to assess the criterion validity ofthe final FC

Approach Statistical tests a) Fat intake (g fat/d) between the FC and FR in Group 2 Univariate, descriptive statistics and Group 3 (using discrete data). b) Association and agreement between the FC and FR for Interclass correlation Group 2 and Group 3 (using discrete data) Bland-Altman approach (1986) Least squares linear regression Sensitivity, specificity and predictive value c) Comparison of classification of individuals into Quartile ranking of data between FC and FR, % quartiles of fat intake between the FC and the FR in agreement between quartiles, Wilcoxon signed Group 2 and Group 3 ranks, kappa statistic d) Comparison of quantification of portion size on the FC Manual scoring of number of respondents who and FR in a sub-sample of Group 2 over-reported and under-reported serve sizes on the FC, frequency distribution of responses e) Identification of food items erroneously recalled or Manual scoring of the types of foods and misclassified on the FC in a sub-sample in Group 2 percentage of times foods were misreported, misrecalled or misclassified. Frequency distributions of response errors on the FC

FC =fat checklist, FR = three-day food record b) Association and agreement between tlze FC and FR in Group 2 and Group 3

Differences in absolute values and the variability of these differences were used to assess agreement between the FR and FC as described by Bland and Altman (1986). Correlation analysis and a least squares regression model were also used to test the relationship between the FC and FR.

The first step to gauge the strength of the agreement was to plot the absolute or individual fat intake for each method and compare with the line of equity. If the methods agreed exactly, the points would all lie along the line of equity. The inter-class correlation was then used to measure the strength of the relationship between the FC and FR. Either Pearson's Product Moment correlation (for parametric data) or Spearman's Rho correlation coefficient (for non­ parametric data) was used.

Determination of measurement error of the FC: The next step in the Bland and Altman (1986) method was to plot the differences between the FR and FC against the average of the two measurements which was taken to be the best estimate of the unknown 'true' value, in this case 'true' fat intake (Altman 1991, 397). These plots showed the size of the difference rather than the size of the absolute measurement and were useful for detecting bias. A one-sample t test of the differences against zero was used to see if the mean of the difference was

81 Chapter 3:Methods significantly different from zero (ie the hypothesised difference) (Altman 1991, 191). The t test can be used where the assumption of normality is reasonable and the variance between the two methods is not large (Altman 1991, 197).

Determination of measurement bias of the FC: Statistical bias associated with the differences in fat intake between the FC and FR was calculated by estimating the mean differences and the standard deviation of the differences. This bias is termed the 'limits of agreement' (Bland and Altman 1986). If a systematic bias was exposed by the Bland-Altman plots, the bias was adjusted by subtracting or adding the mean differences in fat intake from the FC (Bland and Altman 1986).

Precision of the limits of agreement: To measure the precision of the estimated limits of agreement, the standard error (SE) and confidence intervals (CI) of the upper and lower limits of the differences (mean ± 2SD) were calculated. If most of the differences were within the upper and lower limits of agreement, the two methods (ie the FR and FC) were regarded as interchangeable (Bland and Altman 1986).

Least squares regression analysis: Using a least squares linear regression model, fat intake for individuals, as estimated by the FC, was regressed on fat intake as estimated from the FR. To calculate the regression equation for predicting fat intake from the FC, the dependent variable (y) was the FR and the independent variable (x) was the FC. The equation for the regression line was: y=a+bx

Where: a = regression constant (intercept) b =the regression coefficient (slope) x =fat intake (g fat/d) on the FC y =predicted or 'true' fat intake

If the FC and FR agreed exactly, the slope of the regression line (or regression coefficient) would have been one. The assumptions that underlay the use of regression analysis were:

• linearity of the fat intake data between the FR and FC;

• the fat intake values of the FR had a normal distribution for each value ofthe fat intake data from the FC (the predictor variable);

• the variability of the FR was the same for each value of the predictor variable; and

• the values for the FR and FC were independent (Altman 1991, 303).

82 Chapter 3:Methods

These assumptions were tested in Group 2 and Group 3. A scatterplot of the mean fat intake (g fat/d) between the FC and FR was tested for linearity in Group 2 and Group 3. Using the residuals from the regression model, the standardised residuals were plotted against the standardised predictive values to determine normality. In addition, the normality of the standardised residuals was measured using the Shapiro-Wilks statistic. If the plot of the residuals against the predictive values (or the FC values) showed an even scatter, the residuals were considered to be normally distributed (with a mean of zero) (Altman 1991, 303). If unevenly scattered, log transformation was warranted. To assess whether the variance of the distribution of the FR was the same for all values of the FC, the studentized residuals were plotted against the regression standardised predictive values. If the variance was the same, there was taken to be random scatter around zero.

Sensitivity, specificity and predictive values of the FC: As the FC was designed as a resource to detect high fat intakes, the ability of the FC to identify fat intakes at a moderate to higher level (sensitivity) accurately was considered most important. The cut-off level to determine the sensitivity of the FC to identify the higher fat consumers was set at 70 g fat/d which was the 50th percentile of the actual fat intake for all women in the 1983 National Dietary Survey (English et a!. 1987). Where data are skewed, the median value rather than the mean was used for comparison (Gardner and Altman 1989, 24). Although the median value for fat intake in women is much less than that of 105 g fat/d for men of all ages in the 1985 Dietary Survey, a value of 70 g fat/d should still be sufficient to detect upper levels of fat intake in either sex.

Specificity was defined as the proportion of individuals who had fat intakes below the cut-off level of 70 g fat/d on the FR who also were below this cut-off on the FC. Sensitivity of the FC was the proportion of subjects who consumed ~70g fat /d on the FR, and who also reported similar intakes on the FC. Sensitivity reflects the ability of the FC to detect 'true' high fat consumers.

Positive Predictive Value (PPV) was the proportion of subjects who were 'true' high fat consumers positives (scored ~ 70g fat on FC and FR) to the total number of subjects who scored~ 70g fat on the FC. Negative Predictive Value (NPV) was the proportion of subjects who were true low fat consumers or true negatives (scored < 70g fat on the FC and FR) to the total number of subjects who scored <70g fat on the FC. Calculation of sensitivity, specificity and PVs was determined using the equations outlined in Altman (1991, 414-15). A Receiver Operator Curve (ROC) was constructed to determine the best cut-off based on numerical values and compared with the pre-determined clinical cut-off. This was constructed from the known true positive cases (sensitivity) plotted on the x-axis (ie fat intake scores on the FC of subjects 83 Chapter 3:Methods with FR scores ;;::: 70g fat/d) against 1 minus the specificity (ie fat intake scoeres on the FC for subjects who reported :::;; 70g fat/d but ;;::: 70 g fat/d on the FR). c) Comparison of classification of individuals into quartiles offat intake between tlte FCsandFRs

Mean daily fat intakes derived from the FC and FR in Group 2 and Group 3 were ranked into quartiles. Quartiles for each methods were compared for the percentage of exact agreement and disagreement (gross misclassification) of the ranking and for differences between the mean fat ranking in each quartile. A Wilcoxon signed ranks test was used initially followed by a kappa statistic to test the significance of the association between the FC and FR based on comparison of each quartile. d) Comparison ofserve size between tlze FC and FR

This test of the ability of subjects to recall or determine serve sizes on the FC accurately was a measure of respondent error. Subjects were asked to modify serve sizes on the FC, where appropriate. Serve sizes of individual food items in subjects in Group 2 and in Group 3 were manually compared for disparities between the FC and FR, using the FR as the 'accurate' method. e) Identification offood items erroneously recalled or misclassified on tlze FC

This identified specific food items that were erroneously recalled on the FC using the FR as the 'criterion method'. This is another test for respondent error or accuracy of recall of food items on the FC. The FCs were manually compared with the FRs in the same subjects from Group 2 and Group 3 used in d) for the types of foods that were erroneously recalled.

The categories used to define food items erroneously recalled were food items omitted, added, and misclassified (ie listed in the wrong food group or food item on the FC). Numbers and types of food items omitted, added or misclassified on the FC were tabulated for each subject.

3.5.3 Content validity of the final FC

3.5.3.1 Statistical procedures used for measuring content validity of the final FC

The responses on the FCs completed by Group 2 and Group 3 were tested to measure whether the FC captured the serve sizes and frequency of foods consumed. Major contributors of 'true' fat intake from the FC were also determined. An overview of the methods used to measure content validity of the FCs is shown in Table 3.7.

84 Chapter 3:Methods

Table 3.7 Methods used to analyse content validity ofthe final FC in Groups 2 and 3

Methods of measuring content validity Statistical procedures a) Frequency of consumption of foods listed on the FCs in each group Frequency distribution of responses b) Frequency of alterations to serve sizes in each group Manual scoring from FC c) Identification offood items on the FC that were major contributors to Foods from the FC that 'true' daily fat intake in each group contribute to 70% of the 'true' intake from the FR

FC-= fat checklist, FR= three-day food record, 'true' intake = mean daily fat intake calculated from the FR a) Frequency of consumption ofindividualfoods on the FC

This tested whether the food items, serve sizes and frequency options listed on the FC were suitable for the foods consumed by individuals in Group 2 and Group 3. Descriptive statistics were used to determine the frequency of consumption of individual food items on the FCs. b) Frequency of alterations to serve sizes in eaclt group

Changes to serve sizes were identified visually by checking each FC and corresponding frequency score where the value differed from the standard serve on the FC. c) Identification of FC food items that were major contributors to 'true' daily fat intake

Food items that were major suppliers of fat and contributed to a ranked cumulative fat intake of 70% of total fat intake from the more accurate FR, the criterion measure, were identified in Group 2 and Group 3 using frequency analysis. This 70% cut-off value was arbitrarily determined.

85 Chapter 4: Results

CHAPTER4

RESULTS OF VALIDITY TESTING OF THE FINAL FAT CHECKLIST

4.1 Final number of participants Group 1 (University students studying first year physiology)

Ofthe 150 subjects in Group 1 initially targeted for recruitment, only 88 completed the initial reproducibility test (see Table 4.1). The final number matched for analysis between Test 1 and Test 2 in the reproducibility study was 49 subjects; and for Test 1, Test 2 and Test 3, 14 subjects. The major reasons for this attrition was not based on any exclusion or withdrawal criteria but because subjects were not present at the lecture when the data were collected. Attendance at lectures is not compulsory and lack of attendance unlikely to be associated with being in a study.

Group 2 (University students studying third year human nutrition)

Of the 70 students originally targeted for inclusion in the criterion validity study, only 42 provided satisfactory data for both the fat checklist (FC) and three-day food record (FR). The rest (n=28) failed to complete the reference instrument (FR) satisfactorily for either of the following reasons:

• did not adhere to the overall protocol of the study (eg did not attend the follow up interview with the investigator; and/or the FC did not match the timing of the FR) (25 subjects); and

• unsatisfactory rigour in collection of the FR despite being provided with a specific training protocol and written instructions (3 subjects).

Group 3 (Clinic patients)

Of the 70 people targeted for inclusion into the study only 53 were actually recruited. Recruitment was dependent on the number of people choosing to attend the clinic specifically for either weight control or for the cholesterol-lowering program. On completion of the study

86 Chapter 4: Results only I 9 out of 53 had satisfactory data for both the FC and FR that could be used for analysis. The reasons for rejection of the unsatisfactory records in this group were:

• incomplete information on the FR (ie inadequate detail in quantifying or reporting food intake) (I I subjects);

• non-continuous collection ofthe FR with the FC (12 subjects); and

• attrition from the clinic program (11 subjects).

4.1.1 Profile of study participants

Group 1 was used for the reproducibility study. Demographic data (age, height, weight) were not available on this population because of ethical consideration as described earlier (see Section 3.2.4, p. 70). Group 2 and Group 3 were discrete populations and used for the criterion and content validity study. Characteristics of all three groups are summarised in Table 4. I.

Group 2 and Group 3 were significantly different for age (t = -10.5, df = 59, P=0.001) as expected. Twenty percent of subjects in Group 2 were less than 20 years, 80% were below 23 years so data were markedly skewed towards a younger population (mean age 22.3 ± 5.1 years, range 19-42 years, kurtosis = +I. 135). In Group 3, all subjects were greater than 38 years of age (mean age 49. I ± 10.6 years, range 38-68 years). These data were normally distributed with the exception of one outlier (z-score, 3.6).For the matched data in Group 2, there were twice as many females as males (12 males, 30 females) (see Table 4.1, p. 88). Group 3 contained nine males and 10 females.

Significantly higher BMis were seen in females in Group 3 compared to Group 2 (t =-3.551, P<0.001, n=39) but not in males. Excluding sex differences, Group 2 and Group 3 were significantly different in BMI (t =-4.9, df= 58, P = 0.001). In Group 2, all subjects were either . in the healthy weight range (BMI 20-25) or slightly underweight (BMI <20), except for one subject who was categorised as obese. In contrast, 84% of subjects in Group 3 were overweight or obese (BMI > 25) while the rest were in the healthy weight range. An inspection of z-scores for BMI identified two subjects as outliers; one from each study group.

87 Chapter 4: Results

Table 4.1 Participation and demographic characteristics ojonly th ejull particpants in each group

Characteristics of Group 1 Group 2 Group 3 population (First year (Third year university (Clinic patients) university students studying students) nutrition) Number targeted for 150 70 70 recruitment (sex distribution (35 male, 35 female) (35 male, 35 female) na) Number actually 88 70 53 recruited who (22 males, 48 females) (22 males, 31 females) completed FC at Test l Full participants 49 (Test I and 2) 42 19 (matched** data) 14 (Test I, 2 and (12 male, 30 female) (9 male, I 0 female) 3)

AGE na 22.3* ± 5.1 49.1* ± 10.6 mean ±SD 95% CI na 20.8, 23.9 43.9- 54.2 age range na 19-42 30,68 median age na 20 49 BMI na 22.6* ± 3.1 27.0* ± 3.4 mean ±SD 95%CI na 21.9, 23.7 24.9, 28.2 BMI range na 17.5-36.1 21.3- 34.6 median BMI 22.6 26.4

*P

Fat intake, although slightly skewed as seen m Table 4.2, followed a normal distribution

(P~0.05) in all three samples. The high standard deviation reported in Table 4.2 suggests high individual variability. However, the mean fat intake of subjects in Test 2 of 75.0 g fat/d fell within the 95% CI for the mean fat intake for subjects in Test I. The mean fat intake for unmatched subjects of93.0g fat/d fell within the 95% CI for the mean fat intake for subjects in

88 Chapter 4: Results

Test 1 but outside that for Test 2. Unmatched subjects consumed significantly higher mean fat intakes than those subjects in Test 2 (t = -2.337, P = 0.022, 2-tailed, 95% CI for mean differences= -34.8, -2.8g). No significant differences were seen in the mean fat intake between Test 1 and the unmatched responses (t = -1.027, P = 0.307, 2-tailed, 95% CI for mean differences= -26.2, +8.3).

Table 4.2 Comparison of descriptive statistics for fat intake (g fat/d) from the FC at Test 1 and Test 2 for Group 1 (n=49, matched subjects), including unmatched subjects

Descriptive statistic Test 1 Test2 Unmatched (n=49) (n=49) subjects (n=39) Mean (SD) 84.9 (40.7) 75.0 (34.2) 93.8 (46.6) 95% Cl for mean 73.2, 96.6 65.2, 84.8 80.9, 106.7 Median 80.0 75.3 93.0 Variance 1659 1167 2172 Range 9.3- 199.8 11.7- 166.3 18.7-220.3 Pearson's skewness coefficient' +0.120 -0.009 -0.017 Standard error 5.8 4.9 6.4 Normality2 Yes Yes Yes P=O.I6 P=O.ll P=0.20

FC =fat checklist; SD = standard deviation; CI = confidence interval, Pearson's skewness coefficient = (mean­ median/SD (values above 0.3 or below -0.3 indicate severe skewness, 2Values ofP>0.05 follow a normal distribution

4.2.1 Agreement assessment between Test 1 and Test 2 The mean differences and standard deviation of these differences (SDdiff) in mean fat intake and the measurement error defined by the limits of agreement are summarised in Table 4.3. Calculations for these values are found in Appendix 10.

Table 4.3 Statistical parameters for the FC used for assessing agreement in fat intake between Test 1 and Test 2 (n=49)

Statistical parameter g fat/d Mean difference (d) in fat intake (Test 1 minus Test 2) +9.9 (33.4) (SDdiff) 95% CI for mean differences +0.3 to +19.5 Limits of agreement (mean difference (d)± 2 SDdiff) -56.9 to +76.7 Repeatability coefficient 66.8

SD= standard deviation, SDdifJ = SD ofthe differences, CI=confidence interval

The mean difference (d) of+9.9g fat between Test 1 and Test 2 was significantly different from zero difference (t=2.07, df=48, P=0.04) meaning that the FC was not reproducible. The 95% CI

89 Chapter 4: Results for the mean difference excludes zero which corroborates the significance of the difference at the a=0.05 level. The high SDdiff reflects the large individual variability between responses. The limits of agreement showed that Test 1 can be 55.7 g fat below or 75.3 g fat above Test 2. Although the repeatability coefficient of 66.8 g fat/d falls within 2SD of the mean differences (ie acceptable statistically), the magnitude of the limits of agreement in absolute fat values confirms poor agreement between the tests.

Figure 4.1 depicts a comparison of fat intakes (g fat/d) of individuals between Test 1 and Test 2 in the same subjects in Group 1. The data are scattered evenly, close to, and on both sides of the line of equity. The scatter appears wider at the higher levels of fat intake(> 120g fat/d) but does not represent a systematic bias because of the presence of wide differences at the lower end of fat intake. For the data in Figure 4.1, Pearson's correlation was 0.62 (P<0.01, n=49, 2-tailed). This value suggests a moderate strength of association between Test 1 and Test 2 but poor agreement because of the wide scatter of individual differences in means.

.... 200 1i) 180 ~... .12 160 () u. E 140 .g " 120 ,, -:g Jl! 100 .9 Q) ~ 80 J9 .5 J2 60 ~ '(ij 40 "C c: C'll 20 Q) :E 20 40 80 100 160 180 200

Mean daily fat Intake (g fat/d) from FC for Test 2

Figure 4.1 Repeated measures offat intake using FC in the same subjects Group 1, with the line ofequity (n=49) Figure 4.2 shows these differences in fat intake plotted against the mean daily fat intake for Test 1 and Test 2.

The distribution of the differences in fat intake between Test 1 and Test 2 did not follow a normal distribution (Shapiro-Wilks Statistic 0.919, df=49, P=O.Ol) and the data were therefore logged for further analysis. Figure 4.3 shows the loge-transformed data of Figure 4.2.

90 Chapter 4: Results

N' 200 '1i) 180 ~ Ill 160 ::J ·ec: 140 ..... 120 '1i) a 100 a ~ 80 a :g - 60 0 Jl! 0 .!:!! 40 0 0 Q) D D D .!>

Mean daily fat intake (g fat/d) ofTest 1 plus Test 2

Figure 4.2 Agreement assessment between difference in daily fat intake (gfat/d) of Test 1 minus Test 2 plotted against the mean fat intake in both tests (g/fatld) for Group 1 (n=49). The centre line represents the mean differences between the two tests and the other two lines represent 2SDs from the mean.

N' 2.0 '1i) ~ ~ 1.5 ec: 0 .... u '1i) 1.0 D u ~ 0 .5 D D 0 D 0 i D C) 0 a. 0 D D 0 o e.g 0 D g 0.0 IJ 0 0 0 D a a D a D 0 0 0 I -.5 a a"" s 0 .E ~ . -1.0 a e ~ -1.5 0 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5

Mean daily fat intake On g fat/d) for Test 1 plus Test 2

Figure 4.3 The loge-transformed data of Figure 4.2

91 Chapter 4: Results

The scatter of the differences from the mean now appears evenly distributed in Figure 4.3. Therefore, no systematic bias was apparent.

4.2.2 Test-retest results (individual responses to each food item) Differences in fat intake (g fat /d) of individual food items between Test I and Test 2 on the FC were analysed in the same 49 subjects in Group I. No significance differences from zero difference were found for any individual food items except for liver which was not consumed by any subjects in Test 2. Appendix I 0 summarises the mean differences (g fat/d), standard error of the mean differences, 95% CI for the differences, significance of the differences and the number of subjects consuming each food item. The variability in direction of differences in fat intake for individual foods and the wide range of confidence intervals of the differences suggests a wide variation in consumption of individual foods at each test.

4.2.3 Summary of reproducibility results The mean difference of 9.9g fat/d in fat intake between Test I and Test 2 in 49 subjects in Group I for the reproducibility study was small, although the scatter of individual differences, and limits of agreement around the mean differences were wide (see Figure 4.3, p, 9I and Table 4.3, p. 89). Ninety-four percent of individuals in Group I fell within the limits of agreement (mean difference ± 2SD), and the repeatability coefficient of around 67 g fat /d also fell within these limits. Although the repeatability coefficient of the PC was therefore statistically acceptable, the large scatter of the mean differences suggests poor reproducibility in fat intakes for individuals between Test I and Test 2.

4.3 Criterion validity of the final FC

4.3.1 Standards used for measuring criterion validity Forty-two subjects from Group 2 (60% of recruited sample) and I9 subjects from Group 3 (36% of recruited sample) were used for this analysis. These were the final numbers that were matched for analysis (see Table 4.1, p. 88).

In all comparisons, the FR was the 'true' or criterion measure for fat intake.The FC would be interchangeable if the results showed that most of the following criteria were satisfied.

• Ninety-five percent of the differences in fat intake in individual subjects between the PC and FR lay within the mean difference± two standard deviations (SD) of the differences and were not clinically important using Bland-Aitman plots (Bland and Altman 1986);

92 Chapter 4: Results

• Correlation of fat intake between the FC and FR showed at least fair to moderate strength (r >0.3) and were significantly associated, although there are recent criticisms in the interpretation of correlation (see Section 2.5.1.4, p 28);

• Cross-classification measures of fat intake between the FC and FR (ie Wilcoxon signed ranks, kappa statistic) showed good agreement and were significantly associated; and

• Qualitative and quantitative differences in fat intake between the FC and FR were small by inspection.

4.3.2 Characteristics of the dietary intake in Group 2 and Group 3

4.3.2.1 The daily energy and nutrient content of the diet derived from the FR in Group 2 and Group 3

Table 4.4, p. 94 shows the summary statistics of selected nutrients and energy calculated from the FRs, including the nutrient density of fat (g fat/1 OOOkJ) and the ratio of estimated energy intake (EEl) to predicted basal metabolic rate (PBMR). To limit the variability caused by differences in energy intake between the groups, fat intake was energy-adjusted and expressed as g fat/1 OOOkJ.

Energy-adjusted fat intake (g fat/1 OOOkJ) was similar between and within each group and between sexes. The percentage contribution of fat to total energy intake was close to, or slightly exceeded, the recommended Australian population target of <30% for each group and sex. These values, however, are relative to total energy intake and are distorted at high and low levels of energy intake and do not provide a reliable assessment of fat intake. The absolute fat intake (g fat/d) in males in Group 3 was low compared to Group 2 males. There were no significant differences in energy from fat intake or any other energy-yielding nutrients in females between Group 2 and Group 3 (see Table 4.4).

Significant differences (P.$0.05) in absolute intakes of energy, EEI:PBMR, fat, protein, and carbohydrate, however, were found in males between Group 2 and Group 3. All other variables for males were not significantly different between Group 2 and Group 3. The difference in energy intake in males between groups was 6046 kJ/d (95% CI = 3197, 8896).

93 Chapter 4: Results

Table 4.4 Summary statistics of macronutrient and energy contribution in Group 2 and Group 3 calculatedfrom the FR, including estimated Basal Metabolic Rate

Group 2 (n-42) Group 3 (n=19)

Females Males Females Males (n=30) (n=12) (n=lO) (n=9)

Mean 95%CI Mean 95% CI Mean 95%CI Mean 95%CI (SD) (SD) (SD) (SD) 2 BMI (kg/m ) 22.0 20.8, 24.5 22.9, 26.9 24.3, 27.2 24.4, (3.3) 23.2 {2.4) 26.1 (3.3) 29.4 (3.7) 30.0 BMR(kJ/d) 5818 6064, 7613 7239, 6020 5646, 7558 7148, (657) 6649 (588) 7989 (523) 6394 (533) 7968 Energy intake, 6649 5825, 12864* 10612, 6220 4872, 6817* 5010, estimated (EEl) (2206) 7473 (543) 15115 (1855) 7568 (2328) 8590 (kJ/d)

EEIIPBMR 1.16 1.00, 1.68* 1.43, 1.02 0.84, 0.91* 0.65, (actual values) 1.31 1.93 1.21 1.17

EE/IPBMR1 1.33 1.38 1.30 1.38 7aroup cut-ofj}wo adjusted for nan gender)

Protein (g/d) 69.6 61.6, 138.3* 101.5, 65.2 55.1, 71.5* 54.2, (21.5) 77.7 (57.8) 175.0 (14.1) 75.2 (22.6) 88.9

Energy from 18.3 17,20 17.8 15, 21 18.3 16,21 18.3 16, 21 protein(%) (4.4) (4.3) (3.1) (3.7) Fat (g/d) 55.1 48,63 108.2* 89, 13 54.0 68, 70 58.3* 35,37 (20.8) (34.3) (22.2) (30.5)

Energy from fat 32.0 29.5, 31.2 28.2, 32.2 28.5, 30.1 23.1, (%) (6.8) 34.5 (4.7) 34.1 (5.2) 35.9 (9.1) 37.1

Fat (adjusted for 8.6 ( 1.8) 7.9, 8.2 8.4 (1.3) 7.6, 9.2 8.5 (1.4) 7.4, 9.6 7.9 (2.5) 6.2, 9.7 energy) (gfat/JOOOkJ)' CHO (g/d) 196 169,223 372 314,430 164 141, 169 182 133,231 (72.6) (91.4) (32.4) (63.4)

Energy from CHO 47.6 45,50 47.2 43,52 43.6 38,49 43.4 39,48 (%) (6.8) (7.2) (8.0) (6.4) Alcohol 4.9 na 15.2 na 15.4 na 18.8 na (g/d) (12.3) (20.3) (22.2) (19.1)

Energy from 1.8 (4.2) 3.2 (3.3) 5.7 (7.5) 8.1 (8.8) alcohol(%)

BMI=Body Mass Index, SD= standard deviation, BMR= Basal metabolic rate, k.J=kilojoules, CHO=carbohydrate; CJ=95% confidence interval for the mean, *P<0.05, na= not available, 1= cut-offtwo values for EEI:PBMRjrom Goldberg eta/. 1991 The EEI:PBMR was calculated at the group and individual levels. At the group level, the mean EEI:PBMR was well below physiological minima in both males and females in Group 3, and females in Group 2. Twenty-two females (out of 30 females in Group 2) and eight (out of 10

94 Chapter 4: Results

females in Group 3) had an EEl: PBMR ratio less than the Goldberg et al. (1991) group cut-offs shown in Table 4.4.

At the individual level, Goldberg et al. (1991) reported that an EEI:PBMR ratio of 1.06 for males and 0.88 for females represents the lower 95% CI for a plausible ratio, when derived from four-day food records for a single individual. Distribution of these ratios for individuals in Group 2 and Group 3 is shown in Figures 4.4 and 4.5. Five females in Group 2, two females in Group 3 and seven males in Group 3 were below these cut-off levels. No male subjects in Group 2 fell below this level. These figures show clearly that there was wide variability in the distribution of energy intake between and within groups and between sexes, and that males in Group 3 were consuming very low energy intakes relative to their estimated BMR. This suggests that the majority of subjects under-estimated their energy intakes.

1.0 ~.,..,..,.T"TT"TT.,.,..,T"TTTI"TT"TT"11TlTTlnnTTT1TTrliiiiTTTrTIIITTI11TITTTTIITTrrlfnr-l

0 females

0.0 l_ijlillllljlUWljJllLWljlll~~~~~~~~~~~~_J • males .45 .74 .96 1.06 1.09 1.23 1.37 1.45 1.64 1.92 2.21

EEI:PBMR

Figure 4.4 Distribution of individual estimated energy intake (EEI) :predicted basal metabolic rate (PBMR) calculated from the FR in Group 2. Each bar represents one subject.

95 Chapter 4: Results

0 females

0.0 .L_~L,IIll-t--LJ___l_jl--/111-~~.j-L~f-l~~-P'"'I....f-:--I'L'-!-:::-1"""~:---' • males .35 .77 .80 .85 .88 .94 1.05 1.18 1.37 1.54

EEI:PBMR

Figure 4.5 Distribution of individual estimated energy intake (EEJ):predicted. basal metabolic rate (PBMR) calculated from the FR in Group 3. Each bar represents one subject.

4.3.2.2 Comparison of mean fat intake (g fat/d) between the FRs and FCs in Group 2 and Group 3

Table 4.5 compares the fat intake (g fat/d) and descriptive statistics from the FCs and FRs for Group 2 and Group 3. Inspection of the distribution of these data was necessary to apply the appropriate statistical tests of association with confidence ( eg correlation and linear regression). The distribution of fat intake (g fat/d) in both the FR and FC in Group 2 did not follow a normal distribution and was positively skewed. The deviation from normality was not gross based on visual examination of the probability plot. The distribution of fat intake from the FRs for Group 3, although skewed, still followed a normal distribution in contrast to the non-normal distribution of the FC data.

For both Group 2 and Group 3, the mean fat intake (g fat/d) from the FC fell within the 95% CI for the mean fat intake from the FR. Although the variances between the FR and FC appear markedly different in both groups, the variance ratio and F distribution were considered statistically equal in Group 2 {F= I.64, O.I >P>0.05). The variance ratio for Group 3 of 1.92 also confirmed that the data in the FR and FC came from the same population. The wide 95% CI, especially for Group 3, reflected the small sample size. Similarly, the large standard deviation in both groups suggested a wide variability in individual fat intake.

96 Chapter 4: Results

Table 4.5 Mean daily fat intake (g fat/d) and descriptive statistics for the FR and FC in Group 2 and Group 3

Group2 Group3 (n=42) (n=19)

FR FC FR FC

Mean (SD) 72.0 (35.8) 73.8 (28.0) 56.0 (25.7) 59.9 (37.4)

95% CI for mean 60.8, 83.1 64.0, 81.5 43.6, 68.4 41.6, 78.5

Median 61.0 64.5 49.8 46.5

Variance 1279 791 664 1397

Range 20.0- 175.0 25.4- 141.0 15.1-101.5 14.7- 138.0

Pearson's skewness +0.307 +0.332 +0.241 +0.358 coefficient1 SE 5.51 4.38 5.91 8.78

Normality2 No No Yes No P=.01 P=.01 P=.22 P=.01 Test for normality 0.918 0.907 0.928 0.869 (Shapiro-Wilks statistic) FC= fat checklist, FR = three-day food record, na = not applicable, SD = standard deviation, SE=standard error; n 1 = number ofsubjects, Pearson's skewness coefficient = (mean-median/SD (values above 0.3 or below -0.3 indicate severe skewness)), 2Normality- values ofP>0.05 indicate that the data followed a normal distribution,

In summary, the majority of subjects in Group 2 consumed between 50-90g fat/d compared with Group 3 who consumed between 40-60g fat/d based on food record measures (visual examination of the plots). Some subjects, however, in both groups, recalled that they consumed greater than 1OOg fat/d based on FC measures. Two outliers were identified in the FCs for Group 3 but not in the FRs. One outlier was identified in the FRs for Group 2 but not in the FCs. These outliers were not excluded from the analysis because they represented actual reported or recalled intakes. They did, however, contribute to an inflation of the variance.

4.3.3 Tests of association and agreement between the methods A difference in fat intake of individuals and the magnitude of these differences rather than absolute comparisons of means was used to assess association and agreement between the FR and FC.

97 Chapter 4: Results

4.3.3.1 Comparison of individual fat intake (g fat/d) between the FR and FC

Inspection of the data is required to satisfy the assumptions for applying correlation and regression analysis and determining and magnitude and direction of differences in fat intake between the FC and FR in individuals for the Bland-Altman plots.

For Group2

Figure 4.6 represents the association of the fat intake data (g fat/d) between the two methods, for Group 2.

'E 180 ~ 160 140 0 0 ~ 0 E 0 .g 120 S' 0 ::. 100 co 0 0 J!! 0 .9 80 Q) 0 0 ..:Jt! 0 0 0 J!! 0 0 .!: 60 0 'b 0 0 0 0 J!!- 0 0 40 0 ~ rPo 'iij '0 0 c: 20 0 cu Q) ~ 20 40 60 80 100 120 140 160 180

Mean daily fat intake (g faVd) from FC

Figure 4.6: Mean daily fat intake (gfat/d) for Group 2 measured by the FR and FC with the line ofequity and 95% prediction intervals (n=42)

Fat intake from the FC and FR did not follow a normal distribution (see Table 4.5) so data were loge-transformed. Figure 4.7 shows the loge-transformed data from Figure 4.6 and indicates some improvement in association between the methods. No consistent bias or trend in under­ reporting or over-reporting on the FC was apparent.

98 Chapter 4: Results

5.5 0:: LL E 5.0 ,g oo 0 "0 0 -~ 0 ~ Do 0 0 C) 4.5 D §. D 0 Q) ~ s 4.0 0 .E 0 ~ - D 2:- D ·n; oo D "0 3.5 0 c (U Q) D ~ 3.0 3.0 3.5 4.0 4.5 5.0 5.5

Mean daily fat intake (In g fat/d) from FC

Figure 4. 7 Data ofFigure 4.6 after logarithmic transformation (n =42) Using the traditional correlation approach, the data in Figure 4.7 appear to show a good correlation suggesting a strong association between the methods (r=0.68 (Pearson's correlation), P

For Group3

Figure 4.8 shows the association of the fat intake data (g fat/d) between the FR and FC for

Group 3. Fat intakes calculated from the FC for this group were skewed so data were lo~ transformed. Figure 4.9 shows the lo~-transformed data from Figure 4.8. Some improvement in association was seen after this transformation especially at higher intakes of fat. Subjects tended to over-report fat intake at the higher intakes of fat (>90g fat/d) and under-report at intakes less than this (ie 30-90g fat/d). However, around 40% (8/19) of subjects recalled fat intakes accurately (ie represented by data falling along the line of equity) in Group 3 compared to only 24% ( 10/42) in Group 2.

99 Chapter 4: Results

140

0:: u.. 120 E ,g 100 -~ J!! Ol 80 -Q) .:.t. jg .!:.... 60 J!! ·n;~ 40 "0 c ro Q) 20 ~ 0 0

Mean daily fat intake (g faVd) from FC

Figure 4.8 Mean daily fat intake (gfat/d) for Group 3 measured by the FR and the FC, with the line of equity {n =l9)

5.0 0:: u.. E a 4.5 ,g a -~ a J!! a a en 4.0 a a c 0 c Q) .:.t. .19 ....£ 3.5 a J!! ·n;:it:- ""C 3.0 c ro Q) ~ 2.5 2.5 3.0 3.5 4.0 4.5 5.0

Mean daily fat intake (In g fat/d) from FC "

Figure 4.9 Data of Figure 4.8 after logarithmic transformation

Using the traditional correlation approach, the data in Figure 4.9 show a high correlation suggesting that there as greater strength in association between the FC and FR in Group 3 than

100 Chapter 4: Results

Group 2 (r = 0.84 (Pearson's correlation, P < 0.01 (2-tailed)). but the wide scatter of30-40g fat difference implies poor agreement for some individuals.

4.3.3.2 Correlation coefficients between the FR and FC

Table 4.6 summarises the interclass correlations, significance of those correlations and the confidence intervals around the correlation between the FR and FC for Group 2 and Group 3.

Table 4.6 Comparison of the interclass correlations for fat intake (In gfat/d) between the FR and FC for Group 2 and Group 3

Group Pearson's correlation CI for r coefficients (r)

Group 2 (nutrition students) 0.68* 0.47, 0.81 (n=42) Group 3 (clinic patients) 0.84* 0.62, 0.97 (n=19)

CJ=conjidence interval, n =number ofsubjects, *P< 0.01, 2-tailed, FC=fat checklists, FR=food records

Both correlation coefficients were positively and significantly associated (P<0.01, 2-tailed). These correlations, however, conceal the apparent scatter of individual data and poor agreement shown in Figures 4.6 to Figure 4.9.

4.3.3.3 Measurement error of the FC for Group 2 and Group 3

The relationship between the mean differences in fat intake (FC minus FR) and the mean fat intake ((FC+FR)/2) including the limits of agreement and precision of these limits between the FR and FC determines the measurement error of the FC, according to the Bland-Aitman technique. Figure 4.10 and Figure 4.11 illustrates the differences in fat intake (g fat/d) between the two methods plotted against their mean for Group 2 and Group 3 combined.

For Group 2, the scatter of the differences from the combined mean tends to widen as fat intake increases but not consistently as several differences at higher fat intakes are small. Therefore, no systematic bias was apparent. The mean differences followed a normal distribution (Shapiro-Wilks Statistic= 0.947, df=41, P=0.08).

101 Chapter 4: Results 80

a:: 60 1.1.. 1/J :I c: 40 0 .E 0 0 0 0 (..) 20 0 0 1.1.. 0 0 0 a 0

Mean daily fat intake (g faUd) for FC + FR

Figure 4.10 Agreement assessment between the two methods using Bland-Altman plots. Differences in fat intake (g fat/d) between the two methods are plotted against the means for Group 2 (n =42) . The centre line represents the mean difference between the two methods and the other two lines represent 2SDs from the mean difference

For Group 3, the mean differences also followed a normal distribution (Shapiro-Wilks = 0.913, df= 19, P=0.08). Subjects in Group 3 tended to over-report on the FC at high levels of fat intake and under-report on the FC at lower levels of fat intake as seen in Figure 4.11. so

a:: 40 0 1.1.. 1/J Q :I 30 c: 0 .E 0 20 (..) 1.1.. 10 Q

~ Q Q 0 0 Q ~ 0 0 - -10 0 0

0 0 Q -20

-30

-40 -SO 0 20 40 60 80 100 120

Mean daily fat intake for FC + FR

Figure 4.11 Agreement assessment between the two methods using Bland-Altman plots. Differences in fat intake (g fat/d) between the two methods are plotted against the means for Group 3 (n =J9) . The centre line represents the mean difference between the two methods and the other two lines represent 2SDsfrom the mean difference (95% C1).

102 Chapter 4: Results

The mean difference (FC minus FR), the standard deviation of this difference and the measurement error defined by the limits of agreement for each group are summarised in Table 4.7. Calculations for these values are found in Appendix 10. The 95% CI for the mean differences in fat intake between the FC and FR included zero (no difference) for both groups. Therefore, we can say that with 95% confidence, there was no difference in fat intake between the FC and FR for Group 2 and Group 3.

However, the measurement error (defined by limits of agreement (mean ± 2SD) as seen in Table 4.7) showed that the FC can be as high as 52g fat above or as low as 51g fat below the FR for Group 2 and 37 g fat above and 30 g fat below the FR for Group 3. These values for both groups are large in absolute terms and are not acceptable for measuring fat intake of individuals in either group.

Table 4.7 Measurements usedfor assessing agreement between the FR and FC, including the measurement error ofthe FC

Measurement Group 2 Group3 (Nutrition students) (Clinic patients) (11=42) (11=19)

Mean difference in fat intake (FC minus- + 1.8 +3.9 FR) (g fat/d) % mean difference 1.1 7.0 SDDiff(g fat/d) 25.7 16.7 95% CI around mean differences -7.2, +8.8 -4.2, +11.9 Limits of agreement (measurement error) Mean + 2SDdiff +52.1 +37.3 Mean - 2SDdiff -50.6 -29.6 95% CI around the limits of agreement Upper limits +38.3, +66.0 +23 .2, +51.4 Lower limits -64.5, -36.7 -43.6,-15.5

FR =food record, FC =fat checklist, % mean difference = mean fat intake from FC minus mean fat intake from FR/meanfat intake from FR. SD=standard deviation, CI= confidence interval, SDdiff=SD of the mean differences The 95% CI around the limits of agreement were used to measure the precision of these limits. For Group 2, the 95% CI for the lower limits of agreement was -65 to -37 g fat/d and for the upper limits of agreement, +38 to +66 g fat/d. For Group 3, the 95% CI for the lower limits of agreement was -44 to -16 g fat/d and for the upper limits of agreement, +23 to +51 g fat/d. These intervals are large but'clearly showed the discrepancies in fat intake between the FC and FR for both Group 2 and Group 3.

103 Chapter 4: Results

These results suggested that the agreement between the FC and FR was poor and thus unacceptable as a clinical measure of fat intake in individuals.

4.3.3.4. Least squares (linear) regression analysis

Fat intake data (g fat/d) calculated from the FC and FR for Group 2 (n=42) and Group 3 (n=19) were pooled and used for regression analysis. Figure 4.12 and 4.13 shows the least squares regression line for Group 2 and Group 3, including the 95% prediction interval. This fitted regression line explains the proportion of variability in the food record (the dependent variable, Y) and the data points indicate the amount of unexplained variability in predicting 'true' fat intake using the FC. The prediction interval sets the limits of the 95% CI for individual cases around the regression line.

180 0:: 0 u. 160 .gE 140 0 0 0 ~ 120 0 J!! 0 0> 100 Do 0 Q) 0 - 0 .ll:: s 80 .!: 0 0 0 60 0 0 " -J!! 0 q, 0 0 :if!:. 0 0 0 ·a; 40 0 "0

Mean dally fat intake (g faUd) from FC

Figure 4.12 The fitted regression line for individual subjects for Group 2, including 95% prediction intervals

For Group 2, two individuals (comprising 5% of the subjects) fell outside the 95% prediction interval and the data points were more widely scattered than those for Group 3. All individual responses in Group 3 fell within the 95% prediction interval.

104 Chapter 4: Results

a: 140 ~------~ u. .gE 120 ~ 100 s

Rsq = 0.8551 0 20 40 60 80 100 120 140

Mean daily fat Intake (g faVd) from FC

Figure 4.13 The fitted regression line for individual subjects for Group 3, including 95% prediction intervals for individuals Normality, linearity and variability of the regression model were checked to assess whether the assumptions for using regression analysis were met (see Chapter 3, p. 82). Figure 4.14 and 4. 15 shows the regression residuals plotted against the predictive values from the FC for Group 2 and Group 3, respectively. These plots are used to check linearity and variability of the model (see Chapter 3, p. 82).

4

a N 3 0. e:l a (!) 2 .e..... c a; 0 :l 0 0 :2 0 t: 0 " r gJ 0 c 0 00 0 a: 0 0 'b '1:. 0 Do 1J 0 " 0 0 0 ~ a 0 c -1 0 Q) 0 0 "0.a 0 (/') -2 0

-3 20 40 60 80 100 120 140

Unstandardized Predicted Value from FC for Group 2

F~gure 4.14 Studentised residuals from the regression line against the FC values (or unstandardised predicted values) for Group 2

105 Chapter 4: Results

3 ,------~

('I) a. 2 e;:, (.!)

~ .E n; n ;:, .iii-o Q)

0::: I 0 I ~ ~ c Q) -o ;:, -I (/)- C)

-2 20 40 60 80 100 120

Unstandardized Predicted Value from FC for Group 3

Figure 4.15 Studentised residuals from the regression line against the FC values (or unstandardised predicted values) for Group 3

Figure 4.16 and 4.17 show a plot of the standardised residuals from Figures 4.14 and 4.15. These plots in combination with the Shapiro-Wilks statistic determine normality ofthe model.

3,------,

2 0 "

0

«J -I .c @ ' , ,' ' 0 ,' r· z , 0 l -2 - -3 +-----~------~~--~~-- ~----.---~ -60 -40 -20 0 20 40 60 80 I 00

Observed Value

Figure 4.16 Normal plot of residuals from Figure 4.14 for Group 2

106 Chapter 4: Results

2.0

0 1.5 / / / O ,f 1.0 / / 0 / 0 .5 u 0 / 0 / O/ 0.0 u , « /o ca -.5 /-' E /0 0 c / z - 1.0 9' -o (. ~ / u - 1.5 / ~ a. 0 >< Ul -2.0 -20 -10 0 10 20 30

Observed Value

Figure 4.17 Normal plot ofresiduals shown from Figure 4. I 5 for Group 3

Despite the presence of two data points in Group 2 outside the 95% prediction interval (see Figure 4.12, p. I 04), Figure 4.16 and 4.17 indicated that the assumptions for using linear regression were satisfied for Group 2 and Group 3. The scatter around the regression line in Figure 4.12 and Figure 4. 13 was fairly even and symmetric for both groups, therefore linear relationships were accepted. The residual variances of the FR, the dependent variable (Y), as assessed by a plot of the studentised residuals of the FC on the predicted values (or FR values), showed random scatter around zero, indicating that the variance of the FR is similar for each value of the FC (see Figure 4.14 and 4.15). The residuals have a distribution that is close to normal as seen in Figure 4.16 and Figure 4.17.

Table 4.8 shows the analysis of variance table of the regression of the FC on the FR for Group 2 and Group 3. The residuals in Table 4.8 indicate the amount of unexplained variability. For Group 2, the proportion of variation is the sum of squares of the regression model expressed as a percentage of the total sum of squares. From Table 4.8, this value is 25789/5f450 = 0.492 or around 49%. This is R.2 or the coefficient of variability (see also Figure 4.12 and 4.13). Despite the significant slope (P= 0.000 I), the variability of fat intake predicted from the FC explains 'i~ · only 49% of the variance of fat intake from the FR for Group 2. For Group 3, R2 = 0.855 or 86%. Close to 86% of the variability of 'true' fat intake is explained by the variability of the fat checklist. The remaining 14% of fat intake in the model for Group 3 is unexplained.

107 Chapter 4: Results

Table 4.8 Analysis ofvariance corresponding to the regression ofthe FC on the FR

Group Source of Sum of Degrees of Mean F p variation squares freedom squares

2 Regression 25789 25789 38.69 <0.001

Residual 26661 40 667

Total 52450 41

3 Regression 10225 10225 100.30 <0.001

Residual 1733 17 101.9

Total 11959 18

Table 4.9 shows the regression coefficients for the model for Group 2 and Group 3.

Table 4.9 Variables in the regression equation usedfor determining 'true 'fat intake .from the FC in Group 2 and Group 3

Group b SEb Constant 95% Clforb Mean predicted a values (g fat/d)

2 0.891 0.143 7.1 6.22* 0.60, 1.18 72.0

3 0.638 0.064 17.8 10.02* 0.50, 0.77 56.0

*P

Y=a +bX Where: Y = predicted daily fat intake, ('true') value a the regression constant (the intercept) b the regression coefficients (the slope) xi the predictor values (the total amount offat (g fat/d) from all items on the fat checklist)

For example, by substituting the values in Table 4.9 in the regression equation, a predicted 'true' fat intake can be calculated for individuals in either group. For example, for Group 3 this model can only be used if the fat intake from the FC does not exceed 102g fat/d, which is the upper level of fat intake reported by Group 3. Therefore, for a mean daily fat intake of 59.9g fat from the FC in clinic patients (see Table 4.5, p 97) the predicted 'true' fat intake would be:

Predicted 'true 'fat intake (gfat/d) = 17.82 + (0.638 x59.9) = 56.0 gfatld

108 Chapter 4: Results

4.3.3.5 Comparison between methods using cross-classification techniques

Fat intake data from Group 2 and Group 3 were pooled for this comparison. Although there were significant differences in the demographics and absolute daily fat intakes between subjects in Group 2 and Group 3, the energy-adjusted fat contribution to total energy was similar (see Table 4.1, p. 88 and Table 4.5, p. 97). Individuals in both groups showed a wide scatter of fat intake from the FR, therefore pooling fat intake data from Group 2 and Group 3 allowed agreement assessment of a wider range of fat intake. Since the previous agreement measures in Section 4.3.3 (p. 101-104) showed poor agreement at the individual level, pooling Group 2 and Group 3 identified whether the FC determined 'true' fat intake from fat intake data ranked at a range of levels set by the distribution of the FR fat intake data. This assessment is useful for assessing the validity of the FC at the group rather than individual level.

The 25th, 50th and 75th percentile for fat intake for Group 2 and Group 3, the cut-off values for quartiles from the FR and FC, are shown in Table 4.10 and represented graphically in Figures 4.18 and 4.19.

Table 4.10 The 25th, 50th and 75th percentile of intake from the FR and FC for Group 2 and Group 3, pooled (gfatld)(n=61)

25th percentile 50th percentile 75th percentile

FR 41.5 57.0 93.0 FC 46.5 62.4 87.5

FR=food record, FC=fat checklist

One outlier was identified in the FR data as seen in Figure 4.18. This was retained for the cross­ comparison analysis. The median (50th percentile) was 57.0 g fat/d for the FR and 62.4 g fat/d for the FC. In contrast to the data from Group 3 alone, combined data from Group 2 and Group 3 changed the distribution of fat and showed that subjects over-reported fat intake on the FC at the low to middle levels and under-reported at higher levels.

109 Chapter 4: Results

200

180 028

160 a: u. E 140 0 L.. - 120 ~

20

0 N• 61

Figure 4.18 A box and whiskers plot of fat intake of the FR quartiles for Group 2 and 3 (pooled), showing 2.5, 25, 50, 75, and 97.5 cumulative relative frequencies (percentiles)(n=61)

160

140

120 ~ E 0 100 .:::,...... :g ~ 80 ...... ,bO

~..... 60 .5 ..... ~ 40

20

0 N• 61

Figure 4. 19 A box and whiskers plot of fat intake for the FC quartiles for Group 2 and 3 (pooled), showing 2.5, 25, 50, 75, and 97.5 cumulative relative frequencies (percentiles)(n=61)

110 Chapter 4: Results

Table 4.11 shows a cross-comparison of fat intakes ranked in quartiles assessed by the FR and FC for 61 subjects in Group 2 and Group 3. Ninety-two percent of subjects (56/61), when fat intakes were classified by the FR, fell into the same or adjacent quartile when classified by the FC. Fifty-four percent (33/61) were classified in the same quartile by both methods. Fifty-four percent of subjects were classified into the same quartiles by both methods No subjects were grossly misclassified (ie at the extreme quartiles). Overall twenty-one percent (13/61) under­ reported and 23% (14/61) over-reported on the FC. A kappa (k) statistic of0.39 calculated from Table 4.11, although significant (t=5.255, P<0.01), suggests that the overall agreement between the ranked quartiles between the FCs and FRs was 'poor' according to Fleiss (1981) but 'fair' according to Altman (1991, 404) (see Section 2.5.1.7, p. 31).

Table 4.11 Cross-classification of ranked fat intake (based on the mean daily fat intake (gfat/d) between the FC and FR in Group 2 and Group 3, pooled (n=61)

Food record Fat checklist quartiles Total quartiles number of subjects 1.00 2.00 3.00 4.00

1.00 (low) 9 4 2 15

2.00 6 6 4 1 17

3.00 1 4 7 3 15

4.00 (high) 1 2 11 14

Total 16 15 15 15 61

Numbers in table= numbers ofsubjects

Mean daily fat intakes (g fat/d) and the 95% CI for each quartile are presented in Table 4.12. Differences in mean fat intake for the first and third (P=0.005) and first and fourth quartile (P=0.03) were significant. Differences in mean fat intake between the first and second, second and third, and third and fourth quartiles were not significant. This suggests that the FC could only distinguish extremes of fat intake in the pooled data, ie < 4 7 g fat/d (upper level cut-off of the 25th percentile) and > 88 g fat/d (upper level cut-off from the 75th percentile). The FC could not distinguish differences in the low to middle levels of fat intake within this range.

111 Chapter 4: Results

Table 4.12 Mean fat intake (g fat/d) in FR quartiles and FC quartiles for subjects in Group 2 and Group 3, pooled (n=61)

Fat intake (g fat/d) Quartile 1 Quartile 2 Quartile3 Quartile 4 FC Mean (SD) 39.5 {11.2) 55.0 (4.9) 71.4 (7.7) I I4.5 (19.9) 95% CI around mean 30.3, 42.2 52.2, 57.7 67.1, 75.7 103.5, 125.5 Median for each quartile 42.7 54.3 69.5 118.0 FR Mean (SD) 32.8 (8.0) 50.5 (5.2) 74.3 (I 1.7) 1 I5.9 (24.7) 95% CI around mean 28.3, 37.2 47.8, 53.2 67.8, 80.8 I01.7, 130.2 Median for each quartile 34.0 51.7 75.0 I03.5 CJ- confidence interval, SD -standard deviation, FC--'fat checklist, FR--'food record

4.3.3.6 Sensitivity, specificity and predictive values of the FC

Table 4.13 summarises the number of s~bjects categorised as 'lower fat' and 'higher fat' consumers from the FC and FR in Group 2 and Group 3 (pooled). A cut-off of :2:70g fatld was used to determine the sensitivity and specificity of the FC to a 'higher fat' intake (see Chapter 3, Section 3.5.2, p. 88).

Table 4.13 Cross-classification of 'lower' and 'higher' fat consumers between the FC and FRfor Group 2 and Group 3, pooled (n=61)

FC FR Total :2:70 g fatld <70g fatld

:2:70g fatld 17 5 22 <70g fatld 6 33 39

Total 23 38 61

FC=fat checklist, FR=food record Of the 23 subjects whose mean fat intake from the FR was :2: 70g fatld, 17 also had FC estimates above this level. Thus, the FC classified subjects with 74% sensitivity. Of the 38 subjects who had mean fat intakes ~ 70g fat/d according to the FR, 33 subjects also had FC estimates of fat intake below this level. The specificity was therefore 87%.

Because the mean EEI:PBMR ratio was below physiological levels for many individuals and fat intakes were corresponding low or under-reported, further analyses of the FC were conducted to detect how well the FC detected individuals at lower levels of fat intake. Table 4.14. provides an indication of the performance of the FC in the same population at varying fat cut off levels. A value of 62.5 g fat per day was based on a cut off of 30% of the recommended energy expenditure from fat to total fat intake based on the RDI for energy (7,700 kJ) for a 112 Chapter 4: Results

reference female (NHMRC 1991, 32). The lowest cut-off of 40g fat/d was arbitrarily chosen. Appendix 12 depicts the calculations used for determining these values.

Table 4.14 Indicators ofperformance ofthe FC in 61 subjects

FC cut-off level for fat Sensitivity Specificity PPV NPV (g fat/d)

"?.70 0.74 0.84 0.77 0.85 "?.62.5 0.75 0.79 0.75 0.79 ::;;40 0.55 0.96 0.74 0.91

62.5 g fat/d = 30% of energy from fat intake in a reference female consuming 7, 700 kJ/d, FC=fat checklist, PPV=positive predictive value; NPV=negative predictive value The FC was more sensitive to higher fat intakes than lower fat intakes in the population tested, although only 38% of the population consumed more than 70 g fat/d. The PPV was 77%. Using a ::;;40 g fat cut-off the FC had a poor ability to correctly identify consumers with a fat intake less than this cut-off. This comprised 11 subjects which was only 18% of the total population.

The specificity or the proportion of negatives correctly identified by the FC (ie those subjects correctly identified below the cut-off of 70g fat) was satisfactory for the higher fat cut-offs and excellent for the 40g cut-off. For the "?.70g fat cut-off, 5 people (13% of the people eating less than 70g fat) would be receiving unnecessary fat-lowering intervention when they may not need it.

The area under the Receiver Operator Curve (ROC) (a plot of the sensitivity and !-specificity of the FC) for a cut-off of"?. 70g fat was 0.683 or 68%. The optimum cut-off for fat determined automatically by the ROC curve where the sensitivity and specificity was maximised was 88g fat.

4.3.4 Sources of measurement bias/error of the FC Another factor affecting the validity of the FC was the ability of subjects to recall and quantify food intake accurately on the FC. This was reflected by the way in which S!lbjects modified serve sizes, misclassified foods, omitted, added or erroneously recalled foods on the FC. This defines specific foods where error were made and relates to the ability of subjects to:

• classify foods correctly; and

• recall foods correctly.

These alterations are presented in Table 4.15.

113 Table 4.15 Site and.frequency oferrors in recalling foods accurately on the FC by subjects in Group 2 (n=30) and Group 3 (n=19) Food item Item Serve sizes Serve sizes Foods omitted (on FR Foods added (on FC but Foods misclassified overestimated underestimated but not on FC) notonFR)

Group2 Group 3 Group2 Group3 Group2 Group3 Group2 Group3 Group2 Group 3 Fatty meat, all cuts 1-6 2 2 4 1 1 1 5 2 2 Pastry 7-9 1 1 1 Fatty fish 10-11 2 1 2 Cream 12-13 4 2 1 Full-cream yoghurt 14 2 1 Ice-cream 15 5 1 1 9 Full-cream milk, 16 1 2 3 1 1 milkshakes, flavoured milk Cheese 17 6 3 3 2 1 Cakes and biscuits 18,20 8 1 1 1 4 1 1 Fried takeaways 21-24 1 2 2 1 Snack foods 25,26 2 4 4 3 Trimmed meats - all cuts 29-33 2 1 3 5 4 1 1 3 Egg 34 1 1 1 Nuts, peanut butter 35 1 1 Low-fat milk 36 1 Light cakes, scones, pikelets 37 ' 1 Butter, margarine 38 3 2 5 4 Mayonnaise, salad dressing 39 1 3 3 Total 16 8 36 13 25 15 35 10 5 3 Nate: Those food ztems. not znduded . had no errors Chapter 4: Results

4.3.4.1 Comparison of the misreporting of serve sizes of the FCs with FRs in Group 2 and Group 3

Table 4.15 illustrates the number of times that subjects in Group 2 and Group 3 made mistakes over-estimating or under-estimating serve sizes on the FC. Food items were pooled into selected categories as indicated. For example, questions 1 to 6 on the FC were pooled to represent all fatty meat including untrimmed meat, chicken with skin, crumbed meat, and many delicatessen meats that were fatty ( eg bacon and salami). The FRs again were used as the 'true' measure of food intake against which the responses on the FC were compared.

Food items were under-estimated more than twice as frequently as over-estimated on the FCs by both groups. Meat was poorly quantified for both fatty and lean cuts. Most dairy products (full-cream milk, ice cream, cheese) were under-estimated in quantity on the FC. Food items eaten as single foods or with an easily described serve size were mostly correctly classified. These included fried take-away foods (spring rolls, pizza, carton chips), pastry (croissants, meat pie, sausage roll), light cakes (pikelets) and yoghurt identified by the 200g carton.

4.3.4.2 Identification of specific food items erroneously recalled on the Fcs in Group 2 and Group 3

Table 4.15 also indicates foods on the FC that were omitted, added and misclassified. This is another measure of the accuracy of recall of foods as well as reflecting the ability of respondents to correctly classify individual foods on the FC.

A high freqency of errors was apparent for meat (both trimmed and fatty cuts), snack foods, ice-cream, cakes, biscuits and cheese. More errors were made by adding foods (n=35) than omitting foods (n=27) on the FCs in Group 2. The reverse was seen for Group 3. Some foods including salad dressing and mayonnaise, cheese, fried takeaways, snack foods, were equally misrepresented (ie added or omitted). Lean meats were frequently omitted from the FCs (ie included on the FR but not recalled on the F.C) while fatty meats were more frequently added, especially by Group 2. Few food items were misclassified or put in the wrong place (eg fatty meat in the lean meat category). Clearly, problems recalling and quantifying meat on the FC were apparent and need to be addressed in future modifications of the FC.

4.3.5 Fat score from the FC of the 'non-responders' A large numbers of subjects in Group 2 and Group 3 were excluded from the validity study for several reasons (see Section 4.1.1, p. 87). The exclusion of the clinic patients was of key concern and, to a large extent, circumstantial (ie not keeping appointments) because exclusion may have affected their continued adherence to the four month fat-lowering program These 115 Chapter 4: Results subjects were compared with the included or 'matched' subjects as shown in Table 4.16 for differences in fat intake derived from the FCs.

Table 4.16 Comparison of descriptive statistics offat intake (g fat/d) from the FC between the matched and unmatched subjects in Group 2 and Group 3

Fat intake Group 2 Group3 from the FC (Nutrition students) (Clinic patients) (g fat/d) Females Males Females Males

m um m um m um m um (n=30) (n=/8) (n=/2) (n=IO) (n=IO) (n=ll) (n=9) (n=/2) Mean (SD) 60.0 69.3 104.6 122.2 61.6 67.8 58.6* 125.4* (15.6) (27.9) (27.5) (52.6) (37.0) (23.4) (40.0) (27.0)

95%CI 51.1, 55.4, 87.1, 84.5, 34.6, 57.2, 27.8, 109.1, 65.8 83.2 122.1 159.8 87.6 78.5 89.3 141.7

Median 58.1 65.8 101.9 108.7 45.5 72.0 58.0 130.3

Variance 244 780 757 2769 1371 550 1597 730

Range 25.4, 32.7- 60.0- 61.3- 23.6- 20.0- 14.7- 80.0- 89.2 149.0 141.0 233.1 134.4 128.0 134.5 175.0

Normality1 Yes Yes Yes Yes No Yes Yes Yes P=0.40 P=0.07 P=0.54 P=0.26 P=0.02 P=0.40 P=0.41 P=0.96 m=matched subjects, subjects paired with the FRs, urn-unmatched, subjects excluded from the criterion study, CI=conjidence interval, FC=fat checklist, FR=three-day food record, SD=standard deviation, *=significant difference between unmatched and matched data (P=0.002), 1Values of P>0.05 follow a normal distribution using the Shapiro-Wilk's Statistic No significant differences in fat intake (g fat/d) between matched and unmatched subjects were found in most comparisons except for Group 3 males, using a Mann-Whitney U test (U statistic=l2, P=0.002). Males who provided matched data with the FR consumed 50% less fat than those who were unmatched and were excluded for the reasons outlined on page 87. The reasons for this were not determined. However, this may be an indication that the clinic males, who were committed to undertaking the clinic program (ie a four month fat-lowering diet), had already made those changes before commencing the study and/or were- biased towards reporting low-fat intakes. These reasons may provide some explanation for these differences. Female clinic patients in Group 3 reported consuming a relatively low-fat intake but no differences were seen between the matched and unmatched subjects. Clearly, both matched and unmatched subjects in Group 2, irrespective of sex, did not differ in their fat intake using the FC.

116 Chapter 4: Results

4.3.6 Summary of the criterion validity studies of the FC For the criterion validity study, mean differences in fat intake between the FC and FR were small; +0.8 g fat/d for Group 2 and +3.9 g fat/d for Group 3. A correlation of 0.68 in Group 2 and 0.84 in Group 3 would seem to suggest a good to strong association between the methods. At the individual level, however, subjects in both groups showed a wide scatter in agreement between the FC and FR (see Figure 4.8 and Figure 4.9 p. 100). Only 23% of subjects in Group 2 and around 40% of subjects in Group 3 showed very close agreement. Group 3 subjects tended to over-report fat intake on the FC at the higher levels of fat intake. No bias in over- or under-reporting was evident at any level of fat intake in Group 2.

Ninety-five percent of individuals for both groups fell within the limits of agreement (see Figure 4.10 and Figure 4.11, p. 102), but the measurement error was large for both groups. For Group 2, the FC was as high as 52g fat/d above the FR and as low as 51 g fat/d below it. For Group 3, these values were not as high, at 37 g fat/d above and 30g fat/d below the FR. Although FC intakes for individuals fell within the limits of agreement, and were therefore statistically acceptable for interchanging the FC for the FR, they were too large in absolute fat intake to be clinically acceptable as a measurement of actual fat intake.

Male clinic patients (Group 3), who provided matched FC with the FR data, reported 50% less fat intake on the FCs than the unmatched subjects. These matched male clinic patients also had an energy intake to BMR ratio around 45% lower that that for Group 2 matched males. Matched females in Group 2 and Group 3 involved in the validity study had similar but low energy intakes to BMR ratios.

Measurement error of the FC was associated with omission, addition or misclassification of foods as well as poor quantification of serve sizes. Meat was poorly quantified and often misrecalled in both groups. Underestimates of serve sizes occurred more often on the FC than overestimates, especially for dairy foods including milk, cheese and ice-cream in Group 2 and for trimmed meat in Group 3. The reasons for these errors could not be determined.

The association of fat intake between the FR and FC satisfied the assumptions for applying a least squares regression analysis to the fat intake data for Group 2 and Group 3. An equation for predicting 'true' fat intake from the FC was calculated for each group based on the variables listed in Table 4.9, p. 108.

Cross-classification checks on the differences between qualitative ranking of the FCs and FRs in the pooled data from Group 2 and Group 3 showed 'fair' agreement with 54% of subjects

117 Chapter 4: Results classified into the same quartiles by both methods. The FC identified 'higher' fat consumers(~ 70g/fat/d) with a sensitivity of74% and a PPV of77%

4.4 Content validity of the final FC

4.4.1 Data used for measuring content validity Data used to measure content validity of the FC in Group 2 and Group 3 were from the same 61 subjects as those used for the criterion validity study.

4.4.2 Frequency of consumption of food items listed on the FC in Group 2 and Group 3 The ability of the FC to reflect the types and quantities of food items consumed was evaluated by scoring the frequency of responses to each food item on the FC for Group 2 and Group 3.

Responses to each food item on the FC, the frequency of consumption of each food, and the number of FCs, which showed alterations in serve size, are shown in Table 4.17, p. 119. An asterisk marks the site of alteration on the FC. Where a serve size was altered, the number closest to the frequency option in the Number of Times Eaten column was used.

Most foods on the FC were consumed over three days except for crumbed fried chicken (Q3). Some foods were consumed by only one or two subjects (ie liver, kidney, brains, quiche, gravy made from dripping, untrimmed pork and Iamb chops) while others (ie margarine and fat spreads) were consumed by 94% of subjects. Eighteen out of 39 food items were consumed by less than 20% of subjects in both groups.

Because both groups may have been biased in food choices: Group 2 because of their background in nutrition at university or Group 3, because they were being referred to a clinic aimed at lowering fat intake, the frequency responses of Group 1 (physiology students) were included for comparison. Appendix 13 lists these responses. All foods listed on the FC were consumed by subjects in Group 1. Fifteen out of 39 food items were consumed by less than 20% of subjects in this group.

118 Chapter 4: Results

Table 4.17 Frequency distribution ofresponses (%) to fat checklists for Group 2 (n=42 subjects) and Group 3 (n= 19 subjects)

For each food or food group listed, subjects indicated how often they have eaten the amount specified over the previous three days

Food items Q %of Alterations to food Group 2 (n=42) Group 3 (n=19) subjects serve sizes consuming food item (n=61) Group2 Group2 Group3 0 1 2 3 4 5or>5 0 1 2 3 4 5 or and3 >5 combined Liver, kidney, brains (n=41) 1 41 0 0 0 0 0 18 1 0 0 0 0 Untrimmed steak, pork, chicken, lamb 2 27 30 10 0 1 I 0 14 2 2 0 1 0 (not chops) Crumbed fried chicken 3 0 42 0 0 0 0 0 19 0 0 0 0 0 Untrimmed pork chops, lamb chops 4 11 40 2 0 0 0 0 14 2 3 0 0 0 Sliced meat: fatty cuts eg bacon, 5 42 22 12 8 0 0 0 13 2 4 0 0 0 salami, luncheon meat Sausages, frankfurts 6 19 1 34 4 3 1 0 0 13 4* 2 0 0 0 Meat pie, sausage roll 7 21 35 4 2 0 1 0 13 4 2 0 0 0 Savoury pie with pastry made with 8 8 41 1 0 0 0 0 15 4 0 0 0 0 cream (eg quiche) Savoury pie with pastry made without 9 <2 41 0 1 0 0 0 19 0 0 0 0 0 cream (eg quiche) Fish or fish cakes or fingers, crumbed, 10 8 1 38 3 1* 0 0 0 18 1 0 0 0 0 battered, oven fried Undrained canned fish (eg tuna, 11 13 1 3 38 3* 0 0 0 1 16 3* 0 0 0 0 sardines in oil) Ordinary and thickened cream, sour 12 19 32 4 4 2 0 0 18 1 0 0 0 0 cream Light cream, light sour cream 13 18 1 37 5 0 0 0 0 14 1* 0 0 0 0 Full-cream yoghurt, plain or flavoured 14 13 2 1 36 4* 0 1 1 0 18 1* 0 0 0 0 Ice-cream 15 31 1 27 11 2* 2 0 0 16 2 0 0 0 1 Full-cream milk, flavoured milk, 16 60 3 1 12 5 9 7 3 6* 13 3 2* 1 0 0 milkshake Cheese: cheddar, edam 17 61 14 5 11 7 4 1 10 1 7 1 0 0 Rich-cake, cheesecake, black forest 18 10 38 4 0 0 0 0 18 1 0 0 0 0 cake Pastry: croissant, apple strudel, sweet 19 19 1 35 7 0 0 0 0 15 2* 2 0 0 0 pie Biscuits: sweet/chocolate-coated, 20 40 2 26 8 3 0 2 3* 11 3 4 1 0 0 shortbread, cream-filled, cheese crackers Fried rice 21 21 1 35 3 2 2 0 0 14 5* 0 0 0 0 Pizza 22 11 36 5 0 1 0 0 19 0 0 0 0 0 119 Chapter 4: Results

Food items Q %of Alterations to food Group 2 (n=42) Group 3 (n-19) subjects serve sizes consuming food item (n=61) Group2 Group2 Group 3 0 1 2 3 4 5 or>S 0 1 2 3 4 Sor and3 >5 combined Hot chips 23 21 1 34 7 1 0 0 0 15 3* 1 0 0 0 Chiko roll™, dim sim, spring roll 24 10 39 3 0 0 0 0 17 1 1 0 0 0 Potato crisps, Twisties™, corn chips 25 27 32 6 3 1 0 0 13 6 0 0 0 0 Chocolate bar, doughnut 26 39 1 1 24 16* 1 0 1 0 14 5* 0 0 0 0 Toasted muesli 27 9 1 39 1* 1 0 0 1* 17 1 2 0 0 0 Gravy made from meat dripping 28 5 42 0 0 0 0 0 17 2 0 0 0 0 Trimmed steak, pork, lamb (not chops), 29 61 1 l 16 15* 6 3 1 l 8 7 1 3 0 0 trimmed chicken Trimmed pork chops, lamb chops 30 21 1 34 4* 4 0 0 0 15 3 1 0 0 0 Mince meat in the form of hamburger, 31 40 l 28 10* 4 0 0 0 9 7 3 0 0 0 rissoles, bolognaise, lasagne Meat (trimmed) casserole or meat stew: 32 19 l 33 7* 2 0 0 0 17 2 0 0 0 0 eg Chinese meat dishes, curry, goulash (no veg) Sliced meat: ham, beef, lamb, chicken 33 53 1 1 22 11* 8 0 1 0 7 8* 2 2 0 0 Egg 34 37 2 26 6 5 1 1 3* 13 6 0 0 0 0 Low fat milks, Hi lo™ (not Shape™ or 35 32 1 28 3 4 2 0 5* 14 1 0 3 0 0 skimmed) Nuts: any type including peanut butter 36 23 1 34 2* 2 3 0 1 14 2 2 1 0 0 Light cake: plain sponge, scone, 37 32 28 8 3 2 1 0 14 5 0 0 0 0 pikelet, pancake, muffin Added margarine, butter 38 14 5 3 2 8* 12* 11* 6 3 2 8 6 1 0 2* Added oil, salad dressing, mayonnaise 39 37 1 27 4* 6 0 1 2* 12 3 3 0 0 0 Total number (n) of alterations 27 16 made to serve sizes m1ssmg.. data, * sites where alteration m serve sizes were made

120 Chapter 4: Results

4.4.2.1 Food items that were major contributors to daily fat intake based on frequency of consumption of foods

Food items on the FC that contributed to around 70% of the daily 'true' fat intake from the FR in Group 2 and Group 3 are listed in Table 4.18 below. This cut-offwas arbitrarily determined. The mean daily fat intakes from individual foods are found in Appendix 14.

Table 4.18 Food items on the FC which contribute around 70% of 'true' daily fat from the FR intake in Group 2 and Group 3

Group 2 (n=42) Group 3 (n-19)

Qon Food item %of *% Q Food item %of *%of FC daily of on daily subj fat subj FC fat 38 Added margarine, butter 13.5 95 38 Added margarine, butter 19.0 89 16 Full-cream milk, flavoured 7.7 71 4 Untrimmed chops, lamb, pork 8.0 26 milk, milkshakes

5 Sliced meat- fatty cuts, 7.7 48 2 Untrimmed beef, pork, lamb 7.0 26 bacon, salami or chicken (with skin)

17 Cheese 7.6 67 29 Trimmed beef, lamb, pork, 6.6 58 chicken

29 Trimmed beef, lamb, pork, 5.8 62 5 Sliced meat- fatty cuts, 5.9 32 chicken bacon, salami

6 Sausages, frankfurts* 5.1 19 17 Cheese 5.7 47 2 Untrimmed beef, pork, 4.1 29 6 Sausages, frankfurts 5.5 32 lamb or chicken (with skin)

26 Chocolate, doughnut 3.8 43 19 Pastry 5.2 21 20 Biscuits 3.6 38 23 Hot potato chips 3.9 21 7 Meat pie, sausage roll 3.2 17 20 Biscuits 3.6 42 23 Hot potato chips 2.9 19 37 Light cake, pancakes, 2.4 33 pikelets - 36 Nuts, peanut butter 2.4 19

FC =fat checklist, FR =food record, *%ofsubjects consumingfood Thirteen food items in Group 2 and 10 food items in Group 3 out of a total of 39 food items listed accounted for around 70% of the 'true' fat intake in each group. Fat spreads (ie butter and margarine) accounted for nearly 14% oftotal fat daily fat intake in Group 2 compared to 19% in Group 3. From this table, meat intake was a major contributor to fat intake in Group 3

121 Chapter 4: Results

(33% of total fat intake), whereas Group 2 derived considerable fat (ie 27% oftotal fat) from full-cream milk, cheese, takeaway and snack foods.

4.4.2.2 Suitability of the frequency format used

The number of frequency response options were adequate to describe the quantity of foods consumed by Group 2 and Group 3 (Table 4.17) and also Group 1 (Appendix 13) for most foods except milk (both full-cream and low-fat milk), biscuits, muesli and eggs. For these foods, a small number of subjects in Group 1 and Group 2 consumed much more than the upper limit given on the frequency options of>S times in three days. This was detected by the changes that were recorded on the FCs to the frequency option at the upper end. Some subjects calculated the fat score for each food, although this was not part of the instructions and outside the research protocol. These changes were used, as well, to identify alterations. A FC showing how these alterations were made is found in Appendix 7. As described above, the frequency format and standard amount or serve size provided (1 medium glass) for low-fat milk (Q35) and full-cream milk, flavoured milks and milkshake (Q16) was insufficient to meet the upper limits of the amount of milk consumed in three days by some subjects. The range of intake for milk was from two to twelve glasses in the three-day period, the upper limit being well above the specified amount. Similarly, the amounts of biscuit (Q20), muesli (Q27) and added fats and oils (Q38 and Q39) consumed were more than the upper frequency option for some individuals in Group 2. Subjects in Group 3 only exceeded the upper limit of frequency option for butter and margarine (Q38).

Some subjects altered the frequency format at the lower end of the frequency options. These changes usually comprised half the quantity of the standard amounts and would have had a minor effect on the overall mean fat daily intake of the FC compared with the changes made at the upper end of the frequency options.

4.4.2.3 Alterations to standard amount (serve size)

Alterations to the standard amount written on the FC were made by a small number of subjects in Group 2 and Group 3 as indicated in Table 4.17, p. 119-20. These changes are represented by the asterisk in Table 4.17, p. 119 and clearly show that they occurred more often at the lower end of the range of options. Foods most frequently altered were full-cream milk, flavoured milk and milkshakes (Q16) and fat spreads (Q38). Nutrition students altered serve sizes almost twice as often as clinic patients, particularly at the upper end of the frequency options. Conversely, the clinic patients only made alterations at the lower end of the frequency options. These alterations to the standard amounts made on the FC provide some indication

122 Chapter 4: Results about the attention to detail and understanding of the instructions by the subjects involved. They do not measure whether the changes were an accurate description of the quantity of foods actually consumed.

4.4.3 Comparison of the food choices between Group 2 and Group 3

4.4.3.1 Comparison of the food choices from selected food groups

Mean fat intake shown in Table 4.5 (p. 97) provide no information about whether Group 2 and Group 3 subjects differed in their selection of foods that contributed to fat intake. To detect any differences, mean fat intakes of several specified food items were categorised into similar food groups (see Appendix 15), as indicated in Table 4.19, (p. 124), and compared using an independent samples t test.

Table 4.19 illustrates the mean fat intake for each specified food group and the differences and significance of the differences in fat intake between Group 2 and Group 3. Fat intakes from the selected food groups in Table 4.19 were normally distributed (variances equal) for most foods except for the consumption of fatty fish (QlO and Ql1). The mean differences in fat intake were not significantly different from zero between Group 2 and Group 3 for most food groups except for fatty fish and full-cream milk. However, Group 2 showed consistently, although not significantly, higher fat intake for most selected food groups listed than Group 3.

123 Chapter 4: Results

Table 4.19 Differences in mean fat intake (g fat/d) from selected food groups for the FC between Group 2 and Group 3 (n=61 FCs in 61 subjects)

Food groups Item Group2 Group 3 Mean CI for mean P-values number (Nutr (Clinic differences in differences in on fat students) patients) fat intake fat intake checklist (n=42) (SD) (n=l9) (SD) between Gp 2 and Gp 3

g fat/d g fat/d 1 g fat/d

Fatty meat cuts, Ql to Q6 12.7 (14.3) 15.0 (17.0) -2.3 -10.7 to +6.1 ns untrimmed meat

Meat pie, sausage Q7 2.3 (6.2) 1. 7 (3.4) +0.6 -2.4 to +3.6 ns roll

Quiches (with or Q8 and Q9 1.0 (43.5) 0.6 (2.9) +0.4 -1.9 to +2.6 ns without cream)

Fatty fish Q10 and 1.8 (4.6) 0.4 (1.1) +1.4* -0.3 to +3.0 0.05 Q11

Full cream dairy Q12 to 14.4 (5.9) 8.1 (6.6) +6.6** +4.2 to +12.7 <0.01 products Q17

Cakes and biscuits Q18 to 5.5 (7.5) 5.6 (7.7) +0.1 -4.2 to +4.1 ns Q20

Takeaways and Q21 to 8.1 (8.6) 5.8 (8.1) +2.3 -2.4 to +7.0 ns snack foods Q26

Lean meats Q29to 9.4 (5.6) 9.3 (3.8) +0.1 -2.7 ro +2.9 ns Q33

Low-fat milk Q35 1.4 (2.6) 0.9 (1.70 +0.5 -0.8 to +1.8 ns

Added fat Q38 and 10.8 (6.3) 11.3 (7.8) -0.5 -4.2 to +3.3 ns Q39

*P§J.05, **P

4.4.3.2 Comparison of the food choices from individual food items

A comparison of fat intake from individual food items listed on the FC further narrows the differences in food choices between Group 2 and Group 3.

Table 4.20 shows those individual food items that accounted for differences in fat intakes between Group 2 and Group 3 that were significantly different from zero difference. Group 2 derived more fat from obvious fatty foods than Group 3. Full-cream milk drinks and untrimmed lamb accounted for the major differences in fat intake between Group 2 and Group 3 which confirmed the differences in food choices reported in Table 4.19.

124 Chapter 4: Results

Table 4.20 Differences in mean fat intake (g fat/d) from individual foods that were significantly different between Group 2 and Group 3

Food item Item Group 2 Group 3 Mean CI for mean differences number (Nutr students) (Ciin differences on fat (42 FC in 42 patients) between Gp 2 checklist subjects) (SD) (19 FC in 19 and Gp 3 subjects) (SD)

(g fat/d) (g fat/d)' (g fat/d)

Foods consumed in smaller amounts by Group 2 than Group 3

Untrimmed pork 4 0.5 (2.4) 4.6 (8.5) -4.1* -8.2 to -0.2 chops, Iamb chops

Foods consumed in greater amounts by Group 2 than Group 3

Ordinary and 12 2.0 (2.2) 0.03 (0.09) +1.07** +0.4 to +1.8 thickened cream

Full-cream milk 16 5.6 (5.3) 1.4 (2.3) +4.2** +2.3 to +6.2

Pizza 22 l.l (3.1) 0 +l.l* +0.1 to +2.1

Egg 34 1.7 (2.6) 0.5 (0.7) +1.2* +0.2 to+2.2

*0.05

O.OJ, •*P

Full-cream milk drin~s and untrimmed lamb and pork chops accounted for the largest differences in particular foods contributing to fat intake between Group 2 and Group 3. This confirms the differences in food choice in Table 4.18, p. 121.

4.4.4 Summary of content validity results Most foods on the FC were consumed by Group 2 and Group 3 except crumbed fried chicken (Q3). Foods that were major contributors to fat intake from the FC (ie 70% of total fat intake of the FR) differed markedly between Group 2 and Group 3 (see Table 4.18, p. 121). The frequency options on the FC met the quantities of most foods reported consumed by both groups. The amounts of milk (both full-cream and low-fat), biscuits and muesli substantially exceeded the upper limits of frequency options in Group 2 only. Group 3 ten~ed to alter serve sizes at the lower end of the frequency options. However, alterations to serve size were infrequent for both groups.

125 Chapter 5: Discussion ofresults

CHAPTERS

THE VALIDITY AND RELIABILITY OF THE FINAL FAT CHECKLIST

5.1 Introduction In this study, a fat checklist (FC) that measured the previous three days' food intakes was developed and then evaluated by comparing its results with those calculated from a three-day food record (FR).

The level of understanding of the instmctions, and the ability to fill out the FC according to the study protocol were evaluated in a sub-sample of university students and clinic patients in the pilot study.

The results of the pilot study demonstrated that subjects, irrespective of their background (ie clinic patients or university students), generally understood the format and directions but their responses were influenced by their focus on calculating fat intake. The option to include reference to the fat content of foods on the FC was a major source of concern and debate, among clinic staff, because such an inclusion can inhibit food reporting and increase respondent bias. The final version of the FC used for validity testing retained these references because the FC was specifically designed for patient self-assessment of fat intake in the clinic population. Therefore, the amount of fat in each food needed to be included for that purpose.

The original order of grouping foods into food group categories (see Chapter 3, Section 3.1.4, p. 60), irrespective offat content (eg all lean and fatty meats together) was changed in the final version based on the results of the pilot study. This was changed because the clinic patients who elected to follow a low-fat diet perceived that all foods on the FC were to be avoided rather than limited in their diet and were confused. This message of fat avoidance was clearly contrary to nutrition education principles. The final FC retained the same list of foods as that in the pilot version only the order of foods was changed (see Table 3.5, p. 76-7).

Subjects in the pilot study did not change the standard serve size despite the option to do so in the instructions. Baghurst (1993) reported that few subjects alter serve size and certainly, the

126 Chapter 5: Discussion ofresults pilot study confirmed this. Changes to serve sizes, if large, will affect the validity of the FC. However, the sample size of the pilot study was too small to draw any conclusions.

5.2 Reproducibility of the final FC The FC had unacceptable reproducibility. The correlation (r=0.62) and scatterplot for Test 1 and Test 2 (See Figure 4.1, p. 90) showed that there was a good strength of association between the FC completed by the same subjects after a three-week interval. However, the mean difference of 9.9 g fatld on the FCs between Test 1 and Test 2 was significantly different from zero difference (see Table 4.3, p. 89).

Inspection of the 95% CI around the differences in mean fat intake between Test 1 and Test 2 confirms the existence of a real difference (Curran-Everatt et al. 1998) since zero (or no) difference was excluded (see Table 4.3, p. 89). These results also demonstrate a slightly higher fat intake in Test 1 than Test 2 which may be associated with:

• knowledge of the first measurement affecting the second measurement;

• a training effect; and /or

• an actual change in food choice, which was part of the intent of the FC for clinic use.

Other FFQ validation studies have shown similar lower fat intakes on the retest measure with intervals of one month (Engle et al. 1990), three months (Wheeler et al. 1994) and one year (Rohan et al. 1987; Pietinen et al. 1988b) between each test. Loss of compliance and/or real dietary change rather than seasonal influence was given as reasons for lower intakes in these studies. Seasonal influence was not a factor affecting the reproducibility of the FC in this study because of the short time of three weeks between tests. The differences between the tests were more likely to be associated with the nature of variation in food intake and/or the erratic eating habits observed in an age group comprised mainly of first year university students. Other studies have shown that this age group (Abrahams et al. 1988) and university students eat erratically (Wilkins et al. 1991 ).

The mean fat intake from the 'unmatched' subjects in Group 1 also fell within the 95% CI for Test 1 but outside that for Test 2 (see Table 4.2, p. 89) which supports the suggestion that fat intake data in Test 2 (from subJects previously exposed to the FC) was different.

The lower mean fat intake for Test 2 was not attributed to any single food. Except for liver, no single food item on the FC showed large or significant differences in fat intake between Test 1 and Test 2. Only subjects in Test l, not Test 2 consumed liver - hence the reason for this difference.

127 Chapter 5: Discussion ofresults

Although the repeatability coefficient for fat intake between Test 1 and Test 2 was acceptable statistically, the differences in absolute fat intake of individuals, defined by the limits of agreement using the Bland-Altman plots, were too wide and indicative of poor agreement (see Figure 4.3, p. 91 and Table 4.3, p. 89). Poor agreement can be attributed to the wide variability in individual fat intake or relatively small sample size of the population tested. Further tests using a similar population are needed to confirm or refute this conclusion.

5.3 Criterion validity of the final FC Overall, the FC generally showed acceptable criterion validity for measuring fat intake in groups of individuals (see Section 4.3.6, p 117). This implies that the FC could be used interchangeably with a three-day food FR for analysis of fat intake in groups. However, the FC was not considered clinically acceptable for measuring fat intakes of individuals in the lipid clinic. A high measurement error based on large discrepancies in agreement of fat intake between the FR and FC for individual fat intake was apparent. Measurement error of the FC was associated with errors in recall, misreporting of serve size and other limitations inherent in the study design.

5.3.1 Comparison of measures of agreement between the FRs and FCs in Group 2 and Group 3

5.3.1.1 Comparison of mean fat intake

Because the 95% CI for the mean fat intake for these two groups overlap, it could be concluded that the FC captures the 'true value' of the group mean. The difference in mean fat intakes between the FC and FR was less than 1.1% in Group 2 and 7.0% in Group 3 and was not significantly different from zero for either group. The 95% CI around the mean also included zero for both study groups, which suggested that the FC performed better in estimating group mean fat intake than individual intakes.

No differences in mean fat intake suggested close agreement between the FR and FC, however the large SD and wide 95% CI around the mean suggest a wide variability in individual responses. Many dietary validity studies continue to report mean nutrient values and compare these with other studies (see Section 2.8.3, p. 44). Such comparisons are only appropriate in randomised or large population studies to compare and contrast nutrient intakes of interest, where the results can be generalised to a wider population. Comparison of mean fat intakes from the FC in this study with other validity studies of FFQs measuring fat intakes is

128 Chapter 5: Discussion ofresults inappropriate as the study groups were not randomly selected and therefore not necessarily representative of clinic patients or university students.

5.3.1.2 Comparison of individual differences in fat intake between FR and FC a. Scatterplots oftlze FR and FC

A linear association was apparent between the FRs and FCs for fat intake in Group 2 and Group 3. No bias was apparent towards under- or over-reporting on the FC for all subjects in Group 2. In contrast, Group 3 tended to under-report at higher levels of fat intake and over­ report at lower levels of fat intake. Forty-percent of subjects in Group 3 recalled foods accurately on the FC, which was almost twice as many as those in Group 2. Although there was a the high drop-out of subjects in Group 3, the volunteers who completed this study may have been more conscious and motivated about defining what they ate than those in Group 2. Other studies have also shown good agreement of dietary measures with highly motivated volunteers (Beaton et al. 1997). b. Correlation coefficients oftlze FR and FC

Although rejected by Bland and Altman (1986) and others (Hebert and Miller 1991) as inappropriate, correlation coefficients have been the main method of comparison of nutrient intakes in previous published dietary validation studies (see Section 2.8.3.1, p. 45).

The interclass correlations between the FC and FR of 0.68 (Group 2) and 0.84 (Group 3) exceed the critical values of 0.49 and 0.69 for rat the P:::::0.001 level of significance adjusted for the reduced sample size. The observed correlations are of a similar order as those reported in other published fat FFQs studies (Van Assema et al. 1992; Kemppainen et al. 1993; Feunekes et al. 1993; Block et al. 1989). A dietary assessment method is usually considered good if the correlation coefficient between two measures of nutrient intakes is at least 0.5 to 0.6 (Willett et al. 1985). Comparison of correlation coefficients of the FC with other studies, however, is of limited value in its interpretation because of differences in s~ple selection, sample size and study design.

The moderate to high correlations seen in this study confirmed strength of the linear association of fat intake between the FC and FR for both groups. However, these correlations may be inflated because of: the order of administration of the methods; a training effect; the effect of being part of a study; and motivation of subjects.

However, although the correlations suggested a strong relationship between the two methods, the scatterplots actually conceal the poor agreement of individual data between the two

129 Chapter 5: Discussion ofresults

methods, especially for Group 2. These findings thus confirm the weakness of correlation coefficients as a measure of agreement.

c. Regression

The variability of the regression model provides a more meaningful understanding of the association between the FC and FR than correlation coefficients. Based on a least squares regression model, close to 86% of the variability of the model for 'true' fat intake was explained by the variability of the FC for Group 3, and 51% for Group 2. Based on these results, the regression model was considered acceptable for calculating the mean fat intake from the FC for group data in Group 3 (clinic patients) but unacceptable for Group 2. However, the model cannot be used for predicting 'true' fat intake of individuals in either group. The regression model, however, can be used for ranking fat intake in clinic patients, if the fat intake calculated from the FC does not exceed the upper limit of 13 8g fat/d. A predicted 'true' fat intake for a group of clinic patients can be calculated by substituting the variables for Group 3 in Table 4.9 (p. 108) in the regression equation, Y=a + bx. d. Differences in fat intake of individuals between tlte FR and FC

For Group 2 (university students)

For individuals in Group 2, differences in mean fat intake between the FC and FR were equally scattered around zero difference using Bland-Altman plots. This pattern of scatter indicated that no consistent bias towards over-or under-reporting foods on the FC occurred.

As the 95% CI for the differences in fat intake between the FC and FR included zero, we can say with 95% confidence that there was no difference between means for the PC and FR for this group. Almost 95% of the differences in fat intake also fell within the limits of agreement (mean ± SD). Therefore, the measurement error of the FC, although large, is satisfactory in statistical terms for using the FC to measure fat intakes of individuals (see Figure 4.1 0, p. 102). The Bland Altman measures of agreement confirm that the FC is interchangeable with the FR for individuals. However, the amount of absolute fat intake, defined by the limits of agreement, was subject to too much variability for use as a direct measure of fat intake for clinical use.

For Group 3 (clinic patients)

For Group 3, the FC showed bias at the upper and lower end of fat intake using Bland-Altman plots (see Figure 4.8, p. 100 and Figure 4.11, p. 102). The FC tended to under-estimate fat intake in 'lower-fat' consumers (ie 30-70g fat/d) and over-estimate fat in 'higher-fat'

130 Chapter 5: Discussion ofresults consumers. Although there were only four subjects consuming 'higher fat' intakes, all over­ reported on the FC.

The measurement error of the FC in this group was also statistically acceptable and the 95% CI between the mean differences included zero which implied that the FC can be interchanged with a three-day FR for use in individuals. Again, however, the measurement error of the FC defined by the limits of agreement was too large and therefore not acceptable for use in a clinic situation as a direct measure of fat intake in individual patients.

5.3.1.3 Cross classification comparisons

Although there may be restrictions for use of the FC to measure fat intake of individuals directly, exact agreement is not necessary for ranking individuals into groups. Some comparative studies of short-term fat intake have included an assessment of the ability of the FFQs to rank individuals into broad categories of fat intake (Krista! et al. 1990a, 1990c; Dobson eta!. 1993). In the FC study, cross-classification according to quartiles showed 54% exact agreement between the FC and FR for the pooled data from Group 2 and Group 3. The kappa statistic showed only a 'poor' correlation (r=0.33) but significant association between the ranked quartiles. Correct cross-classification (number of subjects in the same or adjacent quartiles) was 92%, which was higher than the expected agreement of 66% due to chance alone. This level of agreement suggests that the FC can rank subjects in a similar order to that on the FRs.

The unweighted kappa statistic for the FC was of the same order, as that found by Van Assema et a!. (1992) in their fat FFQ study. The kappa statistic was undertaken and reported only as a supplement to other measures of agreement. Maclure and Willett (1987) suggest that the kappa statistic (both unweighted and weighted) is virtually meaningless for continuous data grouped into categories and is influenced by the number of categories used. The unweighted kappa does not account for differences in degree of disagreement. Although there was no gross misclassification ofthe FC in this study, the difference between exact and correct classification suggested a large differential in fat intake between the two methods.

The sensitivity and specificity of the FC is a more meaningful measure of validity than the kappa statistic (Maclure and Willett 1987). One in four people in the pooled group with a 'higher fat' intake were not detected by the FC and would lose the possibility of dietary intervention. In contrast, the specificity of the FC of 87% would mean that only 13% of subjects with a 'lower fat' intake would be receiving unnecessary fat-lowering intervention - this equated to five subjects. Chuang Ling et a!. (1998) in a validity study of a short FFQ to assess consumption of cereals, fruits and vegetables in 77 Singapore adults reported a 12% 131 Chapter 5: Discussion ofresults

error in specificity which they interpreted as low and acceptable. The PPV of the FC of 75% was similar to results from other dietary validity studies that have used the same method of analysis (Wilson and Horwath 1996; Taylor and Goulding 1998). Taylor and Goulding (1998) considered a PPV of 79% to. be high value in a FFQ used to measure the calcium intake of NZ children aged 4 to 7 years. The lower PPV in the Wilson and Horwath study of 64% was also considered acceptable.

Sensitivity of the FC had a 45% error in accurately identifying fat intakes at <40g fat (see Table 4.14, p. 113). The exact reasons for this error rate were unclear but were probably related to error in quantification of food such as meat and fish, inappropriate serve sizes on the FCP itself, or to subjects failing to change serve size on the FC.

At the group level, however, the overall effect of mis-reporting serve size or inaccurately recalling foods on agreement was minor because there were no real trends in either direction (ie 21% of subjects under-reported while 23% over-reported fat intake on the FC). At the individual level for both groups, however, the differences were quite large as seen in the scatterplots and confirmed in the Bland-Altman calculations.

5.3.2 External validity of energy intake Comparison of EEI:PBMR ratios of the FR with reference cut-offs was useful to determine whether the FR provided a realistic or plausible measure of energy intake over three days. Overall, twenty-three percent of individuals in both Group 2 and Group 3 were consuming energy intakes below recommended EEI:PBMR cut-offs for individuals.

Group or mean cut-off levels, which are higher than individual cut-offs and adjusted for sex, duration of study and sample size, showed that only the males in Group 2 were meeting plausible levels of energy intake. At a group level, the EEI:PBMR ratios for females in both Group 2 and Group 3 and males in Group 3 were too low to represent plausible or 'usual' intake. If the very low energy intakes were habitually consumed and estimated EEI:PBMR ratios were accurate, large weight losses would be expected. Measuring weight loss over three days, however, was not undertaken in this study because of its poor reliability in measuring true weight changes over short time periods.

While a three-day FR cannot be considered a true indication of 'usual' intake, these low EEI:PBMR ratios indicate unusually low consumption, restrictive dieting, and under-reporting of food consumption. Females and males in Group 3 had relatively low energy intakes and higher BMis than Group 2. Other studies have confirmed that the EEl: BMR ratio is inversely related to high BMI (obesity) and under-estimating energy intake (Voss et al. 1998).

132 Chapter 5: Discussion ofresults

A few subjects with a high BMI and very low energy intakes affected this ratio. One male subject in Group 3 with a BMI of37 reported consuming less than 3000 kJ/d. The high BMis in males in Group 3 resulted in a higher PBMR, which further lowered the ratio of EEI:PBMR in this group (see Table 4.4, p. 94). Moreover, the equations for estimating BMR are crude values and do not consider body composition (Beaton et al. 1997) and may further affect the accuracy of the ratio.

Subject bias because of under-reporting of food/energy intake in this study was evident in most subjects, although the reasons for this are unclear. Subjects may have decreased food intake in response to being involved in a study, being referred to a clinic for assessment or may have been already consuming low energy intakes. Under-reporting of energy intake is the most commonly reported problem in all types of dietary surveys but particularly in weighed FRs (Stockley 1985; Buzzard and Sievert 1994; Ashton et al. 1996) and in obese people who want to lose weight (Black et al. 1991). Ashton et al. (1996) reported 35% under-reporting on weighed FRs based on these EEl: PBMR ratios. Other studies have shown similar or even greater levels of under-reporting using FRs (Pryer et al. 1997; Voss et al. 1998).

The doubly labelled water method, although expensive to conduct, is now considered the 'gold' standard for measure energy expenditure although it does not appear to have been applied to FFQs (Beaton 1997). Several studies using the doubly-labelled water method have confirmed under-reporting of energy in dietary surveys using FRs and 24-hour recalls (Forbes-Ewan et al. 1989; Black et al. 1993). In a review of a number of studies using this method in randomly selected adults using weighed records, under-reporting was reported to be around 20% for both sexes and even greater (27%-36%) in obese women (Black et al. 1993). Lean women who were volunteers in the limited studies published generally showed similar energy intakes to energy expenditures (Black et al. 1993). For the FC in this study, energy intakes in most women including those with an acceptable BMI were well below the estimated energy expenditure needed for weight maintenance.

The fat density (fat adjusted for energy) of the FR appeared similar in each group and sex. Other studies have reported a lower density of fat and a higher proportion of carbohydrate and protein in the diet of people consuming low energy intakes compared to high energy consumers (Voss et al. 1998). Energy adjustment of fat provides no indication of its contribution to under­ reporting of total energy intakes. It appears that, for the subjects in this study, all macronutrient sources were under-estimated proportionally in the FRs. Of interest was that alcohol intake contributed a relative large amount of energy to the total energy intake in clinic males (see Table 4.4, p. 94). This result uncovered the need for incorporating education strategies in

133 Chapter 5: Discussion ofresults relation to alcohol for those patients involved in the weight-lowering program in CHRMC clinic.

5.4 Content validity of the final fat checklist The number of foods listed, serve sizes, grouping of foods and frequency of foods consumed generally met the food choices in subjects in Group 2 and Group 3 with few exceptions. The number of respondents who consumed foods outside the serve size and frequency options was small, although errors made in recalling foods accurately were quite numerous. The serve size was not appropriate for some foods for subjects who consumed either very low or very high fat intakes. This may have contributed to the decrease in sensitivity of the FC at lower levels of fat intake as the majority of respondents consumed relatively low fat intakes (see Table 4.13, p. 112 and Table 4.16, p. 116).

5.4.1 Range of daily fat intake in Group 2 and Group 3, pooled The daily fat intake (g fat/d) on the FC in Group 2 and Group 3 (pooled), including the unmatched subjects, ranged from around 20 to 175g fat /d and thus covered a wide range of fat intake. Although this range is not representative of 'usual' eating habits, comparison with the range of fat intakes of adults of similar age (25-64 years) in the 1983 National Dietary Survey of Australians showed the maximum or 175g fat/d in this study was fairly close to the 90th percentile for males of 172g fat/d (English et al. 1987).

5.4.2 Design of the final FC

5.4.2.1 Suitability of the FC to reflect the food choices in Group 2 and Group 3

Subjects in Group 2 and Group 3 ate most foods listed on the FC over the three days of the study period except crumbed fried chicken. Avoidance of this food is likely to be a chance event and not a justification to exclude it from the food list as the physiology students (Group 1) consumed all foods listed (see Appendix 13). Food choices differed between Group 2 and Group 3 as expected. Group 3 did not eat numerous takeaway foods on the list. Consumption of food is affected by the days surveyed and lifestyle of the subjects. As two days of data collection were weekdays, the university students (Group 1 and 2) had ready access to takeaway foods including Chiko rolls™, spring rolls and pizza at the university cafeteria. For this group, the inclusion of these foods was important on the FC.

Numerous foods were eaten infrequently by only one or two subjects. The inclusion of such foods that are high in fat but infrequently consumed on the FC is important for measurement of fat intake of individuals. However, inclusion of these foods may not be important if the purpose 134 Chapter 5: Discussion ofresults of the FC was used for ranking fat intake of groups or identifying those most in need of clinical intervention.

Frequency analysis provides an indication of the contribution of fat from individual food items to total fat intake. At the upper end of the frequency distribution, just 13 food items accounted for 70% of the 'true' daily fat intake from the FR in Group 2 and 11 foods in Group 3 (see Table 4.18, p 121). Butter, margarine, cheese and full-cream milk were the major contributors to fat intake accounting for 28.8% of daily fat intake in Group 2 (mean age 22.3 ± 5.1 years). By comparison, these same foods contributed 30.6% of fat to the 'average' diet of 6,255 adult Australians, aged 25-65 years (English eta!. 1987). In contrast, Group 3 (mean age 49.1 ± 10.6 years) derived more fat from butter and margarine (19%) and around 27.5% of total daily fat intake from untrimmed meats. These food choices are not atypical for these age groups (Bingham et al. 1994).

The major food items contributing to 70% of fat intake from these populations may not be the same for other populations despite the similarities to other studies. The university students and clinic patients were not representative of the general population, So any similarity when comparing to the general population or the 'national' diet may be a chance event.

5.4.2.2 Suitability of the frequency format for Group 2 and Group 3

An important feature of the design of the FC was whether the number of frequency options reflected the upper and lower limits of the quantities of foods consumed over three days. For the university students in Group 1 and Group 2, the amounts of milk consumed in all forms (low-fat, full-cream milk, flavoured milk and milkshakes) from Q 35 and Q16 on the FC exceeded the upper frequency options, in some cases, considerably. Bingham et al. (1994) identified similar problems in defining serve sizes and quantifying milk in the Oxford and Cambridge FFQs. Muesli, biscuits and added fat and fat spreads also slightly exceeded the upper frequency option. Because the final frequency option was > 5 or more times, these responses were accommodated and did not raise any questions from subjects during data collection.

The number of respondents from Group 2 and Group 3 who altered serve sizes is indicated on Table 4.17, p. 119-20. Nutrition students altered serve sizes more frequently than clinic patients probably because of their education background and training in data collection procedures. These alterations, however, were not necessarily accurate as discussed later. Alterations in serve size were made mostly at the lower end of the frequency options as shown in Table 4.17,

135 Chapter 5: Discussion ofresults

119-20. These changes would have some effects, although minor, on the agreement measures between the FC and FR compared with changes at the upper end of the frequency options.

In summary, the frequency format was not suitable for a small number of foods for the university students. Either the serve size of milk or the number of categories for milk or frequency options available for the existing serve size needs to be adjusted if the FC is to be used again in a population group of active young adults who are high energy consumers (eg athletes). The serve sizes for a small number of foods were also too large for a few clinic subjects and may contribute to inaccurate fat intakes at lower levels of fat intake.

5.4.2.3 Grouping foods: The effect of grouping meat on the FC

The grouping of foods appeared satisfactory for most foods and for a three-day period, except for meats where separation into two sections was associated with some problems. 'Trimmed' and 'untrimmed' meats were misclassified and misrecalled most frequently. Trimmed meats were often categorised in the untrimmed section, which came first on the FC.

It was difficult to interpret the reasons for this. It may have been related to the non-consecutive order in which meats were presented or the way meats were grouped into single questions. Willett (1990, 76) suggests a loss of accuracy when a number of foods are combined into a single food group but studies of the effect of this are few and have been equivocal (Serdula et al. 1992; Willett et al. 1985, 1987). One of the limitations of the FC is that the meat groups do not reflect the specific types of meat selected because Iamb, beef, chicken and pork are pooled into the one group. However, the objective of the FC was not to measure different meat sources only total fat so pooling of meat sources was not an issue.

Another limitation was that the list of meats did not represent all forms of meat available and perhaps this should have been made clearer to subjects in the FC. The existing format of the FC also provides opportunities for respondents to repeat and misrepresent responses. Serves of the meat groups could easily be confused particularly if subjects did not read or fully understand the instructions or meaning of the words 'trimmed' and 'untrimmed' Instructiqns that are more explicit may help to avoid these problems for future use of the FC.

The way these foods are grouped and/or ordered has obviously caused some confusion but was not associated with large differences in mean fat intake between the FC and FR. These foods and errors in recalling foods accurately as shown in Table 4.15 (p. 114) are contributing to the wide range of individual differences in fat intake between the FC and FR. Although Serdula et al. (1992) reported a lower consumption of high fat food in a grouped food questionnaire of 14

136 Chapter 5: Discussion ofresults food items compared with the same FFQ of 29 separated food items, this response was not observed for the FC in this study.

5.5 Factors affecting the interpretation of validity tests of the FC in Group 2 and Group 3

5.5.1 Constraints of the study design Undertaking an unbiased validity study on any dietary method compared with another dietary method is difficult especially a method that measures short-term intake. Because of the protocol for data collection and intended use of the FC, subjects were informed of the purpose of the. study. It was not possible to have a blind research design as the subjects recruited in Group 3 volunteered to participate in the study because they were involved in a program to reduce fat intake. The design, format and title of the FC supported this dietary fat reduction. The same protocol was used in all other groups to maintain consistency in experimental procedures.

An awareness of the purpose of the study and the effect of participating in a study, may bias subject responses and encourage distortion of food choice (Stockley 1985). In this study, under-reporting of fat and energy intakes (reflected in the low EEl: PBMR ratios) was expected and evident. This distortion of food choice was also apparent in the pilot study where subjects were more focused on avoiding and calculating fat intakes than commenting on the pilot evaluation sheet. The fat intakes reported in the criterion validity study were also likely to be underestimates of 'true' fat intake for the same reasons.

The problems of respondent bias because of heightened awareness was recognised but difficult to overcome given the format of the FC and because decreases in fat intake were expected in the clinic patients. Ideally, a FR should have been administered after completion of the FC or a number of times during the clinic program to test this effect. Administration of an additional FR would have introduced yet another burden and task for the recruited subjects who were going to be subjected to repeated blood and physical tests concurrently. In retrospect, given the high drop out rate of 50% of patients from the study, this decision to conduct only one three­ day food record, although not desirable for the research design, was probably justified.

Because the design of the validity study of the FC was not ideal, given the circumstances ofthe subject selection and data collection procedures, identification and reduction of other sources of bias was important.

137 Chapter 5: Discussion ofresults

5.5.2 Subject selection and sampling bias Subject selection for the validity study of the FC was opportunistic. Clinic subjects, who were newly referred to the CHRMC, were invited to participate in the study. Recruiting a matched population to the clinic patients not undergoing dietary intervention would have been an alternative option for improving the research design but was also not feasible because of the ad hoc nature of patients referred to the clinic and the time needed for data collection. The slow rate of recruitment of clinic patients in the time allocated resulted in a lower than expected final sample size.

University students were also used to test the validity of the FC to extend its usefulness and because they:

• allowed validation of the FC to a broader population, presumed to be consuming a wider range of fat intake than clinic patients did; and

• were accessible in large groups: and

• were less likely than clinic patients to distort food choice during the data collection procedures.

No adjustments were made for age or sex because their distribution and the final sample size matched for analysis were too small. Guthrie (1984) and Pao et al. (1975), however, reported that large differences exist in the standard serve sizes for men and women, especially between 18-3.0 years. Conversely, Willett et al. (1987) found that adjustment for age and sex had little influence on correlation coefficients for individual micronutrients in FFQs.

5.5.3 Implication of the decrease in sample size on validity

5.5.3.1 For the reproducibility study

The implication of the low final numbers of matched subjects completing the FC at test 3 (3 months from the first test) was that any effects of a longer time frame and sea~onal changes on fat intake could not be measured.

5.5.3.2 For the criterion validity study

Only 75% of the original target of 70 clinic patients were recruited in the time available for the criterion validity study. Of these, only 36% of subjects (n=19) provided data that were matched for analysis. In contrast, 60% of the numbers recruited in Group 2 (university nutrition students)(n=42) were 'matched' for analysis (see Table 4.1, p. 88).

138 Chapter 5: Discussion ofresults

The main reason for the lower than expected sample size in both groups was because of non­ consecutive collection of the FR and FC rather than exclusion or withdrawal reasons. This mismatch of timing of the FC with the FR was a major problem and was a reflection of the rigidity of the study protocol and design of the FC. This decrease in subject numbers may be indicative of a loss of subject compliance or an inability of subjects to understand, and adhere to, the study protocol. Poor subject compliance is an inherent problem in many dietary surveys, especially those involving FRs (Willett 1990, 55). The reasons for this are poorly documented as most subjects are lost to follow up and cannot be contacted for evaluation.

5.5.3.3 Effect of the decreased sample size on the power of the measures used to detect differences between the FC and FR in the criterion validity study

However, the final sample size of 42 subjects in Group 2 met the target sample size of 28 needed to satisfied the specified power of (0.8) and predicted effect size (r=O.S) for correlation analysis described in the methods (see Section 3.2.1.3, p. 67). The final power of the study based on an effect size of r = 0.64 for 42 subjects was close to 0.9 at a level of significance of P=O.Ol. Although the final sample size of 19 subjects in Group 3 was below that needed for the predicted effect size of 0.5, the actual effect size corresponding to a correlation coefficient of r=0.84 increased the power ofthe correlation and regression analysis tests to over 0.95, P=0.01 (Cohen 1988, 103). Therefore, the regression and correlation results satisfy statistical inference.

Although the outcomes of regression and correlation analysis in this criterion validity study are statistically acceptable, using correlation to determine a real effect size is misleading. An effect size defined by the differences in fat intake between the FC and FR that would be clinically acceptable is more meaningful when interpreting agreement. Determining an accepted fat value is difficult and depends on the current diet consumed and the purpose of using the FC. In patients consuming a low fat diet, an average of 1Og fat/d difference between the FC and FR would probably be clinically acceptable, as this difference is relatively small. If the FC was used for screening for high fat intakes, 20g difference may also be acceptable. - The final power of the Bland-Altman tests to detect a difference in fat intake of either lOg or 20g between the FC and FR was recalculated based on the results in Table 4.7, p. 103 and substituted in the equation and nomogram (Altman 1991, 456) previously described (see Section 3.2.1.3 and Appendix 1). These re-calculations are summarised in Table 5.1 on the next page. The final sample size of 42 subjects in Group 2 and 19 subjects in Group 3 resulted in a power of only 0.25 and 0.27, respectively, to detect a difference of lOg between the FC and FR that was not due to chance. If a difference of 20 g of fat between the FC and FR was considered clinically acceptable, the recalculated power values of 0.73 and 0.75 for Group 2 and 3,

139 Chapter 5: Discussion ofresults respectively were close to 0.8 for the final sample size tested. Although 20g of fat intake is probably appropriate for subjects who consume relatively high fat intakes, it is too large a difference for patients on low fat diets.

Table 5.1 Power calculations for the final subject numbers used for testing the agreement between the FC and FR

Statistical parameters Group 2 Group3 (final sample size=42) (final sample size=19)

SD difference between the FC 25.7 16.7 and FR (g fat/d) (S)

Clinically relevant difference a. 10 a. 10 between the FC and FR b.20 b. 20 g fat/d (o)

Power (1-fJ) 0.8 0.8

Significancelevel(a) 0.05 0.05

Standardised difference (OIS) a. 0.39 (o=IOgfat) a. 0.60 (o=IOgfat)

b. 0.78 (o=20gfat) b. 1.20 (o=20gfat)

Final power of the agreement a. 0.25 (o=IOgfat) a. 0.27 (o=IOgfat) measure b. 0.73 (o=20gfat) b. 0.75 (o=20gfat)

FC =fat checklist, FR =food record, SD= standard deviation

For future studies of the validity of the FC in other population groups, the SD of the differences calculated in this study could be used as a benchmark for determining ideal sample size rather than the cruder correlations, which were actually used for this study. If a 1Og difference in fat intake between the FR and FC was clinically acceptable for clinic patients, a sample size of 84 subjects would be needed for this validation test to reach a power of 0.8 (Altman 1991, 456). This is similar to the sample size estimates predicted by Cohen (1988, 103) and predicted for the smallest effect size of0.3 (see Appendix 1).

For example, for a 20g difference in fat intake between the FC and FR for Group 3, using the data in Table 5.1 in the Altman nomogram (1991, 457) for predicting patient numbers (ie 1.20 standardised difference, power 0.8; significance 0.05), 33 subjects would be needed to repeat the Bland-Altman tests. To repeat this validation study in Group 2 (0.73 standardised difference, power 0.8, significance (two-tailed) 0.05), 51 subjects would be needed.

5.5.4 Effects of investigator bias and training effects on validity measures To address the possible effects of investigator bias, two independent persons administered the training, collection and checking of the FRs at the follow-up interview for the criterion validity

140 Chapter 5: Discussion ofresults

study. Numerous data collectors can reduce bias but do not negate it. The data collectors were likely to have more of an influence on subjects' responses in relation to the FRs, rather than responses to the FCs as these were self-administered. The data for the FCs were collected in isolation with no input from data collectors. Coding and translation errors were minimised because scoring of the FCs were cross-checked by another trained person. The FRs were verified with the subjects for cross-checking food quantities, where needed (eg foods eaten away from home).

Because of the necessity for intervention and instructions provided to subjects to collect FRs, the effects of bias in recording food intake from either an investigator effect or training effect cannot be discounted. Other researchers have observed that subjects in a similar situation record only those foods that they want the investigator to see and frequently change their usual eating patterns (Marr 1971; Stockley 1985).

5.5.5 Effects of a three-day time frame on agreement measures A FFQ that measures three days of food intake retrospectively is an unusual method but not unique. Subjects tend to lose compliance after four days of recording (Gersovitz et al. 1978; Stuff et al. 1983) so the short sampling period of three days used for the FC was perceived to be an advantage for improving the accuracy of the FR data although it was likely to contribute to an inflation of the agreement measures. A short sampling period such as three days of recording and recall is associated with a large standard deviation (Willett 1990, 35) which was seen in the FC study.

5.5.6 Effects of using consecutive days of testing on agreement measures The three days used for testing the FC, Sunday, Monday and Tuesday, raise the issue proposed by Willett ( 1990, 41) that misleading estimates of within-person variation in fat intake can be made when measuring food intake over consecutive days. People do not eat the same way every day. Over-eating on one day is likely to be followed by under-eating !}le next day (El Lozy 1981). Food intake on one day also adds to the prediction of intake for the next day (Morgan et al. 1987). Hence, there is a recommendation to sample at random intervals and for longer time periods for measuring 'usual' fat intake to avoid the effects of consecutive days of sampling (Willett 1990, 41; Marr and Heady 1986). The consecutive day sampling was not an issue in this study as measuring 'usual' intake was not the purpose of the FC. However, such a protocol does have implications if the FC (or modifications of it) was used repeatedly for measuring fat intakes in the 'usual' diet.

141 Chapter 5: Discussion ofresults

5.5. 7 Influence of FRs on agreement measures

5.5.7.1 Reference method for validation of the FC

A three-day FR was used as the reference method because unlike a FFQ, it is minimally dependent on memory and allowed more accurate quantification of the foods eaten. An estimated FR was chosen in preference to a weighed FR for the clinic patients for a number of reasons. Insufficient scales were available for distribution. Weighing and recording dietary intake is a well-known behavioural technique for weight loss (Black eta!. 1991). Weighing all food and beverages rather them estimating them was more likely to distort food choices in patients attending a clinic for dietary intervention, pa1ticularly those referred for the weight­ loss program. In contrast, the university students were not involved in any formal dietary intervention program and used weighed measures to enhance accuracy in quantifying foods consumed.

5.5. 7.2 Sources of inaccuracy and bias in recording food intakes

There is a high degree of potential interference with a subject's normal eating patterns when recording food intakes. Subjects may also report actual food intake inaccurately, either intentionally or non-intentionally. It has been widely reported that recording of food intake actually changes a subject's diet (Pekkarinen 1970; Dennis and Shifflet 1985; Stockley 1985). It is believed to discourage snacking, inhibit food selection and encourage simplification of the diet. This is often related to the time-consuming nature of the method and the desire to impress the investigator, thus hiding bad dietary habits (Pekkarinen 1970). Therefore, although the FR was considered the most 'accurate' method as a criterion measure for this study, it was itself biased. The under-reporting by so many subjects confirmed that subjects were not reporting 'true' intakes. Presumably this is also consistent for the FC.

Additional problems using a weighed FR include the incorrect weighing of foodstuffs with resulting under- or over-estimate of actual intakes. Although subjects were provided with a set of scales to measure food intake in this study, it was possible that these were incorrectly zeroed or that, the actual weights of foods were incorrectly recorded, despite calibration at distribution. Another confounding factor that may have decreased accuracy was the ability of the subjects to make reliable estimates of ingredients in composite meals and meals eaten away from home. To reduce errors in estimating ingredients, respondents were questioned about these foods at interview.

142 Chapter 5: Discussion ofresults

5.5.7.3 Computation of fat intakes

Fat intakes calculated from the FRs using NUTIAB 91/2 (Lewis and Holt 1992) are estimates only and not direct measures (see Chapter 2, Section 2.3.2.1. p. I 0). Where a food reported consumed in the records was unavailable, a substitute was used. Where numerous substitutions were made, inaccurate information can result. In this study, few substitutions were made as subjects generally consumed foods that were commonly consumed and easy to measure. This apparent response to recording food intake supports the observation from other studies that subjects simplify food choice using FRs. The FRs, because of their comprehensive nature, contained more foods than those listed on the FC so were expected to be more accurate than the FC. Two independent people cross-checked coding and analysis of FRs to avoid or limit errors in data entry. Where discrepancies between measures were detected, these were double­ checked and re-analysed to decrease error.

Despite the attention to reducing coding and translation errors in this study, the accuracy of conversion of reported foods into nutrients from food composition tables is always questionable (Kohlmeier 1992) and has been addressed earlier (p. 9-11 ). Although the amount of food waste was measured in the weighed FRs in Group 2, it was poorly addressed in the estimated records in Group 3 and in the FCs, and may account for some loss of accuracy. The issue of appropriate substitutes for the large range of commercial foods modified for fat (ie reduced or low-fat) such as cheese and yoghurt that were not on the database may also have contributed to a decrease in accuracy of the measures of fat intakes.

5.5.7.4 Exclusion of food records from the study

FRs in this study that were inadequate in detail, in either amounts, or types of food consumed, were excluded from the study (see Section 4.1, p. 86-7). The exclusion of these FRs in 21% of subjects in Group 3 and around 11% in Group 2 suggests that the FRs used for analysis came from the more motivated or compliant subjects which may have inflated the agreement measures between the FR and FC.

5.5.8 Influence of the FCs on the agreement measures

5.5.8.1 Timing of administration of the FC

The FC was intended to measure short-term current intake over the previous three days and to be used as a monitoring and education resource at weekly appointments during the four-month clinic program at the CHRMC. The timing of the administration of the FC therefore coincided with the recording period of the FR. For this reason, the FR was completed prior to the FC. This order would have increased awareness of food consumed by subjects which may have

143 Chapter 5: Discussion ojresults biased the results. As the clinic subjects were referred to the clinic for lifestyle and dietary intervention programs, they were likely to have an increase awareness of their food intake anyway. This may have introduced additional bias and further inflated the agreement measures.

However, responses between the fat intake data from the FC of subjects that were 'matched' with the data from the FR. and the FC responses of 'unmatched' subject were no different except for clinic males (See Table 4.16, p. 116). This suggests that those subjects who provided 'matched' data were either not biased in their responses or that both 'unmatched' and 'matched subjects were similarly biased. It could be concluded that being in a study already influenced those males who had provided 'matched' data. The factors affecting bias or the extent of bias contributed by the unmatched subjects is impossible to document, given the small sample size of the clinic males.

5.5.8.2 Format and design of the FC

The suitability of foods listed and frequency format ofthe FC may have inflated the agreement as described earlier in Section 5 .4.2.1, p. 134-7. A major limitation in the design of the FC was that it did not measure the whole diet, only selected foods, so energy and other nutrients could not be calculated. This has limitations in application to any epidemiological study investigating the association between diet and disease and does not account for the interactive effects of other nutrients.

5.5.8.3 Influence of memory or recall bias

The short time frame and the design of the FC aimed to maximise memory and improve recall. Recording food intakes raises awareness of foods consumed (Marr 1971; Lee and Neiman I 993, 49). Therefore, a diet recall of the same three days of recording was unlikely to have a large effect on memory loss. Also the recall-recognition technique of the FC elicits more information than simply remembering food intake by interview and is incorporated into the traditional diet history for this purpose (Bingham 1985; Willett 1990, 58). Despite the perceived advantages of being sensitised to recall food intake more accurat_ely because of a concurrent weighing of food consumed, subjects in both Group 2 and Group 3 still made errors recalling foods (both adding and omitting foods) on the FC. Willett (1990, 54-55) in a review of sources of respondent error in short-term recall methods supports this observation and concludes that errors in recalling foods accurately are common even when subjects recall foods consumed in a 24-hour dietary recall (Willett 1990, 54).

144 Chapter 5: Discussion ofresults

Because the amount of under-reporting and over-reporting in the pooled Group 2 and Group 3 data was relatively equal (Table 4.11, p. Ill), the overall effect of any recall bias was negated at the group level.

5.5.8.4 Errors in reporting serve size

The errors in estimating serve sizes on the FCs were variable but under-estimated more than twice as often as over-estimated. Foods presented in 'discrete' units or eaten as single foods such as a 200g carton of yoghurt, a carton of hot chips and other take-aways, for example, were correctly described. Subjects had difficulty quantifying meat as explained previously in Section 5.4.2.3, p. 136, which was not presented in 'discrete' units. Other studies have supported this result and show that people consistently have problems quantifying food like a piece of steak or fish (Guthrie 1984; Willett 1990, 80; Fogelholm and Lahti-Koski 1991; Faggiano et al. 1992; Tjonneland et al. 1992). Faggiano eta!. (1992) reported over-estimates as high as 50% for meat quantities in 103 respondents to a recall of the previous evening's meal using portion photographs as cues. Despite using the weighed FRs in Group 2 to quantify foods consumed, errors were still evident in this group. The reasons were not determined in the current study as well as Faggiano's study but could be attributed to one or a combination of the following reasons:

• poor compliance in altering serve size on the FC;

• difficulty in remembering quantity consumed; and

• difficulty translating quantities consumed into serve sizes on the FC.

The collection of serve size information in FFQs has been controversial (Guthrie 1984; Willett 1990, 79; Block and Subar 1992; Wheeler et al. 1994) but is not critical given the dominance of frequency of consumption in the estimation of average intakes (Samet et a!. 1984; Tjonneland et a!. 1992). Guthrie (1984) reported that in a population of university students and university employees, aged 18-30, subjects poorly quantified serve sizes of a meal immediately after consumption without the aid of any cues, photographs or food models. Poer recall of food quantities was also evident on the FC. Wheeler et a!. (1994), however, found that reproducibility of fat intake and most other nutrients was higher in a FFQ that included portions and options to change portions by household measure, than a qualitative version of the same FFQ. Conversely, Bingham et a!. (1994) found that the accuracy of a food use checklist was worsened when subjects were given the option to assess their own portion sizes. The FC in this study did not measure the effect on the agreement measures with or without portion sizes

145 Chapter 5: Discussion ofresults but showed that in both clinic patients and university students, errors in estimating portion sizes were evident.

Although misclassification of foods was not high, subjects in both groups tended to add foods, particularly snack foods (eg ice-cream (Q150)) on the FC and omit them from the FR. Worsley et al. ( 1984) and others (Mertz et al. 1991) suggested that these foods are perceived to be socially undesirable and are most likely to be misreported. Despite the confidentiality of responses in both the FR and the FC, the responses on the FRs were not anonymous because of the checking procedure outlined in the study protocol (see Section 3.3.3.1, p. 74). Therefore, the effect on subjects of participating in a study and investigator bias could not be discounted. The use of data collectors may have reduced this investigator bias but not avoided it.

Food items recalled on the FCs that were under-estimated compared with the FRs were added butter and margarine, cheese (Group 2 only) and cream (see Table 4.15, p. 114). These foods frequently occur as ingredients in recipes and may be easily forgotten in a recall situation. Prompts for these foods were provided on the FR instructions but not provided on the FC when used as ingredients. Group 3 were notable in over-estimating serve sizes for cakes and biscuits which is probably attributed to the subjects either not reading or changing the serve sizes on the FC.

146 Chapter 6: Conclusions and suggestions for future work

CHAPTER 6

CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK

6.1 Conclusions The fat checklist (FC) developed in this study produced acceptable agreement measures for fat intake with the more 'accurate' food records (FRs) although reproducibility was unacceptable. The FC produced statistically comparable measures of mean daily fat intake with the FRs; was able to classify subjects similarly into quartiles of fat intake with the FRs; and had an acceptable sensitivity and low error rate in identifying subjects with a mean intake over a three days of;;:=: 70g fat/d. The FC can be self-administered, self-scored and takes most people less than ten minutes to complete and score. However, although the agreement measures were acceptable using these broad measures of cross-comparison, the FC cannot replace a three-day FR in measuring fat intake in individual subjects in either Group 2 (the university students) or Group 3 (the clinic patients). The measurement error of the FC was too large and unacceptable for measuring absolute fat intake in individuals.

Numerous confounding factors were identified that could have biased the interpretation of the reproducibility and validity results and contributed to the measurement error of the FC. Overall these factors were not large enough to substantially affect the agreement measures in groups of individuals, but contributed to the unacceptable validity of the FC for measuring fat intake in individuals. The nature and magnitude of errors in recalling foods accurately or misreporting serve sizes on the FC contributed to the poor agreement of fat intake between the FC and FR in individuals. Minor modifications to the FC are needed to address these issues.·

In summary, the FC cannot be used interchangeably with a three-day FR for measuring fat intake in individuals, if estimates of individual fat intake are required. However, this does not mean that the FC is without useful applications (see Section 6.3, p. 152). Exact agreement between the FR and FC is not critical to the ability of the FC to rank individuals by levels of fat intake or monitor relative changes in fat intake in an intervention program at the group level in a statistically adequate sample. The FC is not suitable for individuals unless the difference in fat intake is too large to be attributable to poor reproducibility. Other measures such as weight

147 Chapter 6: Conclusions and suggestions for future work change, serum cholesterol or the individuals perception of change in fat intake should be used in combination with the FC in an intervention program for evaluating change.

The FC was found to be applicable to the Australian food supply and the food choices of a young adult population and possibly an older adult population. Further modification and testing of the FC would be needed to justify its use in other populations.

6.1.1 Detailed conclusions

Reproducibility

• Reproducibility of the FC was unacceptable. Respondents consumed less fat at the retest but this was likely to be biased by a training effect associated with the short-term interval between tests as well as by other factors.

Criterion validity

• Differences in mean fat intake between the FR and FC for each group were small (less than 1.1% for Group 2 and 7.0% for Group 3), but showed very wide differences among individuals. The FC performed better in estimating group mean intakes than measuring individual fat intakes.

• Although the FC appeared statistically acceptable for measuring fat intake of individuals in both groups, the differences in individual fat intake between the FC and FR, and measurement errors of the FC were too large for 'accurate' measurement of fat intake in individual clinic patients (Group 3) who were referred for reducing fat intake.

• The regression equation calculated from the data in Group 3 can be used for measuring 'true' fat intake in groups of clinic subjects, provided the fat score from the FC does not exceed l38g fat/d. The regression equation derived for Group 2 was not acceptable for calculating 'true' fat intake in Group 2.

• The level of correct classification of the FC with the FR (ie 92%) based on quartile ranking of all subjects in Group 2 and Group 3 (pooled) was at least as good as levels reported from other validation studies targeting fat intakes. The FC can replace a three-day FR for measuring fat intakes in groups and could be used for ranking fat intake or discriminating between high (>88g fat/d) and low fat (<47g fat/d) consumers.

• The sensitivity of the FC to mean fat intakes of ;:;::?Og fat/d and high specificity to detecting fat intakes below this level suggested that the FC was acceptable as a screening measure for higher fat intake. This observation support the cross-classification results.

148 Chapter 6: Conclusions and suggestions for future work

• Food items erroneously recalled (ie misclassified or quantified incorrectly) or forgotten on the FC were mainly meat, dairy foods and snack foods. The reasons for these errors were related, to some extent, to the design of the FC and respondent bias and will be discussed further in Section 6.1.3, p. 150.

Content validity

• Few problems were detected in the design, format and administration of the FC indicating a good content validity. All food items listed on the FC were consumed by the study groups. However, a change in the order of meats listed is warranted (see Section 6.2.1, p.151 ).

• The frequency response options and portion sizes on the FC generally reflected the food choices of the clinic patients and university students with some exceptions, therefore minor modifications to milk questions and biscuits are recommended.

6.1.2 Problems with the study design The nature of the study design would be associated with an inflation of the agreement measures between the FC and FR, although efforts were made to minimise subject and investigator biases. Close agreement in means and cross-classification measures, and the moderate to high correlation coefficients may have been reflected by the study design and order of administration of the FC. Despite the apparent close agreement of the results, there was still unexpected wide variability in agreement measures of fat intake in individuals. The following factors may have affected the interpretation of the agreement measures as well as the accuracy of the reference method. The magnitude of the effects of these factors, however, is unclear.

• The administration of the FC after the FR may have sensitised subjects to foods consumed and thus would have inflated the agreement measures.

• The use of a three-day FR as the criterion method was associated with under-reporting of energy in the majority of subjects in both groups.

• The study groups were not representative of the general population as they were selected populations (university students; clinic patients) and biased in terms of educational background, sex ratio, age and motivation; accordingly the agreement measures between the FC and FR especially of the clinic patients may have been falsely high because of their willingness to participate in an intervention program and were likely to be more motivated to achieve a health outcome than the university students. The FC cannot be generalised to other population groups.

149 Chapter 6: Conclusions and suggestionsfor future work

• The final sample sizes, 42 subjects in Group 2 and 19 subjects in Group 3, matched for analysis were small, largely because of a mismatch of timing of the FC with the FR especially in Group 3, the clinic patients. These low numbers in Group 3 decreased the power of the statistical tests used to detect a difference in fat intake between the FC and FR. The number of subjects in Group 2 met the specified power of 0.8 to detect a difference in fat intake between the two methods that was statistically significant at P::;;0.05.

• No allowance was made on the FC for potential differences in food selection between men and women. The changes in serve sizes observed at the upper end of the frequency options may have been a reflection of these differences. This was not tested in this study because of the low numbers of males participating, and the bias introduced by under-reporting particularly by males in Group 3.

6. 1.3 Problems with respondent bias Respondent (subject) bias associated with misreporting or poor memory recall of foods reported consumed was identified as a source of error or variation between the FCs and FRs. However, comparisons of mean fat intakes, correlation coefficients and cross-classification measures between the FC and FR indicated good agreement, despite these effects. The problems with memory bias and accuracy in recalling retrospective intake on the FC, even in the short-term identified in this, and other studies, is difficult to isolate. Measurement of memory or recall bias and subsequent relative validity assumed that the three-day FR was the more 'accurate' measure or 'gold standard'. Because a FR is not a direct or truly 'accurate' measure of food intake, it is impossible to determine whether differences between the two methods were because of under-reporting, over-reporting, misclassification or erroneous recording in either method.

Subject bias was related to the following:

• The design of the FC was determined by its main intended use (ie as a resource for reducing fat intake in a longitudinal dietary intervention program in a Canberra clinic) and this design was maintained for the validity testing. It was apparent to subjects, both in the pilot study and actual study, that fat intake was being assessed. This was a source of subject bias that would have been systematic for both FRs and FCs.

• The order in which foods were listed on the final FC (eg 'trimmed' and 'untrimmed' meat, listed in two separate sections) was associated with incorrect responses in a number of subjects.

150 Chapter 6: Conclusions and suggestions for future work

• Subjects had difficulty estimating quantities of mixed meals acc.urately when filling out FRs, which may have decreased accuracy of computation of fat intake. This difficulty was addressed in the administration of the study by the use of food models and standard household measures to cross check amounts consumed but cannot be discounted as a source of error affecting validity.

6.2 Suggested modifications to the FC Several improvements and modifications to the design of the FC are recommended based on the outcomes of the validity testing. These changes depend on the objectives for use, the study design (if used for monitoring) and the targeted respondents.

The inclusion of cues on the FC is warranted for ingredient prompting. Foods including cheese, eggs and added fat and fat spreads, that are important fat sources in the diet and used as ingredients in many foods, were frequently omitted. Prompting was included on the estimated FRs, as part of the example in the instructions, but not in enough detail on the FCs. Prompting would be best incorporated in the FC where these foods are listed. For example, in Ql7 (cheese), include a prompt (eg remember cheese used as an ingredient or added to pasta, rice or sandwiches).

The inclusion of reduced- and low-fat dairy foods, especially low-fat cheese is also warranted considering the plethora of these foods now available in 2000 and the likelihood that clinic patients restricting fat would consume these foods.

6.2.1 Summary of suggested modifications for clinic use • Retain the title of Fat Intake Checklist if the purpose of its use is to modify fat or monitor fat intake in the diet of individuals.

• Retain the amount of fat (in grams in the Fat per Serve column) so that patients as well as the dietitian or health professional can use the checklist as a self-monitoring resource.

• Retain the same format (ie frequency options and three-day recall instructions) but re-order the food list into similar food groups. Both 'untrimmed' and 'trimmed' meat could be placed side-by-side. The pilot study revealed that subjects using the checklist for the purposes of reducing fat intake, tended to avoid or limit most of the foods listed. Placing food groups together may avoid confusion, which should improve accuracy of recall and also provide an educational message, and opportunities to assist patients to improve food selection.

151 Chapter 6: Conclusions and suggestionsforfuture work

• Splitting the milk group into separate items and include reduced- and low-fat dairy foods (ie cheese, yoghurt, Fruche™).

• Clarify the meaning of the words 'trimmed' and 'untrimmed' by reference to the example in the instructions below:

Over the past three days, this person had eaten one small untrimmed steak (including most of the fat), 2 fried drumsticks (including the skin) and 3-4 slices of untrimmed roast pork (including the fat) so the total is 3 (ie. counting 1 serve for the steak, 1 for the chicken and 1 for the pork) and the fat intake is 22 x 3 =66 grams of fat.

• Include food models (or photographs) of cooked meats and other foods difficult to quantify to match the standard amounts (or serve sizes) provided on the FC.

• Include household measuring devices (or photographs) (ie standard cups and spoon etc) as a demonstration to help patients quantify food intake.

• Use Q43 (ie ..... foods eaten over the previous three days that contain fat but not mentioned in the questionnaire ..... ), the final open-ended question, to elicit additional information by prompting.

6.3 Recommendations for use of the FC

Monitoring of fat intake of individual clinic patients

The FC can be used for documenting and monitoring of relative changes in fat intake in individual clinic patients in a counselling or clinical intervention situation. Bias in reporting may not be important if subjects are surveyed more than once and act as their own controls. The FC may be useful for documenting adherence to a clinical study where consistency of fat or food intake needs to be maintained. Bias only becomes important if absolute levels of fat intake are needed. The option to differentiate types of fatty acids could also be determined.

Evaluation of fat intake in groups of individuals in the clinic program

The FC can be used to evaluate the efficacy of a dietary intervention program in the clinic patients and to rank subjects into groups on the basis of fat intake. For example, fat intakes from the FC measured at baseline (pre-intervention) can be compared with intakes at the end of the program (post-intervention). Kemppainen et al. (1993) concluded that three days were adequate for ranking of subjects according to fat intake in 82 Finnish adults. Some studies have also supported three days of measurement for reliably estimating fat intake in groups (Jeor et

152 Chapter 6: Conclusions and suggestions for .future work al. 1983; Marr and Heady 1986; Brekke et al. 1992) while other studies have suggested six to seven days (Basiotis et al. 1987; Nelson et al. 1989).

Fat intake data can be used directly from the FC to check relative differences, or it can be derived from the regression equation to determine differences in 'true' fat intake.

Screening tool for groups of individuals

The FC has an acceptable sensitivity for measuring ~ 70 g fat/d and has a good capacity to discriminate between high and low fat consumers in a community or clinical situation. It would be useful for identifying high fat consumers and ranking consumers into categories. However, its sensitivity, using the existing serve sizes and format, is poor for very low fat intake of:::;; 40 g fat/d.

Education of patients I community groups about fat sources in food

The FC provides an education resource which can be used in clinical or community settings to raise awareness of food sources of fat in clients/consumers and for measuring short-term eating behaviour; and for assisting in self-help intervention.

6.4 Recommendations for future testing The suitability of the FC (or modifications of it) for other population groups of different ages, socio-economic and cultural backgrounds than the local population studied here would need to be tested before use. Culturally or regionally unique foods such as the types of low-fat milk available in Victoria are not listed on the FC. This could be accommodated in the last question at the end of the FC by using prompts or foods typically eaten by the population of interest. The order of foods may need to be changed for different cultures because the Australian food groups are based on similar nutrient composition (ie Core Food Groups). Other cultures classify foods differently. For example, numerous Asian cultures classify foods by their digestibility or medicinal qualities. Development of a culture specific food list, based on food consumption patterns of the target population, and meeting with representatives from the target population is warranted.

The FC was, to a large extent, based on the food intake of Australians surveyed in the National Dietary Survey of 1983 (Cashel et al. 1986; English et al. I 987). As the food supply includes many new foods and more low- and reduced-fat food in 2000, the FC would need to be revised to accommodate these changes and retested prior to use.

153 Chapter 6: Conclusions and suggestions for future work

Serve sizes in the current FC were based on US data (Pao et al. 1975). Data on typical serve sizes of Australian adults in the 1983 National Dietary Survey were not available at the time of the study. These have since been published (Australian Institute of Health and Welfare, 1994). However, data on typical serve sizes for the 1995 National Nutrition Survey are not yet available. A modification in serve size in the FC may be warranted if the typical serve sizes of the foods people usually consume have changed substantially.

Many dietary validation studies have used, and continue to use statistical comparative tests (ie comparison of means of nutrient values or foods, correlation between nutrients), which are dictated by the conventional wisdom of the time to assess the performance of a new method of measurement. This study has demonstrated that applying these conventional methods may be misleading and erroneous in determining 'true' agreement. Researchers are now more interested in defining the source and magnitude of measurement error and biases inherent in measuring food intake and reporting the variability in agreement measures. The cross comparison techniques, manual scoring of errors and Bland Altman plots used in this study are recommended for determining the measurement errors of any new dietary method.

Finally, the conclusion that the FC developed in this thesis could replace a three-day diet record was not confirmed in either the clinic patients or university students tested. However, any conclusion that the FC is a truly accurate measure of fat intake in the three days of measurement would be misleading because, in reality, its validation was based on a comparison made with an imperfect criterion or reference method, a FR. There is presently no truly 'accurate' method of measuring food, and hence fat intake, in free-living people. The final FC is, however, not without useful applications in a clinical as well as a community setting.

154 Bibliography

BIBLIOGRAPHY

Abrahams, S.F., Mira, M., Beumont, P.J.V., Sowerbutts, T.D. & Llewellyn-Jones, D. 1988, 'Eating behaviours among young women', Med J Aust, vol. 2, pp. 225-28. Altman, D.G. 1991, Practical Statistics for Medical Research, Chapman and Hall, London. pp. 163-174, 191, 197,284-307,401-418,437-460. Altman, D.G. & Gardner, M.J. 1992, 'Confidence intervals for research findings', Br J Obstet Gynaecol, vol. 99, no. 2, pp. 90-l. Angus, R.M., Sambrook, P.N.M., Pocock, N.A. & Eisman, J.A. 1989, 'A simple method for assessing calcium intake in Caucasian women', JAm Diet Assoc, vol. 89, no. 2, pp. 209-14. Armstrong, B. & Doll, R. 1975, 'Environmental factors and cancer incidence and mortality in different countries with special reference to dietary practices', Int J Cancer, vol 15, pp. 617- 631. Ashton, B.A., Marks, G.C., Battistutta, D., Green, A.C. & The Nambour Study Group 1996, 'Under-reporting of energy intake in two methods of dietary assessment in the Nambour Trial', Aust J Nutr Diet, vol. 53, pp. 53-60. Australian Bureau of Statistics (ABS) I Commonwealth Department of Health and Family Services (CDHFS), 1997, National Nutrition Survey; Selected Highlights Australia, 1995, Australian Bureau of Statistics, Cat No. 4802, Canberra. Australian Institute of Health and Welfare 1994, 'Foods Commonly Consumed in the 1983 National Dietary Survey of Adults: Amounts per Eating Occasion', Australian Institute of Health and Welfare, Canberra. Baghurst, K.l. & Baghurst, P .A. 1981, 'The measurement of usual dietary intake in individuals and groups', Trans Menz Found, vol. 3, pp. 139-60. Baghurst, K.l., Crawford D., Worsley, A., Syrette, J.A., Record, S.J. & Baghurst, P.A. 1988, 'The Victorian Nutrition Survey: A profile of the energy, macronutrient and sodium intake of the population', Comm Health Stud, vol. XII, no. 1. pp. 42-54. Baghurst K.l., Crawford, D.A., Worsley, A. & Record, S.J. 1989, 'The Victorian Nutrition Survey- intakes and sources of dietary fats and cholesterol in the Victorian population', Med J Aust, vol. 149, pp. 12-19. Baghurst, K.l. 1992, Personal communication, CSIRO Division of Human Nutrition, Adelaide. Baghurst, K.l. 1993, 'The food frequency technique and its relevance to population surveys in Australia- a commentary', Aust J Nutr Diet, vol. 49, no. 3, pp. l 014. Balogh, M., Kahn, H.A. & Medalie, J.H. 1971, 'Random repeat 24-hour dietary recalls', Am J Clin Nutr, vol. 24, pp. 304-l 0. · Balogh, M., Medalie, J.H., Smith, H. & Groen, J.J. 1968, 'The development of a dietary questionnaire for an ischemic heart disease survey', Isr J Med Sci, vol. 4, pp. 195-203. Barratt, A., Reznick, R., Irwig, L., Simpson, J.M., Oldenburg, B., Horvath, J. & Sullivan, D. 1994, 'Work-site cholesterol screening and dietary intervention: the Staff Healthy Heart Project', Am J Public Health, vol. 84, pp.779-82. Basiotis, P.P., Welsh, S.O., Cronin, F.J., Kelsay, J.L. & Mertz, W. 1987, 'Number of days of food intake records required to estimate individual and group nutrient intakes with defined confidence',JNutr, vol. 117, pp. 1638-41.

155 Bibliography Beard, T.C., Eickhoff, R., Mejglo, Z.A., Jones, M., Bennett, S.A. & Dwyer, T. 1992, 'Population-based survey of human sodium and potassium excretion', Clin Exp Pharmacal Physiol, vol 19, no.S, pp. 327-30. Beaton, G.H., Burema, J. & Ritenbaugh, C. 1997, 'Errors in interpretation of dietary assessments', Am J Clin Nutr, vol. 65, (4 Suppl), pp. 11 OOS-11 07S. Beaton, G.H., Milner, J., McGuire, V. feather, T.E. & Little, J.A. 1983, 'Source ofvariance in 24 hour dietary recall: Implication for nutrition study design and interpretation; carbohydrate sources, vitamins and minerals', Am J Clin Nutr, vol. 37, no. 6, pp. 986-95. Bellach, N. 1993, 'Remarks on the use of Pearson's correlation coefficient and other association measures in assessing validity and reliability of dietary assessment', Eur J Clin Nutr, vol. 47, (Suppl2), pp. S42-5. Berenstein, M., Rothstein, H. & Cohen, J. 1997, Power and Precision: A computer program for statistical power analysis and confidence intervals, Biostat, NJ, pp. 22. Bergman, B.A., Boyings, J.C. & Erickson, M.L. 1990, 'Comparison of a food frequency questionnaire and a 3-day diet record', JAm Diet Assoc, vol. 90, no. 10, pp. 1431-3. Bingham, S. 1985, 'Aspects of dietary survey methodology', British Nutrition Foundation, Nutrition Bulletin 44, vol. I 0, no. 2, pp. 90-103. Bingham, S.A. & Cummings, J.H. 1985, 'Urine nitrogen as an independent validatory measure of dietary intake: a study of nitrogen balance in individuals consuming their normal diet', Am J Clin Nutr, vol. 42, pp. 1276-1289. Bingham, S.A. 1987, 'The dietary assessment of individuals; methods, accuracy, new techniques and recommendations', Nutr Abst Rev, vol. 57, pp. 705-43. Bingham, S.A., Nelson, M., Paul, A.A., Haraldsdottir, J., Bjorge-Loken, E. & van Staveren, W.A. 1988, 'Methods for data collection at an individual level', in Manual on Methodology of Food Consumption Studies, eds, M.E. Cameron & W.A. Van Staveren, Oxford University Press, NY, pp. 53-106. Bingham, S.A.,Gill, C., Welch, A., Day, K., Cassidy, A., Khaw, K.T., Sneyd, M.J., Key, T.J.A., Roe, L. & Day, N.E. 1994, 'Comparison of dietary assessment methods in nutritional epidemiology: weighed records v. 24-h recalls, food-frequency questionnaires and estimated diet records', Br J Nutr, vol. 72, pp. 619-43. Bjorntorp, P. 1986, 'Fat patterning and disease: a review', in Human Body Composition and Fat Distribution: Report of an EC Workshop, London, December, 1985, ed. E.G. Norgan, Euro­ Nut, Wageningen. Black A.B., Goldberg, G.R., Jebb, S., Livingstone, M.B.E., Cole, T.J. & Prentice, A.M. 1991, 'Critical evaluation of energy intake data using fundamental principles of energy physiology: 2. Evaluating the results of published surveys', Eur J Clin Nutr, vol. 45, pp. 583-599. Black, A.B., Prentice, A.M.,Goldberg, G.R., Jebb, S.A., Bingham, S.A., Livingstone, M.B. & Coward, W.A. 1993, 'Measurements of total energy provide insights into the validity of dietary measurements of energy intake', JAm Diet Assoc, vol. 75, pp. 572-9. Bland, J.M. & Altman, D.G. 1986, 'Statistical methods for assessing agreement between two methods of clinical measurement', The Lancet, Feb 8, pp. 307-8. Bland, J.M. & Altman, D.J. 1990, 'A note on the use of the intra-class correlation in the evaluation of agreement between two methods of measurement', Comput Biol Med, vol. 20, no. 5, pp. 337-340. Block, G. 1982, 'A review of validations of dietary assessment methods', Am J Epidemiol, vol. 115, pp. 492-505.

156 Bibliography Block, G., Hartman, A.M., Dresser, C.M., Carroll, M.D., Gannjon, J. & Gardner, L 1986, 'A data based approach to diet questionnaire design and testing', Am J Epidemiol, vol. 1 I, no. 3, pp. 453-69. Block, G. 1989, 'Human dietary assessment: methods and issues', Prev Med, vol. I8, pp. 653- 60. Block, G., Hartman, A.M., Dresser, C.M., Carroll, M.D., Gannon, J. & Gardner, L. 1986, 'A data-based approach to diet questionnaire design and testing', Am J Epidemiol, vol. 124, no. 3, pp. 1453-69. Block, G. & Hartman, A.M. I989, 'Issues in reproducibility and validity of dietary studies', Am JC!in Nutr, vol. 50, pp. 1133-8. Block, G., Clifford, C., Naughton, M.D., Henderson, M. & McAdams, M. 1989, 'A brief dietary screen for high fat intake', J Nutr Educ, vol. 21, pp. I 99-207. Block, G. & Subar A. 1992, 'Estimates of nutritional intake from a food frequency questionnaire: The 1987 National Health Interview', JAm Diet Assoc, vol. 92, pp. 969-77. Block, G., Thompson, F.E., Hartman, A.M., Larkin, F.A. & Guire, K.E. 1992, 'Comparison fo two dietary questionnaires validated against multiple diet records during a 1-year period', JAm Diet Assoc, vol. 92, no, pp. 696-93. Borenstein, M., Rothstein, H. & Cohen J. 1997, Power and Precision: A Computer Program for Statistical Power Analysis and Confidence Intervals, Biostats, NJ, pp.5, 22. Bradburn, N.M., Rips, L.J. & Shevell, S.K. 1987, 'Answering autobiographical questions: The impact of memory and inference on surveys', Science, vol. 236, pp. 157-61. Brekke, M.J., Brekke, M., Otter, E.S., Peters, J.R., Mullis, R.M. & Hunninghake, D.B. 1992, 'Appropriate days for measuring intake of fat and cholesterol', Ann Nutr Metab, vol. 36, pp. 318-27. British Standards Institute. 1979, Precision ofTest Methods I: Guidefor the Determination and Reproducibility of a Standard Test Method (BS 5497, Part 1), British Standards Institute, London, BSI. Burema, J., van Staveren, W.A. & van der Brandt, P.A. 1988, 'Validity and reproducibility', in Manual on Methodology of Food Consumption Studies, eds. M.E. Cameron & W.A. Van Staveren, Oxford University Press, NY, 68-92, 171-82. Burke, B.S. 1947, 'The dietary history as a tool in research', JAm Diet Assoc, vol. 23, pp. 1041-6. Buzzard, I.M., Asp, E.H., Chlebowski, R.T., Boyar, A.P., Jeffrey, R.W., Blackburn, G.L., Jochimsen, P.R., Scanlon, E.F. & lnsull, W. 1990, 'Diet intervention methods to reduce fat intake: nutrient and food groups composition of self-selected low-fat diets', JAm Diet Assoc, vol. 90, no.1, pp. 52-50, 53. Buzzard, I.M. & Sievert, Y.A. 1994, 'Research priorities and recommendations for dietary assessment methodology', Am J Clin Nutr, vol. 59 (Suppl), pp. 275-80. Byers, T., Marshall, J., Fiedler, R., Zielezny, M. & Graham, S. 1985, 'Assessing nutrient intake with an abbreviated dietary interview', Am J Epidemiol, vol. 122, no. 1, pp. 41-50. Cade, J.E. 1988, 'Are diet records using household measures comparable to weighed intakes', J Hum Nutr Diet, vol. I, no. 3, pp. 171-8. Cade, J.E. & Margetts, B.M. 1988, 'Nutrient sources in the English diet: quantitative data from three English towns', Int J Epidemiol, vol. 17, no. 4, pp. 844-48. Callmer, E., Riboli, E., Saracci, R., Akesson, B. & Lindgardes, F. 1993, 'Dietary assessment methods evaluated in the Malmo food study', lnt J Intern Med, vol. 233, pp. 53-7.

157 Bibliography

Cameron, M.E. & van Staveren, W.A. eds. 1988, Manual on Methodology_ of Food Consumption Studies, Oxford University Press, NY, pp. 68-92, 171-82. Carter, R.L., Sharbough, C.O. & Stapell, C.A. 1981, 'Reliability and validity of the 24 hour recall', JAm Diet Assoc, vol. 79, pp. 5472-7. Cashel, K., English, R., Bennett, S., Berzins, J., Brown, G. & Magnus, P. 1986, National Dietary Survey of Adults: 1983, No. I, Foods Consumed, Commonwealth Department of Health, Australian Government Publishing Service, Canberra. Cashel, K. & Lester, I. I 987, NUTTAB, Commonwealth Department of Community Services and Health, Canberra. Cashel, K., English, R. & Lewis, J. 1989, Composition ofFoods, Australia, AGPS, Canberra. Cashel, K. 1990, 'Compilation and scrutiny of food composition data', in Uses and Abuses of Food Composition Data. Supp to Food Aust, ed. H. Greenfield, vol. 42, no. 8, pp. S21-4. Cashel K.M. & Greenfield, H. 1997, 'Population nutrition goals and targets for Australia: Influences of new Australian Food Composition Data', J Food Compand Anal, vol. 10, pp. 176-189. Cohen, J. 1988, Statistical Power Analysis for the Behavioral Sciences, Lawrence Erlbaum Assoc, New Jersey, pp. 4, 54-56, 79-80, 103, 416, 420, 428. Cummings, S.R., Block, G., McHenry, K. & Baron, R.B. 1987, 'Evaluation of two food frequency methods of measuring dietary calcium intake', Am J Epidemiol, vol. 126, pp. 796- 802. Cunningham, J.H. 1990, 'Sampling of foods for nutrient composition studies', in Uses and Abuses of Food Composition Data. Supp to Food Aust ed. H. Greenfield, vol. 42, no. 8, pp. Sl6-17, 28. Curran-Everett. D., Taylor, S. & Kafadar, K. 1998, 'Fundamental concepts in statistics: elucidation and illustration' J Neurophysiol, vol. 85, no. 3, pp. 775-86. Curtis, A.B., Musgrave, K.O. & Klimis-Tavantis, D. 1992, 'A food frequency questionnaire that rapidly assesses intake of fat, saturated fat, cholesterol and energy', JAm Diet Assoc, vol. 92,no. 12,pp. 1517-1519. Danforth, E. 1985, 'Diet and obesity',AmJClin Nutr, vol. 41, pp. 1132-45. Daniels, L. 1984, 'Collection of dietary data from children with cystic fibrosis: Some problems and practicalities', Human Nutr:Appl Nutr, vol. 38A, pp. 110-18. Delcourt, C., Cubeau, J., Balkau, B. & Papoz, L. 1994, 'Limitations in the correlation coefficient in the validation of diet assessment methods', Epidemiol, vol 5, no 5, pp. 518-24. Dennis, B. & Shifflett, P.A. 1985, 'A conceptual and methodological model for studying dietary habits in the community', Ecol Food Nutr, vol. 17, pp. 253-62. Department of National Health and Welfare 1990, Action towards Healthy Eating: Canada';s Guidelines for Healthy Eating and Recommended Strategies for Implementation, cat. no: H39- 166/1990E, Minister ofNational Health and Welfare, Canada Diekhoff, G. 1992, Statistics for the Social and Behavioral Sciences: Univariate, Bivariate and Multivariate, Wm. C. Brown Pub!. Dubuque, IA, pp. 220,275. Dobson, A.J., Bilijlevens, R., Alexander, H.M., Croce, N., Heller, R.F., Higginbotham, N., Pike, G., Plotnikoff, R., Russell, A. & Walker, R. 1993, 'Short fat questionnaire: A self­ administered measure of fat-intake behaviour', Aust J Public Health, vol. 17, no. 2, pp. 144-9. Drevon, C.A., Solvoll, K.M., Lund-Larsen, K., Sandstad, B., Tande, T., Baksaas, I. & Soyland, E. 1991, 'Plasma fatty acid pattern as a biological marker for dietary intake of long chain n-3

158 Bibliography fatty acids', in Biomarkers ofDietary Exposure. Proceedings ofthe 3rd Meeting on Nutritional Epidemiology, F.J. Kok, &. P. van't Veer, eds, Smith Gordon, London, p. 93. Dwyer, J.T. & Krall, E.A. 1988, 'The problem of memory in nutritional epidemiological research', JAm Diet Assoc, vol. 88, pp.1250-7. Eck, I.H., Klesges, R.C., Hanson, C.L., Slawson, D., Portis, L. & Lavasque, M.E. 1991, 'Measuring short-term dietary intake: development and testing of a 1-week food frequency questionnaire', JAm Diet Assoc, vol. 91, no. 8, pp. 940-5. Eck, L.H., Klesges, L.M. & Klesges, R.C. 1996, 'Precision and estimated accuracy of two short-term food frequency questionnaires compared with recalls and records', J Clin Epidemiol, vol. 49, no. 10, pp. 1196-1200. El Lazy, M. 1981, 'Dietary variability and its impact on nutritional epidemiology', J Chron Dis, vol. 36, pp.237-49. Engle, A., Lynn, L.L., Koury, K. & Boyar, A.P. 1990, 'Reproducibility and comparability of a computerised self-administered food frequency questionnaire', Nutr Cancer, vol. 13, pp. 282- 92. English, R., Cashel, K., Bennett, S., Berzins, J. & Magnus, P. 1987, National Dietary Survey of Adults. 1983 No.1. Nutrient Intakes, AGPS, Canberra, Australia English, R. & Lewis, J. 1992, Nutritional Values of Australian Foods, Department of Community Services and Health, AGPS,Canberra. Epstein, L.M., Reshef, A. & Abramson, J.F. 1970, 'Validation of a short dietary questionnaire', IsrJ Med Sci, vol. 6, pp. 589-96. Faggiano, F., Vineis, P., Cravanzola, D., Pisani, P., Xompero, G., Riboli, E. & Kaaks, R. 1992, 'Validation of a method for the estimation of food portion size', Epidemiol, vol. 3, no. 4, pp. 379-83. Feskanich, D., Rimm, E.B., Giovannucci, E.I., Colditz, G.A., Stampfer, M.J., Litin, L.B. & Willett, W.C. 1993, 'Reproducibility and validity of food intake measurements from a semi­ quantitative food frequency questionnaire', JAm Diet Assoc, vol. 93, no. 7, pp. 790-6. Feskanich, D., Marshall, J., Rimm, E.B., Litin, L.B. & Willett, W.C. 1994, 'Simulated validation of a brief food frequency questionnaire', Ann Epidemiol, vol. 4, no. 3, pp. 257-8. Feunekes, G.I., Van Staveren, W.A., De Vries, J.H., Burema, J. & Hautvast, J.G. 1993, 'Relative and biomarker-based validity of a food-frequency questionnaire estimating intake of fats and cholesterol', Am J Clin Nutr, vol. 58, no. 4, pp. 489-96. Fleiss, J.L. 1981, Statistical methods for rates and proportions, Wiley and Sons, NY, pp. 143- 55. Fogelholm, M. & Lahti-Koski, M. 1991, 'The validity of a food use questionnaire in assessing the nutrient intake of physically active young men', Eur J Clin Nutr; vol. 45, pp. 267-72. Forbes-Ewan, Ch., Morrissey, B.L., Gregg, G.C. & Walters, R.R. 1989, 'Use of doubly labeled water technique in soldiers training for jungle warfare', J Appl Physiol, vol67, no. 1, pp.14-18. Gardner, M.J & Altman, D.G. eds. 1989. Statistics with Confidence: Confidence Intervals and Statistical Guidelines, British Medical Journal, London, pp. 4, 34, 77, 93-4. Garrow, J.S. 1995, 'Validations of methods for estimating habitual diet; proposed guidelines [editorial]', Eur J Clin Nutr, vol. 49, no. 4, pp. 231-2. Gersovitz, M., Madden, J.P. & Smicklas-Wright, H. 1978, 'Validity of the 24- hour dietary recall and a 7-day record for group comparisons', JAm Diet Assoc, vol. 73, pp. 48-55. Goldberg, G.B., Black, A.E., Jebb, S.A., Cole, T. J., Murgatroyd, P.R., Coward, W.A. & Prentice, A.M. 199 I, 'Critical evaluation of energy intake data using fundamental principles of

159 Bibliography energy physiology; I. Derivation of cut-off limits to identify under-recording', Eur J Clin Nutr, 45, pp. 569-581. Goldbohm, R.A., van den Brandt, P.A., Brants, H.A., van't Verr, P., AI, M., Sturmans, F. & Hermus, R. J. 1994, 'Validation of a dietary questionnaire used in a large-scale prospective cohort study on diet and cancer', Eur J Clin Nutr, vol. 48, no. 4, pp.253-65. Goodwin, P.J. & Boyd, N.F. 1987, 'Critical appraisal of the evidence that dietary fat intake is related to breast cancer risk in humans', J Nat Cancer lnst, vol. 79, pp. 473-85. Graham, S., Marshall, J., Haughey, B., Mittelman, A., Swanson, M., Zielezny, M., Byers, T., Wilkinson, G. & West, D. 1988, 'Dietary epidemiology of cancer ofthe colon in western New York', Am J Clin Epidemiol, vol. 128, pp. 490-503. Greeley, S., Storbakken, L. & Magel, R. 1992, 'Use of a modified food frequency questionnaire during pregnancy'. JAm Col! Nutr, vol. 11, no. 6, 728-34. Greenfield, H. & Southgate, D.A.T. 1992, 'Food composition data, production, management', Experiences in food composition studies at the national and international level', Proc Nutr Soc Aust, vol. 16, pp. 96- I 03. Guthrie, H:A. 1984, 'Selection and quantification of typical food portions by young adults', J Am Diet Assoc, vol. 84, no. 12, pp. 1440-4. Hankin, J .H., Stallones, R.A. & Messinger, H. B. I 968, 'A short dietary method for epidemiologic studies. III. Developement of questionnaire, Am J Epidemiol, vol. 87, no. 2, pp 285-98. Hankin, J.H., Nomura, A.M., Lee, J., Hirohata, T. & Kolonel, L.N. 1983, 'Reproducibility of a diet history questionnaire in a case-control study of breast cancer', Am J Clin Nutr, vol. 37, pp. 981-5. Hankin, J.H. 1988, 'Validation problems', in Manual on Methodology of Food Consumption Studies, eds. M.E. Cameron & W.A. Van Staveren, Oxford University Press, NY, pp. 183-89. Hebert, J.R. & Miller, D. R. 1991, 'The inappropriateness of conventional use ofthe correlation coefficient in assessing validity and reliability of dietary assessment methods', Eur J Epidemiol, vol. 7, no. 4, pp. 339-43. Hertzler, A.A. & McAnge, T.R. 1986, 'Development of an iron checklist to guide food intake', JAm Diet Assoc, vol. 86, no. 6, pp. 782-6. Hjermann, I., Velve-Byre, K., Holme, I. & Leren, P. 1981, 'Effect of diet and smoking on the incidence of coronary heart disease. Report on the Oslo Study Group of a randomised trial in healthy men', Lancet, vol. ii, pp. 1303-10. Horwath, C.C. 1990, 'Food frequency questionnaires: A review', Aust J Nutr Diet, vol. 47, No. 3, pp. 71-6. Horwath, C.C. & Worsley, A. 1990, 'Assessment of the validity of a food frequency questionnaire as a measure of food use by comparison with direct observation ·of domestic food stores', Am J Epidemiol, vol. 131, no. 6, pp. 1059-67. Horwath, C.C. 1993,' Validity of a short food frequency questionnaire for estimating nutrient intakes in elderly people', Br J Nutr, 70, pp. 3-14. Hunter, D.J., Sampson, L., Stampfer, M.J., Colditz, G.A., Rosner, B. & Willett, W.C. 1988, 'Variability in portion sizes of commonly consumed foods among a population of women in the United States',AmJ Epidemiol, vol. 127, pp. 1240-9. Hunter, D. 1990, 'Biochemical indicators of dietary intake', in Nutritional Epidemiology, ed. W.C. Willett, Oxford University Press, NY, pp. 143-216. Hunter, D.J., Rimm, E.B., Sacks, F.M., Stampfer, M.J., Coldite, G.A., Litin, L.B. & Willett, W.C. 1992, 'Comparisons of measures of fatty acid intake by subcutaneous fat aspirate, food 160 Bibliography frequency questionnaire and diet records in a free-living population of US men', Am J Epidemiol, vol. 135, pp. 418-27. Irwig, L.M. & Simpson, J.M. 1989, 'Assessing agreement', Med J Aust, vol 151, pp. 235-36. Jain, M.G., Harrison, G.L., Howe, G.R. & Miller, A.B. 1982, 'Evaluation of a self-administered dietary questionnaire for use in a cohort study', Am J Clin Nutr, vol. 36, pp. 931-5. Jain, M. 1989, 'Diet history: Questionnaire interview techniques used in some retrospective studies of cancer', JAm Diet Assoc, vol. 89, pp. 1647-52. Karras, D.J. 1997, 'Statistical methodology: II. Reliability and variability assessment in study design, Part A', A cad Em erg Med, vol 4, no. I, pp. 64-71. Kemppainen T., Rosendahl., A., Nuutinen, 0., Ebeling, T., Pietinen, P. & Uusitupa, M. 1993, 'Validation of a short dietary questionnaire and a qualitative fat index for the assessment of fat intake', Eur J Clin Nutr, vol. 47, no. 11, pp. 765-75. Keys, A. 1980, Seven Countries: A Multivariate Analysis of Death and Coronary Heart Disease, Harvard University Press, Cambridge, pp. 252. Kinlay, S., Heller, R.F. & Halliday, J.A. 199I, 'A simple score and questionnaire to measure group change in dietary fat intake', Prev Med, vol. 20, no. 3, pp. 3 78-88. Kohlmeier, L. 1991, 'What you should know about your marker', in Biomarkers of Dietary Exposure, Proceedings of the 3rd Meeting on Nutritional Epidemiology, eds. F.J. Kok & P. van't Veer, Smith Gordon, London, pp. 15-I6. Kohlmeier, L. I 992, 'Problems and pitfalls of food-to-nutrient conversions', in Food and Health Data: Their Use in Nutrition Policy-Making. WHO Regional Publications, European series, No. 34. Copenhagen, WHO pp. 73-84. Krall, E.A. & Dwyer, J.T. 1987, 'Validity of a food frequency questionnaire and a food diary in a short-term recall situation', JAm Diet Assoc, vol. 87, pp. 1374-7. Krista!, A.R., Shattuck, A.L., Henry, H.J. & Fowler, A.S. I990a, 'Rapid assessment of dietary intake of fat, fiber and saturated fat: validity of an instrument suitable for community intervention research and nutritional surveillance', Am J Health Prom, vol. 4, pp. 288-95. Krista!, A.R., Shattuck, A.L. & Henry, H.J. I990b, 'Patterns of dietary behaviour associated with selecting diets low in fat: Reliability and validity of a behavioral approach to dietary assessment', JAm Diet Assoc, vol. 90, pp. 2I4-20. Krista!, A.R., Abrams, B.F., Thornquist, M.D., Dosogra, L., Croyle, R.T., Sgattuck, A.L. & Henry, H.J. 1990c, 'Development and validation of a food use checklist for evaluation of community nutrition intervention', Am J Pub! Health, vol. 80, pp. 3 I 8-22. Krista!, A.R., Beresford, S.A. & Lazovich, D. 1994, 'Assessing change in diet-intervention research', Am J Clin Nutr, vol. 59 (I suppl), pp. 185S-9S. Kuzma, J.W. 1992, Basic Statistics for the Health Sciences, 2nd edn, Mayfie!d Publishing Co, CA. pp. 200-202. Larkin, F.A., Metzner, H.L., Thompson, F.E., Flegal, K.M. & Guire, KE. 1989, 'Comparison of estimated intakes by food frequency and dietary records in adults', JAm Diet Assoc vol. 89, pp. 2I5-23. Laschinger, H.K. 1992, 'Intraclass correlations as estimates of interrater reliability in nursing research', West J Nurs Res, vol 14, no. 2, pp. 246-52. Last, J. 1988, A Dictionary of Epidemiology, 2nd edn, Oxford University Press, NY. pp. 132. Lazarus, R., Baur, L., Webb, K. & Blyth, F. 1996, 'Body mass index in screening for adiposity in children and adolescents; systematic evaluation using receiver operator characteristic curves', Am J Clin Nutr, vol 63, no. 4, pp. 500-6. 161 Bibliography

Lee, J., Kolonel, L.M. & Hankin, J.H. 1983, 'On establishing the interchangeability of different dietary-intake assessment methods used in studies of diet and cancer', Nutr Cancer, vol 5, pp. 215-18. Lee, J., Koh, D. & Ong, C, 1989, 'Statistical evaluation of agreement between two methods for measuring a quantitative variable', Comput Bioi Med, vol 19, pp. 61-70. Lee, R.D. & Nieman, D.C. 1993, 'Measurement of diet', in Nutritional Assessment, WC Brown Comm Inc, Iowa, pp. 49-76, 103-20. Lee-Han, H., McGuire, V. & Boyd, N.F. 1989, 'A review of the methods used by studies of dietary measurement', J Clin Epidemio/, vol. 42, no. 3, pp. 269-79. Lewis, J. & Holt, R. 1992, NUTAB91-92 Nutrient Data Table for Use in Australia, National Food Authority, AGPS, Canberra. Lewis, J. Milligan, G. & Hunt, A. 1995, NUTAB95 Nutrient Data Table for Use in Australia, National Food Authority, AGPS, Canberra. Ling, A.M., Horwath, C. & Parnell, W. 1998, 'Validation of a short food frequency questionnaire to assess consumption of cereal foods, fruit and vegetables in Chinese Singaporeans', Eur J C/in Nutr, vol. 52, no. 8, pp. 557-64. Lipid Research Clinics Program 1984, 'The Lipid Research Clinics Coronary Primary Prevention Trial Results. I. Reduction in the incidence of coronary heart disease', JAMA, vol. 251,no.3,pp.351-363. Maclure, M. & Willett, W.C. 1987, 'Misinterpretation and misuse of the kappa statistic', Amer J Epidemiol, vol 126, no. 2, pp. 161-69. Marr, J.W. & Heady, J.A. 1986, 'Within and between-person variation in dietary surveys: Number of days needed to classify individuals', Hum Nutr:Appl Nutr, vol. 40A, pp. 347-64. Marr, J.W. 1971, 'Individual dietary surveys: Purposes and methods', Wid Rev Nutr Diet, vol. 13, pp. 105-64. Martin, M.J., Halley, S.B. & Browner, W.S. 1986, 'Serum cholesterol, blood pressure and mortality: implications from a cohort of361,662 men', Lancet, vol. 2, pp. 933-6. Martin, L.J., Lockwood, G.A., Math, M., Krista!, A.R., Kriukov, V., Greenberg, C., Shatuck, A.L. & Boyd, NF. 1997, 'Assessment of a food frequency questionnaire as a screening tool for low fat intakes', Cont Clinl Trials, vol 18, pp. 241-250. McMichael, A.J. 1988, 'Diet and large bowel cancer: prospects for prevention', in Preventing Cancer, ed. M. Tattersall, Australian Professional Publications, Mosman, NSW, pp. 129-40. Mertz, W., Tsui, J.C., Judd, J.T., Reiser, S., Hallfrisch, J., Morris, E.R., Steele, P.O. & Lashley, E. 1991, 'What are people really eating? The relation between energy intake derived from estimated diet records and intake determined to maintain body weight', Am J Clin Nutr, vol. 54, pp. 291-5. . McLennan, W. & Podger, A. 1993, Apparent Consumption of Foodstuffs and Nutrients: Australia 1990-91, Australian Bureau of Statistics, Canberra. McLennan, W. & Podger, A. eds, 1998, National Nutrition Survey: Nutrient Intakes and Physical Measurements Australia 1995, Catalogue No. 4805.0, Australian Bureau of Statistics and Department of Health and Family Services, Canberra. Millar, B.D. & Beard, T.C. 1988, 'Avoidance of dietary sodium - a simple questionnaire', Med J Aust, vol. 149, pp. 190-2. Morgan, K.J., Johnson, S.R., & Goungetas, B. 1987, 'Variability of food intakes. An analysis of a 12-day data series using persistence measures', Am J Clin Nutr, vol. 35, pp. 1258-68.

162 Bibliography

Mullen, B.J., Kranzler, N.J., Grivetti, H.G., Shutz, H.G & Meiselman, H.L. 1984, 'Validity of a food frequency questionnaire for the determination of individual food intake', Am J Clin Nutr, vol. 39, pp. 136-43. Munger, R.G., Folsom, A.R., Kushi, L.H., Kaye, S.A. & Sellers, T.A. 1992, 'Dietary assessment of older Iowa women with a food frequency questionnaire: nutrient intake, reproducibility and comparison with 24-hour recall interviews', Am J Epidemiol, vol. 136, no. 2, pp. 192-200. Munro, B.H. 1997, Statistical Methods for Health Care Research, 3rd edn, Lippincott, Philadephia, pp. 78, 248, 455. Munro, S. Birze, I. & Samman S. 1995, 'Evaluation of dietary assessment methods used in Australian lipid clinics: a pilot study', Aust J Nutr Diet, vo1.52, pp. 25-28. Murphy, R.S. & Sempos, C. 1989, 'Design, consideration and dimensions of diet in the Third National Health and Nutrition Examination Survey (NHANES III)', in Epidemiology, Nutrition and Health, eds, L. Kohlmeier & E. Reising, Smith-Gordon, London, vol. 149, pp. 12-20. Musgrave, K.O., Giambalvo, L., LeClerc, H.L. & Cook, R.A. 1989, 'Validation of a quantitative food frequency questionnaire for rapid assessment of dietary calcium intake', Am J Clin Nutr, vol. 89, pp. 1484-8. National Health and Medical Research Council, 1991, Recommended Dietary Intakes for Use in Australia, AGPS, Canberra, pp. 32. National Health and Medical Research Council, 1992a, Dietary Guidelines for Australians, AGPS, Canberra. National Health and Medical Research Council, 1992b, The Role ofPolyunsaturated Fats in the Australian Diet, AGPS, Canberra Nelson, M., Hague, G.F., Cooper, C. & Bunker, V.W. 1988, 'Calcium intake in the elderly: Validation of a dietary questionnaire', J Hum Nutr Diet, vol. 1, pp. 115-27. Nelson, M., Black, A.E., Morris, J.A. & Cole, T.J. 1989, 'Between- and within-subject variation in nutrient intake form infancy to old age: estimating the number of days required to rank dietary intakes with desired precision', Am J Clin Nutr, vol. 50, pp. 155-67. Nes, M., Andersen, L.F., Solvoll, K., Sandstad, B., Hustvedt, B.E., Lovo, A. & Drevon, C.A. 1992, 'Accuracy of a quantitative food frequency questionnaire applied to elderly Norwegian women', Eur J Clin Nutr, vol. 46, no. 11, pp. 809-21. Nunnally J, C. 1978, Psychometric Theory, McGraw-Hill, NY. Nutbeam, D., Wise, M., Bauman, A., Harris, E. & Leeder, S. I 993, Goals and Targets for Australia's Health in the Year 2000 and Beyond, Department of Public Health, University of Sydney, Sydney. Pao, E.M., Fleming, K.H., Guenther, P.M. & Mickle, S.J. 1975, Foods Commonly Eaten by Individuals: Amount per Day and per Eating Occasion, Home Economic Research Report, No 44. Consumer Nutrition Centre, Human Nutrition Information Service, Hyattsville, Maryland. Paul, A. & Southgate, D.A.T. 1988, 'Conversion into nutrients', in Manual on Methodology of Food Consumption Surveys, eds. M.E. Cameron & W.A. van Staveren, Oxford University Press, London, pp. 121-143. Paul, A. & Southgate, D.A.T. 1991, McCance and Widdowsons' The Composition of Foods, Holland Biomedical Press, London. Pekkarinen, M. 1970, 'Methodology in the collection of food consumption data', Wrld Rev Nutr Diet, vol. 12, pp. 145-71. Pietinen, P., Hartman, A.M., Haapa, E., Rasanen, L., Haapakoski, J., Palmgren, J., Albanes, D., Virtamo, J. & Huttenen, J.K. 1988a, 'Reproducibility and validity of dietary assessment 163 Bibliography instruments. I. A self-administered food use questionnaire with a portion size picture booklet', Am J Epidemiol, val. 128, pp. 655-66. Pietinen, P., Hartman, A.M., Haapa, E., Rasanen, L., Haapakoski, J., Palmgren, J., Albanes, D., Virtamo, J. & Huttunen, J.K. 1988b, 'Reproducibility and validity of dietary assessment instruments: II: A qualitative food frequency questionnaire', Am J Epidemiol, val. 128, no. 3, pp. 667-76. Prentice, A.M., Black, A.E., Coward, W.A. & Cole, T.J. 'Energy intake in overweight and obese adults in affluent societies: an analysis of 319 doubly labeled water measurements', 1996, Eur J Clin Nutr, val. 50, pp. 93-97. Pryer, J.A., Vrijheid, M., Nicholls, R., Kiggins, M. & Elliot, P. 1997, 'Who are the 'low energy reporters' in the Dietary and Nutritional Survey of British adults?', Int J Epidemiol, val. 26, pp. 146-154. Remmel, P.S. & Benfari, R.C. 1980, 'Assessing dietary adherence in the Multiple risk factor Intervention Trial (MRFIT). II Food record rating as an indicator of compliance', JAm Diet Assoc, val. 76, pp. 357-60. Richard, L. & Roberge, A.G. 1986, 'Validity of a short method based on food frequency and multiple regression to evaluate the nutrient intakes of French Canadian women', Nutr Res, val. 6, pp. 17-27. Rohan, T.E., Record, S.J. & Cook, M.D. 1987, 'Repeatability of estimates of nutrient and energy intake: The quantitative food frequency approach', Nutr Res, val. 7, pp. 125-37. Rutishauser, I. 1988, 'Making measurements: diet', Menzies Technical Report no. 3, pp. 89- 120. Sabry, J.H. 1988, 'Purposes of food consumption studies', in Manual on Methodology ofFood Consumption Surveys, eds, M.E. Cameron & W.A. van Staveren, Oxford University Press, London. Samet, J.M., Humble, C.G. & Skipper, B.E. 1984, 'Alternatives in the collection and analysis of food frequency interview data', Am J Epidemiol, val. 120, no. 4, pp. 572-81. Schatzkin, A., Greenwald, P., Byer, D.P. & Clifford, C.K. 1989, 'The dietary fat-breast cancer hypothesis is alive', JAMA, vol. 261, pp. 3284-7. Schofield, W.N., Schofield, C. & James, W.P.T. 1985, 'Basal metabolic rate- Review and prediction, together with an annotated bibliography of source material', Hum Nutr: Clin Nutr. Vol39C (Suppl), pp. 1-96. Sempos, C.T., Johnson, N.E., Smith, E.L. & Gilligan, C 1985, 'Effects of intraindividual and interindividual variation in repeated dietary records', Am J Epidemiol, val. 121, no. 1, pp. 120- 30. Sempos, C.T., Briefel, R.R., Flegal, K.M., Johnson, C.L., Murphy, R.S., & Woteki, C.E. 1992, 'Factors involved in selecting a dietary survey methodology for national nl)trition surveys', Aust J Nutr Diet, val. 49, no. 3, pp. 96-100. Serdula, M., Byers, T., Coates, R., Mokdad, A., Simoes, E.J. & Eldridge, L. 1992, 'Assessing consumption of high fat foods into single questions', Epidemiol, val. 3, no. 6, pp. 503-8. Shekelle, R.B., MacMillan-Shryock, A., Oglesby, P., Lepper, M., Stamler, J., Shugley-Liu, M.S. & Raynor, W.J. 1981, 'Diet, serum cholesterol, and death from coronary heart disease', New Eng J Med, val. 304, no.2, pp. 65-70. Slinker, B.K. & Glantz, S.A. 1988, 'Multiple linear regression is a useful alternative to traditional analyses of variance', Am J Physiol, val. 255, no. 3, pt 2, pp. 353-67.

164 Bibliography Smith, W., Mitchell, P., Reay, E.M., Webb, K. & Harvey P.W.J. 1998, 'Validity and reproducibility of a self-administered food frequency questionnaire in older people', Aust NZ J Public Health, vol22, no. 4, pp. 456-63. Sobell, J., Block, G., Kowslowe, P., Tobin, J. & Andres, R. 1989, 'Validation of a retrospective questionnaire assessing diet 10-15 years ago',AmJ Epidemiol, vol. 130, no. I, pp. 173-87. Sorenson, A.W., Caulkins, B.M., Connolly, M.A. & Diamond, E. 1985, 'Comparison of nutrient intake determined by four dietary instruments', J Nutr Educ, vol. 17, pp. 92-8. Stockley, L. 1985, 'Changes in habitual food intake during weighed inventory surveys and duplicate diet collections. A short review', Ecol Food Nutr, vol. 17, pp. 263-269. Strohmeyer, S.L., Massey, L.K. & Davison, M.A. 1989, 'A rapid dietary screening device for clinics', JAm Diet Assoc, vol. 84, pp. 428-32. Stuff, J.E., Garza, C., O'Brian-Smith, E., Nichols, B.L. & Montandon, C. 1983, 'A comparison of dietary methods in nutritional studies', Am J Clin Nutr, vol. 37, pp. 300-6. Stunkard, A.J. & Waxman, M. 1981, 'Accuracy of self-reports of food intake', JAm Diet Assoc, vol. 79, pp. 547-551. Taylor, R.W., Keil D., Gold, E.J., Williams, S.M. & Goulding, A. 1998, 'Body mass index, waist girth, and waist-to-hip ratio as indexes of total and regional adiposity in women. Evaluation using receiver operator characteristics curves', Am J Clin Nutr, col, 67, pp. 44-9 Taylor, R.W. & Goulding A. 1998, 'Validation of a short food frequency questionnaire to assess calciumintake in children aged 3 to 6 years', Eur J Clin Nutr, vol. 52, no. 6, pp. 465-5. Thompson, F.E., Metzner, L., Lamphiear, D.E. & Hawthorne, V.M. 1990, 'Characteristics of individuals and long term reproducibility of dietary reports: The Tecumseh Diet Methodology Study', J Clin Epidemiol, vo!. 43, no. 11, pp. 1169-78. Thompson, R.I. & Margetts, B.M. 1993, 'Comparison of a food frequency questionnaire with a 10-day weighed record in cigarette smokers', Int J Epidemiol, vol. 22, no. 5, pp.824-33. Tjonneland, A.T., Haraldsdottir, J., Overvad, K., Stripp, C., Ewertz, M. & Moller-Jensen, 0. 1992, 'Influence of individually estimated portion size data on the validity of a semiquantitative food frequency questionnaire', Int J Epidemiol, vo!. 21, no. 4, pp. 770-7. US Department of Health and Human Services 1990, Healthy People 2000. Conference Ed, USDHHHS (PHS), Washington DC. Vailas, L.I., Blankenhorn, D.H., Selzer, R.H. & Johnson, R.L. 1987, 'A computerized quantitative food frequency analysis for the clinical setting: use in documentation and counselling', JAm Diet Assoc, vol 87, no. 11, pp. 1539-43. van Assema, P., Brug, J., Kok, G. & Brants, H. 1992, 'The reliability and validity of a Dutch questionnaire on fat consumption as a means to rank subjects according to individual fat intake', EurJCanc Prev, vol. 1, pp. 375-80. Voss, S., Kroke, A., Klipstein-Grobusch, H., & Boeing, H. 1998, 'Is macronutrient composition of dietary intake data affected by underrerporting? Results form the EPIC-Potsdam study', Eur J Clin Nutr, vol. 52, pp. 119-126. Welten, D.C., Kempner H.C., Post, G.B. & Van Staveren, W.A, 1995, 'Comparison of a quantitative dairy questionnaire with a dietary history in young adults', Int J Epidemiol, vo!. 24, no.4,pp. 764-70. Wheeler, C., Rutishauser, 1., Conn, J. & O'Dea, K. 1994, 'Reproducibility of a meal-based food frequency questionnaire. The influence of format and time interval between questionnaires', Eur J Clin Nutr, vol. 48, pp. 795-809.

165 Bibliography

Wheeler, C.E., Rutishauser, I.H.E. & O'Dea, K. 1995, 'Comparison of nutrient intake data from two food frequency questionnaires and weighed records', Aust J Nutr Diet, vol. 52, no. 3, pp. 140-8. WHO Study Group 1990, 'Diet, nutrition and the prevention of chronic diseases', WHO Technical Report Series no. 797, WHO, Geneva. Wilkins, J.A., Boland F.J. & Albinson J. 1991 'A comparison of male and female university athletes and non athletes on eating disorder indices: Are athletes protected', Int J Sport Behav, vol 14,no.2,pp. 129-43. Willett, W.C. 1990, Nutritional Epidemiology, ed. W.C. Willett, Oxford University Press, NY, pp. 41' 52-68, 69-91' 92-126, 245-271. Willett, W.C., Stampfer, M.J., Underwood, B.A., Speizer, F.E., Roxner, B. & Hennekens C.H. 1983, 'Validation of a dietary questionnaire with plasma carotenoid and alpha tocopherol levels', Am JClin Nutr, vol. 38, pp. 631-9. Willett, W.C., Sampson, L., Stampfer, M.J., Rosner, B., Bain, C., Witschi, J., Hennekens, C.H. & Speizer, F.E. 1985, 'Reproducibility and validity of a semiquantitative food frequency questionnaire',AmJ Epidemiol, vol. 122, pp. 51-65. Willett, W.C., Reynolds, R.D., Cottreii-Hoehner, M.S., Sampson, L. & Browne, M.S. 1987, 'Validation of a semi-quantitative food frequency questionnaire: Comparison with a 1-year diet record', JAm Diet Assoc, vol. 87, pp. 43-7. Willett, W.C., Sampson, L., Browne, M.L., Stampfer, M.J., Rosner, B., Hennekens, C.H. & Speizer, F.E. 1988, 'The use of a self-administered questionnaire to assess diet four years in the past',AmJ Epidemiol, vol. 127, no. 1, pp. 188-99. Wilson, P. & Horwath, C. 1996, 'Validation of a short food frequency questionnaire for assessment of dietary calcium intake in women', Eur J Clin Nutr, vol. 50, pp. 220-228. Worsley, A., Baghurst, K.l. & Leitch, D.R. 1984, 'Social desirability response bias and dietary inventory responses', Hum Nutr:Appl Nutr, vol. 38A, pp. 29-35. Worsley, T. 198 I, 'Psychometric aspects of language dependent techniques in dietary assessment', Trans Menzies Found, vol. 3, pp. 161-92. Woteki. C.E., Hitchcock, D.C., Briefel, R.R. & Winn, D.M. 1988, 'National health and Nutrition examination Survey- NHANES', Nutr Today, vol. 23. no. 1, pp.25-27. Zimmet, P.Z. 1988, 'Primary prevention of diabetes mellitus', Diabetes care, vol. 11, pp. 258- 62. Zimmet, P.Z., King, H.O.M. & Bjorntorp, S.P.A. 1986, 'Obesity, hypertension, carbohydrate disorders and the risk of chronic diseases', Med J Aust, vol. 145, pp. 256-62.

166 Appendices Appendix 1: Determi~ing sample size and power ""' For correlation analysis Table I. I shows estimated sample sizes (ie the number of paired observations) needed for several power levels using varying correlation coefficients, which reflect effect size. Several power levels are used as these may differ in clinical studies involving quaJ:?.titative data to those conventionally used in qualitative behavioural studies.

Table 1.1 An example of a sample size planning table for undertaking correlation analysis using a two-tailed test (figures in the table represent n to detect r) Power - 0.7 0.8 0.9 ES-r ES-r ES=r 0.3 0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.7 az- .01 103 34 15 125 41 18 158 51 22 az=.05 67 23 10 85 28 12 113 37 16

Derived from Co/ten 1988, 103: ES= effect size, a2=signijicance level, two-tailed; number in table represented required sample size, *0.80 power values used by convention in behavioural studies The number of subjects required changes according to the magnitude of the effect size as seen in Table 3.4. Assuming a variable direction for the reproducibility and validity study, a two­ tailed test (a2) was chosen.

Determining power Between 80% and 90% is also common in clinical studies (Altman I991, p. 456). This means that the power of the study to detect the smallest effect is at least 80% to 90%. For an 80% or 0.8 power value), this equates to a 20% Type II error or one chance out of five of confirming the null hypothesis when it is, in fact, not true. A type II error may be as low as I% (power = 0.99). The higher the power, the higher the sample size needed {Altman 1991, p. 456) Altman (1991, p. 456) proposed use of a graphical method using a nomogram for calculating sample size or power of a test. The advantage of this nomogram is that it can be used to determine sample size prior to a study, as well as at the end of a study, to detect the effect of final sample size on power of the test used. For continuous data using paired or within person studies, the variables needed to use the nomogram (Altman 199 I, p.456-7) to determine sample size needed for a given power include: • the standard deviation (SD) ofthe changes or differences between the methods (SDdiffbetween the FC and FR) • a clinically relevant difference (a) • the significance level (a- two-sided)

• the power (1-fJ) For using the nomogram, a variable called the standardised difference needs to be calculated Standardised differences = a I SDDiff

Sample size estimates for linear (least squares) regression Cohen ( 1988, 444) provides a formula for determining sample size needed for linear regression analysis, given an effect size index derived from L (or A.) tables. The effect size is related to the number of independent variables at a given level of significance. For regression analysis, Cohen (1988, 413-14) defines a small effect as an R2 of0.02, a moderate effect as R2 of 0.13 and a large effect" as an R2 of 0.30. The formula for estimating sample size is: N = A.(l- R2 )/ R2 + J.l. + 1 where: N = total sample R2 =squared multiple correlation A.= effect size index J.l. = number of independent variables Source: Munro 1997, 248 Table 2 shows estimated sample sizes (ie the number of paired observations) needed for regression analysis using this formula for a power level of 0.8 using varying R2 values, which reflect effect size. Where there is only one independent variable as in simple linear regression of the FC against the FR, R2 is equivalent to r2 (Cohen I 988, 428). As R2 values do not appear to have not been reported in dietary validation studies of FFQs that measure fat intake, the r values squared from the correlation in Table 2.4, p. 24 were used to estimate sample size for regression analysis in this study.

Table 2 An example of estimated sample sizes for regression analysis using a power of0.8 (numbers in table represent estimated sample size)

Rl =effect size index (ES) R2 = 0.02 R2 = 0.12 R2 = 0.26 R2 = 0.30 R2 = 0.49 (r = 0.14) (r= 0.35) (r = 0.53)* (r= 0.54) (r=0.70)* az = .01 575 88 32 31 14 a2=.0S 384 59 22 20 10

Derived from Co/zen 1988, 416, 420: ES= effect size, a2:Significance level, two-tailed; R2 = multiple squared correlation coefficient, r = correlation coefficient, number in table represents required sample size, * = correlation previous reported in dietary validation studies ofFFQs measuring fat in the short term. Appendix 2: Background information to subjects involved in the pilot study "" Consent forms

An evaluation of a new questionnaire about food intake Background information Some Australians eat a lot of fat. High fat intakes are linked with many diet-related diseases including overweight and obesity, diabetes, certain types of cancer and heart disease. The development and evaluation of new resources to measure and monitor fat intake is important for helping people identify and modify their fat intake and help prevent and reduce their risk of these diet-related problems. A new questionnaire has been developed called a Fat Intake Checklist to help Australians who need to reduce fat in their diet for weight loss or for other reasons. You are invited to participate in a pilot study to evaluate the presentation of the Fat Intake Checklist. This will require you to spend approximately 5-10 minutes filling out the checklist and also completing an evaluation form. We are interested in your views about the clarity of instructions and your understanding of the instructions, presentation and format. The completed checklists and evaluation forms will be confidential and anonymous and securely stored at the University of Canberra in the Faculty of Applied Science. It is a requirement for any study undertaken at the university that you sign a consent form. If you are willing to participate, please sign in the space provided below.

Consent I have read and understood the information provided. I am willing to participate in this survey. Signature Date: THANK YOU FOR YOUR COOPERATION Vicki Deakin, Chief Investigator Faculty of Applied Science PO Box 1 BELCONNEN ACT 2616 University of Canberra Phone: 201 2567 (BH) lt.ppt:ndict:.r

Appendix 3: Evaluation form for the pilot fat checklist

Evaluation of the fat intake checklist Your comments on the Fat Intake Checklist would be greatly appreciated. Please complete tlze questions below. Circle your reponses or write directly on the paper where indicated. This will take about 5 minutes to complete.

QUESTION YOUR RESPONSE

1. Did you read the instructions as directed before Yes No commencing the questionnaire? 1 l now to Q2 now go to Q4

2. Did you generally understand the instructions? Yes No

1now go to Q3 now1 to Q4

3. How well do you think you understood the excellent good fair poor not sure instructions'! (Circle one response only)

4. Did you check the standard amount column when all the most of some of rarely never 1i me the time the time answering each question'! (Tick one box only)

5. Did you change the standard amount at any time? Yes No

now1 go to Q6 now1 go to Q7

6. List any roods thai you changed Ill the space below

iii Appendice.f

7. Did you have any problems recalling what you ate most of some of not never don't know the time the time really in the previous 3 days (circle the reposnse that best fits)

8. Were there any foods on the list that you had a lot of several a few no foods don't know foods foods foods difficulty in determining the actual amount consumed? (circle the reposnse that best fits)

9. Can you list these foods in the space below.

10. If requested, which of the 2 methods listed below a. weighing individual food items or would you prefer to record your food intake over a 3 day period b. estimating the amount consumed using cups, spoons

Thank you for your time

Vicki Deakin

Consultant Dietitian

Cardiovascular Health Risk Management Clinic

iv Appendix 4: Background information

An evaluation of a new questionnaire about food intake

Background information Some Australians eat a lot of fat. High fat intakes are linked with many diet-related diseases including overweight and obesity, diabetes, .certain types of cancer and heart disease. The development and evaluation of new resources to measure and monitor fat intake is important for helping people identify and modify their fat intake and help prevent and reduce their risk of these diet-related problems. A new questionnaire has been developed called a Fat Intake Checklist to help Australians who need to reduce fat in their diet for weight loss or for other reasons. You are invited to participate in the evaluation of this Fat Intake Checklist. This will require you to spend approximately 5-10 minutes filling out the checklist and also completing a record of your food intake for three days. You will be required to attend two brief interviews (around 5-10 minutes) with one of the data collectors for instructions and cross-checking of information. The completed checklists and food records will be confidential and anonymous and securely stored at the University of Canberra in the Faculty of Applied Science.

Vicki Deakin, Chief Investigator Faculty of Applied Science PO Box I BELCONNEN ACT 2616 University of Canberra Appendix 5: Consent forms

An evaluation of a new questionnaire about food intake The development and evaluation of new resources to measure food intake is important for helping people identify and modify their food choices and to help prevent and reduce their risk of diet-related problems. A new questionnaire has been developed called a Fat Intake Checklist to help people who need to reduce fat in their diet for weight loss or for other reasons. You are invited to participate in the evaluation of this Fat Intake Checklist. This will require you to spend approximately 5-10 minutes filling out the checklist and also.completing a record of your food intake for three days. The completed checklists and food records will be confidential and anonymous and securely stored at the University of Canberra in the Faculty of Applied Science. It is a requirement for any study undertaken at the university that you sign a consent form. If you are willing to participate, please sign in the space provided below.

Consent I have read and understood the information provided. I am willing to participate in this survey. Signature Date: THANK YOU FOR YOUR COOPERATION Vicki Deakin, Chieflnvestigator Faculty of Applied Science PO Box 1 BELCONNEN ACT 2616 University of Canberra Phone: 201 2567 (BH) UNIVERSITY OF CANOERRA

-·-= Appendix 6: Ethics approval

Project No: 91132

7 November 199(· ·

Ms Vicki Deakin Faculty of Applied Science

Dear Ms Deakin, The Human Ethics Committee has approved your application to experiment using human subjects for a project entitled: The validity and reliability of a food frequency questionnaire.

Yours faithfully,

Therese Stubbs Secretary, Human Experimentation Ethics Committee

Kirinari Slrccl Urucc ACT 1'0 Uox I Uctconncn ACT 2616 Auslralia Tclcphonl" ·61 6 (06) 201 5111 Facsimile +61 6 (06) 201 5999 ~ppendix 7: Example of a completed aic page I I •• , fat checklist (final version) FAT INTAKE CHECKLIST -...;

a,~\ '? "-i>' -2. I. D. NUMBER ...... •....••...... DATE COMPLETED.:-...... ·1 ···f···ti FORM NUMBER ...•••? ......

INSTRUCTIONS

1. Please circle th ffiber or times you have eaten the following foods over the PREVIOUS THREE DAYS, not including today. 2. Trimmed means the obvious fat or chicken skin has been cut off. Untrimmed means the fat and chicken skin have been eaten. 3. It is important that you fill in the questionnaire as accurately as possible, indicating every time you have eaten any or the foods mentioneq. Make sure you answer every question. 4. Weights or foods are based on cooked weight, (weight after cooking). 5. The column on the far right Indicates the grams of fat for one standard serve. Do not write in this column. 6. You are free to comment on your food type and standard amount If they differ substantially from the description given (e.g. you may have had 112 standard serve). Just write a brief note next to the Item.

FOOD STANDARD AMOUNT NUMBER OF TIMES EATEN FAT PER circle the total number SERVE

EXAMPLE ONLY

untrimmed steak, pork, lamb (not chops) 1 small serve (120g) 0 1 2 3 4 5 or more [22 untrimmed chicken (roasted. grilled, frkd) 2 drumsticks or 1 wing plus 1 thigh or 3-4 slices roast meat

Over the 3 days this person had eaten 1 large steak, 2 fried drumsticks and 3-4 slices of roast pork so the total is 4 (ie. counting 2 serves for the steak, 1 for the chicken and 1 for the pork) and the fat intake is 22 x 4 = 88 grams of fat.

START THE QUESTIONNAIRE HERE

I. Liver, kidney, brains, Lripc, tongue Breakfast serve (100 g) 0 2 3 4 5 or more [8

2. untrimmed steak, pork, lamb (not chops) 1 small (120g) 0 2 3 4 5 or more [22 untrimmed chicken (incl. BBQ and roasted) 2 drumstick or 1 wing plus 1 thigh or 3-4 slices roast meal

J. Crumbed fried chicken 2 drumsticks or 0 1 2 3 4 5mmore [29 (eg. Kentucky Fried) 1 wing plus I thigh

4. untrimmed pork chops, lamb chops 2 chops (120 g) or 0 I 2 3 4 5mmore [33 I large pork chop

viii __ _...... , . ...._.._ ...... ··------o-···-~·· -·····- page 2.

FOOD STANDARD AMOUNT NUMBER OF =[IMES EATEN FAT PER circle the total number SERVE

5. Sliced meat: fatty culS cg. bacon 2 rasher 0 1 2 3 4 5 or more [25 salami, luncheon mealS 2 slices

6. Sausages, frankfurts 2 lhin or 1 thick 0 1 2 3 4 5 or more [22 .,- 7. Meat pic, pastic, sausage roll 1 pic or 2 small rolls · 0 (!)2 3 4 5 or more [2-c <..ct 1

8. Savoury pie with pastry 1 individual pie 0 1 2 3 1 5 or more (SO made with cream eg. quiche or I large slice (180g)

9. Savoury pie with pastry made without 1 individual pie 0 I 2 3 4 5 or more [37 cream eg. spinach pie, quiche or I large slice (180g)

10. Fish or fish cakes or fingers 1 medium piece (150 g) / oQ)234 5 or more [27 tt-l 1 crumbed, batlercd, oven fried or 3 small cakes

II. undrained canned fish, eg. tuna 2 tablespoons 0 1 2 3 4 5 or more [J.C or sardines in oil 3-4 sardines

12. Ordinary and thickened crCa.m, sour cream, 1 tablespoon 0 1 2 3 4 5 or more [7.5 (Remember soups/sauccs/momays)

13. Light cream, light ~ur cream 1 tablespoon 0 I 2 3 4 5 or more [-c (Remember soups/sauces/momays)

14. full cream yoghurt, plain or flavoured 1 small carton (200g) 0 1 2 3 4 5 or more [7

15. lee cream, (not special diet variety) 2 scoops (120g) ®d ~~ore [9 t,.•.(]

16. full cream milk, flavoured milk, milkshake 1 medium glass (200 mL) ·I"",/ 0 2(])4 5 or more [7.51'2..4

17. Chccsc, cheddar, edam etc. 1 cube matchbox size (30g) 0 2 3 4 5 or more [10

18. Rich cake, cheesecake, black forest 1 av. shop serve (120g) 0 2 3 4 5 or more [-co

19. Pastry: croissant 2 croissants 0 2 3 4 5 or more [30 apple strudel, sweet pie I average piece (150g)

20. Biscuits: swccl/chocolatc coated/shortbread/ 2 biscuits v ~ 0 1@ 3 4 5 or more (B {(, I cream filled/cheese crackers 4-5 crackers

21. Fried rice 112 cup 0 2 3 4 5 or more [8

22. Piu.a lfJ of medium size 0 2 3 4 5 or more [17

23. Hot chips I medium canon v"' o(Y2 3 4 5 or more [303~

ix ..., ' Cardiovascular Health Risk: Management dtinic page3

FOOD STANDARD AMOUNT NUMBER OF T-!MES EATEN FAT PER circle the total number SERVE

24. Chiko roll, dim sim, spring roll 2 small or I large 0 2 3 4 5 or more [IS J

25. Potato ~hips, twistics, chcczels, com chips 25g bag 0 2 3 4 5 or more [II • health food bar 40 g bar

26. Chocolate bar (Mars, Chokito etc.) 50g 0 I 2 3 4 5 or more [16 doughnut 2doughnulS " 27. Toasted muesli, muesli flakes 2 tablespoons 0 2 (}l 4 5 or more [6 VfrC. ~ IJC I l c;r 1 ~t;c• I 28. Gravy made from meat dripping 1 tablespoon 0 2 3 4 5 or more [3 (not made from Gravox)

FOODS CONTAINING FAT BUT WHICH CAN BE USED REGULARLY IN MODERATION

29. trimmed steak, pork, lamb (not chops) 1 small (120g) 0 1 2 3 4 5 or more [12 1 trimmed chicken 2 drumstick or 1 wing plus 1 thigh or 3-4 slices roast meat

30. trimmed pork chops, lamb chops 2 chops (140 g) or :.. 0 1 @ 3 4 5 or more [17~4 1 1 large pork chop

31. Mince meat in the form of 1 pattie (60 g) or 0 1 2 3 4 5 or more (II hamburger, rissole, bologncsc, lasagne etc. lf2 cup mince

32. Meal (trimmed) casserole or meat stew, eg. 1 cup v 0 Q) 2 3 4 5 or more [13 1"3 1 chinese meat dishes/curry/goulash (no veg.)

33. Sliced meat: ham, beef, lamb, chicken 2 slices r/ 0 (!) 2 3 4 5 or more [6 b ] (trimmed)- remember sandwiches

34. Egg 1 egg 0 1 2 3 4 5 or more (S

35. low rat milk, Hi-Lo (not Shape or Skim) 1 medium glass (200 mL) 0 2 3 4 5 or more [.c

36. Nuts: any type including 9-t 0 nuts (approx 20 g) 0 2 3 4 5 or more [II peanut butter l tablespoon

37. Light cake: plain sponge, 1 small piece 0 l 2 3 4 5 or more [9 scone, 2 scones pikelet, pancake 3 pikelets, I pancake

X Cardiovascular Health Risk Management Clinic page4\...

Answer the following questions in the space provided.

ADDED FAT

38. On average, how many teaspoons of buller or margarine do you have per day? (remember sandwiches, toast. crackers and cooking) ...... & ..... teaspoons. [4~t3{ 2._1

39. How much mayonnaise, salad dressing, or oil have you used over the previous three days? Do not include low fat or no oil dressing •...... teaspoons. [4 1

40. When meat or vegetables arc fried or roasted, what are they cooked in most orten? (circl~ the answer which best fits)

Buuer I dripping [beef or lamb] Lardlcopha Cooking or table margarine· . Polyunsaturated margarine Polyunsaturated vegetable oils Not polyunsaturated oils Cooked in own juices I never fry or roast meat other \., ·ra o th er, please spccafy• .....'r..:J,_,J ::...... !......

41. How much fat have you used over the previous three days to fry or roast meat or vegetables? (eire/~ the answer which best fits)

less than 1 teaspoon 1tcaspoon 2teaspoons 1 dcsertspoon 1 tablespoon 2., more than I tablespoon (How many? ...... ) [4 32- ]

42. Do you add butter or margarine to vegetables or meat arter cooking? (cird~ one answer only)

always I e§cti~l often I rarely I never

43. Please list below whether you have eaten any other foods over the previous thrtt days that may contain fat and were not mentioned in this questionnaire. Use househould measures (eg. cup, teaspoon, small serve) to describe the amount of food eaten. For example, coconut, coconut cream, pat~. duck, goose, ghcc, puddings, cream substitute, non-dairy coiTee whitener ......

...... ~· ...... - ...

Your approximate total fat intake In grams Is

Thanks for your cooperation

Vicki Deakin Consultant Nutritionist

xi Appendix 8: Instructions for recording food records weighed methods ~

1. Weighed food records (used in Group 2)

INSTRUCTIONS FOR A THREE-DAY FOOD RECORD

Read these instructions carefully before completing the food records on FORM A. Background information

~ Obtain a record of your food and beverage consumption on any Sunday, Mmiday and Tuesday by weighing all foods and beverages consumed on a Sunday, Monday and Tuesday. Ensure that your appointment with Vicki or Maria is on the next day (ie Wednesday) to complete the fat checklist. Record all foods and beverages on the attached forms (FORM A). Detailed instructions are outlined in the method below and on the next page. Metlzod Weigh individual foods and beverages consumed (using the scales provided) and record their exact description and weight (in grams) on Form A. It makes no difference for analysis whether the food is raw or cooked but please tell us if, and how, it was cooked (eg grilled, fried etc.). Weigh each item separately using the kitchen scales provided and, where weighing is inappropriate or impossible such as eating out, estimate the amounts in household measures (eg. tablespoons, cups and household measures provided). Do not simply record a hamburger on Form A but enter each ingredient using both weights and estimated household amounts as shown in the table below.

Food Household amount Weight (grams) bread roll 1 average 60 butter 1 teaspoon 5 (estimated) hamburger pattie, grilled 1 average 85 tomato sauce 1 tablespoon 22 pickles, sweet 1 tablespoon 28

Sttmtfunlmeusm·es Standard household measures are listed in the box to assist you. If you measure foods by volume, the serving of food can be converted to grams 1 teaspoon (5g) using these rough equivalents. 1 dessertspoon (lOg) 1 tablespoc;m (20g) 1 ounce (30g)

Thank you for you assistance, Vicki Deakin, University of Canberra Appendices

FORM A: 3·DA Y FOOD INTAKE RECORD

Date: Age: ______yrs

Code number (office use): ___ Sex

Height: ______m Weight ______kg

Sunday

Meal or beverage IHousel1old amount IWeight (g) \\'aste (g) Amount eaten (g)

xiii Appendix 9: Instructions for recording food __records by estimated methods

2. Estimated food records INSTRUCTIONS FOR A THREE-DAY FOOD RECORD Read these instructions carefully before completing the food records on FORM A. Background information Obtain a record of your food and beverage consumption on any Sunday, Monday and Tuesday by estimating by household measure (eg. tablespoons, cups and household measures provided) all foods and beverages consumed on a Sunday, Monday and Tuesday. Ensure that your next appointment at the clinic is on the next day (ie Wednesday) to complete the fat checklist. Record all foods and beverages on the attached forms (FORM A). Detailed instructions are outlined in the method below and an example of FORM A is provided on the next page. Method Estimate the amounts of all foods and beverages consumed and record their exact description on Form A. It makes no difference for analysis whether the food is raw or cooked but please tell us if, and how, it was cooked (eg grilled, fried etc.). Estimate the amount of each item consumed separately and, where weighing is in household measures (eg. tablespoons, cups and houdehold measures provided). Do not simply record a hamburger on Form A but enter each ingredient using estimated household amounts as shown in the table below. Food Household amount bread roll 1 average butter I teaspoon hamburger pattie, grilled I average tomato sauce 1 tablespoon pickles, sweet I tablespoon

Stmulan/measures Standard household measures are listed in the box to assist you. If you measure foods by volume, the serving of food can be converted to grams 1 teaspoon (Sg) using these rough equivalents. 1 dessertspoon (I Og) 1 tablespoon (20g) 1 ounce (30g) Appendices

INSTRUCTIONS FOR FILLING OUT A FOOD RECORD Read these instructions carefully before completing the food record • The food record is to be filled in for three consecutive days: Sunday, Monday and Tuesday.

• Carry the recording sheet with you so you can fill in all the foods and beverages consumed at the time of consumption.

• Do not trust your memory by trying to complete the record at the end of the day.

• Include all food, drink, medication and supplements.

• Provide details of the quantity in household measures (eg cups, spoons) of each item of foods and beverages consumed. If known, report the actual weight of food consumed

Bread, wholemeal 2 sandwich slices

Milk ShapeTM average glass about 200mL

Cornflakes Kelloggs™ cornflakes I cup and , I teaspoon

Milk on corflakes Y2 cup

• Be specific about the description of the foods eaten by providing as many details as possible. For example,

Orange juice Sweetened, I average glass

Toasted muesli no added fat I cup

Yoghurt, fruit Skii'M light I small carton

• When a dish contains a number of different foods combined (eg omelette, stew, salad, and casserole) list each ingredient and its quantity. For example: Ome/eue:

Eggs 2 medium size

Matured cheddar cheese ~z cup

Tomato VJ medium size

Celery VJ stick • If the food is cooked. include the method of cooking. For example:

fried egg I medium

grilled lish I small lillet

baked (in oil) potato mediUm s•ze boiled pumpkm 1/, cup or I scoop • Now check the example on the next page before recording food intake.

XV !.ppendices

FOOD RECORD: EXAMPLE ONLY (This example is not a recommended diet) DATE ... write the date on each sheet MEAL FOOD AND DRINK DESCRIPTION QUANTITY

orange juice sweetened 2glasses muesli toasted I cup milk skimmed 'lz cup BREAKFAST toast wholemeal 2 sUces butter I teaspoon .jam strawberry 2 teaspoons coffee white, no sugar I cup

SNACKS IN-BETWEEN banana average size I yoghurt, fruit Ski lite I small carton

salad sandwich white bread 2 slices butter 1 teaspoon lettuce I large leaf LUNCH tomato 2 slices cheddar cheese I slice corn chips 50 g packet flavoured milk coffee Moove 250mL

SNACKS IN-BETWEEN snack Pack sultanas, peanuts 25 g packet

chicken baked with skin l drumstick + 'lz breast gravy home-made with flour 'lz cup corn tinned 2 tablespoons DINNER peas boiled 3 tablespoons potato mashed with margarine l scoop orange juice sweetened 2 small glasses apple pie and cream l small piece + 2 tablespoons cream -

SNACK Milo made with Shape milk l cup

MEDICATION & Vitamin C tablet I x 250 mg SUPPLEMENTS

xvi Appt:ndict:.t

FORM A: THREE DAY FOOD RECORD

Date: Age: ------yrs

Code number (office usc>: ____ Sex

______kg Height: ------m Weight

SUNDAY

Meals Food or bc\·cragc Description Quantity consumed

Snacks or drinks before breakfast

Breakfast

Snacks in-between

Lunch

Snacks in between

Dinner

Snacks

x vii Appendix 10: Measures of agreement using 8/and-A/tman techniques

Calculation for mean differences, SD of the differences, limits of agreement and precision of limits of agreement for fat intake (g fat/d) between test I and test 2 using technique described by Bland and Altman (1986) Reproducibility Criterion validity Group 1 (n=49, matched pairs) Group 2 (n=42) Group 3 (n=19) Mean differences (d) g fat/d 9.8844 (test 1 minus test 2) +1.7667 +3.87 (FR minus FC) (FR minus FC) SD of the differences (SDdift) 33.4002 25.6827 16.7205 SE of the mean differences (d) 4.7715 3.9629 3.8320 ..Jsn2Jn 95% CI for the mean differences t=2.10, df=48, (0.05) 2-tailed t=2.021, df=41, 2 tailed t=2.120, df=18, 2 tailed, d-(t x SE of d) to d+(t x SE of d) +0.2907 (lower bound) to +19.4780 (0.05)- (0.05) (upper bound) -7.2366, +8.7700 -4.1801, + 11.938 Limits of agreement +76.685g fat/d -50.6073, +52.1321 -29.5621, +37.3299 upper limit =d+2SDdiff -56.916g fat/d lower limit =d-2SDdiff Precision of the estimated limits of agreement CI for limits of agreement SE of d ± 2SD = 6.8639 SE of d ± 2SD (Clfor the upper and lower limits are calculated) =6.64405 SE ofd ± 2SD= ..J3SD2/n Lower limit= d-2SDdiff (lower) ± (t x SE of d-2SD) -72.4 to -3 8.9 g fat/d (lower limits) -64.47, -36.728 -43.6475,-15.4767 Upper limit= d+2SDdiff (lower)± (t x SE of d-2SD) +58.7 to 92.0 g fat/d (upper limits) +38.260, +66.000 +23.2445, +51.41526 - Coefficient of repeatability 33.4002 na (95% of the differences should be less than 2SD 66.8 from the mean differences) SD =square the mean differences, add them up, divide by n, and take the square root ·' SD=standard devaition, CI= confidence interval, Sddiff, SD ofthe mean differences, SE=standard error ofthe mean differences, t=t tables, df=degrees offreedom, na=not applicable Appendix 11: Differences in fat intake {g fat/d) of individual food items on the fat checklist befween Test 1 and Test 2

Independent samples t-test values for each food item between Test 1 and Test 2 of the fat checklist on responses of49 subjects subjects in Group 1 (n=49, df=96)

Question representing food item Number of Mean 95% CI of the Standard t p on fat checklist subjects differences differences error of the consuming (fest 1 difference food items In minus Test2 Test 1 and (g fat/d) Test2 Liver, kidney, brains 4 +0.7 +0.001,+1.3 .. 0.3 2.1 0.04 Untrimmed steak, pork, chicken, 24 +5.8 -0.6,+12.3 3.1 1.8 0.08 lamb (not chops) Crumbed fried chicken 5 +1.8 -0.8,+4.4 1.3 1.4 0.17

Untrimmed pork chops, lamb 7 -0.7 -5.4,+4.0 2.4 -0.3 0.77 chops Sliced meat: fatty cuts eg bacon, 38 +2.6 -6.2,+11.4 4.4 0.6 0.57 salami, luncheon meat Sausages, frankfurts 19 -2.7 -7.4,+2.1 2.4 -1.1 0.26

Meat pie, sausage roll 24 -3.2 -8.5,+2.2 2.7 0.6 0.24

Savoury pie with pastry made 7 +2.0 -4.2,+8.3 3.2 0.8 0.52 with cream (eg quiche) Savoury pie with pastry made 10 +2.6 -3.5,+8.8 3.1 -0.5 0.40 without cream (eg quiche) Fish or fish cakes or fingers, II -1.1 -5.8,+3.6 2.4 -0.6 0.64 crumbed, battered, oven fried Undrained canned fish ( eg tuna, 7 -1.1 -4.1,+2.3 1.7 -0.6 0.51 sardines in oil) Ordinary and thickened cream, 27 -0.6 -2.6,+1.4 1.0 1.9 0.60 sour cream Light cream, light sour cream 14 +0.7 -.004,+1.5 0.4 -0.1 0 ..06

Full-cream yoghurt, plain or 24 -0.1 -2.5,+2.3 1.2 0.3 0.91 flavoured Ice-cream 32 +0.4 -2.3,+3.1 1.4 0.6 0.79

Full-cream milk, flavoured milk, 63 +2.1 -5.0,+9.3 3.6 0.7 0.56 milkshake Cheese: cheddar, edam 74 +1.8 -3.1,+6.7 2.5 1.7 0.47

Rich-cake, cheesecake, black 15 +7.9 -1.3,+17.2 4.7 0.5 0.09 forest cake Pastry: croissant, apple strudel, 23 +2.0 -5.7,+9.8 3.9 0.2 0.60 sweet pie Biscuits: sweet/chocolate-coated, 36 +0.7 -4.6,+5.9 2.6 -0.4 0.81 shortbread, cream-filled, cheese crackers - Fried rice 14 -0.3 -2.2,+1.5 0.9 0.2 0.70

Pizza 12 +1.3 -2.7,+3.4 1.5 0.3 0.82

Hot chips 25 +1.0 -6.2,+8.3 3.6 -0.6 0.78

Chiko roll™, dim sim, spring roll II -0.9 -4.2,+2.3 1.6 0.4 0.58

Potato crisps, Twisties™, com 47 +0.7 -2.5,+3.8 1.6 0.7 0.68 chips Chocolate bar, doughnut 33 +2.5 -4.7,+9.7 3.6 0.4 0.49

Toasted muesli 26 +0.7 -2.2,+3.6 1.4 0 0.64 Appendix 12: Calculations for sensitivity, specificity and predictive values of the FC

FC FR Total :?: 70 g fat/d <70 g fat/d :?:70 g fat/d 17 (a) 5 (b) 22 (a+b) <70 g fat/d 6 (c) 33 (d) 39 (c+d) 23 (a+c) 38 (b+d) 61

Sensitivity =a/a+c = 17/33 = 0.739 (true positives) Specificity= d/b+d = 0.868 (True negatives) PPV = a/a+b = 17/22 = 0.773 NPV = d/d+c = = 0.846 Likelihood ratio of a positive test to detecting greater than 70 g fat from the FC =a/a+c divided by b/b+d = 5.6 Likelihood ratio of a negative test to tecting greater than 70 g fat from the FC =c/a+c divided by d/b+d = 0.3

Frequency of greater than 70 g fat intake in this group 23/61 = 3 7% Appendix 13: Frequency distribution of responses in Group 1 Frequency distribution ofresponses (0/o) to fat checklists for Group 1 (n=49 subjects) in Test 1 and Test 2 (Figures m the table represent the number ofsubjects selecting each frequency option) Food items Q Number of all Alterations to Test 1 (n-49) Test 2 (n-49) subject! food serve sizes consuming food item (n=98)

Test 1 and 2 Test Test 2 0 1 2 3 4 5 or 0 1 2 3 4 5 or combined 1 >5 >5> Liver, kidney, brains (n=41) I 4 45 4 49

Untrimmed steak, pork, chicken, lamb (not chops) 2 24 32 10 6 1 42 3 3 1

Crumbed fried chicken 3 5 45 4 48 I

Untrimmed pork chops, lamb chops 4 7 46 2 1 45 3 1

Sliced meat: fatty cuts eg bacon, salami, luncheon meat 5 38 29 14 4 2 31 13 3 2

Sausages, frankfurts 6 19 41 7 1 38 7

Meat pie, sausage roll 7 24 39 9 1 35 10 4 • Savoury pie with pastry made with cream (eg quiche) 8 7 45 3 1 46 3

Savoury pk with pastry made without cream (eg quiche) 9 10 43 4 2 45 3 1*

Fish or fish cakes or fingers, crumbed, battered, oven fried 10 11 45 2 2 42 6 1

Undrained canned fish (eg tuna, sardines in oil) 11 7 46 2 I 45 2 I I

Ordinary and thickened cream, sour cream 12 27 36 10 3 35 8 6

Light cream, Iight sour cream 13 14 39 6 4 44 4

Full-cream yoghun, plain or flavoured 14 24 36 9 2 I I 37 6 3 3

Ice-cream 15 32 33 11 3 2 33 12 3 1

Full-cream milk. flavoured milk, milkshake 16 63 4 3 20 6* 7 6 5 4* 18 8 . IS 7 2 3" /,I

Cheese: cheddar, edam 17 74 1 10 15• 10 II 3 14 12 13 7 3

Rich-cake, cheesecake, black forest cake 18 15 3 2 339 5* 4 I 40 4 1

Pastry: croissant, apple strudel, sweet pie 19 23 36 10* 2 I 39 6 4

Biscuits: sweetlchocolate-coated, s~ortbread, cream-filled, 20 36 14 9 13 6 3 4* 13 10 12 8 3 3 cheese crackers

Fried rice 21 14 42 5 2 42 4 2 1 Food items Q Number of all Alterations to Test 1 (n-49) Test 2 (n-49) subjects food serve sizes consuming food item (n-98)

Test 1 and 2 Test Test 2 0 1 2 3 4 5 or 0 1 2 3 4 5 or combined 1 >5 >5>

Pizza 22 12 42 6 I 44 3 2

Hot chips 23 25 37 9 2 I 36 I* 10 2

Chiko roJlT"', dim sim, spring roll 24 11 44 3* 2 43 3 2 I

Potato crisps, Twisties™, com chips 25 47 1 23 18* 5 2 I 28 10 8 3 • Chocolate bar, doughnut 26 33 I 17 3 5 24 12 10 2 I

Toasted muesli 27 26 I 34 6 2 7* 37 6 2 2 I

Gravy made form meat dripping 28 12 43 5 I 43 5 I

Trimmed steak, pork, lamb (not chops), trimmed chicken 29 52 23 122 13 I 23 20 5 I

Trimmed pork chops, lamb chops 30 13 43 5 1 42 6 1

Mince meat in the form of hamburger, rissoles, bolognaise, 31 38 29 16 4 31 15 3 lasagne

Meat (trimmed) casserole or meat stew: eg Chinese meat 32 21 41 7 I 36 II 2 dishes, curry, goulash (no veg)

Sliced meat: ham, beef, lamb, chicken 33 36 27 13 8 1 35 8 4 2

Egg 34 36 28 121 5 4 33 9 5 I 2 Low fat milks, Hi loT"' (not Shape™ or skimmed) 35 40 26 44 6 10 2 I 31 7 4 3 I 2

Nuts: any type including peanut butter 36 44 I 26 101 6 6 I 28 II 5 3 2 0 • Light cake: plain sponge, scone, pikelet, pancake, muffin 37 22 40 6 2 1 37 6 4 2

Adde,d margarine, buner 38 81 4 2 4 9* 13 14 6 3* 3 13.' 15 7 8 3* • 1.1 Added oil, salad dressing, mayonnaise 39 35 1 35 7 5 1 28 6 10 2 2 1*9 Total number (n) of alterations made to serve sizes 19 11 m1ssmg data, .Sites . where alteration m serve s1zes were made Appendix 14: Mean fat intake {g fat/d) from indiV-idual food items on the fat checklist for Group 2 and Group 3

Group2 Group3 (n=42) (n=l9) Item Mean daily fat intake Mean daily fat intake Food item (g fat/d) (SD) (g fat/d) (SD) Ql 0 0.1 (0.6) Liver, kidney, offal ~

Q2 2.9 (6.1) 3.9 (7.9) Untrimmed beef, pork, lamb or chicken (with skin)

Q3 0 0 Crumbed fried chicken Q4 0.5 (2.4) 4.6 (8.5) Untrimmed chops, lamb, pork

Q5 5.6 (6.6) 3.3 (5.7) Sliced meat - fatty cuts, bacon, salami Q6 3.7 (9.1) 3.1 (5.1) Sausages, frankfurts Q7 2.3 (6.2) 1.7 (3.4) Meat pie

Q8 0.4 (2.6) 0 Savoury pie made with cream 0 Q9 0.6 (3.8) -0.13 Savoury pie (quiche) without cream 0.6 (2.8) QIO 0.9 (2.9) 0.27 Crumbed fish 0.1 (0.3) Qll 0.9 (3.8) 0.30 Fish canned in oil 0.2 (I.I) Q12 1.1 (2.2) O.QI Cream 0.07 (0.3) Q13 0.2 (0.4) 0.08 Light cream, sour cream 0.2 (0.5) Q14 0.6 (1.9) 0.22 Yoghurt (full-cream) O.Q7 (0.3) Q15 1.5 (2.4) 0.15 Ice-cream 1.1 (3.5) Q16 5.6 (5.3) 0.32 Full-cream milks 1.4 (2.3) Q17 5.5 (4.9) 0.13 Cheese -3.2 (3.6) Q18 1.3 (4.0) 0.22 Rich cake, cheesecake 0.7 (3.1) Q19 1.7 (3.8) 0.35 Pastry 2.9 (6.5) Q20 2.6 (4.9) 0.27 Biscuits 2.0 (6.5) Q21 0.8(2.1) 0.61** Fried rice 0.6 (I.l) Q22 1.1(3.1) 0 Pizza 0 Q23 2.1 (4.7) ""' 0.4 Hot potato chips - 2.2 (5.3) Q24 0.4 (1.3) 0.09 Chiko roll, dim sim, spring roll etc 0.8 (2.5) Q25 0.9 (1.9) 0.8 (1.3) Potato crisps, Twisties™, corn chips

Q26 02.7 (4.1) 1.3 (2.2) Chocolate

Q27 0.5 (2.1) 0.3 (1.0) Muesli toasted

Q28 0 0.1 (0.3) Gravy

Q29 4.2 (4.8) 3.7 (4.1) Trimmed beef, lamb, pork, chicken ~

Q30 1.5 (3.5) 1.5 (3.2) Trimmed chops

Q31 1.1 (1.8) 1.8 (2.0) Mincemeat

Q32 1.1 (2.3) 0.5 (1.4) Meat trimmed for casserole

Q33 1.4 (1.9) 1.8 (2.0) Slice lean meat

Q34 I. 7 (3.0) 0.5 (1.4) Eggs

Q35 1.4 (2.6) 0.9 (1.7) Milk (low fat)

Q36 1.7(4.1) 1.7 (3.3_ Nuts, peanut butter

Q37 1.7 (3.0) 0.8 (1.4) Light cake, pancakes, pikelets

Q38 9.7 (5.4) 10.7 (7.7) Butter, margarine

Q39 1.1 (2.2) 0.7 (1.0) Salad dressing, mayonnaise

*P 0.05, ** P

Food item Grouped food Individual Food included in 'food items' (grouped) items food items Fatty meat cuts, QI to Q6 Ql Liver and 'offal' untrimmed meat Q2 Untrimmed steak, pork, lamb (not chops) Q3 Crumbed fried chicken Q4 Untrimmed pork chops, lamb chops Q5 Slice meat: fatty cuts Q6 Sausages , frankfurts Meat pie, sausage Q7 roll Quiches (with or Q8 and Q9 Q8 Savoury pie with pastry made with cream without cream) Q9 Savoury pie with pastry made without cream Fatty fish Q I 0 and Q 11 QIO Fish cakes, fish fingers, crumbed battered fish Qll undrained canned fish in oil Full cream dairy Ql2 to Ql7 QI2 Ordinary and thickened cream products Q13 Light cream, light sour cream Ql4 Full-cream yoghurt Ql5 Ice-cream Ql6 Full-cream milk, flavoured milk, milkshake Ql7 Cheese Cakes and biscuits Q 18 to Q20 Ql8 Rich cake, cheesecake, Black Forest cake Ql9 Pastry: croissants, apple strudel, apple pie Q20 Biscuits Takeaways and Q21 to Q26 Q21 Fries rice snack foods Q22 Pizza Q23 Hot chips Q24 Chiko roll, dim sim, spring roll Q25 Potato crisp, twisties, com chips Q26 Chocolate bar, doughnut Lean meats Q29 to Q33 Q29 Trimmed steak, pork, lamb (not chops), chicken (with skin) Q30 Trimmed pork chops, lamb chops Q31 mincemeat Q32 Meat (trimmed) for casserole, meat stew Q33 Sliced meat: ham, beef Low-fat milk Q35 Added fat Q38 and Q39 Q38 Added butter, margarine Q39 Added oil, salad dressing, mayonnaise corrected -c \B