UNCLASSIFIED//FOUO

Introduction to Statistical Principles Part I LTC Renee Cole, PhD, RDN, LDN

UNCLASSIFIED//FOUO UNCLASSIFIED//FOUO Purpose

The goal of this training session is to gain (or refresh) basic knowledge of statistical principles.

Why do we care?

• Identify statistical tests needed to: – Effectively answer your research question – Assess data collected – Improve ability to critically appraise the quality of evidence- based literature

Note: Sessions present common statistical methods used. It is not meant to be all inclusive.

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 2of 37 April 2016 UNCLASSIFIED//FOUO References

References that may helpful: 1. Portney LG. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, N.J: Pearson/Prentice Hall; 2009. 2. Field AP. Discovering Using SPSS: (and Sex, Drugs and Rock “N” Roll). 3rd ed. Los Angeles: SAGE Publications; 2009. 3. Hinton PR, ed. SPSS Explained. London ; New York: Routledge; 2004. 4. Dawson GF. Easy Interpretation of : The Vital Link to Applying Evidence in Medical Decisions. 1st ed. Philadelphia, PA: Saunders/Elsevier; 2008.

3 [email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 3of 37 April 2016 UNCLASSIFIED//FOUO Acknowledgements

• Dr. Steven Allison, PT, retired Air Force Colonel • Dr. John Childs, PT, Air Force Lt Col • Dr. Shane Koppenhaver, PT, Army LTC

Some slide materials were extracted from their presentations for use in this 3-Part series

4 [email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 4of 37 April 2016 UNCLASSIFIED//FOUO Objectives

• Describe variable scales of measurements • Describe the principles involved in & hypothesis testing • Determine the appropriate statistical tests: – Descriptive analysis – Correlation analysis – T-tests analysis – ANOVA & Repeated Measures models

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 5of 37 April 2016 UNCLASSIFIED//FOUO 1st and Foremost…

• Methods & statistical test must be able to address the research question • The type of statistical methods will depend upon: • Research design • Variables (independent vs. dependent) • Scales of measurement • Normality & homogeneity of data • Always compute to paint the picture of your sample population / setting

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 6of 37 April 2016 UNCLASSIFIED//FOUO Classifications of Statistics

• Descriptive – What are the characteristics of? • Comparative – What’s the difference between? • Correlation – What’s the relationship (association) between? • Regression – What predicts? • Reliability – How reproducible are scores, a technique or device?

• Parametric (inferential) - With underlying assumptions • Nonparametric – Does not meet assumptions

• Univariate – Using a single dependent variable • Multivariate – Simultaneous use of multiple dependent variables

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 7of 37 April 2016 UNCLASSIFIED//FOUO Making Choices

• If you already know all the statistical tests and how they are used, it’s “easy” … • For the rest of us: – Broad categories help to clarify our thoughts – Algorithms help to pick the right test (see attachments) – Remember you will get better with practice – Statistical programs do most of the work

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 8of 37 April 2016 UNCLASSIFIED//FOUO Variable Scales of Measurement

• Nominal ("naming" ) variables

• Ordinal ("ordered") variables

• Interval (equal intervals) variables

• Ratio (equal intervals, true zero) variables

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 9of 37 April 2016 UNCLASSIFIED//FOUO Variable Scale of Measurement

Why Bother Identifying the Variable Scale of Measurement?

• Required to choose the right statistical test • Helps to determine if parametric or non- parametric version of statistical test

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 10 of 37 April 2016 UNCLASSIFIED//FOUO Four Parametric Assumptions

1. The sample is randomly drawn from the target population 2. Data are normally distributed 3. Homogeneity of – SD of group 1  SD of group 2 4. Data are on interval/ratio scales (continuous)

• May be justifiable to violate assumptions – Random population is complex and costly (random sampling ≠ random group assignment) – Accounted for by “robust" nature of inferential tests

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 11 of 37 April 2016 UNCLASSIFIED//FOUO Nominal Variables

• Dichotomous or categorical • Discrete and mutually exclusive categories • To analyze must assign a # for each category / response for identification only (dummy coding) • Examples:

– Sex (1=Male, 2=Female) or Deployed (1=Yes, 2=No) – Race (1=Caucasian, 2=Black/AA, 3=Asian, etc.) – Medical Condition (1=DM2, 2=HTN, 3=CVD, etc.) • Math: descriptive counts only (frequencies or percent) • No quantitative value – value will not make sense (i.e. Sex mean value = 1.3) – Frequency (n= 200 Men); percent (70% men)

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 12 of 37 April 2016 UNCLASSIFIED//FOUO Ordinal Variables

• Rank-ordered categories

• Relationship for adjacent categories (> or <)

– Zero < trace < poor < fair < good < normal – “Unequal intervals” • Dummy codes for variables required for analysis • Examples:

– Rank (1=E1-E4, 2=E5-E9, 3=W01-CW5, 4=O1-O3, 5=O4-O6, etc.)

– Satisfaction (1=Unsatisfied, 2=Neither Unsat or Sat, 3=Satisfied)

– BMI Category (1=Underweight, 2=Normal weight, 3=Overweight, 4=Obese)

• Math: descriptive counts only (frequencies) • Note: Survey items are typically ordinal, but when a set of survey items are treated as a score and totaled, the scale is often considered continuous [email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 13 of 37 April 2016 UNCLASSIFIED//FOUO Interval Variables

• Rank-ordered • Equal intervals • No true zero, but considered ‘continuous’ • Examples:

– Years (B.C. or A.D.) or Temp (°F, C or K) or Time

• Math: add or subtract (no ratios) – No dummy codes necessary! • Often displayed as Mean ± Std Deviation (SD)

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 14 of 37 April 2016 UNCLASSIFIED//FOUO Ratio Variables

• Interval scale with true zero • Examples: – Steps/day, Calories, Age, or Height • Math: all operations, including ratios – No dummy codes necessary! • Often displayed as Mean ± Std Deviation (SD)

For SPSS Interval & Ratio variables are treated the same = continuous

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 15 of 37 April 2016 UNCLASSIFIED//FOUO Let’s Practice…

How can you manipulate these variables to change the scale? Education Possibilities • Education • Continuous • Calories – Total # of years of education • Passed a Test • Ordinal • Behavioral Risk Score – 1= High School Diploma • Military Rank – 2= Associates Degree • Weight Status – 3= Bachelor’s Degree – 4= Masters / Grad Degree • Age • Nominal – 1= Received BS diploma Can convert from continuous to – 2= Did Not receive BS diploma nominal or ordinal, but cannot go in the opposite direction

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 16 of 37 April 2016 UNCLASSIFIED//FOUO How to answer your Research Question?

Determine if: – Descriptive study only – Correlation / Relationship (no causation) • Pearson’s r, Spearman’s rho, Kendall’s-tau, Phi- coefficient – Statistical Inference (infer to larger population) • T-Test, Chi-Square, ANOVA • , Multivariate regression Describe Find Cause & Effect Population Relationships

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 17 of 37 April 2016 UNCLASSIFIED//FOUO Descriptive Analysis

• Describes your research sample and/or situation • Typically many independent variables (IV) – Not manipulating – just observing • What do we typically want to know?

– Demographics of Population (Sex, Age, Education, etc.) – Trying to paint a picture of the situation • Amputee weight status through healing process • Bariatric patients compliance with their meal plan • Dieting habits of Soldier trying to meet ABCP standards

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 18 of 37 April 2016 UNCLASSIFIED//FOUO Descriptive Analysis

How are outcomes displayed? • Nominal / Ordinal – Frequencies (%) – (most frequent #) or (mid-pt of data) – Presented in tables (%), or pie charts • Interval / Ratio (Continuous data) – Mean ± (SD) – Presented in tables, figures, graphs

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 19 of 37 April 2016 UNCLASSIFIED//FOUO Examples…

Table 1. Participant Descriptive Demographics Dichotomized by Body Mass Index Overall Normal BMI Overweight BMI n=295 n=105 n=190 Mean SD Mean SD Mean SD Age* 30.1 8.6 27.4 8 31.6 8.6 Continuous Data BMI 30.0 4.2 22.7 1.6 29.3 3.3

n%n%n% Normal BMI status 168 56.9 Sex (Male)* 209.0 70.8 53 50.5 156 82.1 Education High School 50 16.9 24 22.9 26 13.7 College Credits up to BS 208 35.3 68 64.7 140 73.7 Graduate Credits 37 12.5 13 12.4 24 12.6 Rank Nominal / Ordinal Enlisted 211.0 35.8 69 65.7 142 74.7 Data Officer 84.0 9.5 36 34.3 48 25.3 Branch (Army) 263.0 90.7 89 84.7 174 91.6 Ethnicity/Race Black/AA 42.0 14.3 9 8.6 33 17.6 White/Caucasian 166.0 56.7 72 68.6 94 50 Hispanic 49.0 16.7 8 7.6 41 21.9 Asian 16.05.598.673.7 Other 20.0 6.8 7 6.7 13 6.9 * p<0.05 between BMI weight categories BMI = Body Mass Index; SD = Standard Deviation; BS = Bachelors Degree; AA = African American

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 20 of 37 April 2016 UNCLASSIFIED//FOUO Examples…

Age

37% 29% Subjects 34% of

Number

12‐13 yrs 14‐15 yrs 16‐18 yr Age of Subjects with Back Pain (n=77)

80 Normal Overweight Male Female 70 70.4 70.2 60 63.3 60.4 55.9 57.7 50 51.7 50.3 29.1% 40 44.1 40.7 39 37.7 30

20 4.8% 66.1% 10

0 Env/Social Emotional Physical

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 21 of 37 April 2016 UNCLASSIFIED//FOUO Descriptives

• Remember… • You always want to describe your population • Plan to run descriptive analysis regardless of the research design • Even if RCT study you plan for… – Sample demographic descriptive analysis PLUS – T-test inferential statistics and/or correlations • Descriptive analysis should be included in the ‘Methods Section’ of protocol or manuscript

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 22 of 37 April 2016 UNCLASSIFIED//FOUO How to answer your Research Question?

Determine if: – Descriptive study only – Correlation / Relationship (no causation) • Pearson’s r, Spearman’s rho, Kendall’s-tau, Phi- coefficient – Statistical Inference (infer to larger population) • T-Test, Chi-Square, ANOVA • Linear regression, Multivariate regression

Describe Find Cause & Effect Population Relationships

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 23 of 37 April 2016 UNCLASSIFIED//FOUO Correlation Coefficients

• Assessing a relationship between variables • Do they vary together? • How strongly are they associated? • What is the nature of the relationship? • Correlation vs. Causation • Not the same • Both may be caused by another factor • Causation statements come from inferential statistics (t-test, chi-square, , etc.)

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 24 of 37 April 2016 UNCLASSIFIED//FOUO Correlation Coefficients

• Vary between zero and ± 1.00 • Closer to |1.00|, higher strength of relationship • Sign indicates direction • + increase together • - as one  the other  • Tighter grouping higher coefficient • Visualize with scatterplots

Portney & Watkins (2009) p. 524

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 25 of 37 April 2016 UNCLASSIFIED//FOUO Is it a good correlation?

Look at “r” • r = 0.00 to 0.25 = little or no relationship • r = 0.26 to 0.50 = fair relationship • r = 0.51 to 0.75 = moderate to good • r = 0.75 to 1.00 = good to excellent

Always create a scatterplot to observe values • One outlier can influence the results • Scatter points might not be uniform

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 26 of 37 April 2016 UNCLASSIFIED//FOUO Scatterplots

Current vs. Time

700 1.1 Current = 0.0032 * t + 0.44, r = 0.52 r = 0.31 1.0 600 0.9

500 0.8

0.7 400 0.6

Current 0.5

(newtons) 300 HHD Force (milliamps) 0.4 200 0.3

0.2 100 0.1

0 0.0 0405060708090100110 0 50 100 150 200 Body Mass Time (kg) (milliseconds)

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 27 of 37 April 2016 UNCLASSIFIED//FOUO Correlation Coefficient

• Covariance: as one changes, the other also changes • Does NOT assess differences • May not have statistically significant difference, but excellent correlation coefficient • May have significant difference but weak correlation coefficient • Does not assess agreement (ICCs do)

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 28 of 37 April 2016 UNCLASSIFIED//FOUO Significance of coefficient (p-value)

• Null hypothesis: the correlation coefficient between variable x and variable y is not significantly different from zero.

– H0: r = 0

• Very sensitive to sample size – trivial coefficients (r = 0.1 to 0.2) are often statistically significant if sample is large enough

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 29 of 37 April 2016 UNCLASSIFIED//FOUO Cross-Sectional Correlation Study

Design: Cross Sectional Assessment of BMI vs Intuitive Eating Scale Score (IES) Purpose: Determine if there is a correlation between BMI status and intuitive eating scores. Question: Do normal BMI Soldiers exhibit more intuitive in eating habits than overweight BMI Soldiers?

Primary outcome (dependent variable) is IES Score: • 21‐item on a 5‐point Likert Scale • Point from 21‐105; the higher the score, the more intuitive

Grouping variable (independent variable): BMI – dichotomized –normal (18‐24.9 kg/m2) & overweight (≥25 kg/m2)

What type of demographic data should I collect? Do you think there will be other associations with weight status and/or Intuitive Eating Score?

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 30 of 37 April 2016 UNCLASSIFIED//FOUO Correlation matrix

Meet Trying to BMI IES Score Gender 5days/wk Lose Wt for 30 min

BMI Correlation Coefficient 1.000 ‐ 0.555 ‐ 0.241** ‐ 0.395** 0.031 Sig. (2‐tailed) 0.000 0.000 0.000 0.670 IES Score Correlation Coefficient ‐ 1.000 ‐0.030 0.222** 0.045 Sig. (2‐tailed) 0.612 0.000 0.464 Gender Correlation Coefficient ‐ ‐ 1.000 ‐ 0.190** ‐ 0.011 Sig. (2‐tailed) 0.001 0.864 Trying to Lose Wt Correlation Coefficient ‐ ‐ ‐ 1.000 ‐ 0.018 Sig. (2‐tailed) 0.776 Meet 5days/wk for 30 min Correlation Coefficient ‐ ‐ ‐‐1.000 Sig. (2‐tailed) *= Correlation is significant at the 0.05 level (2‐tailed). **= Correlation is significant at the 0.01 level (2‐tailed). IES = Intuitive Eating Score (ranges from 21‐105 pts); Gender =1 Male, 2 Female; Trying to Lose Weight = 1 Yes, 2 No; Meet 5 d/wk (30 min) physical activity = 1 No, 2 Yes What does this mean?? Let’s draw out simple line graphs

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 31 of 37 April 2016 UNCLASSIFIED//FOUO Sometimes it helps to visualize…

r = ‐ 0.555 (p < 0.001) r = ‐ 0.241 (p < 0.001)

Negative association… Negative association… • BMI increases as IES • BMI decreases as dummy Score decreases code for Gender increases BMI • Normal BMI tends to BMI be more intuitive in • How is Gender scored? eating characteristics • 1=M and 2=F • Men are more likely to IES Score • Moderate to good M F relationship have a higher BMI than women r = 0.222 (p < 0.001) • Little to no relationship

Positive association… Wt • As dummy code of Lose Wt increases, so does IES Score Lose • How is Trying to Lose Wt scored? to • 1=Yes, 2=No

Try • Participants trying to lose weight tend to IES Score be less intuitive in eating characteristics • Little to no relationship

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 32 of 37 April 2016 UNCLASSIFIED//FOUO Is it a good correlation? Look at “r2”

r2 = “coefficient of determination” • The square of the correlation coefficient is called the “coefficient of determination” • Value is conceptually helpful • Interpreted as: “the percent of variance in x that is explained (or accounted for) by y” • So, consider correlation of BMI & IES: – r = -0.555 means r2 = 0.308 – ~31% of the variance in IES is explained by BMI

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 33 of 37 April 2016 UNCLASSIFIED//FOUO Correlation Coefficients

• If correlation analysis is used, it should be noted in the ‘methods section’ of protocol or manuscript • Typically note the type of correlational analysis, which is based upon scale of measurement • Statistical programs do not always use same theoretical nomenclature as a text book

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 34 of 37 April 2016 UNCLASSIFIED//FOUO Which Test Do I Use?

Type of Data

Ratio Interval Ordinal Nominal

Continuous Data

Ratio Pearson r Pearson r Spearman Point biserial rho

Interval Pearson r Spearman Point biserial rho

Ordinal Spearman Rank biserial rho or Kendall’s Type of Data tau

Nominal phi coefficient

• Pearson r –Both variables are continuous (ratio or interval) (same in SPSS) • Spearman rho –One variable is ordinal & the other is continuous or ordinal (same in SPSS) (also used for continuous data instead of Pearson r if the data is not normally distributed) (Kendall’s tau assess small discrepancies; more accurate w/ small sample) • Point biserial – One variable is nominal & the other is continuous (Pearson in SPSS) • Rank biserial – One variable is nominal & the other is ordinal (Kendalls‐tau in SPSS) • Phi‐coefficient ‐ Both variables are nominal (if discrete and dichotomous; 2x2 ) (Crosstab in SPSS) [email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 35 of 37 April 2016 UNCLASSIFIED//FOUO How to answer your Research Question?

Determine if: – Descriptive study only – Correlation / Relationship (no causation) • Pearson’s r, Spearman’s rho, Kendall’s-tau, Phi- coefficient – Statistical Inference (infer to larger population) • T-Test, Chi-Square, ANOVA • Prediction, Linear regression, multivariate regression

Describe Find Cause & Effect Population Relationships

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 36 of 37 April 2016 UNCLASSIFIED//FOUO To Be Continued…

Continue on with Part 2 Inferential Statistics

[email protected]; 508‐233‐5808 UNCLASSIFIED//FOUO 37 of 37 April 2016