Bivariate Analysis Correlation
Total Page:16
File Type:pdf, Size:1020Kb
Bivariate Analysis Correlation Variable 1 Used when you measure two continuous variables. 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 Variable 2 LEVELS X2 X2 t-test chi square test chi square test Examples: Association between weight & height. >2 LEVELS X2 X2 ANOVA Association between age & blood pressure chi square test chi square test (F-test) CONTINUOUS t-test ANOVA -Correlation (F-test) -Simple linear Regression Correlation Pearson's Correlation Coefficient Weight (Kg) Height (cm) 55 170 200 Correlation is measured by Pearson's Correlation Coefficient. 93 180 190 90 168 180 ht g A measure of the linear association between two 60 156 170 112 178 Hei variables that have been measured on a 160 continuous scale. 45 161 150 85 181 140 Pearson's correlation coefficient is denoted by r. 104 192 0 102030405060708090100110120 68 176 Weight 87 186 A correlation coefficient is a number ranges between -1 and +1. Pearson's Correlation Coefficient Pearson's Correlation Coefficient If r = 1 Î perfect positive linear relationship between the two variables. If r = -1 Î perfect negative linear relationship between the two variables. If r = 0 Î No linear relationship between the two r= +1 r= -1 r= 0 variables. Pearson's Correlation Coefficient Pearson's Correlation Coefficient http://noppa5.pc.helsinki.fi/koe/corr/cor7.html -0.9 0.8 0.2 -0.5 Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 1: Moderate Moderate Research question: Is there a linear relationship between the weight and height of students? Ho: there is no linear relationship between weight & -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 height of students in the population (p = 0) Ha: there is a linear relationship between weight & StrongWeak Strong height of students in the population (p ≠ 0) Statistical test: Pearson correlation coefficient (R) Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 1: SPSS Output Example 1: SPSS Output r Correlations Correlations coefficient weight height weight height weight Pearson Correlation 1 .651** weight Pearson Correlation 1 .651** Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1975 1954 N 1975 1954 height Pearson Correlation .651** 1 height Pearson Correlation .651** 1 Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1954 1971 N 1954 1971 **. Correlation is significant at the 0.01 level (2 il d) **. Correlation is significant at the 0.01 level Value of statistical test: 0.651 (2 il d) P-Value P-value: 0.000 Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 1: SPSS Output Example 2: SPSS Output Correlations Correlations weight age weight height weight Pearson Correlation 1 .155** weight Pearson Correlation 1 .651** Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1975 1814 N 1975 1954 age Pearson Correlation .155** 1 height Pearson Correlation .651** 1 Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1814 1846 N 1954 1971 **. Correlation is significant at the 0.01 level **. Correlation is significant at the 0.01 level (2 t il d) (2 il d) Conclusion: At significance level of 0.05, we reject null Research question: Is there a linear relationship between hypothesis and conclude that in the population there is the age and weight of students? significant linear relationship between the weight and height of students. Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 2: SPSS Output Example 2: SPSS Output Correlations Correlations weight age weight age weight Pearson Correlation 1 .155** weight Pearson Correlation 1 .155** Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1975 1814 N 1975 1814 age Pearson Correlation .155** 1 age Pearson Correlation .155** 1 Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1814 1846 N 1814 1846 **. Correlation is significant at the 0.01 level **. Correlation is significant at the 0.01 level (2 t il d) (2 t il d) p = 0 ; No linear relationship between weight & age H : Value of statistical test: 0.155 o in the population P-value: 0.000 Ha: p ≠ 0 ; There is linear relationship between weight & age in the population Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 2: SPSS Output Example 3: SPSS Output Correlations Correlations weight age age height weight Pearson Correlation 1 .155** age Pearson Correlation 1 .084** Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1975 1814 N 1846 1812 age Pearson Correlation .155** 1 height Pearson Correlation .084** 1 Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1814 1846 N 1812 1971 **. Correlation is significant at the 0.01 level **. Correlation is significant at the 0.01 level (2 t il d) (2 t il d) Conclusion: At significance level of 0.05, we reject null Research question: Is there a linear relationship between hypothesis and conclude that in the population there is a the age and height of students? significant linear relationship between the weight and age of students. Pearson's Correlation Coefficient Pearson's Correlation Coefficient Example 3: SPSS Output Example 3: SPSS Output Correlations Correlations age height age height age Pearson Correlation 1 .084** age Pearson Correlation 1 .084** Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1846 1812 N 1846 1812 height Pearson Correlation .084** 1 height Pearson Correlation .084** 1 Sig. (2-tailed) .000 Sig. (2-tailed) .000 N 1812 1971 N 1812 1971 **. Correlation is significant at the 0.01 level **. Correlation is significant at the 0.01 level (2 t il d) (2 t il d) p = 0 ; No linear relationship between height & age H : Value of statistical test: 0.084 o in the population P-value: 0.000 Ha: p ≠ 0 ; There is linear relationship between height & age in the population Pearson's Correlation Coefficient SPSS command for r Example 3: SPSS Output Example 1 Correlations Analyze age height Correlate age Pearson Correlation 1 .084** Bivariate Sig. (2-tailed) .000 N 1846 1812 select height and weight and put it in the height Pearson Correlation .084** 1 “variables” box. Sig. (2-tailed) .000 N 1812 1971 **. Correlation is significant at the 0.01 level (2 t il d) Conclusion: At significance level of 0.05, we reject null hypothesis and conclude that in the population there is a significant linear relationship between the height and age of students. In-class questions In-class questions T (True) or F (False): T (True) or F (False): In studying whether there is an association The correlation between obesity and number of between gender and weight, the investigator cigarettes smoked was r=0.012 and the p-value= found out that r= 0.90 and p-value<0.001 and 0.856. Based on these results we conclude that concludes that there is a strong significant there isn’t any association between obesity and correlation between gender and weight. number of cigarette smoked. Simple Linear Regression Simple Linear Regression Used to explain observed variation in the data In order to explain why BP of individual patients are different, we try to associate the differences in PB with differences in other relevant patient characteristics (variables). For example, we measure blood pressure in a sample of patients and observe: Example: Can variation in blood pressure be explained by age? I=Pt# 1 2 3 4 5 6 7 Y= BP 85 105 90 85 110 70 115 Simple Linear Regression Simple Linear Regression Questions: Mathematical properties of a straight line Y= B0 + B1X 1) What is the most appropriate Y = dependent variable mathematical Model to use? X = independent variable A straight line, parabola, B = Y intercept etc… 0 B1= Slope 2) Given a specific model, how The intercept B is the value of Y when X=0. do we determine the best 0 fitting model? The slope B1 is the amount of change in Y for each 1-unit change in X. Simple Linear Regression Simple Linear Regression Estimation of a simple Linear Regression Model Example 1: Research Question: Does height help to predict weight Optimal Regression line = B + B X using a straight line model? Is there a linear relationship 0 1 between weight and height? Does height explain a significant portion of the variation in the values of weight observed? Y = B0 + B1X Weight = B0 + B1 Height Simple Linear Regression Simple Linear Regression SPSS output: Example 1 SPSS output (Continued): Example 1 ANOVAb b Variables Entered/Removed Sum of Model Squares df Mean Square F Sig. Variables Variables 1 Regression 169820.3 1 169820.297 1435.130 .000a Model Entered Removed Method Residual 230982.0 1952 118.331 1 heighta . Enter Total 400802.3 1953 a. All requested variables entered. a. Predictors: (Constant), height b. Dependent Variable: weight b. Dependent Variable: weight Model Summary Coefficientsa Adjusted Std. Error of Unstandardized Standardized Model R R Square R Square the Estimate Coefficients Coefficients 1 .651a .424 .423 10.878 Model B Std. Error Beta t Sig. 1 (Constant) -95.246 4.226 -22.539 .000 a. Predictors: (Constant), height height .940 .025 .651 37.883 .000 a. Dependent Variable: weight Simple Linear Regression Simple Linear Regression SPSS output (Continued): Example 1 SPSS output (Continued): Example 1 Coefficientsa Model Summary Unstandardized Standardized Adjusted Std. Error of Coefficients Coefficients Model R R Square R Square the Estimate Model B Std. Error Beta t Sig. 1 .651a .424 .423 10.878 1 (Constant) -95.246 4.226 -22.539 .000 height .940 .025 .651 37.883 .000 a. Predictors: (Constant), height a. Dependent Variable: weight Weight = B + B Height 0.424 Height explains 42.4% of the variation seen in 0 1 weight -95.246 0.940 Weight = -95.246 + 0.94 Height Increasing height by 1 unit (1 cm) increases weight by 0.94 Kg Simple Linear Regression In-class questions Question 1: Coefficientsa Unstandardized Standardized Coefficients Coefficients In a simple linear regression model the predicted straight line Model B Std.