MultivariateMultivariate StatisticsStatistics
PsyPsy 524524 AndrewAndrew AinsworthAinsworth StatStat ReviewReview 11 IVIV vs.vs. DVDV
IndependentIndependent VariableVariable (IV)(IV) ––ControlledControlled byby thethe experimenterexperimenter
––and/orand/or hypothesizedhypothesized influenceinfluence
––and/orand/or representrepresent differentdifferent groupsgroups IVIV vs.vs. DVDV
DependentDependent variablesvariables
––thethe responseresponse oror outcomeoutcome variablevariable
IVIV andand DVDV -- “input/output”,“input/output”, “stimulus/response”,“stimulus/response”, etc.etc. IVIV vs.vs. DVDV UsuallyUsually representrepresent sidessides ofof anan equationequation xy→ xyz→→ β xy+ α = xy= β + α ExtraneousExtraneous vs.vs. ConfoundingConfounding VariablesVariables
ExtraneousExtraneous ––leftleft outout (intentionally(intentionally oror forgotten)forgotten) ––ImportantImportant (e.g.(e.g. regression)regression)
ConfoundingConfounding –– ––ExtraneousExtraneous variablesvariables thatthat offeroffer alternativealternative explanationexplanation ––AnotherAnother variablevariable thatthat changeschanges alongalong withwith IVIV UnivariateUnivariate,, BivariateBivariate,, MultivariateMultivariate
UnivariateUnivariate ––onlyonly oneone DV,DV, cancan havehave multiplemultiple IVsIVs
BivariateBivariate ––twotwo variablesvariables nono specificationspecification asas toto IVIV oror DVDV (r(r oror χχ2)2)
MultivariateMultivariate ––multiplemultiple DVsDVs,, regardlessregardless ofof numbernumber ofof IVsIVs ExperimentalExperimental vs.vs. NonNon--ExperimentalExperimental
ExperimentalExperimental – high level of researcher control, direct manipulation of IV, true IV to DV causal flow
NonNon--experimentalexperimental – low or no researcher control, pre-existing groups (gender, etc.), IV and DV ambiguous
ExperimentsExperiments == internalinternal validityvalidity
NonNon--experimentsexperiments == externalexternal validityvalidity WhyWhy multivariatemultivariate statistics?statistics? WhyWhy multivariatemultivariate statistics?statistics?
RealityReality
––UnivariateUnivariate statsstats onlyonly gogo soso farfar whenwhen applicableapplicable
––“Real”“Real” datadata usuallyusually containscontains moremore thanthan oneone DVDV
––MultivariateMultivariate analysesanalyses areare muchmuch moremore realisticrealistic andand feasiblefeasible WhyWhy multivariate?multivariate?
““Minimal”Minimal” IncreaseIncrease inin ComplexityComplexity
MoreMore controlcontrol andand lessless restrictiverestrictive assumptionsassumptions
UsingUsing thethe rightright tooltool atat thethe rightright timetime
RememberRemember – Fancy stats do not make up for poor planning
– Design is more important than analysis WhenWhen isis MVMV analysisanalysis notnot usefuluseful
HypothesisHypothesis isis univariateunivariate useuse aa univariateunivariate statisticstatistic
––TestTest individualindividual hypotheseshypotheses univariatelyunivariately firstfirst andand useuse MVMV statsstats toto exploreexplore
––TheThe SimplerSimpler thethe analysesanalyses thethe moremore powerfulpowerful StatStat ReviewReview 22 Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
ContinuousContinuous datadata
––smoothsmooth transitiontransition nono stepssteps
––anyany valuevalue inin aa givengiven rangerange
––thethe numbernumber ofof givengiven valuesvalues restrictedrestricted onlyonly byby instrumentinstrument precisionprecision Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
DiscreteDiscrete ––CategoricalCategorical ––LimitedLimited amountamount ofof valuesvalues andand alwaysalways wholewhole valuesvalues
DichotomousDichotomous ––discretediscrete variablevariable withwith onlyonly twotwo categoriescategories ––BinomialBinomial distributiondistribution Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata ContinuousContinuous toto discretediscrete
– Dichotomizing, Trichotomizing, etc.
– ANOVA obsession or limited to one analyses
– Power reduction and limited interpretation
– Reinforce use of the appropriate stat at the right time Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata x1 x2 x1di x2di 11 9 1 0 10 7 1 0 11 10 1 1 14 12 1 1 14 11 1 1 10 8 1 0 12 10 1 1 10 9 1 0 11 8 1 0 10 11 1 1 …… ……
X1 dichotomized at median >=11 and x2 at median >=10 Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
CorrelationCorrelation ofof X1X1 andand X2X2 == .922.922
CorrelationCorrelation ofof X1diX1di andand X2diX2di == .570.570 Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
DiscreteDiscrete toto continuouscontinuous
––cannotcannot bebe donedone literallyliterally (not(not enoughenough infoinfo inin discretediscrete variables)variables)
––oftenoften dichotomousdichotomous datadata treatedtreated asas havinghaving underlyingunderlying continuouscontinuous scalescale 0.35 NormalNormal ProbabilityProbability FunctionFunction
0.3
0.25
0.2
0.15
0.1
0.05
0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
CorrelationCorrelation ofof X1X1 andand X2X2 whenwhen continuouscontinuous scalescale assumedassumed == .895.895
(called(called TetrachoricTetrachoric correlation)correlation)
NotNot perfect,perfect, butbut closercloser toto realreal correlationcorrelation Continuous,Continuous, DiscreteDiscrete andand DichotomousDichotomous datadata
LevelsLevels ofof MeasurementMeasurement –– NominalNominal –– CategoricalCategorical –– OrdinalOrdinal –– rankrank orderorder –– IntervalInterval –– orderedordered andand evenlyevenly spacedspaced –– RatioRatio –– hashas absoluteabsolute 00 OrthogonalityOrthogonality
CompleteComplete NonNon--relationshiprelationship
OppositeOpposite ofof correlationcorrelation
AttractiveAttractive propertyproperty whenwhen dealingdealing withwith MVMV statsstats (really(really anyany stats)stats) OrthogonalityOrthogonality
PredictPredict yy withwith twotwo Xs;Xs; bothboth XsXs relatedrelated toto y;y; orthogonalorthogonal toto eacheach other;other; eacheach xx
predictspredicts additivelyadditively (sum(sum ofof xxi/y/y correlationscorrelations equalequal multiplemultiple correlation)correlation)
X2 Y X1 OrthogonalityOrthogonality
DesignsDesigns areare orthogonalorthogonal alsoalso
WithWith multiplemultiple DV’sDV’s orthogonalityorthogonality isis alsoalso advantagesadvantages StandardStandard vs.vs. SequentialSequential AnalysesAnalyses ChoiceChoice dependsdepends onon handlinghandling commoncommon predictorpredictor variancevariance
X Y 1
X2 StandardStandard vs.vs. SequentialSequential AnalysesAnalyses StandardStandard analysisanalysis –– neitherneither IVIV getsgets creditcredit
y c Ag en t e o p Im
Health StandardStandard vs.vs. SequentialSequential AnalysesAnalyses SequentialSequential –– IVIV enteredentered firstfirst getsgets creditcredit forfor sharedshared variancevariance
y c Ag en t e o
Imp
Health MatricesMatrices
DataData MatrixMatrix GRE GPA GENDER
500 3.2 1 420 2.5 2 650 3.9 1 550 3.5 2 480 3.3 1 600 3.25 2 For gender women are coded 1 MatricesMatrices
CorrelationCorrelation oror RR matrixmatrix GRE GPA GENDER GRE 1.00 0.85 -0.13 GPA 0.85 1.00 -0.46 GENDER -0.13 -0.46 1.00 MatricesMatrices
Variance/CovarianceVariance/Covariance oror SigmaSigma matrixmatrix
GRE GPA GENDER GRE 7026.67 32.80 -6.00 GPA 32.80 0.21 -0.12 GENDER -6.00 -0.12 0.30 MatricesMatrices
SumsSums ofof SquaresSquares andand CrossCross--productsproducts matrixmatrix (SSCP)(SSCP) oror SS matrixmatrix
GRE GPA GENDER GRE 35133.33 164.00 -30.00 GPA 164.00 1.05 -0.58 GENDER -30.00 -0.58 1.50 MatricesMatrices
SumsSums ofof SquaresSquares andand CrossCross--productsproducts matrixmatrix (SSCP)(SSCP) oror SS matrixmatrix N 2 SS() Xiijj=−∑ ( X X ) i=1 N SP()( Xj Xki=−∑ Xjj X )() Xik − X k i=1