Pthe Key to Multivariate Statistics Is Understanding
Total Page:16
File Type:pdf, Size:1020Kb
Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems each technique is suited for < The objective(s) of each technique < The data structure required for each technique < Sampling considerations for each technique < Underlying mathematical model, or lack thereof, of each technique < Potential for complementary use of techniques 1 Multivariate Techniques Model Techniques y1 + y2 + ... + yi Unconstrained Ordination Cluster Analysis y1 + y2 + ... + yi = x Multivariate ANOVA Multi-Response Permutation Analysis of Similarities Mantel Test Discriminant Analysis Logistic Regression Classification Trees Indicator Species Analysis y1 + y2 + ... + yi = x1 + x2 + ... + xj Constrained Ordination Canonical Correlation Multivariate Regression Trees 2 Multivariate Techniques Technique Objective Unconstrained Ordination Extract gradients of maximum (PCA, MDS, CA, DCA, NMDS) variation Cluster Analysis Establish groups of similar (Family of techinques) entities Discrimination Test for & describe differences (MANOVA, MRPP, ANOSIM, among groups of entities or Mantel, DA, LR, CART, ISA) predict group membership Constrained Ordination Extract gradients of variation in (RDA, CCA, CAP) dependent variables explainable by independent variables 3 Multivariate Techniques Technique Variance Emphasis Unconstrained Ordination Emphasizes variation among (PCA, MDS, CA, DCA, NMDS) individual sampling entities Cluster Analysis by defining gradients of (Family of techinques) maximum total sample variance; describes the inter- Discrimination entity variance structure. (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) 4 Multivariate Techniques Technique Variance Emphasis Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) Cluster Analysis Emphasizes both differences (Family of techinques) and similarities among individual sampling entities Discrimination by clustering entities based (MANOVA, MRPP, ANOSIM, on inter-entity resemblance. Mantel, DA, LR, CART, ISA) Constrained Ordination (RDA, CCA, CAP) 5 Multivariate Techniques Technique Variance Emphasis Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) Cluster Analysis (Family of techinques) Discrimination Emphasizes variation among (MANOVA, MRPP, ANOSIM, groups of sampling entities; Mantel, DA, LR, CART, ISA) describes the inter-group variance structure. Constrained Ordination (RDA, CCA, CAP) 6 Multivariate Techniques Technique Variance Emphasis Unconstrained Ordination (PCA, MDS, CA, DCA, NMDS) Cluster Analysis (Family of techinques) Discrimination (MANOVA, MRPP, ANOSIM, Emphasizes variation among Mantel, DA, LR, CART, ISA) individual sampling entities by defining gradients of Constrained Ordination maximum total sample (RDA, CCA, CAP) variance explainable by environmental variables 7 Multivariate Techniques Technique Dependence Type Unconstrained Ordination Interdependence (PCA, MDS, CA, DCA, NMDS) Cluster Analysis Interdependence (Family of techinques) Discrimination Dependence (MANOVA, MRPP, ANOSIM, Mantel, DA, LR, CART, ISA) Constrained Ordination Dependence (RDA, CCA, CAP) 8 Multivariate Techniques Technique Data Structure Unconstrained Ordination One set; >>2 variables (PCA, MDS, CA, DCA, NMDS) Cluster Analysis One set; >>2 varibles (Family of techinques) Discrimination Two sets; 1 grouping variable, (MANOVA, MRPP, ANOSIM, >>2 discriminating variables Mantel, DA, LR, CART, ISA) Constrained Ordination Two sets; >>2 response (RDA, CCA, CAP) variables, >>2 explanatory variables 9 Multivariate Techniques Obs Group X-set Y-set 1Aa11 a12 a13 ... a1p b11 b12 b13 ... b1m 2Aa21 a22 a23 ... a2p b21 b22 b23 ... b2m 3Aa31 a32 a33 ... a3p b31 b32 b33 ... b3m ......... ... ......... ... nAan1 an2 an3 ... anp bn1 bn2 bn3 ... bnm n+1 C c11 c12 c13 ... c1p n+2 C c21 c22 c23 ... c2p Unconstrained Ordination n+3 C c31 c32 c33 ... c3p (PCA, PCO, CA, DCA, ......... ......... NMDS) NC c c c ... c Cluster Analysis n1 n2 n3 np (Family of techinques) 10 Multivariate Techniques Obs Group X-set Y-set 1Aa11 a12 a13 ... a1p b11 b12 b13 ... b1m 2Aa21 a22 a23 ... a2p b21 b22 b23 ... b2m 3Aa31 a32 a33 ... a3p b31 b32 b33 ... b3m ......... ... ......... ... nAan1 an2 an3 ... anp bn1 bn2 bn3 ... bnm n+1 C c11 c12 c13 ... c1p n+2 C c21 c22 c23 ... c2p n+3 C c31 c32 c33 ... c3p Discrimination Techniques ......... (MANOVA, MRPP, ......... ANOSIM, Mantel; DA, NC cn1 cn2 cn3 ... cnp LR, CART, ISA) 11 Multivariate Techniques Obs Group X-set Y-set 1Aa11 a12 a13 ... a1p b11 b12 b13 ... b1m 2Aa21 a22 a23 ... a2p b21 b22 b23 ... b2m 3Aa31 a32 a33 ... a3p b31 b32 b33 ... b3m ......... ....... ......... ....... nAan1 an2 an3 ... anp bn1 bn2 bn3 ... bnm n+1 C c11 c12 c13 ... c1p n+2 C c21 c22 c23 ... c2p n+3 C c31 c32 c33 ... c3p Constrained Ordination ......... (RDA, CCA, CAP, COR) ......... NC cn1 cn2 cn3 ... cnp 12 Multivariate Techniques Technique Sample Characteristics Unconstrained Ordination N (from known or unknown (PCA, MDS, CA, DCA, NMDS) # pop's) Cluster Analysis N (from known or unknown (Family of techinques) # pop's) Discrimination N (from known # pop's) or (MANOVA, MRPP, ANOSIM, N1, N2, ... (from separate Mantel, DA, LR, CART, ISA) pop's) Constrained Ordination N (from one pop) (RDA, CCA, CAP) 13 Multivariate Techniques If the research objective is to: P Describe the major ecological gradients of variation among individual sampling entities, and/or to portray sampling entities along "continuous" gradients of Unconstrained maximum sample variation, then use... Ordination P Assume linear relationship to ecological gradients... PCA, PCO(MDS) P Assume unimodal relationship to ecological gradients... CA(RA), DCA P Assume no particular relationship; only monotonic relationship between input and output dissimilarities... NMDS 14 Multivariate Techniques If the research objective is to: P Establish artificial classes or groups of similar entities where pre-specified, well- defined groups do not already exist, and/or to portray sampling entities in "discrete" groups, then use... Cluster Analysis P Assign entities to a specified number of groups to maximize within-group similarity or form composite clusters... Non-hierarchical Cluster Analysis P Assign entities to groups and display relationships among groups as they form... Hierarchical Cluster Analysis 15 Multivariate Techniques If the research objective is to: P Establish artificial classes or groups of entities with similar species composition and abundance where pre-specified, well-defined groups do not already exist, based on measured environmental variables, and/or to portray sampling entities in "discrete" groups representing species assemblages with distinct environmental affinities, then use... Constrained Cluster Analysis (MRT) 16 Multivariate Techniques If the research objective is to: P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Test for “significant” differences among groups... < Parametric test... MANOVA / DA < Nonparametric tests... MRPP, ANOSIM, Mantel 17 Multivariate Techniques If the research objective is to: P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Describe the major ecological differences among groups... < Assume a linear discrimination function... DA < Assume a logistic discrimination function... LR (MLR) < Do not assume any particular function... CART (UCT) < Identify “indicators” for each group... ISA 18 Multivariate Techniques If the research objective is to: P Differentiate among pre-specified, well-defined classes or groups of sampling entities, and to: P Predict group membership of future observations... < Linear classification function... DA (LDF) < Logistic classification function... LR (MLR) < Decision tree classifier... CART (UCT) < Other nonparametric classifiers... Kernel K nearest-neighbor 19 Multivariate Techniques If the research objective is to: P Explain the variation in a continuous dependent variable using two or more continuous independent variables, and/or to develop a model for predicting the value of the dependent variable from the values of the independent variables, then use... Multiple Linear Regression Alternatives: CART (URT) 20 Multivariate Techniques If the research objective is to: P Explain the variation in a dichotomous dependent (grouping) variable using two or more continuous and/or categorical independent variables, and/or to develop a model for predicting the group membership of a sampling entity from the values of the independent variables, then use... Multiple Logistic Regression Alternatives: DA CART (UCT) 21 Multivariate Techniques If the research objective is to: P Describe the major ecological patterns in one set of (response) variables explainable by another set of (explanatory) variables, then use... Constrained P Assume linear response function of Ordination or response variables (species) along linear MRT gradients defined by the explanatory variables (environment)... RDA, CAP P Assume unimodal response function of response variables (species) along linear gradients defined by the explanatory variables (environment)... CCA, DCCA P Do not assume any response function... MRT 22 Multivariate Techniques If the research objective is to: P Describe the major ecological relationships between two sets of variables expressed as distance matrices; i.e., dissimilarities between samples, then use... Mantel Test P Describe the major ecological