<<

bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

1 A Multivariate View of Parallel

2

3 Stephen P. De Lisle*

4 Daniel I. Bolnick

5

6 Department of & Evolutionary

7 University of Connecticut

8 Storrs, CT 06269

9

10 * email: [email protected]

11

12 Running title: Parallelism revealed

13

14 Keywords: , Gasterosteus aculeatus, , phenotypic vector

15 analysis, random matrix theory

16

17

18

19

20

21

22

23

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

24 Abstract

25 A growing number of empirical studies have quantified the degree to which evolution is

26 geometrically parallel, by estimating and interpreting pairwise angles between evolutionary

27 change vectors in multiple replicate lineages. Similar comparisons, of distance in trait space, are

28 used to assess the degree of convergence. These approaches amount to element-by-element

29 interpretation of distance matrices, and can fail to capture the true extent of multivariate

30 parallelism when evolution involves multiple traits sampled across multiple lineages. We

31 suggest an alternative set of approaches, co-opted from evolutionary quantitative genetics,

32 involving eigen analysis and comparison of among-lineage covariance matrices. Such

33 approaches not only allow the full extent of multivariate parallelism to be revealed and

34 interpreted, but also allow for the definition of biologically tenable null hypotheses against which

35 empirical patterns can be tested. Reanalysis of a dataset of multivariate evolution across a

36 replicated lake/stream gradient in threespine stickleback reveals that most of the variation in the

37 direction of evolutionary change can be captured in just a few dimensions, indicating a greater

38 extent of parallelism than previously appreciated. We suggest that applying such multivariate

39 approaches may often be necessary to fully understand the extent and form of parallel and

40 convergent evolution.

41

42

43

44

45

46

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

47 Introduction

48 How repeatable is the evolutionary process? Highpoints in the history of include taxa that

49 have appeared to independently evolve similar in response to similar environmental

50 conditions. Such examples represent some of the most striking evidence for adaptive evolution,

51 suggesting not only that evolution can sometimes repeat itself, but that we can also identify the

52 broad environmental factors governing (Nosil et al. 2002, Langerhans and

53 DeWitt 2004, Losos 2011, Bolnick et al. 2018, Stuart 2019). Thus, this classical question cuts to

54 the core of ongoing debates over the role of chance and determinism in evolution at all

55 timescales.

56 The question of repeatability in evolution has been reframed in light of contemporary

57 approaches to studying evolution and natural selection in the wild. This new work has

58 distinguished the outcome of evolution, convergent (divergent), from the path of evolutionary

59 change, parallel (nonparallel) (Bolnick et al. 2018). Purpose-built statistical tests have been

60 invented to test hypotheses of parallelism and convergence at both the micro (Collyer and Adams

61 2007, Adams and Collyer 2009, Collyer et al. 2015) and macro scale (Mahler et al. 2013). This

62 work has led to some important advances in our empirical understanding of how repeatable

63 evolution can be (Oke et al. 2017, Stuart et al. 2017), yet has also highlighted some fundamental

64 challenges (Bolnick et al. 2018) of linking pattern and process in empirical tests of the

65 repeatability of evolution.

66 Here we emphasize an explicitly-multivariate set of approaches to the study of parallel

67 and convergent evolution. These multivariate approaches surmount three key difficulties: 1)

68 univariate analyses, often employed in studies of parallel and convergent evolution, can be

69 difficult to interpret when the data are multivariate, and can lead to 2) underestimation of the

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

70 importance of shared dimensions of evolutionary change, which have been difficult to identify

71 due to the challenge of 3) construction of a statistical null expectation that is also biologically

72 appropriate, against which empirical patterns can be pitted. We show how these challenges can

73 be resolved with some basics of linear algebra that permit analysis of entire matrices of similarity

74 of evolutionary change. We illustrate our points by revisiting a published dataset of parallel

75 evolution in a fish species, where a truly-multivariate approach reveals a far greater extent of

76 parallelism than can be concluded from univariate analysis. Our arguments largely ‘parallel’

77 similar issues raised in a closely aligned subfield, evolutionary quantitative genetics, where it has

78 long been recognized (Lande and Arnold 1983, Phillips and Arnold 1989, Blows and Brooks

79 2003, Blows 2007a, b, Kirkpatrick 2009, Wyman et al. 2013) that understanding selection on and

80 expression of complex traits requires approaches that are explicitly multivariate.

81 Below, we first define parallelism and convergence as separate but related patterns. Next,

82 we outline a set of multivariate geometric techniques to explore the degree and form of

83 parallelism, applying these techniques to a reanalysis of a published dataset of parallelism in a

84 fish species. We then review approaches to assessing multivariate convergent/divergent

85 evolution, independent of parallelism, and highlight a specific published case study leveraging

86 such an approach. We discuss advantages and limitations of these multivariate techniques.

87

88 Defining parallelism and convergence as unique phenomenon

89 Parallel (nonparallel) and convergent (divergent) evolution can be seen as separated but related

90 patterns (Figure 1; see also Bolnick et al. 2018). This separation is potentially important because

91 it is the case that unique evolutionary processes could lead to patterns of evolutionary parallelism

92 without convergence, or vice-versa. But, parallelism and convergence also need not be viewed

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

93 as mutually exclusive phenomena. For example, conserved directional selection across lineages

94 could result in parallelism without convergence (Figure 1A). Alternatively, towards a

95 shared optimum by lineages with unique evolutionary histories, and thus unique ancestral

96 positions in trait space, could result in convergence without parallelism (Figure 1B). Yet,

97 parallel evolutionary processes can lead to divergence if, for example, one lineage evolves faster

98 along a shared trajectory. Thus, separating parallelism and convergence may often be necessary

99 to link evolutionary pattern and process.

100

101 Geometry of parallel evolution

102 An intuitive way to define parallelism is via analysis of evolutionary change vectors (Collyer and

103 Adams 2007, Adams and Collyer 2009), where the vector of multivariate evolutionary change

104 across an environmental gradient or time points a and b (e.g. and ancestral versus descendant

105 population), for a given lineage is

∆ (1)

106 (Lande 1979) perhaps with traits standardized to make units comparable across traits. Note that,

107 beyond morphology, such a vector can be defined using breeding values, sequence variation

108 (Stuart et al. 2017), or gene expression profiles, and that environments can be defined using

109 external abiotic factors or any other feature of interest that defines subpopulations (e.g., sex; De

110 Lisle and Rowe 2017). ∆ is a vector of evolutionary change, or evolutionary response to

111 selection, and a growing number of studies have employed an approach (Collyer and Adams

112 2007, Oke et al. 2017, Stuart et al. 2017) where the angle between ∆ vectors is estimated for

113 each pairwise combination of lineages studied, as

∆∆ (2)

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

114 where each vector has been normalized to unit length. Statistical significance of the angle is

115 often assessed via permutation of the data, comparing the observed value to a null of perfect

116 parallelism across each pairwise combination of lineage replicates (Collyer and Adams 2007,

117 Adams and Collyer 2009). While appropriate for assessing significance of a specific pair of

118 vectors, for example when only two lineages are sampled or are of particular a priori interest,

119 this approach is problematic for two reasons when multiple lineages are sampled.

120 First, how do we interpret values when more than two lineages are sampled? Lineages

121 would be expected to differ, potentially substantially, in the direction of evolutionary change

122 even when most are evolving in a similar direction. The problem is illustrated in Figure 2A,

123 where three lineages are diverging in multivariate trait space at angles of = 45.5 degrees from

124 each other. Each pairwise angle interpreted alone suggest evolution is largely nonparallel. Yet

125 this element-by-element interpretation misses the fact that there is a shared common axis of

126 divergence in multivariate trait space that all three lineages load strongly onto and that accounts

127 for a large amount (80%) of the among-lineage variation in the direction of evolution.

128 Moreover, multiple dimensions of parallelism and (anti)parallelism, although a likely feature of

129 multivariate parallel evolution (Figure 2B), cannot be inferred in any intuitive way from

130 interpreting individual angles alone. This challenge in interpreting these pairwise angles

131 individually only becomes greater as the dimensionality of trait space and the number of lineages

132 sampled increases, and is analogous to the within-lineage challenge of using single elements of a

133 genetic covariance matrix to understand the distribution of genetic variation across traits

134 (Kirkpatrick 2009), or the challenge of using single elements of a matrix of nonlinear selection

135 gradients to infer the shape of nonlinear selection (Blows and Brooks 2003, Blows 2007a, b).

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

136 Put simply, lineages may share a common axis (or axes) of directional change even when most

137 of the individual angular distances between lineages are significantly non-zero.

138 A second and related difficulty with interpreting angular distances individually is the

139 problem of what the null is for the angle between any given pair of lineages (Bolnick et al.

140 2018). Many investigators may be interested in comparing observed patterns to a null of

141 independent directions of evolution across lineages. This null results in an expected value of 90

142 degrees for any pair of lineages. Yet intuitively, the null distribution should depend on both the

143 number of traits sampled and the total number of lineages (i.e., beyond any given pair in

144 question), as sampling many lineages evolving independently in low-dimensional trait space is

145 expected to often result in some highly parallel pair-wise combinations through chance alone.

146 Moreover, investigators may be interested in testing more sophisticated null hypotheses, for

147 example comparing patterns of parallelism to what may be expected due to biases from genetic

148 correlations when estimate(s) of G are available (Schluter 1996), and it will often be difficult to

149 define such a corresponding null angle that can be tested using the permutation approach

150 typically used in studies of parallelism.

151 These challenges emerge from what is, at its essence, an element-by-element approach to

152 analysis of angular distance matrices, and they can be circumvented by adopting approaches that

153 consider explicitly the entire matrix of similarity among lineage’s evolutionary change vectors.

154 We can define a matrix of ∆ of vectors for n traits from m lineages as

155

∆, ∆, (3) ∆, ∆,

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

156 with m rows as replicate lineages and n columns as traits. From this data matrix, where each

157 element is the evolutionary change value for a single trait from a single independently evolving

158 lineage, and each row thus represents a ∆ vector from a given lineage, we can calculate two

159 potentially relevant covariance matrices. One is the n×n matrix (where n is the number of

160 measured traits) of among-lineage variances and covariances in the traits’ evolutionary changes

161

(4)

162 This captures the variances and covariances of evolutionary change across replicate lineages for

163 each trait. Spectral decomposition (eigen analysis) of R would be relevant for understanding

164 parallel evolution, since parallel evolution would lead to dimensions of R with very low

165 variance. However, it is a challenge to test hypotheses about null or nearly-null dimensions

166 (Kirkpatrick 2009, Blows and McGuigan 2015). So, for the specific question of, “How parallel is

167 evolutionary change?”, we may be better off focusing on a different product of X,

168

(5)

169 Which is the matrix of vector correlations (assuming the rows of X have been normalized) of

170 evolutionary change vectors for each pair of independent lineages in X. Thus C is a m×m

171 correlation matrix where m is the number of replicate lineages. This matrix contains ones on the

172 diagonal, and each off-diagonal describes the correlation between multivariate evolutionary

173 change vectors across a pair of lineages. By describing the relationship between lineage vectors

174 as a correlation instead of an angle (i.e., a distance), we can use the spectral decomposition

175 (eigen decomposition) of C,

176

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

(6)

177 where Q is the matrix of eigenvectors and V is the diagonal matrix of eigenvalues,

178 to assess two major questions on the degree and form of multivariate parallelism that cannot be

179 tested using univariate approaches:

180

181 1) How extensive is parallelism? Parallel evolution (and anti-parallel) will be captured by

182 the leading eigenvector(s) of C. In principle at the extreme, where all evolutionary

183 change is perfectly parallel, C would be of unit rank, with all variance (= m) captured in a

184 single dimension, q1, of shared evolutionary change. At the other extreme, if

185 evolutionary change is completely independent in directional across all lineages sampled,

186 C would be of rank equal to m or n (whichever is lower), with variance distributed

187 uniformly across non-null (if more lineages than traits are sampled) eigenvectors. Thus,

188 the strength of parallelism captured by the leading eigenvector (q1) of C can be expressed

189 as / ∑ , where is the leading eigenvalue of C. True parallelism is reflected when

190 all lineages load positively on the corresponding eigenvector q1, while anti-parallelism

191 would be reflected by a mixture of positive and negative loadings on q1.

192

193 2) How many dimensions of shared evolutionary change exist? (Anti)Parallel evolution in

194 multiple orthogonal directions would be reflected in multiple significant eigenvalues.

195 When multiple dimensions are identified as significant (see below), these dimensions

196 reflect orthogonal axes of parallel and anti-parallel evolution. Multiple dimensions

197 suggest more than one alternative solution to a particular adaptive challenge, or evolution

198 towards more than one phenotypic optimum, and hence reflect parallel evolution among

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

199 certain comparisons between replicate descendant populations but non-parallel evolution

200 for other such comparisons. Figure 2B illustrates such a scenario.

201

202 Thus, the eigenvectors and eigenvalues of C provide information on both the extent of

203 multivariate parallelism and contributions of specific lineages to parallel evolution or lack

204 thereof. Eigenvector scores could, in principle, be related to other variables of interest, such as

205 among-lineage environmental variation via canonical correlation or other approaches. We can

206 also relate the eigenvectors of C back to trait space via

(7)

207 Where A is a matrix with m column vectors of length n relating each eigenvector of C back to

208 trait space (Figure 2). We can then see how traits load on these vectors to establish which traits

209 are contributing most to parallelism.

210 Of course, sampling error will create variance in all or most dimensions. This same

211 sampling error will also lead to a skewed distribution of eigenvalues even in the absence of

212 actual parallel evolution (Johnstone 2001). This leads to the problem of constructing a null

213 expectation for evolutionary parallelism against which empirical patterns can be tested.

214 Specifically, what is the null expectation for the distribution of eigenvalues of C? This is a

215 problem in random matrix theory, where the distribution of eigenvalues of random covariance

216 matrices has been of great interest (Tracy and Widom 1996, Johnstone 2001, Tracy and Widom

217 2009, Blows and McGuigan 2015, Sztepanacz and Blows 2017). Generally, the leading

218 eigenvalue of a sample covariance matrix is expected to be follow a Tracy-Widom distribution

219 (Tracy and Widom 1996, 2009, Blows and McGuigan 2015, Sztepanacz and Blows 2017);

220 however, this distribution is sensitive to centering and scaling constants that may make

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

221 hypothesis testing difficult (Blows et al. 2015), and we may also be interested in assessing

222 significance of more than one dimension (as suggested above). A null distribution of all m

223 eigenvalues can be obtained by sampling the m-dimensional Wishart distribution with n df,

224 where the hypothesis being tested depends on the covariance structure of the sampled Wishart

225 distribution (Johnstone 2001). If we are interested in testing a null hypothesis of independent

226 evolution across lineages, this would correspond to zero off-diagonals (an identity matrix) in the

227 Wishart covariance structure. An example of such a null distribution under varying n is

228 illustrated in Figure 3. For the case of more lineages than traits, we can simulate random vectors

229 to establish the null empirically, although the strongly skewed distribution of eigenvalues in such

230 a scenario indicates that exceptionally strong parallelism is required to reject the null (Figure 3).

231 By comparing the eigenvalues of C to the expected distribution of eigenvalues under the null

232 expectation, we can establish whether evolution is significantly more parallel in multivariate trait

233 space, than null expectations. And, we can test whether this is true in more than one dimension,

234 which would be reflected in multiple eigenvalues greater than the null expectation.

235 Although a null hypothesis of independent directions of evolution among lineages is

236 likely to be the most appropriate null for many tests of the presence and importance of

237 parallelism, other null hypotheses may be of interest. For example, one may wish to test

238 empirical patterns against a null of perfect parallelism, when researchers seek to identify

239 population-specific (e.g., non-parallel) aspects of evolution. In such cases a null distribution of

240 unit rank matrices would be appropriate. Alternatively, we may also be interested in the degree

241 to which genetic covariances and selection shape parallelism. We can relate C to G, for the case

242 of m < n, by postmultiplying the right hand side of equation 5 by the m dimensional identity

243 matrix in the form of and realizing that under directional selection and constant G,

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

244 (Zeng 1988), where B is the covariance matrix of directional selection gradients across

245 lineages. However, because both G and R share the same dimensionality (number of traits, as

246 opposed to lineages as with C) it may be far more straightforward to test hypotheses on the

247 influence of G on evolutionary change via analysis of R instead of C (e.g., see Chenoweth et al.

248 2010). It is likely that constructing more sophisticated null expectations for contrasts in

249 orientation may require development of theoretical models that focus specifically on the

250 geometry of divergence in complex traits (e.g. see Thompson et al. 2019, whose model is

251 constructed specifically to bear relevance to angular contrasts).

252

253 Example: Parallel evolution across a lake-stream gradient in threespine stickleback

254 Threespine stickleback in postglacial inland waters of the Northern hemisphere represent one of

255 the most compelling models for testing the repeatability of evolution. Repeated colonization of

256 freshwater from marine environments has resulted in replicated divergence from a marine

257 ancestral phenotype into multiple freshwater forms (McPhail 1993, Bell and Foster 1994,

258 Colosimo et al. 2005). In a comprehensive study of parallel evolution in this system, Stuart et al.

259 (2017) sampled morphology, genetics, and environmental variables from lake and stream

260 subpopulations from 16 lineages (watersheds) on Vancouver Island, Canada. They found only

261 limited evidence of parallelism in morphological divergence in 84-dimensional trait space

262 between lake and stream subpopulations across the 16 replicate lineages, as inferred in part from

263 a relatively even distribution of angular distances across all 120 pairwise lineage combinations

264 (their figure 3b) with no clear signal of a skew towards values near zero. By correlating these

265 pairwise angular distances in morphological change vectors, with estimates of angular distance

266 in genetic and environmental change vectors, Stuart et al. showed that deviations from

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

267 parallelism among lineages is due to non-parallelism in environmental factors, and levels of gene

268 flow across subpopulations. This study is unique in leveraging additional data to explain

269 variation in morphological parallelism, yet the high dimensional nature of the Stuart et al. data

270 exemplifies the challenges of inferring the overall degree of parallelism from angular distance

271 alone.

272 Although element-by-element interpretation of angular distances provides little evidence

273 of strong morphological parallelism in the Stuart et al. data, there are a few traits that were

274 strongly parallel. In a reanalysis of their morphological dataset, we find that spectral

275 decomposition of the matrix of vector correlations in morphological change vectors, C, reveals a

276 strongly skewed distribution of eigenvalues (Figure 4). Comparison of this distribution to that

277 expected under a null hypothesis of random (in direction) evolutionary change across lineages

278 reveals strong statistical support for three dimensions of parallel evolution (Figure 4). The

279 leading eigenvalue is greater than 7, capturing nearly 50% of the variance among lineages in the

280 direction of evolution, and together the three significant dimensions of parallelism explain 77%

281 of the among-lineage variation. Examination of the lineage loadings on the corresponding

282 eigenvectors reveals that the sign of these loadings is shared by most lineages, indicating these

283 shared axes of divergence are largely (although not completely) axes of parallelism, as opposed

284 to antiparallelism (Figure 5). Projecting these vectors back into trait space (via equation 7)

285 indicates that traits classified as those related to defense, swimming, and trophic interactions load

286 most strongly on these three axes, compared to traits that are generally unclassified (Figure 6).

287 This analysis (SAS/IML script as well as estimates of C, Q, V, and A, provided in the

288 supplemental material) suggests an important role for morphological parallelism across lineages,

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

289 and suggests traits known to play a role in performance, defense, and resource acquisition to be

290 important in parallel adaptation across a lake/stream gradient.

291 By revealing statistical support for three dimensions of (imperfect) parallelism, our revisit

292 of the Stuart et al. dataset in some ways recapitulates their findings: we do not find evidence of a

293 single dimension of complete parallelism, which would be reflected in only a single significant

294 eigenvalue with an eigenvector that all lineages load positively onto. Thus, patterns of

295 evolutionary change in this system are apparently complex, and influenced by more than the

296 simple environmental classification of ‘lake’ versus ‘stream’, and the analyses in Stuart et al.

297 (2017) show how such deviations from parallelism can be associated with potential causal

298 sources of variation. However, the multivariate approach we take here suggests a far more

299 important role for morphological parallelism than can be inferred from interpretation of

300 individual angular distances: most of the variation in evolutionary change in 84 dimensional trait

301 space across 16 independent lineages can be captured in just three dimensions, indicating

302 overwhelming statistical support (c.f. lead eigenvalue in Figure 5 to null) to reject a null

303 hypothesis that evolution has proceeded in random directions in trait space across these lineages.

304 Moreover, although trait-by-trait univariate linear models suggest that the traits with clearly-

305 defined ecological functions rarely diverge in parallel in univariate space (Stuart et al.’s Figure

306 1A), our multivariate analysis reveals that these traits are in fact the most important contributors

307 to shared axes of parallel divergence in multivariate trait space (Figure 6). This is consistent

308 with biological intuition that traits related to swimming performance, defense from predators,

309 and feeding may play an important role in adaptation to lake versus stream environments. Thus,

310 our reanalysis of the Stuart et al. dataset illustrates the utility of taking a multivariate approach to

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

311 assessing the overall degree of parallelism, an approach which complements their original

312 analyses relating deviations from perfect parallelism to and environmental variation.

313

314 Convergent and in multivariate trait space

315 We can think of convergent versus divergent evolution as a distinct and non-exclusive

316 phenomenon from parallelism (Figure 1; Bolnick et al. 2018). Past workers have proposed

317 separate (from analysis of parallelism) tests of convergence via estimation of the distribution of

318 pair-wise differences in the lengths of evolutionary change vectors

∆ || ∆ || || ∆ || (8)

319 calculated for each pairwise combination of lineages, where non-zero values imply some degree

320 of convergence or divergence (Collyer and Adams 2007, Stuart et al. 2017, Bolnick et al. 2018).

321 Alternatively, estimation of the change in Euclidean distance between lineage pairs across

322 environments/timepoints a and b

,,, ,,, (9)

323 can be calculated, where negative values would indicate convergence between a given lineage

324 pair and positive values divergence (Bolnick et al. 2018), and this approach could perhaps be

325 modified by instead calculating Mahalanobis distance standardized by a pooled covariance

326 matrix. The former approach (equation 8) is problematic because it is difficult to interpret how

327 values of ∆ correspond to convergence vs divergence (Bolnick et al. 2018). The latter

328 approach is essentially an indirect comparison of the variance among lineages in one

329 environment versus the other, which captures multivariate convergence versus divergence in an

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

330 intuitive, albeit somewhat indirect (when more than two lineages are sampled) way. Both

331 approaches are problematic in that they do not allow assessment of which traits contribute most

332 to convergence versus divergence, nor do they account for more complex scenarios where the

333 form of convergent/divergent evolution differs across traits or trait combinations (Figure 7).

334 A more direct approach is to compare the among-lineage covariance matrices of trait

335 mean values, D (Lande 1979, Lande 1980), estimated separately for the two environments or

336 timepoints. Given we have vectors of trait means and corresponding to

337 timepoint/environment a and b for each lineage, we can calculate their separate among lineage

338 covariance matrices and . Multivariate convergent evolution would result in a reduction in

339 among-lineage variance, and a reduction in the size of the covariance matrix D. Thus a

340 comparison of the traces of and provides a test of whether evolution has been net-

341 convergent or net-divergent, similar to the analysis of pairwise Euclidean distances as in

342 equation 10.

343 The comparison of the among-lineage covariance matrices provides potential for

344 additional insight when patterns of selection or genetics produce convergence in some

345 combinations of traits, and divergence in others (Figure 7). Although the formal comparison of

346 covariance matrices is a complex statistical problem, the challenge is reduced when only a single

347 pair of matrices is to be compared (as is the case when only two environments or time points are

348 studied) and when one is interested in retaining and contrasting all principle components (Blows

349 et al. 2004). In this case, a Common Principle Component Analysis (CPCA) approach allows for

350 a straightforward test of how and differ that is also easily interpretable (Flury 1988,

351 Phillips and Arnold 1999). In the simplest case, a nested multivariate mixed effects model can

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

352 be used to test the null of = , rejection of which indicates convergence (or divergence) in

353 some combination of traits.

354 Finally, we note that trait scaling represents a particular challenge in analysis of

355 convergence regardless of analytic approach. What do we do when traits are expressed in

356 different units, or differ substantially in value, or evolvability? In some cases, analysis of the

357 raw trait values (population means) may appropriate. Alternatively, one could center the

358 population means and scale them by a pooled estimate of evolvability, such as the pooled within-

359 population phenotypic or genetic covariance matrix P or G, if available. Regardless, we

360 emphasize that the issue of trait scale is a complex and critical topic (see Houle et al. 2011), that

361 should be carefully considered in studies of multivariate evolutionary change.

362

363 Example: Multivariate divergent evolution during adaptation to a novel environment

364 Here we provide no novel analyses, but rather highlight a published study employing the

365 approaches outlined above to test hypotheses related to multivariate convergent evolution.

366 Although a number of studies have analyzed eigenstructure of D matrices to test hypotheses on

367 the form of divergence across lineages (Schluter 1996, Blows and Higgie 2003, McGuigan et al.

368 2005, Hohenlohe and Arnold 2008, Kolbe et al. 2011, Punzalan and Rowe 2016, De Lisle and

369 Rowe 2017), the study of Schoustra et al. (2012) provides an example of how formal comparison

370 of D matrices across environments provides a straightforward approach to testing hypotheses

371 related to multivariate convergence. In this study the authors used a laboratory experimental

372 evolution approach to understand how population size influences convergent/divergent

373 evolution. To do so they adapted replicate lineages of filamentous fungus (Aspergillus nidulans),

374 originating from a common ancestor, to a novel laboratory environment under two population

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

375 size treatments, low and high. They then measured four traits related to colony formation at the

376 conclusion of 800 generations of adaptation. They used a nested multivariate mixed effects

377 model and found no evidence to reject the null of = , indicating no evidence of

378 multivariate convergence/divergence across population size treatments. They confirmed this

379 result using CPCA, finding no evidence to reject a null of matrix equality in favor of

380 proportionality. Finally, Schoustra et al. (2012) used a factor analytic mixed modelling approach

381 to estimate the rank of the pooled matrix, finding statistical support for a full rank divergence

382 matrix. Together, these analyses indicate divergent evolution in all trait dimensions following

383 adaptation from a common ancestor, and that this divergence was not influenced by population

384 size environments. Importantly, this study illustrates how analysis of D matrices in a study of

385 convergent evolution allows for biological hypotheses related to multivariate convergence to be

386 defined in terms of specific statistical contrasts of D across environments of interest.

387

388 Discussion and Conclusions

389 Here we advocate for an explicit multivariate approach in studies of parallel and convergent

390 evolution. Spectral decomposition and comparisons of two categories of covariance matrices,

391 the matrix of vector correlations between lineage evolutionary change vectors (C) and the

392 among-lineage covariance matrices in trait mean values (D), allow biologists to test hypotheses

393 concerning the extent and form of multivariate parallel or convergent evolution (respectively),

394 and to identify the crucial actual lineages and traits underlying these patterns. Whenever multiple

395 traits from multiple lineages are sampled, such an approach provides a potentially more complete

396 picture of the degree and extent of evolutionary change than does element-by-element analysis

397 (e.g., pairwise angles), as is often employed in such studies.

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

398 The general approaches we advocate are certainly not new to ,

399 having been developed, described, and applied extensively in the evolutionary quantitative

400 genetics literature. Moreover, such approaches are likely already conceptually grasped by

401 workers familiar with the mechanics of a principle component analysis, and are readily

402 implemented in existing software capable of performing simple matrix manipulations (e.g., R,

403 SAS/IML, Matlab) and perhaps fitting of mixed effects models. Relevant purpose-built

404 statistical packages are also available (e.g. Melo et al. 2015). The point of our paper, then, is

405 simply to highlight one growing (Bolnick et al. 2018, Stuart 2019) subfield of evolutionary

406 biology – quantitative studies of evolutionary parallelism – where these multivariate approaches

407 may be particularly useful but are often not yet employed. Indeed, in many cases ascertaining the

408 true degree and form of parallel evolution, in particular, may be otherwise impossible.

409 We have suggested that spectral decomposition of among-lineage covariance matrices

410 may often indicate a greater extent of parallelism than is implied through element-by-element

411 interpretation of angular distances. Our reanalysis of a high dimensional dataset of

412 morphological evolution in stickleback lineages support this suggestion, and indicate a greater

413 role for parallelism than can be concluded from individual interpretation of angular distances

414 (Stuart et al. 2017). Of course, whether this suggestion is broadly true is an open empirical

415 question requiring additional studies from a variety of taxa. Moreover, our work suggests that in

416 some cases, especially when trait dimensionality is low, comparison of the eigenvalues of C to

417 an appropriate null may indicate little support for parallelism where an element-by-element

418 approach would suggest otherwise. This is because large correlations between evolutionary

419 change vectors from independent, randomly-diverging lineages are expected when many

420 lineages are sampled relative to the number of traits measured. Although high trait

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

421 dimensionality often presents a challenge in analysis of multivariate datasets, analysis of vector

422 correlations transposes the dataframe (cf. equations 4 and 5) in such a way that flips the burden

423 of statistical power to favor sampling of traits. However, an important caveat is that any test of

424 convergent versus divergent evolution via comparison of D matrices, as also suggested here and

425 illustrated in past studies (e.g. Schoustra et al. 2012), will favor sampling of lineages over traits.

426 This suggests that careful consideration of the question of interest (tests of parallelism or tests of

427 convergence) may be critical in the early stages of study design if the approaches advocated here

428 are to be employed. Alternatively, in some cases it could be desirable to identify the traits

429 contributing most to parallelism via eigen analysis of C, and then focus on contrasts of lower

430 dimensional D matrices estimated using only a subset of key traits. Ultimately, the decision of

431 number and type of traits included in any analysis is a biological, rather than statistical or

432 geometric, problem; although high trait dimensionality results in null expectations favorable to

433 uncovering parallelism, including traits arbitrarily may weaken the effects observed through

434 parallel divergence in the traits that matter. Conversely, choosing traits non-randomly can

435 amount to cherry-picking that can overemphasize parallelism.

436 Analysis of the orientation of evolutionary change, either through angular distance or

437 vector correlation, has intuitive appeal for the quantification of parallelism and repeatability in

438 evolution. When evolutionary change involves multiple traits sampled across multiple lineages,

439 the question of how repeatable is the evolutionary process? is one that will often transcend both

440 individual traits and individual lineages, and so will be most clearly addressed with the

441 multivariate approaches presented here.

442

443

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

444

445 Acknowledgements

446 We are grateful to Yoel Stuart and David Punzalan for discussion and their helpful comments on

447 the manuscript. Funding was provided by the University of Connecticut to DIB.

448

449

450

451

452

453

454

455

456

457

458

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

AB Trait 2 Trait 2

Trait 1 Trait 1

459

460 Figure 1. Parallel and convergent evolution as separate but related patterns. Parallel

461 evolution can be defined as a shared orientation in the direction of evolutionary change across

462 two or more lineages, regardless of their starting (ancestral) position, illustrated for four lineages

463 in panel A showing perfect parallelism. Convergent evolution, however, describes a change

464 (reduction for convergence, increase for divergence) of the dispersion of lineages in trait space,

465 illustrated in panel B. Note that evolution can be any combination of parallel or convergent,

466 depending on the ancestral state of the lineages in question and the evolutionary processes at

467 play.

468

469

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

470 471 Figure 2. Parallel evolution in multivariate trait space. In Panel A, Vectors of evolutionary

472 change for three traits sampled from 3 lineages are plotted as blue arrows. Each pairwise

473 combination of lineage vectors differ in orientation by 45.5 degrees in three dimensional trait

474 space. Analysis of these angles alone would indicate evolution is only weakly, if at all, parallel.

475 Yet all three lineages load strongly on a single major axis of evolutionary change that accounts

476 for 80% of the among-lineage variation in the orientation of evolution. The red arrow plots this

477 vector of shared multivariate parallel evolution back into three dimensional trait space (see eqn

478 3-5). Panel B illustrates a similar scenario but sampling four lineages diverging in two-

479 dimensional trait space. Red arrows plot the two orthogonal vectors of shared multivariate

480 parallel evolution, indicating one major axis of parallel evolution upon which all four lineages

481 load positively, in addition to a second axis upon which some lineage evolutionary change

482 vectors load positively, and some negatively, indicating antiparallel evolution in this direction

483 across the sample of lineages.

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

484 485 Figure 3. Null distributions of eigenvalues for a 16-dimensional (16 lineage) C matrix.

486 Increasing trait dimensionality relative to lineage replication leads to null distributions favorable

487 to uncovering parallelism. Under the null hypothesis of no relationship between vectors of

488 evolutionary change, which would be reflected in a perfectly uniform distribution of eigenvalues,es,

489 correlations between lineages (manifest as eigenvalues exceeding unity) are still expected to be

490 observed through sampling error. The magnitude of these expected leading eigenvalues increaseses

491 with decreasing trait dimensionality simply due to random sampling. For example, when the

492 number of traits is less than the number of lineages (vectors) sampled, strong correlations are

493 inevitable even when the null hypothesis of random evolutionary divergence across lineages is

494 true. Thus, sampling many traits relative to lineages flattens the null distribution of eigenvalues,

495 increasing power to distinguish true parallel evolution from what is expected purely from

496 sampling error under the null hypothesis. It is important to note, however, that high trait

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

497 dimensionality simply creates null distributions favorable to uncovering parallelism; choice of

498 traits to include in any analysis is primarily a biological problem. Distributions obtained by

499 sampling the corresponding Wishart distribution, with the exception of 8 traits which was

500 constructed empirically by placing random vectors in trait space. Dimensionality chosen here for

501 continuity with the empirical example from Stuart et al. 2017.

502

503 504 505 506 507 508 509 510 511 512

513

514

515

516

517

518

519

520

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

8 Null distribution (empirical) Null distribution (Wishart) Observed distribution

6

4 Eigenvalue

2

0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Eigenvalue rank

521 522 Figure 4. Distribution of eigenvalues from matrix C from Stuart et al. 2017 data, showing

523 statistical support for three orthogonal dimensions of parallel change. Yellow and green

524 boxplots indicate the expected distribution of eigenvalues under a null hypothesis of a random

525 direction of evolutionary change cross lineages (zero off-diagonals in the matrix of vector

526 correlations), calculated from sampling the corresponding Wishart distribution or empirically by

527 placing random vectors in trait space (each 1,000 X). Blue indicates values observed from

528 diagonalization of the estimated matrix of vector correlations, C. C was calculated from

529 Supplementary data Table 1in Stuart et al. (2017), with each row of X first normalized to unit

530 length.

531

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

532

533

0.2

0.0 Loading on PC1 of C

-0.2

Pachen Freder Pye M Bea Comida Robert Swan V Mo Joe Bo M Nort Kenned Th i uc llag o isty ieme hal v or t h er e y

Population 534 535

0.4

0.2

0.0 Loading on PC2 of C

-0.2 Misty Bo V J Mo C Kenned ThiemeFreder N R Beaver Pye Sw Pachen Much il o o o o o la e mid r ber t g o t a re hy n a a t l Population 536

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

0.2

0.0

-0.2 Loading on PC3 of C of PC3 on Loading

-0.4

No Fre Pa Be Vi P Th Muc M M Bo C Ke Robert J Sw ye o r ch a lla ie o isty ot omid n e th der ore ne a e ve g m h n y n r e al a d Population 537 538

539 Fig 5. Loadings for each lineage on the first three eigenvectors (principle components) of C.

540 The observation that most lineages are loading positively on PC1 and PC2 indicate that those are

541 dimensions of (mostly) true parallel evolution, as opposed to anti parallel evolution, which

542 would be represented as a mixture of positive and negative loadings across lineages, for example

543 PC3.

544

545

546

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

547

548 549 Fig 6. Each of the three significant dimensions of parallelism, estimated from reanalysis of

550 Stuart et al. 2017, related back to trait space. Trait classifications are taken from Stuart et al.

551 2017.

552

553

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

AB

Trait 2 C D

554 Trait 1 555 556 Figure 7. Convergent and divergent evolution in multivariate trait space. Comparison of

557 among lineage variance-covaraince matrices D of mean trait values estimated at timepoint (or

558 environment) one (dashed elipses) and two (solid elispses) indicate the form of convergent or

559 divergent evolution. Arrows in panel A illustriate hypothetical evolutionary change vectors,

560 which are left off of subsequent panels for concision. In panel A, evolution is net-convergent and

561 all traits contribute to this convergence; matrices are proportional. In panel B, evolution is

562 convergent in only a single trait dimension. In panel C, evolution is both divergent and

563 convergent, in different combinations of traits, leading to changes in the orientation of the

564 among-lineage covariance matrix. In panel D, evolution is neither convergent nor divergent in

565 any combination of traits, despite strong parallel change across all lineages, leading to no net

30 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

566 change in the among-lineage covariance. All cases illustrate largely parallel evolutionary change

567 in overall size.

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

31 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

589 References

590 Adams, D. C., and M. L. Collyer. 2009. A general framework for the analysis of phenotypic 591 trajectories in evolutionary studies. Evolution 65:1143-1154. 592 Bell, M. A., and S. A. Foster. 1994. The Evolutionary Biology of the Threespine Stickleback. 593 Oxford University Press, Oxford. 594 Blows, M. W. 2007a. Complexity for complexities sake? . Journal of Evolutionary Biology 20 595 39-44. 596 Blows, M. W. 2007b. A tale of two matrices: multivariate approaches in evolutionary biology 597 Journal of Evolutionary Biology 20:1-8. 598 Blows, M. W., S. L. Allen, J. M. Collet, S. F. Chenoweth, and K. A. McGuigan. 2015. The 599 Phenome-Wide Distribution of Genetic Variance. The American Naturalist 186:15-30. 600 Blows, M. W., and R. C. Brooks. 2003. Measuring nonlinear selection. The American Naturalist 601 162:815-820. 602 Blows, M. W., S. F. Chenoweth, and E. Hine. 2004. Orientation of the Genetic 603 Variance‐Covariance Matrix and the Fitness Surface for Multiple Male Sexually Selected 604 Traits. The American Naturalist 163:329-340. 605 Blows, M. W., and M. Higgie. 2003. Genetic constraints on the evolution of mate recognition 606 under natural selection The American Naturalist 161:240-253. 607 Blows, M. W., and K. A. McGuigan. 2015. The distribution of genetic variance across phenotype 608 space and the response to selection. Molecular Ecology 24:2056-2072. 609 Bolnick, D. I., R. D. H. Barret, K. B. Oke, D. J. Rennison, and Y. E. Stuart. 2018. (Non) Parallel 610 Evolution. Annual Review of Ecology and 49:303-330. 611 Chenoweth, S. F., H. D. Rundle, and M. W. Blows. 2010. The contribution of selection and 612 genetic contraints to phenotypic divergence. The American Naturalist 175:186-196. 613 Collyer, M. L., and D. C. Adams. 2007. Analysis of two-state multivariate phenotypic change in 614 ecological studies. Ecology 88:683-692. 615 Collyer, M. L., D. J. Sekora, and D. C. Adams. 2015. A method for analysis of phenotypic 616 change for phenotypes 617 described by high-dimensional data. Heredity 115:357-365. 618 Colosimo, P. F., K. E. Hoseman, S. Balabhadra, G. J. Villarreal, M. Dickson, J. Grimwood, J. 619 Schmutz, R. M. Myers, D. Scluter, and D. M. Kingsley. 2005. Widespread parallel 620 evolution in sticklebacks by repeated fixation of ectodysplasin alleles. Science 307:1928- 621 1933. 622 De Lisle, S. P., and L. Rowe. 2017. Disruptive natural selection predicts divergence between the 623 sexes during . Ecology and Evolution 2017:1-12. 624 Flury, B. D. 1988. Common Principle Components and related multivariate models. Wiley, New 625 York. 626 Hohenlohe, P. A., and S. J. Arnold. 2008. MIPoD: A hypothesis-testing framework for 627 microevolutionary inference from patterns of divergence. American Naturalist 171:366- 628 385. 629 Houle, D., C. Pélabon, G. P. Wagner, and T. F. Hansen. 2011. Measurement and meaning in 630 biology The Quarterly Review of Biology 86:3-34. 631 Johnstone, I. M. 2001. On the distribution of the largest eigenvalue in principal components 632 analysis. Annals of Statistics 29:295-327.

32 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

633 Kirkpatrick, M. 2009. Patterns of quantitative genetic variation in multiple dimensions. Genetica 634 136:271-284. 635 Kolbe, J. J., L. J. Revell, B. Szekely, E. D. I. Brodie, and J. B. Losos. 2011. Convergent 636 evolution of phenotypic integration and its alignment with morphological diversification 637 in caribbean Anolis ecomorphs. Evolution 65:3608-3624. 638 Lande, R. 1979. Quantitative genetic analysis of multivariate evolution, applied to brain: body 639 size allometry. Evolution 33:402-416. 640 Lande, R. 1980. Genetic variation and phenotypic evolution during allopatrix The 641 American Naturalist 116:463-479. 642 Lande, R., and S. J. Arnold. 1983. The measurement of selection on correlated characters. 643 Evolution 37:1210-1226. 644 Langerhans, R. B., and T. J. DeWitt. 2004. Shared and unique features of evolutionary 645 diversification. The American Naturalist 164:335-349. 646 Losos, J. B. 2011. Convergence, adaptation, and constraint. Evolution 65:1827-1840. 647 Mahler, D. L., T. Ingram, L. J. Revell, and J. B. Losos. 2013. Exceptional convergence on the 648 macroevolutionary landscape in island lizard radiations. Science 341:292-295. 649 McGuigan, K. A., S. F. Chenoweth, and M. W. Blows. 2005. Phenotypic divergence along lines 650 of genetic variance The American Naturalist 165:32-43. 651 McPhail, J. D. 1993. Ecology and evolution of sympatric sticklebacks (Gasterosteus): origin of 652 the species pairs. 71:515-523. 653 Melo, D., G. Garcia, A. Hubbe, A. P. Assis, and G. Marroig. 2015. EvolQG - An R package for 654 evolutionary quantitative genetics F1000Research 4:925. 655 Nosil, P., B. Crespi, and C. P. Sandoval. 2002. Host-plant adaptation drives the parallel evolution 656 of reproductive isolation. Nature 417:440-443. 657 Oke, K. B., G. Rolshausen, C. LeBlond, and A. P. Hendry. 2017. How parallel is parallel 658 evolution? A comparative analysis in fishes. The American Naturalist 190:1-16. 659 Phillips, P. C., and S. J. Arnold. 1989. Visualizing multivariate selection. Evolution 43:1209- 660 1222. 661 Phillips, P. C., and S. J. Arnold. 1999. Hierarchical comparison of genetic variance-covariance 662 matrices. I. Using the Flury hierarchy Evolution 53:1506-1515. 663 Punzalan, D., and L. Rowe. 2016. Concordance between stabilizing , 664 intraspecific variation, and interspecific divergence in Phymata. Ecology and Evolution 665 6:7997-8009. 666 Schluter, D. 1996. Adaptive evolution along genetic lines of least resistance. Evolution 50:1766- 667 1774. 668 Schoustra, S. E., D. Punzalan, D. Rola, H. D. Rundle, and R. Kassen. 2012. Multivariate 669 phenotypic divergence due to the fixation of beneficial in experimentally 670 evolved lineages of a filamentous fungus PloS OnE 7:e5035. 671 Stuart, Y. E. 2019. Divergent uses of “parallel evolution” during the history of the American 672 Naturalist. The American Naturalist 193:11-19. 673 Stuart, Y. E., T. Veen, J. N. Weber, D. Hanson, M. Ravinet, B. K. Lohman, C. J. Thompson, T. 674 Tasneem, A. Doggett, R. Izen, N. Ahmed, R. D. H. Barret, A. P. Hendry, C. L. Peichel, 675 and D. I. Bolnick. 2017. Contrasting effects of environment and genetics generate a 676 continuum of parallel evolution Nature Ecology and Evolution 1:0158. 677 Sztepanacz, J. L., and M. W. Blows. 2017. Accounting for Sampling Error in Genetic 678 Eigenvalues Using Random Matrix Theory. Genetics 206:1271-1284.

33 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

679 Thompson, K. A., M. M. Osmond, and D. Schluter. 2019. Parallel genetic evolution and 680 speciation from standing variation. Evolution Letters 3:129-141. 681 Tracy, C. A., and H. Widom. 1996. On orthogonal and symplectic matrix ensembles. 682 Communications in Mathematical Physics 177:727-754. 683 Tracy, C. A., and H. Widom. 2009. The distributions of random matrix theory and their 684 application.in S. V., editor. New Trends in Mathematical Physics Springer, Dordrecht. 685 Wyman, M., J. R. Stinchcombe, and L. Rowe. 2013. A multivariate view of the evolution of 686 sexual dimorphism. Journal of Evolutionary Biology 26:2070-2080. 687 Zeng, Z.-B. 1988. Long-term correlated response, interpopulation covariation, and interspecific 688 allometry. Evolution 42 363-374. 689

34 (which wasnotcertifiedbypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmade bioRxiv preprint

690

691 Supplemental Material doi:

692 Tables https://doi.org/10.1101/2020.01.26.920439

Table S1. Matrix C of vector correlations between lineage evolutionary change vectors from Stuart et al 2017 data Village Beaver Boot Comida Frederick Joe Kennedy Misty Moore Muchalat Northy Pachena Pye Roberts Swan Thiemer Bay

Beaver 1 ------available undera

Boot -0.215 1 ------Comida 0.611 0.1764 1 ------Frederick 0.806 0.6859 1 ------0.0927 CC-BY 4.0Internationallicense

Joe 0.092 0.4975 0.5374 0.2702 1 ------; this versionpostedJanuary27,2020. Kennedy -0.607 0.5037 -0.4296 -0.6763 0.0232 1 ------

Misty -0.115 0.6279 0.1237 -0.0905 0.2473 0.5395 1 ------

Moore 0.298 0.0589 0.4003 0.3499 0.1781 -0.0741 0.4100 1 ------Muchalat 0.544 0.5165 0.6601 0.0204 -0.7952 0.1722 1 ------0.4092 0.5312 - - Northy -0.335 0.2551 -0.4651 -0.3115 0.5977 0.4289 -0.597 1 ------0.2904 0.1105 - - Pachena 0.753 0.5767 0.8734 0.1251 -0.7509 0.3215 0.844 -0.465 1 - - - - -

0.3524 0.3357 . - - Pye 0.690 0.5544 0.8042 0.2033 -0.6206 0.3474 0.713 -0.450 0.8711 1 - - - - 0.2233 0.2439 - -

Roberts 0.460 0.6201 0.4360 0.4897 -0.3416 0.3868 0.423 -0.622 0.5117 0.5155 1 - - - The copyrightholderforthispreprint 0.2462 0.1805 - - Swan 0.247 0.5223 0.2682 0.5166 -0.2108 0.1145 0.297 -0.580 0.2811 0.2563 0.8601 1 - - 0.1401 0.2532 - Thiemer -0.595 0.4242 -0.5560 -0.5848 0.8036 0.5697 0.0630 -0.680 0.730 -0.6423 -0.5329 -0.6453 -0.648 1 - 0.2125 Village Bay 0.450 0.3726 0.5420 0.5722 0.4027 -0.1337 0.2773 0.3688 0.247 -0.102 0.4878 0.5747 0.2645 0.051 -0.077 1

693

694

35 (which wasnotcertifiedbypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmade bioRxiv preprint

695

Table S2. Spectral decomposition of C from Stuart et al. 2017 doi:

Eigenvalues Eigenvectors, Q https://doi.org/10.1101/2020.01.26.920439

V lineage q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16

7.405 Beaver 0.29 0.06 0.22 0.03 0.53 -0.08 -0.36 -0.12 0.41 -0.14 0.42 0.11 0.09 0.19 0.03 0.09

3.010 Boot -0.12 0.45 -0.06 -0.42 -0.14 -0.19 -0.02 0.16 0.45 -0.28 -0.27 -0.03 0.30 -0.24 0.13 0.10

1.983 Comida 0.27 0.27 -0.09 -0.07 0.13 -0.34 0.02 0.49 -0.18 0.19 0.23 -0.45 -0.34 -0.13 -0.03 0.07

0.998 Frederick 0.31 0.13 0.24 -0.11 0.24 -0.01 0.29 -0.11 0.05 -0.09 -0.40 0.19 -0.44 0.03 0.29 -0.44 available undera 0.598 Joe 0.11 0.36 -0.37 -0.27 -0.11 0.08 0.41 -0.48 -0.04 0.02 0.43 0.04 -0.04 0.13 -0.12 -0.03

0.543 Kennedy -0.29 0.22 -0.14 0.12 0.05 0.37 -0.10 0.31 0.37 0.18 0.11 0.16 -0.23 -0.08 -0.38 -0.42

0.347 Misty -0.14 0.46 0.09 0.19 0.16 -0.32 -0.07 -0.11 -0.20 0.56 -0.20 0.31 0.26 0.11 -0.03 0.04

0.292 Moore 0.12 0.30 0.13 0.70 -0.29 -0.20 0.13 -0.08 0.08 -0.43 -0.02 -0.12 -0.01 0.04 -0.22 -0.04 CC-BY 4.0Internationallicense ; 0.199 Muchalat 0.31 -0.15 0.13 -0.06 -0.34 -0.10 0.27 0.46 0.09 0.11 0.23 0.30 0.35 0.32 0.08 -0.24 this versionpostedJanuary27,2020.

0.179 Northy -0.26 0.11 0.29 0.03 0.47 0.30 0.52 0.19 -0.18 -0.16 0.06 -0.21 0.33 0.02 -0.05 0.06

0.138 Pachena 0.33 -0.02 0.23 0.01 -0.06 0.17 0.19 0.03 0.04 0.11 0.03 0.40 -0.09 -0.57 -0.27 0.44

0.122 Pye 0.31 0.06 0.22 -0.01 -0.18 0.35 -0.01 -0.17 0.31 0.43 -0.24 -0.53 0.12 0.18 -0.08 0.07

0.068 Roberts 0.27 0.09 -0.36 0.33 0.09 0.27 -0.09 0.00 -0.07 0.06 0.09 -0.04 0.32 -0.42 0.49 -0.23

0.053 Swan 0.21 0.03 -0.53 0.13 0.20 0.20 0.08 0.25 0.05 -0.11 -0.35 0.18 -0.07 0.41 -0.04 0.42

0.039 Thiemer -0.31 0.18 0.20 0.15 -0.23 0.18 0.07 0.08 0.15 0.11 0.24 0.08 -0.33 0.13 0.60 0.35 .

0.027 Village Bay 0.16 0.38 0.22 -0.21 -0.18 0.40 -0.44 0.10 -0.50 -0.25 0.01 0.10 0.01 0.15 -0.05 -0.03 The copyrightholderforthispreprint 696

697

698

699

36 (which wasnotcertifiedbypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmade bioRxiv preprint

700 doi: https://doi.org/10.1101/2020.01.26.920439 available undera CC-BY 4.0Internationallicense ; this versionpostedJanuary27,2020. . The copyrightholderforthispreprint

37 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439Table S3. Matrix A relating; this eigenvectorsversion posted of January C to trait 27, space 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made a1 a2 a3 availablea4 a5under aCC-BYa6 4.0a7 International a8 licensea9 . a10 a11 a12 a13 a14 a15 ------Standard length 0.384 0.218 0.044 0.045 0.018 0.021 0.006 0.032 0.002 0.016 0.029 0.015 0.021 0.019 0.014 ------First dorsal spine length 0.366 0.169 0.159 0.044 0.062 0.159 0.075 0.098 0.015 0.018 0.025 0.012 0.030 0.013 0.000 ------Second dorsal spine length 0.297 0.206 0.191 0.118 0.062 0.215 0.078 0.185 0.003 0.001 0.032 0.019 0.010 0.022 0.001 ------Dorsal fin length 0.232 0.070 0.000 0.029 0.062 0.008 0.040 0.026 0.037 0.027 0.024 0.016 0.014 0.000 0.007 ------Caudal peduncle depth 0.494 0.011 0.102 0.111 0.002 0.064 0.007 0.040 0.038 0.017 0.012 0.018 0.003 0.008 0.008 ------Anal. fin length 0.214 0.182 0.028 0.116 0.101 0.012 0.006 0.017 0.014 0.015 0.007 0.003 0.024 0.015 0.010 ------Pectoral fin insertion length 0.269 0.153 0.026 0.055 0.054 0.087 0.022 0.065 0.015 0.028 0.004 0.056 0.036 0.009 0.005 ------Body depth 0.598 0.225 0.151 0.018 0.090 0.018 0.049 0.019 0.049 0.042 0.070 0.020 0.029 0.041 0.009 - - - - Mouth Length 0.218 0.152 0.025 0.033 0.008 0.054 0.013 0.003 0.024 0.006 0.010 0.006 0.045 0.038 0.046 ------Snout length 0.386 0.242 0.047 0.025 0.134 0.060 0.079 0.004 0.033 0.012 0.016 0.037 0.028 0.012 0.011 - - - Eye length 0.236 0.294 0.079 0.037 0.036 0.144 0.016 0.050 0.016 0.049 0.008 0.067 0.010 0.004 0.012 ------Head length 0.364 0.142 0.009 0.052 0.115 0.065 0.020 0.054 0.002 0.031 0.012 0.008 0.013 0.003 0.012 ------Pectoral fin width 0.287 0.050 0.051 0.087 0.046 0.034 0.056 0.107 0.022 0.132 0.003 0.014 0.019 0.002 0.021 - - - - Pectoral fin length 0.267 0.160 0.043 0.067 0.051 0.028 0.126 0.024 0.121 0.041 0.010 0.008 0.025 0.003 0.004 ------Pectoral fin perimeter 0.260 0.036 0.093 0.083 0.026 0.095 0.015 0.098 0.094 0.123 0.016 0.005 0.025 0.007 0.044 - - - - - Pectoral fin area 0.273 0.272 0.079 0.041 0.089 0.085 0.019 0.100 0.098 0.084 0.007 0.027 0.008 0.037 0.030 - - - - Buccal cavity length 0.380 0.090 0.014 0.122 0.139 0.022 0.037 0.035 0.043 0.021 0.071 0.049 0.021 0.022 0.026 - - - - - Gape width 0.441 0.003 0.026 0.098 0.016 0.020 0.023 0.074 0.016 0.038 0.004 0.042 0.019 0.047 0.004 ------Body width point 1 0.368 0.016 0.065 0.153 0.012 0.003 0.061 0.021 0.005 0.038 0.013 0.059 0.027 0.039 0.037 ------Body width point 2 0.439 0.052 0.076 0.020 0.120 0.029 0.120 0.067 0.015 0.028 0.011 0.028 0.006 0.014 0.015 ------Pelvic girdle width 0.574 0.106 0.104 0.049 0.163 0.036 0.070 0.004 0.014 0.014 0.062 0.043 0.016 0.005 0.017 ------Pelvic girdle diamond width 0.563 0.103 0.060 0.045 0.091 0.027 0.043 0.011 0.061 0.035 0.062 0.057 0.054 0.016 0.024 ------Pelvic girdle length 0.344 0.270 0.088 0.131 0.105 0.018 0.030 0.064 0.068 0.009 0.064 0.045 0.029 0.006 0.004 Pelvic girdle diamond ------length 0.363 0.217 0.015 0.079 0.056 0.012 0.022 0.039 0.018 0.026 0.014 0.008 0.047 0.001 0.007 ------Body width point 3 0.370 0.066 0.058 0.005 0.055 0.015 0.131 0.060 0.019 0.067 0.013 0.011 0.080 0.019 0.016 ------Body width point 4 0.373 0.080 0.011 0.047 0.048 0.020 0.066 0.046 0.028 0.067 0.013 0.033 0.052 0.054 0.010 ------Left side plate number 0.064 0.046 0.106 0.293 0.168 0.137 0.023 0.040 0.026 0.064 0.073 0.054 0.022 0.034 0.012 ------Right side plate number 0.055 0.092 0.083 0.321 0.131 0.167 0.020 0.067 0.020 0.088 0.069 0.040 0.002 0.020 0.006 - - - - Right side gill raker number 0.227 0.479 0.166 0.022 0.093 0.029 0.005 0.032 0.149 0.109 0.056 0.059 0.078 0.010 0.007 ------First longest raker length 0.256 0.253 0.020 0.004 0.001 0.152 0.067 0.090 0.040 0.008 0.033 0.059 0.048 0.026 0.016 Second longest raker ------length 0.262 0.228 0.029 0.001 0.009 0.138 0.102 0.070 0.053 0.005 0.017 0.081 0.034 0.020 0.006 ------Third longest raker length 0.235 0.227 0.022 0.022 0.004 0.142 0.053 0.080 0.041 0.022 0.023 0.055 0.039 0.029 0.005 ------Raker density 0.392 0.065 0.007 0.136 0.016 0.101 0.026 0.050 0.017 0.036 0.057 0.001 0.023 0.001 0.007

38 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

------Mean pelvic spine length 0.386 0.284 0.256 0.079 0.066 0.217 0.119 0.006 0.022 0.003 0.069 0.100 0.041 0.015 0.003 ------Jaw opening inlever length 0.346 0.150 0.073 0.011 0.024 0.013 0.005 0.064 0.101 0.003 0.077 0.046 0.015 0.046 0.024 - - - - - Jaw closing inlever length 0.285 0.188 0.001 0.016 0.001 0.026 0.010 0.007 0.036 0.017 0.055 0.006 0.018 0.010 0.025 ------Jaw outlever length 0.289 0.188 0.016 0.010 0.018 0.046 0.005 0.046 0.043 0.029 0.058 0.022 0.013 0.048 0.001 Epaxial muscle cross------sectional area 0.786 0.283 1.206 0.303 0.030 0.105 0.053 0.014 0.003 0.006 0.002 0.021 0.016 0.008 0.002 ------Neurocranium length 0.298 0.106 0.083 0.032 0.048 0.017 0.020 0.021 0.026 0.049 0.012 0.025 0.016 0.021 0.038 ------Epaxial muscle height 0.302 0.027 0.092 0.011 0.005 0.021 0.081 0.025 0.003 0.038 0.110 0.019 0.000 0.022 0.005 ------Opercular 4-bar fixed length 0.285 0.159 0.051 0.013 0.061 0.012 0.057 0.035 0.055 0.011 0.020 0.042 0.012 0.015 0.015 Opercular 4-bar coupler ------length 0.265 0.124 0.032 0.032 0.088 0.034 0.073 0.046 0.033 0.007 0.009 0.028 0.011 0.026 0.024 Opercular 4-bar input ------length 0.308 0.086 0.060 0.009 0.009 0.022 0.047 0.058 0.053 0.053 0.074 0.062 0.021 0.012 0.022 Opercular 4-bar output ------length 0.244 0.123 0.037 0.002 0.011 0.008 0.035 0.040 0.056 0.008 0.026 0.050 0.013 0.025 0.021 Opercular 4-bar diagonal ------length 0.265 0.129 0.022 0.035 0.096 0.028 0.046 0.015 0.032 0.017 0.030 0.021 0.006 0.020 0.020 ------Centroid size 0.387 0.221 0.029 0.035 0.028 0.032 0.005 0.034 0.010 0.016 0.035 0.007 0.029 0.021 0.005 Landmark 1 GPA ------coordinate X 0.134 0.097 0.076 0.095 0.192 0.112 0.002 0.107 0.004 0.052 0.033 0.018 0.015 0.010 0.009 Landmark 1 GPA - - - - coordinate Y 0.090 0.027 0.038 0.090 0.077 0.072 0.089 0.020 0.022 0.028 0.054 0.007 0.017 0.002 0.039 Landmark 2 GPA ------coordinate X 0.177 0.256 0.060 0.105 0.054 0.153 0.129 0.010 0.010 0.045 0.033 0.095 0.014 0.027 0.011 Landmark 2 GPA ------coordinate Y 0.155 0.072 0.034 0.077 0.140 0.045 0.069 0.016 0.014 0.038 0.030 0.003 0.028 0.017 0.034 Landmark 3 GPA - - - - coordinate X 0.065 0.192 0.059 0.012 0.125 0.071 0.094 0.154 0.101 0.063 0.009 0.038 0.013 0.050 0.048 Landmark 3 GPA ------coordinate Y 0.381 0.319 0.065 0.056 0.070 0.028 0.024 0.124 0.023 0.050 0.010 0.010 0.024 0.030 0.005 Landmark 4 GPA ------coordinate X 0.030 0.144 0.111 0.301 0.031 0.024 0.071 0.029 0.038 0.047 0.018 0.013 0.030 0.009 0.031 Landmark 4 GPA ------coordinate Y 0.453 0.460 0.195 0.112 0.080 0.010 0.005 0.006 0.003 0.001 0.034 0.027 0.024 0.004 0.014 Landmark 5 GPA ------coordinate X 0.135 0.039 0.031 0.142 0.064 0.059 0.130 0.018 0.015 0.044 0.066 0.023 0.033 0.020 0.033 Landmark 5 GPA ------coordinate Y 0.340 0.405 0.165 0.170 0.062 0.010 0.031 0.031 0.004 0.006 0.038 0.019 0.032 0.013 0.006 Landmark 6 GPA ------coordinate X 0.081 0.132 0.046 0.181 0.132 0.073 0.104 0.051 0.008 0.109 0.004 0.001 0.029 0.034 0.002 Landmark 6 GPA ------coordinate Y 0.244 0.322 0.095 0.141 0.037 0.081 0.025 0.095 0.044 0.024 0.020 0.001 0.017 0.013 0.024 Landmark 7 GPA ------coordinate X 0.165 0.128 0.034 0.054 0.103 0.017 0.011 0.038 0.045 0.060 0.062 0.014 0.005 0.030 0.023 Landmark 7 GPA ------coordinate Y 0.005 0.265 0.014 0.020 0.007 0.053 0.062 0.001 0.003 0.030 0.069 0.019 0.040 0.034 0.017 Landmark 8 GPA - - - - - coordinate X 0.425 0.047 0.020 0.129 0.018 0.036 0.042 0.042 0.028 0.041 0.066 0.024 0.026 0.005 0.037 Landmark 8 GPA - - - - - coordinate Y 0.014 0.015 0.007 0.111 0.076 0.057 0.035 0.001 0.037 0.017 0.019 0.002 0.027 0.006 0.004 Landmark 9 GPA ------coordinate X 0.323 0.120 0.014 0.050 0.044 0.061 0.029 0.056 0.048 0.062 0.017 0.024 0.019 0.022 0.010 Landmark 9 GPA ------coordinate Y 0.004 0.056 0.026 0.114 0.097 0.036 0.038 0.008 0.062 0.046 0.009 0.021 0.003 0.006 0.003 Landmark 10 GPA ------coordinate X 0.423 0.115 0.014 0.077 0.042 0.055 0.017 0.021 0.090 0.120 0.030 0.002 0.007 0.025 0.035 Landmark 10 GPA - - - - - coordinate Y 0.013 0.101 0.079 0.126 0.099 0.050 0.012 0.009 0.046 0.025 0.018 0.002 0.008 0.017 0.001 Landmark 11 GPA ------coordinate X 0.070 0.102 0.009 0.052 0.074 0.069 0.031 0.040 0.034 0.020 0.016 0.007 0.067 0.003 0.044 Landmark 11 GPA - - - - - coordinate Y 0.165 0.147 0.124 0.051 0.005 0.009 0.069 0.099 0.057 0.053 0.050 0.019 0.015 0.014 0.026

39 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.26.920439; this version posted January 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Landmark 12 GPA ------coordinate X 0.066 0.109 0.082 0.188 0.154 0.052 0.045 0.067 0.019 0.000 0.014 0.026 0.001 0.003 0.000 Landmark 12 GPA - - - - - coordinate Y 0.273 0.235 0.063 0.038 0.029 0.114 0.030 0.100 0.136 0.081 0.019 0.003 0.006 0.019 0.010 Landmark 13 GPA ------coordinate X 0.240 0.049 0.184 0.297 0.112 0.121 0.102 0.011 0.017 0.050 0.000 0.012 0.058 0.007 0.033 Landmark 13 GPA ------coordinate Y 0.104 0.181 0.003 0.088 0.151 0.084 0.077 0.032 0.011 0.014 0.003 0.005 0.023 0.008 0.017 Landmark 14 GPA ------coordinate X 0.069 0.095 0.045 0.058 0.049 0.138 0.070 0.040 0.011 0.023 0.005 0.036 0.025 0.053 0.016 Landmark 14 GPA - - - - coordinate Y 0.242 0.311 0.034 0.036 0.159 0.104 0.030 0.020 0.003 0.011 0.036 0.084 0.005 0.025 0.016 Landmark 15 GPA - - - - - coordinate X 0.041 0.226 0.048 0.117 0.063 0.013 0.004 0.075 0.116 0.002 0.056 0.038 0.018 0.090 0.026 Landmark 15 GPA ------coordinate Y 0.238 0.340 0.052 0.000 0.115 0.117 0.008 0.024 0.007 0.020 0.004 0.088 0.008 0.025 0.004 Landmark 16 GPA ------coordinate X 0.163 0.316 0.059 0.037 0.159 0.046 0.050 0.065 0.028 0.040 0.040 0.062 0.046 0.041 0.013 Landmark 16 GPA ------coordinate Y 0.212 0.211 0.055 0.036 0.093 0.034 0.156 0.017 0.013 0.005 0.079 0.023 0.010 0.021 0.012 Landmark 17 GPA ------coordinate X 0.017 0.167 0.088 0.172 0.009 0.117 0.104 0.062 0.022 0.002 0.049 0.062 0.038 0.008 0.033 Landmark 17 GPA ------coordinate Y 0.165 0.070 0.024 0.034 0.124 0.028 0.070 0.010 0.018 0.036 0.068 0.022 0.022 0.007 0.039 Landmark 18 GPA ------coordinate X 0.213 0.048 0.076 0.054 0.126 0.056 0.138 0.047 0.090 0.014 0.049 0.021 0.013 0.031 0.019 Landmark 18 GPA ------coordinate Y 0.058 0.044 0.073 0.084 0.004 0.080 0.117 0.081 0.058 0.023 0.022 0.049 0.010 0.012 0.018 Landmark 19 GPA ------coordinate X 0.022 0.238 0.060 0.040 0.073 0.074 0.086 0.074 0.057 0.072 0.007 0.042 0.020 0.019 0.019 Landmark 19 GPA ------coordinate Y 0.040 0.060 0.077 0.144 0.079 0.126 0.016 0.051 0.020 0.022 0.022 0.043 0.005 0.016 0.056

701

702

703

40