bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

2

3 Cooperate-and-radiate co-evolution between and plants

4

5 Katrina M. Kaur1*, Pierre-Jean G. Malé1, Erik Spence2, Crisanto Gomez3, Megan E. Frederickson1

6

7 1Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street,

8 Toronto, Ontario, Canada, M5S 3B2

9 2SciNet Consortium, University of Toronto, 661 University Avenue, Toronto, Ontario, Canada,

10 M5G 1M1

11 3Dept Ciències Ambientals, Universitat de Girona, C/Maria Aurèlia Capmany, 69, 17003,

12 Girona, Spain

13

14 * Corresponding author: [email protected]

15

16

17

18

19

20

21

22

23

24

25

26 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

27 Abstract

28 Mutualisms may be “key innovations” that spur diversification in one partner lineage, but no study

29 has evaluated whether mutualism accelerates diversification in both interacting lineages. Recent

30 research suggests that plants that attract mutualists for defense or seed dispersal have higher

31 diversification rates than non-ant associated plant lineages. We ask whether the reciprocal is true:

32 does the ecological interaction between ants and plants also accelerate diversification in ants? In other

33 words, do ants and plants cooperate-and-radiate? We used a novel text-mining approach to determine

34 which ant associate with plants in defensive or seed dispersal mutualisms. We investigated

35 patterns of trait evolution and lineage diversification using phylogenetic comparative methods on a

36 large, recent species-level ant phylogeny. We found that ants that associate mutualistically with plants

37 have elevated diversification rates compared to non-mutualistic ants, suggesting that ants and plants

38 cooperate-and-radiate.

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

55 Introduction

56 How have species interactions contributed to the diversification of life on Earth? Ehrlich and Raven1

57 famously proposed escape-and-radiate co-evolution as an engine of plant and diversification.

58 Here we present evidence that “cooperate-and-radiate” co-evolution is also an important diversifying

59 force. Recent studies have linked mutualism evolution to accelerated lineage diversification2–4,

60 suggesting that mutualism either buffers lineages against extinction, promotes speciation as lineages

61 enter new adaptive zones, or both5. Building on previous research showing that partnering with ants

62 enhances plant diversification3,4, we ask if interacting mutualistically with plants enhances ant

63 diversification.

64 Mutualism theory generally predicts the opposite; mutualism is expected to hinder

65 diversification6,7 because the interdependence of partners, conflicts of interest between them, or the

66 invasion of selfish “cheaters” should make mutualistic lineages vulnerable to extinction8–10. However,

67 the available phylogenetic evidence strongly suggests that mutualisms often persist over long periods

68 of evolutionary time, and may help lineages flourish8. Furthermore, recent studies show that

69 mutualism can expand a lineage’s realized niche11,12, potentially creating ecological opportunity.

70 Ant-plant interactions are classic examples of mutualism that have evolved numerous times in

71 both partner lineages, but a relationship between ant-plant mutualisms and enhanced diversification

72 has been tested for only in plants3,4,13. Ant “bodyguards” visit extrafloral nectaries (EFNs) on plants or

73 nest in specialized plant cavities (domatia) and protect plants against herbivores or other enemies14.

74 Ants also disperse seeds that bear lipid-rich elaiosomes3. The evolution of elaiosomes3 and EFNs4, but

75 not domatia13, enhances plant diversification, providing one possible explanation for the rapid

76 radiation of angiosperms famously referred to as “Darwin’s abominable mystery”15. Ants can reduce

77 the negative effects of seed predators or herbivores on plant populations, potentially reducing

78 extinction risk4,16, or they may help plants colonize new sites, promoting speciation3,16, although it is

79 worth emphasizing that ants move seeds only short distances17. Since ant and angiosperm radiations

80 are broadly contemporaneous, having diversified during the Late Cretaceous18–20, they may have

81 ‘cooperated-and-radiated’; on the ant side, the evolution of ant-plant interactions may have made new bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

82 niches available in the form of plant-derived food or nest sites, resulting in expanded ranges or

83 decreased extinction risk.

84 To investigate cooperate-and-radiate co-evolution, we automated the compilation of trait data

85 from the primary literature. Inspired by recent studies that have leveraged advances in bioinformatic

86 pipelines21 and machine reasoning22 to characterize, for example, protein-protein interaction

87 networks23, we took an innovative text-mining approach to compile ant-plant interaction data from the

88 abstracts and titles of over 89,000 ant-related publications. We groundtruthed the results by manually

89 checking a subset of abstracts for false positives and evaluated how the number of unique plant-ant

90 species identified by our text-mining algorithm changed as the algorithm sampled more abstracts. We

91 analyzed our trait data in conjuction with a species-level ant phylogeny using both Binary State

92 Speciation and Extinction (BiSSE)24 and Bayesian Analysis of Macroevolutionary Mixtures

93 (BAMM)25 models to assess the effect of mutualism on ant diversification. Thus, we evaluate for the

94 first time whether the evolution of mutualism leads to reciprocal diversification in ants and plants

95 through cooperate-and-radiate co-evolution.

96

97 Results

98 Text-mining generated a wealth of trait data (Figure 1, Figure S1). This method outputted trait data

99 for 3342 ant species in 265 ant genera, representing 23% and 73%, respectively, of all currently

100 recognized ant species and genera, globally. The text-mining extracted these data from approximately

101 15,000 abstracts, with the number of unique species increasing as the number of abstracts sampled

102 increased (Figure 2). The species accumulation curves suggest that our lists of ant species that nest in

103 domatia or consume plant food bodies are relatively complete, but that text-mining an even a larger

104 sample of abstracts from the primary literature would identify many more ant species that disperse

105 seeds or visit EFNs, as well as many more non-mutualistic ant species (Figure 2).

106 We paired this approach with a more traditional method of manually assembling a list of

107 seed-dispersing ants by reading the primary literature (183 articles). Our automated approach

108 performed relatively well in comparison; 85 seed-dispersing ant species in 28 genera occurred in both

109 our text-mining results and this hand-compiled dataset. In total, the text-mining identified 129 ant bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

110 species in 39 genera as seed dispersers, compared to 268 ant species in 60 genera in the hand-

111 compiled dataset, meaning that 44 ant species were in the text-mining results only, and 183 species

112 were in the hand-compiled dataset only. Of the 183 seed-dispersing ant species in the hand-compiled

113 dataset but not in our text-mining output, 149 species were described in 92 papers that were not

114 included among our 89,000 abstracts, suggesting that the largest improvements to our text-mined trait

115 dataset would come from having access to more abstracts, rather than from a better text-mining

116 algorithm. The text-mining method was not immune to error, but the number of false positives

117 declined rapidly as the number of abstracts in which ant names and trait terms co-occurred increased.

118 We found no false positives when an ant species name co-occurred with one or more trait terms in 4

119 or 5 abstracts (Figure 3).

120 Seed-dispersing and EFN-visiting ants were most commonly identified by text-mining,

121 followed by domatia-nesting ants; relatively few ant species that consume food bodies were found by

122 text-mining. Specifically, we identified 309 ant species in 77 genera that disperse seeds (including

123 taxa in the hand-compiled dataset) and 3030 ant species in 261 ant genera that do not; 122 ant species

124 in 37 genera that visit EFNs and 3182 ant species in 261 ant genera that do not; 58 ant species in 22

125 genera that nest in domatia and 3246 ant species in 262 genera that do not; 16 ant species in 8 genera

126 that consume plant food bodies (other than elaiosomes) and 3288 species in 265 genera that do not. In

127 all, we identified 432 ant species in 84 genera that are plant mutualists, and 2910 ant species in 256

128 genera that are not; subsequent analyses were conducted on this combined category.

129 After pruning the phylogeny to match the trait data, the tree was comprised of 795 species in

130 199 genera (Figure 1, Figure S1). Unlike other studies of trait-dependent diversification, we did not

131 include species for which we had no trait information (i.e., ant species that did not appear in our text-

132 mined abstracts). Thus, the 795 species used in the BiSSE and BAMM analyses were species for

133 which the text-mining determined the ant species is likely a plant mutualist (432 species), as well as

134 species for which the text-mining determined the ant species is likely not a plant mutualist (363

135 species); the latter ant species names appeared in at least one of our ~89,000 abstracts but were never

136 found together with any plant mutualist terms (Table S2). Both BiSSE and BAMM models found that

137 mutualistic ant lineages diversify faster than non-mutualistic ant lineages (Table 1, Figure 4). In the bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

138 BiSSE model, there was no overlap between plant mutualists and non-mutualists in their 95% credible

139 intervals for speciation, extinction, and transition rates, meaning these differences are statistically

140 significant (Table S3). Both BiSSE and BAMM analyses reported higher speciation and higher

141 extinction rates when ants evolved mutualisms with plants; however, overall diversification rates were

142 higher in mutualistic ants, despite elevated extinction rates, because of the even greater difference in

143 speciation rates between mutualistic and non-mutualistic ants.

144

145 Discussion

146 Our results build on previous research21,22 showing how automated methods can reliably and

147 efficiently assemble large trait databases for answering macroevolutionary questions. Text-mining

148 generated trait data for almost twice the number of ant species as in the most comprehensive

149 phylogeny available (Nelson & Moreau, unpublished data), and we had overlapping trait data and

150 phylogenetic information for 795 ant species. We used these data to test whether plant-ant lineages

151 have elevated diversification rates, a hypothesis that was supported by both BiSSE and BAMM

152 models. Previous research has similarly found that partnering with ants for defense or seed dispersal

153 accelerates plant diversification3,4, suggesting that ant and plant lineages are either responding to the

154 same external factor(s) affecting diversification (e.g., biogeography), or that ants and plants

155 cooperate-and-radiate. Thus, in addition to having direct effects on both ant and plant partners,

156 mutualism evolution may have long-term implications for lineage diversification, such that mutualism

157 results in reciprocal diversification in interacting lineages.

158 Text-mining was a successful and efficient method for assembling a large trait database, and

159 was primarily limited by the availability of abstracts. The species accumulation curves (Figure 2)

160 suggest that with an even larger corpus, we could acquire trait data for many more ant taxa, and

161 especially that we would identify many more EFN-visiting and seed-dispersing ant species. Our text-

162 mining extracted similar trait information as what is normally assembled manually, and more

163 laboriously, from the primary literature; an automated approach could also vastly improve datasets for

164 meta-analyses, ecological network analysis, etc. Although our text-mining algorithm returned a small

165 number of false positives because we took any co-occurrence of a trait term and an ant species name bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

166 in an abstract as evidence of an association, the number of false positives declined rapidly as trait

167 terms and ant names were found together in more abstracts (Figure 3), again suggesting that a larger

168 corpus would help to reduce noise in the text-mining output. We also text-mined only abstracts;

169 assuming abstracts of papers on ant-plant mutualisms are less likely than main texts to mention non-

170 mutualistic ants in passing, our method should be more conservative regarding false positives, but of

171 course full-text articles contain more trait information. To improve on our text-mining method,

172 considering the proximity between words or the frequency of words, or a more restrictive rule for how

173 often an ant name and trait term need to co-occur, could help to further reduce the frequency of false

174 positives. However, a more restrictive rule may not be necessary with a sufficiently large corpus, as

175 our comparison between the text-mined and hand-compiled data sets shows that most of the seed-

176 dispersing ants missed by the text-mining were described in papers not in our corpus.

177 Automated downloading of large numbers of abstracts proved difficult as most are behind

178 paywalls, and even with institutional subscriptions, it is challenging to download publications en

179 masse26. Thus, more open access publications could increase the efficacy and benefits of automated

180 data collection methods such as text- or data-mining. Nonetheless, using text-mining to extract data

181 from published papers could be applied to a variety of research questions, given that data is readily

182 available in word combinations.

183 Overall, the evolution of ant-plant mutualisms was associated with higher ant lineage

184 diversification; we found a positive effect of mutualism evolution on ant diversification in both

185 analyses, although the effect size was smaller in the BAMM than in the BiSSE analysis (Table 1,

186 Figure 4). We report both speciation and extinction values from the BiSSE model but focus our

187 discussion on diversification rate estimates. In the BiSSE analyses, although the traits evolve

188 independently multiple times, mutualism is very frequently lost (Table 1). One possibility is that

189 mutualism evolves, the ant lineage radiates, and then mutualism is secondarily lost multiple times.

190 This would explain the effect of mutualistic traits on diversification, despite the high mutualism loss

191 rate. Perhaps a more likely explanation is that our trait dataset may include a high number of false

192 negatives that BiSSE models reconstruct as secondary losses. If an ant name and trait term co-occur in

193 enough abstracts, there is little doubt that the ant is a plant mutualist, but it is more challenging to be bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

194 sure that ants named in abstracts that do not also contain trait terms are truly non-mutualistic.

195 Ant-plant interactions are often diffuse, generalized mutualisms27 and theory predicts that

196 mutualistic lineages are prone to extinction6,7. For example, in angiosperms, domatia are thought to be

197 an evolutionary dead end; they appear to go extinct easily in unfavourable conditions, or are as easily

198 lost as they are gained13. However, although ants appear to revert easily back to a non-mutualistic

199 state, we still found that mutualism positively affects ant diversification (see Figure S2 for ancestral

200 state reconstruction). Our results also indicate that there are elevated extinction rates in mutualistic

201 ants but the effect of this on diversification is swamped by the positive effect of mutualism on ant

202 speciation. Compared to seed-dispersing ants, ants that nest in domatia or consume EFN or food

203 bodies confer very different benefits to plants (dispersal versus defense, respectively) but most of

204 these ants nonetheless receive similar rewards—mainly food, but also housing for domatia-nesting

205 ants. These rewards could expand the geographic range or realized niche of an ant lineage. Similar

206 arguments have been made to explain how the evolution of EFNs or elaiosomes has accelerated plant

207 diversification3,4. In combination with previous work on angiosperm diversification, our finding that

208 ant-plant mutualisms also spur ant diversification suggests ants and plants have diversified in tandem

209 via “cooperate-and-radiate” co-evolution.

210

211 Methods

212 Sources of phylogenetic information. We used a phylogenetic tree that includes 1,731 ant species

213 provided to us by Drs. Matthew Nelsen and Corrie Moreau, applying the drop.tip function in the R28

214 package ape29 to include only tips for which we had trait data. The tree is a compilation of previously

215 published ant trees and provides the most up to date inference of evolutionary relationships among

216 ants. However, since there are 14,416 recognized valid ant species names (B. Fisher, personal

217 communication), even a tree with 1,731 ant species is woefully under sampled. The recent

218 phylogenetic comparative methods we used account for incomplete taxon sampling, but these

219 methods are not without error25. We did not account for phylogenetic uncertainty, as such analyses are

220 computationally intensive for many phylogenetic comparative methods.

221 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

222 Sources of trait data. We text-mined 89,495 titles and abstracts from two sources, after removing

223 duplicates prior to analysis: 1) 62,988 abstracts from FORMIS: A Master Bibliography of Ant

224 Literature, containing all known ant literature through to 199630, and 2) 52,885 ant-related

225 publications available through Springer’s Application Portal Interface (API); Springer publishes

226 numerous journals with substantial ant-related content, including Oecologia, Insectes Sociaux,

227 -Plant Interactions, and others. We used Springer’s API to search titles and abstracts for ant

228 species names and plant traits that facilitate ant-plant mutualisms (Table S1). We downloaded the

229 abstracts of the Springer articles that mentioned the ant species or trait terms on our list. One of us

230 (CGL) provided a hand-compiled list of seed-dispersing ants from reading 180 journal articles; we

231 supplemented the text-mining results with data from this hand-compiled dataset.

232

233 Text-mining approach. Because a significant amount of information can be discerned from word

234 combinations alone31, text-mining can be an effective tool to extract information from the published

235 scientific literature. We took a text-mining approach to compile trait data on ant-plant associations;

236 specifically, to determine which ant lineages consume food bodies, nest in domatia, visit EFNs, and

237 disperse seeds. We created a term-document matrix to hold the trait data using the Pandas and NumPy

238 packages in Python 2.7.1232,33. Our Python script used an n-gram approach that allowed us to identify

239 short sequences of words, such as ant binomial nomenclature names or trait terms31. Our n-grams

240 were: 1) all 14,416 currently valid ant species names in a global list of ants, excluding extinct taxa,

241 and 2) a list of traits related to ant-plant mutualisms (Table S2). The resulting term-document matrix

242 identifies whether 1) each ant name appeared without a trait or 2) each ant name appeared with a trait,

243 in each publication’s title or abstract. We text-mined trait terms in four broad categories (domatia,

244 extrafloral nectar, food bodies, and seed dispersal) to capture many different types of ant-plant

245 mutualisms. We used the term-document matrix to score each mutualism category as a discrete binary

246 trait for each ant species, and then further combined all the data into a single binary “plant mutualist”

247 category.

248 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

249 Text-mining validation. If an ant species name never occurred in the same abstract as a trait term

250 (Table S2), but appeared in at least one abstract without a trait term, it was scored as an absence (0). If

251 an ant species name co-occurred with a trait term in at least one abstract, it was scored as a presence

252 (1). Both false positives and false negatives are a concern with text-mined trait data, although it is

253 worth noting that other methods of gathering trait data can also be error-prone at such a large scale.

254 Nonetheless, we validated our text-mined trait data in several ways. First, we scrutinized cases in

255 which ant species names co-occurred with trait terms in 5 or fewer abstracts. We read each of these

256 abstracts and manually scored the ant species in question as a true (1) or false (0) association with the

257 trait term. We compared these manual scores to the text-mining results to determine how the number

258 of false positives declined as the number of abstracts containing ant names and trait terms increased.

259 The trait data for the manually checked abstracts were corrected if required for all subsequent

260 analyses. Although we removed most duplicate abstracts from our corpus during initial processing,

261 because not all abstracts were identically formatted, we discovered a few additional duplicates while

262 reading abstracts and adjusted the dataset accordingly. We also calculated species accumulation

263 curves, in which we determined how the number of unique plant-ant and non-plant-ant species

264 increased as a function of the number of abstracts that were text-mined. Finally, we also compared the

265 text-mined data to our hand-compiled data set.

266

267 Lineage diversification analyses. Rabosky and Goldberg34 showed that low transition rates and rare

268 shifts can cause high type 1 error in SSE models, but this is not a problem in our analyses because ant-

269 plant mutualisms have evolved many times independently across the ant phylogeny (Figure 1).

270 However, Rabosky and Goldberg34 also highlighted a SSE model inadequacy in which neutral traits

271 spuriously influence diversification rates. For this reason, we assessed the influence of being a plant-

272 mutualist on lineage diversification in two ways, using: 1) Binary State Speciation and Extinction24

273 (BiSSE) models implemented using the diversitree35 package in R28 and 2) Bayesian Analysis of

274 Macroevolutionary Mixtures (BAMM) models using bamm 2.5.0 and BAMMtools36 in R28.

275 BiSSE estimates the transition rate between states (q01 and q10) as well as state-specific

276 extinction and speciation rates (mu0 and mu1 and lambda0 and lambda1, respectively). We ran a BiSSE bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

277 model using a global proportion of missing data, specifically the number of species in the tree divided

278 by the total number of currently recognized ant species. To test whether mutualist and non-mutualist

279 lineages have different diversification rates, we ran a Markov Chain Monte Carlo (MCMC) BiSSE

280 analysis with an exponential prior with rate 1/2r, where r is the independent diversification rate of the

281 character. An initial MCMC was run with a tuning parameter of 0.1 for 1,000 generations. The

282 revised tuning was calculated from the width of the middle 90% of the posterior samples from these

283 initial runs. The MCMC analysis was subsequently run for 10,000 generations. We estimated net

284 diversification (speciation – extinction) rates in mutualistic and non-mutualstic ant lineages. We

285 calculated 95% credible intervals of the posterior samples for each parameter from the BiSSE run.

286 We also ran BAMM analyses three times on the same tree. The BAMM was run for 10

287 million MCMC generations, sampling the parameters after every 100,000 generations. We included

288 an overall proportion of ant species in the tree over the total number of indentified ant species in order

289 to indicate how much clade information was missing to account for incomplete taxon sampling (see

290 supplementary file). Rate priors were calculated for each tree using BAMMtools36 and convergence of

291 the BAMM runs was also tested using the R28 package coda37. To assess if diversification rates

292 differed in mutualistic and non-mutualistic ant lineages, we used the subtreeBAMM and getcladerate

293 functions in BAMMtools36 to assess whether there were areas of the tree in the mutualist state that had

294 higher diversification rates than the non-mutualist state. We used the BAMM output to assess the

295 diversification rate of mutualistic and non-mutualistic ant lineages.

296

297 Data availability. Trait and species information used in the text-mining will be available in

298 DataDryad upon acceptance.

299

300 Code availability. Text-mining code (Python script) and lineage diversification code (R scripts for

301 BiSSE and BAMM analyses) will be available in DataDryad upon acceptance. Scripts are also

302 available on github (respository name: Cooperate-and-Radiate).

303

304 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

305 Acknowledgements

306 We thank Matthew Nelsen, Corrie Moreau, and Brian Fisher for providing us with phylogenetic and

307 taxonomic information. We also thank Luke Mahler, James Thomson, Santiago Claramunt, Asher

308 Cutter, Marjorie Weber, Rebecca Batstone, Emily Dutton, Jason Laurich, Anna O’Brien, Shannon-

309 Meadley Dunphy, Mitchel Trychta, and Tia Harrison for their help and feedback over the course of

310 this project. We acknowledge the financial support of a Natural Sciences and Engineering Research

311 Council (NSERC) Discovery Grant and Discovery Accelerator Supplement to MEF.

312

313 Author contributions

314 MEF, PJM, and KMK conceived the study and approach. PJM and KMK obtained abstracts, KMK

315 and ES wrote and edited the text-mining script. CGL generated the hand-compiled list of papers.

316 KMK ran phylogenetic analyses and lead the writing of the manuscript, with contributions from MEF.

317

318 Additional information

319 Supplementary information is available for this paper.

320 Correspondence and requests for materials should be addressed to KMK.

321

322 References

323 1. Ehrlich, P. R. & Raven, P. H. Butterflies and Plants: A Study in Coevolution. Evolution 18, 586

324 (1964).

325 2. Joy, J. B. Symbiosis catalyses niche expansion and diversification. Proc. Biol. Sci. 280, 20122820

326 (2013).

327 3. Lengyel, S., Gove, A. D., Latimer, A. M., Majer, J. D. & Dunn, R. R. Ants sow the seeds of global

328 diversification in flowering plants. PLoS ONE 4, e5480 (2009).

329 4. Weber, M. G. & Agrawal, A. A. Defense mutualisms enhance plant diversification. Proc. Natl.

330 Acad. Sci. 111, 16442–16447 (2014).

331 5. Gómez, J. M. & Verdú, M. Mutualism with plants drives primate diversification. Syst. Biol. 61,

332 567–577 (2012). bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

333 6. Hembry, D. H., Yoder, J. B. & Goodman, K. R. Coevolution and the Diversification of Life. Am.

334 Nat. 184, 425–438 (2014).

335 7. Yoder, J. B. & Nuismer, S. L. When Does Coevolution Promote Diversification? Am. Nat. 176,

336 802–817 (2010).

337 8. Frederickson, M. E. Mutualisms Are Not on the Verge of Breakdown. Trends Ecol. Evol. 32, 727–

338 734 (2017).

339 9. Sachs, J. L. & Simms, E. L. Pathways to mutualism breakdown. Trends Ecol. Evol. 21, 585–592

340 (2006).

341 10. Dunn, R. R., Harris, N. C., Colwell, R. K., Koh, L. P. & Sodhi, N. S. The sixth mass coextinction:

342 are most endangered species parasites and mutualists? Proc. R. Soc. B Biol. Sci. 276, 3037–3045

343 (2009).

344 11. Afkhami, M. E., McIntyre, P. J. & Strauss, S. Y. Mutualist-mediated effects on species’ range

345 limits across large geographic scales. Ecol. Lett. 17, 1265–1273 (2014).

346 12. Batstone, R. T., Carscadden, K. A., Afkhami, M. E. & Frederickson, M. E. Using niche breadth

347 theory to explain generalization in mutualisms. Ecology 0, 1–12 (2018).

348 13. Chomicki, G. & Renner, S. S. Phylogenetics and molecular clocks reveal the repeated evolution of

349 ant-plants after the late Miocene in Africa and the early Miocene in Australasia and the Neotropics.

350 New Phytol. 207, 411–424 (2015).

351 14. Bronstein, J. L. The Contribution of Ant-Plant Protection Studies to Our Understanding of

352 Mutualism. Biotropica 30, 150–161 (1998).

353 15. Crepet, W. L. & Niklas, K. J. Darwin’s second ‘abominable mystery’: Why are there so many

354 angiosperm species? Am. J. Bot. 96, 366–381 (2009).

355 16. Weiblen, G. D. & Treiber, E. L. Evolutionary origins and diversification of mutualism. Mutualism

356 37–56 (2015).

357 17. Giladi, I. Choosing benefits or partners: a review of the evidence for the evolution of

358 myrmecochory. Oikos 3, 481–492 (2006). bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

359 18. Moreau, C. S. & Bell, C. D. Testing The Museum Versus Cradle Tropical Biological Diversity

360 Hypothesis: Phylogeny, Diversification, And Ancestral Biogeographic Range Evolution Of The

361 Ants. Evolution 67, 2240–2257 (2013).

362 19. Moreau, C. S., Bell, C. D., Vila, R., Archibald, S. B. & Pierce, N. E. Phylogeny of the ants:

363 Diversification in the age of angiosperms. Science 312, 101–104 (2006).

364 20. Wilson, E. O. & Holldobler, B. The rise of the ants: A phylogenetic and ecological explanation.

365 Proc. Natl. Acad. Sci. 102, 7411–7414 (2005).

366 21. Jackson, L. M., Fernando, P. C., Hanscom, J. S., Balhoff, J. P. & Mabee, P. M. Automated

367 Integration of Trees and Traits: A Case Study Using Paired Fin Loss Across Teleost Fishes. Syst.

368 Biol. 0, 1–13

369 22. Dececchi, T. A., Balhoff, J. P., Lapp, H. & Mabee, P. M. Toward synthesizing our knowledge of

370 morphology: Using ontologies and machine reasoning to extract presence/absence evolutionary

371 phenotypes across studies. Syst. Biol. 64, 936–952 (2015).

372 23. Szklarczyk, D. et al. STRING v10: protein–protein interaction networks, integrated over the tree

373 of life. Nucleic Acids Res. 43, D447–D452 (2014).

374 24. Maddison, W. P., Midford, P. E. & Otto, S. P. Estimating a Binary Character ’ s Effect on

375 Speciation and Extinction. Syst. Biol. 56, 701–710 (2007).

376 25. Rabosky, D. L. Automatic detection of key innovations, rate shifts, and diversity-dependence on

377 phylogenetic trees. PLoS ONE 9, e89543 (2014).

378 26. Tennant, J. P. et al. The academic, economic and societal impacts of Open Access: an evidence-

379 based review. F1000Research 5, 632 (2016).

380 27. Beattie, A. J. The Evolutionary Ecology of ant-plant mutualisms. (Cambridge University Press,

381 1985).

382 28. Team, R. C. R: A language and environment for statistical computing. R Foundation for

383 Statistical Computing, Vienna, Austria. 2016. (2017).

384 29. Paradis, E., Claude, J. & Strimmer, K. APE: Analyses of Phylogenetics and Evolution in R

385 language. Bioinformatics 20, 289–290 (2004). bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

386 30. Wojick, D. & Porter, S. FORMIS: a master bibliography of ant literature. (Washington, DC

387 (USA) USDA, ARS, 2016).

388 31. Halevy, A., Norvig, P. & Pereira, F. The Unreasonable Effectiveness of Data. IEEE Intell. Syst.

389 24, 8–12 (2009).

390 32. McKinney, W. Data Structures for Statistical Computing in Python. Proc. 9th Python Sci. Conf.

391 1697900, 51–56 (2010).

392 33. van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy Array: A Struture for Efficient

393 Numerical Computation. Comput. Sci. Engeneering 13, 22–30 (2011).

394 34. Rabosky, D. L. & Goldberg, E. E. Model inadequacy and mistaken inferences of trait-dependent

395 speciation. Syst. Biol. 64, 340–355 (2015).

396 35. Fitzjohn, R. G. Diversitree: Comparative phylogenetic analyses of diversification in R. Methods

397 Ecol. Evol. 3, 1084–1092 (2012).

398 36. Rabosky, D. L. et al. BAMMtools: An R package for the analysis of evolutionary dynamics on

399 phylogenetic trees. Methods Ecol. Evol. 5, 701–707 (2014).

400 37. Plummer, M., Best, N., Cowles, K. & Vines, K. CODA: convergence diagnosis and output

401 analysis for MCMC. R News 6, 7–11 (2006).

402

403

404

405

406

407

408

409

410

411

412 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

413 Tables

414 Table 1. Parameter estimate summary from MCMC for BiSSE analysis and post-burn-in MCMC

415 result summary for BAMM analysis. The mean and (standard deviation) of parameters are reported.

BiSSE BAMM

Speciation0 0.11 (0.017) 0.13 (0.0073)

Speciation1 0.69 (0.13) 0.15 (0.010)

Extinction0 0.10 (0.019) 0.045 (0.0082)

Extinction1 0.50 (0.16) 0.064 (0.0011)

Q01 0.017 (0.0039) -

Q10 0.20 (0.028) -

Diversification0 0.0052 (0.0078) 0.080 (0.0022)

Diversification1 0.19 (0.028) 0.089 (0.0032)

Overall Rate1-0 0.19 (0.033) 0.0098 (0.0023) 416 417 418 419 420 421 422 423 424 425 426 427 Tree scale: 10 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

428 Figures

429

430

431

Dolopomyrmex

Cardiocondyla Stegomyrmex

Nesomyrmex 432 Atopomyrmex Xenomyrmex Rogeria Eutetramorium Carebara Metapone Crematogaster Liomyrmex

Huberia PerissomyrmexPristomyrmexMeranoplus

Megalomyrmex Austromorium Oxyepoecus Monomorium Tropidomyrmex Lophomyrmex Solenopsis Tyrannomyrmex Anillomyrma Myrmicaria Tranopelta Acanthomyrmex Ochetomyrmex Lordomyrma Diaphoromyrma Adlerzia Blepharidatta Wasmannia Allomerus Myrmecina 433 Pheidole Procryptocerus Dacatria Cephalotes Strongylognathus Basiceros Proatta Mayriella Eurhopalothrix Talaridris Tetramorium Calyptomyrmex Strumigenys Octostruma Vollenhovia Phalacromyrmex Melissotarsus Acanthognathus 434 Rhopalomastix Daceton Lenomyrmex Leptothorax Harpagoxenus Orectognathus Epopostruma Formicoxenus Mycocepurus Temnothorax Apterostigma Poecilomyrma Myrmicocrypta Oxyopomyrmex Cyatta 435 Mycetosoritis Goniomma Cyphomyrmex

AphaenogasterMessor Mycetophylax Trachymyrmex Stenamma Acromyrmex Veromessor Mycetagroicus Pogonomyrmex 436 Atta Myrmica

Manica Ectatomma Rhytidoponera Gnamptogenys 437 Typhlomyrmex Camponotus Polyrhachis Opisthopsis

Myrmoteras

Plant Mutualist Plant

Seed Dispersal Seed

EFN

Food Body Food Domatia Gigantiops Yavnella Plagiolepis 438 Anomalomyrma Acropyga Martialis Proceratium Petalomyrmex Discothyrea Anoplolepis Probolomyrmex Formica Apomyrma 439 Iberoformica Prionopelta Polyergus Onychomyrmex

Rossomyrmex Amblyopone Cataglyphis Myopopone Proformica Stigmatomma Mystrium Oecophylla 440 Notoncus Paraponera Tatuidris Platythyrea PseudonotoncusProlasius Diacamma Notostigma Cryptopone Paratrechina Ectomomyrmex

Euprenolepis Ponera 441 Nylanderia Thaumatomyrmex Neoponera

ParaparatrechinaPrenolepis Dinoponera Zatania Pachycondyla Hypoponera Lasius Harpegnathos Myrmecocystus Centromyrmex Psalidomyrmex Cladomyrma Plectroctena 442 Myrmelachista Leptogenys Hagensia Brachymyrmex Froggattella Brachyponera Ochetellus Phrynoponera Philidris Buniapone Iridomyrmex Paltothyreus Ophthalmopone Streblognathus Doleromyrma Odontoponera Anochetus Papyrius Odontomachus Nebothriomyrmex Leptanilloides Acanthostichus Anonychomyrma 443 Sphinctomyrmex Linepithema Forelius Cerapachys Dorymyrmex Azteca Aenictus Dorylus Cheliomyrmex Neivamyrmex Labidus Nomamyrmex Eciton Nothomyrmecia

Myrmecia Leptomyrmex Gracilidris

Ravavy Loweriella

Dolichoderus

Aptinoma

Tapinoma

Liometopum Myrcidris

Aneuretus

Bothriomyrmex

Tetraponera

Technomyrmex 444 Pseudomyrmex

445

446

447

448

449 Figure 1. Visualization of trait data, showing which genera (N = 199) contain species that nest in

450 domatia (pink), consume food bodies (yellow), visit EFNs (purple), disperse seeds (orange), or engage

451 in any plant mutualism (combination of all traits) (green). Black bars show the number of species in

452 each . The full species-level phylogeny and trait data used in analyses is in the Supplementary

453 Information.

454 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

455

456 2500 457 1500 458 Number of Species 500

459 0

0 5000 10000 15000 460 Number of Abstracts 461

462 150 150 463 100 100 464 50 50

465 Number of Species Number of Species 0

466 0 0 5000 10000 15000 0 5000 10000 15000 467 Number of Abstracts Number of Abstracts 468

469 Figure 2. Species accumulation curves showing how the number of unique ant species increased as

470 more abstracts were sampled for both ants that do not interact mutualistically with plants (top panel)

471 and ants that disperse seeds (orange), visit EFNs (purple), nest in domatia (pink), and consume food

472 bodies (yellow), (bottom panel). Nearly 15,000 abstracts contained ant names. Grey lines are means

473 and black or colored regions are standard deviations calculated from sub-sampling abstracts 100 times

474 at each x-axis value.

475 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

476

477

478 75 479

480

481 50 482 False Positive True Positive 483

484

Frequency (Count) 25 485

486

487 0 488 1 2 3 4 5 489 Number of Occurrences 490

491 Figure 3. When an ant species name appeared with a trait term in five or fewer abstracts, abstracts

492 were manually scored to check for false positives. No false positives were detected when ant species

493 name and trait terms co-occurred in at least 4 abstracts.

494 bioRxiv preprint doi: https://doi.org/10.1101/306787; this version posted April 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512 Figure 4. Diversification rates from (A) BiSSE and (B) BAMM analyses for ant lineages that are

513 plant mutualists (green) and ant lineages that are not plant mutualists (grey).