Survival and selection biases in early animal evolution and a source of systematic overestimation in molecular clocks
Graham E. Budd1 and Richard P. Mann2,3
1 Dept of Earth Sciences, Palaeobiology, Uppsala University, Villavägen 16, Uppsala, Sweden, SE 752 36. [email protected] 2Department of Statistics, School of Mathematics, University of Leeds, Leeds LS2 9JT, United Kingdom. [email protected] 3The Alan Turing Institute, London NW1 2DB, United Kingdom
1 Abstract
2 Important evolutionary events such as the Cambrian Explosion have inspired many
3 attempts at explanation: why do they happen when they do? What shapes them,
4 and why do they eventually come to an end? However, much less attention has
5 been paid to the idea of a “null hypothesis” – that certain features of such diver-
6 sifications arise simply through their statistical structure. Looking back from our
7 own perspective to the origins of large groups such as the arthropods, or even the
8 animals themselves, will expose features that look causal but are in fact inevitable.
9 Here we review these features with particular regard to the Cambrian explosion. We
10 conclude that the fossil record of the late Ediacaran to Cambrian is very likely to
11 be recording a true evolutionary radiation at this time; and show how the unusually
12 rapid nature of this event leads to characteristic over-estimation of its time of origin
13 by molecular clock methods - an artefact that is likely to apply to other unusually
14 fast radiations too.
15
16 Keywords: Crown groups, diversification rates, the push of the past, molecular
17 clocks, Cambrian explosion.
1 18 Introduction
19 The early evolution of the animals remains a remarkably contentious topic, with a lack
20 of agreement even on fundamental issues such as when it took place. The majority view
21 in the field is probably that animals evolved considerably before their undoubted fossil
22 record commenced (e.g. [1][2][3][4][5]). Such a view is based on several lines of evidence:
23 (1) molecular dates, that have consistently placed the timing of (for example) bilaterian
24 origins tens to hundreds of millions of years before the Cambrian [2][4]; (2) biomarkers
25 that have been used to place the origin of sponges in the Cryogenian [6]; (3) a general
26 view about timing involving a necessary period of evolution being required before the
27 fossil record (and its biogeography) can be generated [7][8]; and (4) a view that the fossil
28 record is in general patchy and unlikely to be reliable in any useful way for documenting
29 the precise timing of animal origins (see discussion in [5]).
30 Standing against this sort of view is what is probably the minority one that the fossil
31 record of early animals can be read more or less “as is”, despite its obvious imperfections
32 [9][10][11][12]. In particular, the origin of bilaterian-like trace fossils from later than 560
33 Ma in the late Ediacaran [13] is seen as an important marker that provides a backstop
34 for the latest time of origin of crown-group bilaterians, and which probably indicates the
35 time of entry of stem-group bilaterians into the fossil record. The rapid, but resolvable,
36 appearance of body and trace fossil taxa in the succeeding Terreneuvian Series is thus
37 seen as the unfolding of the crown group bilaterian radiation.
38 It seems remarkable, on the face of it, that so well a documented phenomenon as the
39 Cambrian explosion can be open to such divergent interpretations. One reason why this
40 might be the case is that the basic “ground rules” are different in each sort of view. In
41 particular, there is an ancient tradition of viewing the animal phyla as being distinct
42 entities that cannot be easily compared to each other. Thus, it has become quite com-
43 monplace to argue that their morphological origins are essentially independent from each
44 other. Whilst such a view has been articulated in various ways by many people, one
2 Figure 1: A classical image of the parallel emergence of the phyla around the time of the early Cambrian from relatively simple (and thus unfossilisable) ancestors. Reprinted with permission from Bergström (1989) [14].
45 classical image is the striking one of Bergström 1989 (Fig.1; [14]) that shows a series
46 of phyla emerging from a small, slug-like organism (a “procoelomate”) during a “forma-
47 tive interval” (c.f. Valentine’s “roundish flatworm” [15] and the "small, thin" ancestors
48 of Sperling and Stockey [4]). A more extreme version of such ideas is that the earliest
49 bilaterians closely resembled the planktonic larvae of the extant clades (e.g. [16], drawing
50 on the tradition that can be ultimately traced to Haeckel [17]). Buttressed by various
51 geological and developmental arguments, this hypothesis posits that such small, and by
52 their very nature hard-to-preserve tiny animals are meant to persist up many, if not all
53 stem groups that lead to the modern-day crown-group phyla. Hence, the appearance of
54 undoubted bilaterian fossils represents in each case the transition from the unfossilisable
55 proto-phylum member to a full-blown fossilisable taxon. Such a transition could be trig-
56 gered either by an environmental stimulus (typically oxygen levels rising) or presumably
57 by some necessary level of ecological complexity being reached.
58 It is surprisingly difficult to unpack the entire set of assumptions and traditions that
59 lie behind the first of these views, but some features stand out and can be critically
3 60 examined. The first is the assumption that for all or at least many bilaterian clades,
61 the ancestral state was a small body size which would be difficult to record in the fossil
62 record [18]. Small organisms often lack key features such as complex musculature, body
63 cavities and appendages, and it seems that the critical body length for these purposes is
64 around 1 mm (see discussion in [11]). Many living protostomes are indeed tiny, and some
65 phylogenetic reconstructions have suggested this is ancestral for the clade as a whole [19].
66 Without a reliable and fully-resolved protostome phylogeny, such a view is difficult to
67 assess. Nevertheless, in recent years some relatively large Cambrian members of clades
68 that today consist of meiofaunal organisms have been discovered. Of course, simply
69 finding a single large member of a clade is not equivalent to demonstrating that large body
70 size is the ancestral state, but it does (considerably) weaken the assumption that small
71 body size must be, especially given evidence that old fossils preserve more plesiomorphic
72 character states than recent taxa do (having had less time to shed them; [20]). Another
73 point of dispute has been the placement of the “Xenacoelomorpha”, a clade that consists
74 of Xenoturbella and acoel and nematodermatid flatworms; different analyses have placed
75 them either as the sister group to all other bilaterians [21][22], or as sister group to the
76 Ambulacraria (echinoderms and hemichordates) [23]. It should be stressed, however, that
77 even if such a clade is the sister group to all other bilaterians, it does not immediately
78 follow that its simple morphology must represent the ancestral condition for bilaterians as
79 a whole (as in [21]). Determining this would require detailed character reconstruction at
80 the base of the rest of the bilaterians, which is currently a matter of considerable dispute,
81 and reference to the cnidarian outgroup. In order to present more crisp hypotheses, we
82 shall briefly examine the current evidence for various clades:
83 Annelids, phoronids, sipunculans, brachiopods, molluscs
84 This, the classical “Lophotrochozoa” grouping, may also include entoprocts, bryozoans
85 and nemerteans (for a recent discussion of all of the lophotrochozoans in their broadest
86 sense, see [24]). Most of the included clades have tiny members (e.g. the so-called
4 87 ‘archiannelids’ [25], Gwynia capsula [26] etc) but a secondary derivation of these seems
88 most likely (see e.g. [27][28] for annelids). In general, the lively debate around the origins
89 of this group includes the distinct possibility that fossilisable morphological features such
90 as sclerites and setae are plesiomorphic within them [10][29][30][31].
91 Other spiralians
92 Other protostomes (excluding the ecdysozoans - see below) include the Gnathifera (gnathos-
93 tomulids, rotifers, micrognathozoans, and, it seems, chaetognaths [32]), and the “Roupho-
94 zoa” [33] consisting of platyhelminths and gastrotrichs; these two clades together may
95 be paraphyletic and give rise to the lophotrochozoans (see e.g. the summary of re-
96 cent progress in [34]). Such a reconstruction has been considered to suggest a primitive
97 small body size for the protostomes as a whole [19][33]. In particular, it has been
98 argued [33] that the coelomate condition in the spiralians is confined to the lophotro-
99 chozoans (i.e. annelids, brachiopods, molluscs etc) and arose from within a clade of
100 acoelomate, flatworm-like animals. Once again though, some recent molecular and fossil
101 discoveries have suggested the situation is not so simple. The inclusion of the relatively
102 large, and potentially fossilisable and coelomate chaetognaths within or as sister group of
103 the Gnathifera throws into question the primitive state in this clade ( [35]; c.f. [32][36]);
104 and even in the Rouphozoa, recent suggestions of large (> 10 cm) gastrotrichs in the
105 Cambrian [37] must bring the primitive character state in this clade under question as
106 well.
107 Ecdysozoans
108 The last common ancestor of onychophorans and euarthropods (at least) was a large
109 organism [38][39]. The placement of tardigrades remains uncertain [40][41], but once
110 again a single clade of tiny organisms cannot be automatically assumed to reveal ances-
111 tral states. The plesiomorphic state in the cycloneuralian worms, whether or not they
5 112 are monophyletic, remains uncertain. Of the living clades, it is reasonable to assume that
113 the priapulans are primitively large, and that the loriciferans and kinorhynchs small.
114 However, large Cambrian representatives of both the latter groups have recently been
115 described (i.e. [42][43], although the status of the kinorhynch remains to be confirmed),
116 with the suggestion that the ancestral scalidophorans were also of large size. Even with
117 the assumption of primitive small body size in the nematoideans (nematodes plus ne-
118 matomorphs) then, a large body size for the entire Ecdysozoa is, on the basis of present
119 evidence probably most likely. In addition, the presence of sclerotized cuticle within
120 the entire clade makes its appearance in the small carbonaceous fossil record entirely
121 plausible.
122 Deuterostomes
123 Despite the recent description of a putative meiofaunal deuterostome from the early
124 Cambrian [44], and the possible presence of the Xenacoelomorpha in the clade [23]), an
125 ancestral large body size in at least echinoderms, hemichordates and chordates seems
126 most reasonable. Thus, overall an ancestrally large body size seems most likely for the
127 deuterostomes.
128 It should be clear from the above that an ancestral small body size for large clades
129 of bilaterians cannot straightforwardly be inferred from present day body sizes, and this
130 conclusion becomes even less secure once the fossil record is considered. For example,
131 such a view seems to be directly contradicted by the reconstruction of the stem group in
132 the arthropods, where we know that, far from being tiny, many stem group arthropods
133 were in fact very large [45][46][47].
134 Body size is by no means the only marker for preservability, and perhaps not even
135 a very good one. However, the trace fossil record is one that is tied to body size, and
136 it is indeed difficult to imagine a high abundance of relatively large bilaterians existing
137 without any sort of perceivable trace fossil record. Even if (as has been argued) ancestral
138 deuterostomes, for example, were sessile or semi-sessile [48], it does not follow that no
6 139 trace of them would be left in the record. Furthermore, even if an ancestral small organism
140 (or one that could not leave trace fossils) was inferred, it would be required to stay in
141 this state without any ecological innovation for many millions of years. Trace fossils from
142 the terminal Ediacaran are not always particularly small [49], and – if we accept that
143 they represent stem-group, or even early crown-group bilaterians – it follows that some
144 of these were of large size too.
145 Even if we accept that ancestral bilaterians were small, the implication for the fossil
146 record would be that as the stem-group members of many phyla crossed into the Cam-
147 brian, they would simultaneously each have to develop into large animals, in a highly
148 unparsimonious way. Finally, it should be noted that at least putative trace fossils of
149 meiofaunal organisms have been reported from Brazil [50], suggesting (if correctly as-
150 signed) that even tiny organisms might leave traces.
151 The problem of suggesting multiple attainments of large and fossilisable body size
152 as a solution to the problem of the lack of Ediacaran crown group bilaterians is partly
153 concealed by the low sampling of present day diversity in summary cladograms, because
154 phyla are often (and perforce) represented by a single lineage (see e.g. [5]). However,
155 if the crown group of any particular clade of large animals is thought to have emerged
156 before the Cambrian, then all crown group lineages of that phylum crossing the boundary
157 would themselves have to independently develop large (and complex) organisation. This
158 itself is striking: why would be the case that of all the implied crown group bilaterian
159 diversity in the Precambrian, no lineage ever hit upon a way of attaining large body
160 size? Such arguments have recently gained new potency because of the growing and
161 widespread recognition that the large Ediacaran organisms indeed lie within stem groups
162 to various animal groups (cnidarians, ctenophores, bilaterians, etc [51][52] [53][54]).
163 If, then, the presence of large stem-group animals (complete with their at least putative
164 trace fossils (e.g. [55][56]) is acknowledged during the Ediacaran, why are there no crown-
165 group bilaterians? We believe that the combination of these factors makes the presence
166 of bilaterians deep in the Ediacaran or even earlier very unlikely indeed.
7 167 Such a view receives substantial support from birth-death modelling [57][58], which
168 suggests that under most conditions, crown groups emerge rapidly and swamp their stem
169 groups in short order. In such a view, if the crown group bilaterians emerged during the
170 early Ediacaran (c.f. [2]), then modelling would suggest that by the time of the opening of
171 the Cambrian, there should be many thousands of species, diversified into many different
172 crown group lineages. Yet, somehow, not a single body fossil of an indisputable crown-
173 group bilaterian has ever been found before the Cambrian. Just as strikingly, when
174 animals do appear in the fossil record in the Cambrian, the fossil record develops in a
175 way that makes it look as if a real diversification is going on (see e.g. [59][60][61][62][63];
176 a pattern that is consonant across the small skeletal, carbonaceous and trace fossil records
177 [63]). Such a pattern has also been noted in the unfolding of the early angiosperm fossil
178 record [64], despite molecular clock evidence for a deeper origin [65][66]. No satisfactory
179 explanation of such patterns has been undertaken. In the rest of this contribution, then,
180 we take as our view that crown-group bilaterians emerged late (probably just before the
181 beginning of the Cambrian). We wish to examine two features of this emergence, both
182 through birth-death modelling: i) why do body plan features appear to emerge so rapidly
183 in the Cambrian, and ii) why might molecular clocks be overestimating the timing of the
184 origins of clades?
185 Birth-death models and their biases
186 Patterns of diversification have been long studied through birth-death models, with im-
187 portant early papers being [67] and [68]; with very significant early contributions by
188 Raup, whose 1983 paper [69] on the early origins of major groups is an underappreci-
189 ated classic (for an update of this work, and a general discussion of the fossil record and
190 birth-death models, see [58]. Recent important developments in this general area include
191 e.g. [70][71]). Simple homogeneous birth-death models assign constant rates of extinction
192 and speciation to a group; an end member is the so-called Yule process that has only
8 Figure 2: Two "gambler’s ruin" simulations with an exponential birth-death process. A Starting with 1 species; B starting with 10 species. Red lines are successful clades that manage to escape from the absorbing boundary of extinction; black ones, clades that go extinct. In the case of initially small clades (A), most rapidly go extinct without reaching a substantial size. Initially larger clades (B) more often survive; note that those that go extinct tend to behave like the survivors until they rather rapidly collapse to extinction.
193 speciation and no extinction [72]. However, despite the presence of these constants in
194 the models, a wide range of stochastic outcomes can emerge from them (compare: even
195 if one has a constant 1 in 6 chance of rolling a 6 with an unbiased die, the number of
196 sixes actually rolled in a trial set of 20 rolls will vary greatly from one trial to the next).
197 In particular, if there is any sort of significant extinction rate, then clades are logically
198 unlikely to survive for long. Clades that do survive to the present from a distant time
199 in the past (as Cambrian clades would represent) thus represent a small subset of all
200 possible clades. Such survival can be compared to a classical "gambler’s ruin" process
201 (c.f. page 45 ff. of ref. [73]), with the critical difference being that in an exponential
202 process (as opposed to the simple random walk of Raup), it is possible (rarely) to escape
203 from the absorbing boundary that extinction represents (Fig.2).
204 Such survivors thus represent a very biased subset of all possible outcomes of the
205 birth-death process, and the most important component of their peculiar characteristics
206 was identified by Nee et al.(1994) [67] as the “push of the past”. Essentially, the push
9 207 of the past is the chance higher-than-normal rate of diversification present in the early
208 species of a successful clade. As discussed by Budd and Mann (2018) [58], this early burst
209 of diversification can be seen by considering the observed rate in species as a whole (i.e.
210 including both extinct species and lineages leading to modern groups).
211 The push of the past has been almost entirely ignored because most phylogenetic work
212 involving birth-death models has involved the modelling of the so-called reconstructed
213 evolutionary process. This essentially involves the modelling of the “lineages” that give
214 rise to modern day diversity, and which can be reconstructed backwards in time through
215 the coalescence process (e.g. [74]). Such lineages are, in short, those represented by
216 the terminal and internal branches of a phylogeny. Rather surprisingly these lineages,
217 although speciating twice as fast as the background rate [74][58], show little variation of
218 rate until the present is nearly reached, and thus show no “push of the past” effect, at least
219 as defined by [67][58], even though early diversification rate changes along lineages are
220 sometimes referred to as such (e.g. [75]). Early high rates of lineage creation, then, must
221 rely on a different feature. Various explanations have been posited for observed instances
222 of such early bursts of lineage diversification [76][77][78][79], and the general pattern has
223 been referred to as the “large clade effect” (LCE [58]). Whatever their underlying causes,
224 early bursts can again be seen as features of the general stochasticity of birth-death
225 processes ( [58]; c.f. [80]). Nevertheless, this is not a survivorship bias, because all clades
226 that have lineages must tautologically have survived to the present day. Survivorship
227 could occur by only one lineage of a particular clade living in the present day, or by
228 many thousands, even with the same diversification parameters. If one examines only
229 the large examples out of these many possibilities, it will be seen that they will generally
230 have created a larger-than-expected number of lineages early on in their history, giving
231 a characteristic “bulge” in the lineage-through-time (LTT) plot of the clade (Fig.3; c.f.
232 fig. 7 of Budd and Mann 2018). This, then, is a selection, not survivorship bias of the
233 process (c.f. [81], who discusses the perils of treating large clades as representative of the
234 evolutionary process as a whole).
10 Figure 3: The exponential probability distribution of Yule-process generated crown group sizes for crown groups of age 100 Ma, mean size of 200 and birth rate of 0.0530. The distribution is broadly divided into small (left), medium (middle) and large (right) clades. The middle section runs from clades that are twice too large to half too large, and contains half of all the trees. Representative lineage-through-time (LTT) plots are provided for each sector (compare Fig.6) with retrospective time in Myrs on the x axis and the number of lineages of the y axis.
11 235 A few papers have explicitly examined these early diversification bursts considered
236 as features of unusually large clades (e.g. [58][80] [82]). However, they are normally
237 subsumed under a different sort of category, i.e. “diversity-dependent” diversification.
238 Such models make the assumption that as a characteristic carrying capacity of a particular
239 clade is reached, the rate of diversification will asymptotically slow down, until a steady
240 state is reached. Although we are rather skeptical towards the biological reality of the
241 ecological underpinnings of such models, we note that in any case the typical patterns
242 of the LCE and DDD closely resemble each other, and it is not clear that the two can
243 be easily distinguished ( [58][57]). Two sorts of bias can thus be distinguished in birth
244 death modelling of the evolutionary process: the push of the past survivorship bias of the
245 process as a whole, and the large clade effect selection bias that further emerges by only
246 examining unusually large living clades. We now wish to explore some of the features
247 that these biases introduce into patterns of evolution, in particular the rapid formation
248 of body plans and the concept of “key innovations”; and how they might affect molecular
249 clocks.
250 Body plans and key innovations
251 In order to introduce the rates of morphological (and, indeed, molecular) evolution into
252 discussions about trees based on birth-death models, it is probably necessary to make
253 a further assumption, i.e. that there is some sort of relationship between rates of spe-
254 ciation and rates of molecular and/or morphological change. Such a relationship seems
255 intuitive, because eventually all species must be distinguished by certain morphological
256 and molecular synapomorphies, but the empirical evidence for this has been mixed (see
257 e.g. [83]). One potential reason for this is that molecular phylogenies are perforce based
258 on the reconstructed evolutionary process which cannot include extinctions within it (lin-
259 eages, by definition, cannot include extinction). Any examination of rates of molecular
260 or phenotypic change in such trees can thus only examine the relationship between such
12 261 rates and the rate of lineage production; and the latter will bear little resemblance to
262 the rate of true speciation that includes all extinct taxa too. Only if extinction rates are
263 very low will there be a reasonably close relationship between the two, because then any
264 speciation event in the past will have a good chance of giving rise to a lineage.
265 Early bursts of molecular and morphological evolution: the case
266 of the arthropods
267 We explore some of the problems discussed above by examining the important study of
268 rates of evolution of arthropods by Lee et al. 2013 [84]. They showed that if the time of
269 origin of the arthropods was fixed by the fossil record at around 555Ma, then arthropods
270 evolved very rapidly during the Cambrian, with initially high rates of both molecular
271 and morphological change across the early lineages. Arthropods, above all other clades,
272 are likely to be a “large clade” in the sense of [58], given that they are much larger than
273 any other clade of a comparable age that does not include them. This view is supported
274 by the LTT plot that can be reconstructed from the data of Lee et al. 2013 [84] (Fig.
275 4), showing a large early bulge. In addition, because of the inevitably very low sampling
276 rate (c. 50 taxa across the probable millions of extant species), the LTT shows a rapid
277 flattening out as the present is approached (i.e. given true present day diversity, one
278 would except many more lineages actually to have been created as the present day is
279 approached).
280 A notable feature of the data provided by Lee et al. (2013) [84] is that when one
281 compares the rate of lineage creation to the rates of molecular/morphological evolution,
282 they counter-balance each other, so that the mean number of molecular (or morphological)
283 changes per lineage creation remains constant through time. Thus, the curve showing
284 the rate of decline of lineage creation (Fig.4B) is very similar to their curves showing
285 the rate of decline of morphological/molecular change (their fig. 3). Furthermore, they
286 show that the size of the initial burst of evolution is dependent on where the root of
13 Figure 4: A The "lineage-through-time" (LTT) plot reconstructed from the data of Lee et al. 2013 [84]. Note, owing to (inevitable) undersampling, that the LTT plot flattens out towards the recent, but that it starts with a pronounced bulge characteristic of large clades. B The reconstructed rate of lineage production through time. Each point is the inverse of total branch length between successive lineage creations; a 50 Myrs smoothed average line is also included. Compare this effect with that rate of slow-down of molecular and morphological rates of change in Lee et al. 2013, fig. 3.
287 tree is placed; if deeper, the initial rates are lower. These patterns imply that lineage
288 creation rate (as opposed to speciation rate) and evolution rates are not independent
289 (c.f. [85][86]). Why might the creation of a lineage be accompanied a fixed amount of
290 molecular and morphological change?
291 If one makes the assumption that rates of morphological/molecular change correlate
292 in some sense with rates of speciation (see above and discussion in [58]), it follows that
293 the large clade effect would not predict such a slowing down of rates along the lineages,
294 which only have a slightly elevated initial speciation rate in large clades compared to
295 normal ones [58]. Several other possibilities present themselves, however. The first is
296 that the root of the arthropods is really much deeper than the fossil record suggests (but
297 see below), and that artificially compressing their early evolution has the side effect of
298 increasing early rates of evolution too. The second is that there really is a large clade
299 effect (i.e. early arthropod lineages do appear much faster than expected), and that the
14 300 algorithms used for distributing character change across the tree do not take this into
301 account (and thus distribute character change to lineages as if they all had the same mean
302 length). Finally, there is the possibility that there is an as yet poorly-understood lower-
303 level process (involving ecology, competition etc) that really does connect rate of lineage
304 creation with that of underlying evolutionary change. Although we have not managed
305 to penetrate this question adequately, we suspect that the second of these possibilities
306 is likely to be correct, and that, along with other artefacts that we demonstrate below,
307 standard treatment of large clades will generate artefactual high rates of evolutionary
308 change along their early branches. Despite the difficulties of understanding what happens
309 along the lineages of a clade, it nevertheless remains likely that successful clades will show
310 high rates of evolutionary change in their early history when averaged across both plesions
311 and lineages, because of the overall push of the past effect [58], but concentrated into the
312 early lineage(s). Such rapid early evolution will however, be a necessary feature of the
313 emergence of a successful clade, and not its cause, as in the ‘key innovation’ concept (see
314 discussion in e.g. [38]). One further selection bias may enter here. Phyla are commonly
315 thought of as (often) large clades that are morphologically distinct from each other.
316 Unusually large clades should have short stem groups [58][57], and all thing being equal
317 this should imply a relatively short period of time for evolutionary change to accumulate;
318 yet the distinctness criterion for phyla suggests the opposite, since the accumulation of
319 distinctiveness occurs by extinction in the stem group [59][87] and therefore selecting
320 phyla through distinctiveness selects in favour of longer stem groups. Although the overall
321 effect of such a bias is uncertain, it should probably be considered as an important factor
322 when trying to understand the disparity differences between phyla when considered as
323 distinct evolutionary units (e.g. [87]).
15 324 Molecular clocks and the large clade effect
325 Because almost all molecular data are from living organisms, it follows that they bear
326 largely upon the reconstructed evolutionary process. A further consequence is thus that
327 molecular data are in principle not affected by changes in rates of speciation, which are
328 largely constant along the lineages, even in large clades (until the recent is approached at
329 least [58]). However, we have previously suggested that molecular reconstruction might
330 nevertheless be more subtly affected by the large clade effect [58]. Whilst this influence
331 is potentially wide-ranging, here we focus on one part of it, which is the influence of the
332 LCE on molecular clock estimates.
333 Modern-day molecular clocks [88] are typically (but not universally – e.g. [89]) con-
334 structed using approaches that simultaneously estimate a phylogenetic reconstruction,
335 branching model and rate of substitution along the lineages, often employing so-called
336 “relaxed clock” methods (e.g. [90][91]). It is rather noticeable that in many cases, molec-
337 ular clocks considerably over-estimate times of clade origins when measured against the
338 fossil record, and the origin of the bilaterians has been one of several classical instances
339 of this pattern (for the case of angiosperms, see e.g. [64] and the particularly trenchant
340 discussion by [92]). Often such mismatches arise through rather imprecise or inadequate
341 calibration of clocks [93][94], and indeed better attention to calibration has certainly
342 narrowed the fossil-molecular clock gap in many instances (see e.g. [95]). Even so, de-
343 spite rather close interrogation, molecular clock estimates of bilaterian origins, although
344 exhibiting a considerable range of possible dates, still firmly place them deep in the
345 Ediacaran or even earlier, even when the error bars on the estimates are taken into
346 account [5][2].
347 Many possible reasons for problems with molecular clocks have been discussed, includ-
348 ing those concerning heterogeneous molecular evolution [96], date calibration [93][97] and
349 various issues of sampling and tree priors [98][99][100][101]. Without denying any of
350 these issues, many of which can and have been accounted for, we wish here to take
16 351 a somewhat broader perspective of the relationship between tree model and molecular
352 clock results. Given our reasoning above, we are led to ask the question: “given an ap-
353 parently reliable fossil record of animals, why might molecular clocks be systematically
354 inaccurate here?”, which we appreciate is the opposite of the one normally taken, which
355 is, rather, to examine ways in which the fossil record might be misleading. Here, then,
356 we wish to focus on one topic in particular: the role of unusual clade size in molecular
357 clock bias.
358 Lineage-through-time plots and times of origins: general consid-
359 erations
360 By reconstructing a tree, and with some dating constraint on it (by estimating some
361 combination of the rate of molecular change, ages of tips or of nodes), the lineage-through-
362 time (LTT) plot should be expected to yield an estimate of the base of the clade (Fig.
363 5). For normally-sized clades, the LTT plot when not close to the present is a straight
364 line on a semi-log plot, with slope λ − µ (where λ is the unit rate of speciation and µ the
365 unit rate of extinction). However, in an unusually large clade, the early part of the LTT
366 shows a much higher slope than that of the central part (which remains of slope λ − µ),
367 implying the lineages accumulate faster than one would expect. Extrapolation from the
368 central part of the plot might thus be expected to yield an erroneously deeper origin of
369 the clade (Fig.5). In the case of the Yule process, this is a straightforward calculation
370 as the expected time to produce a clade of N species is ln(N)/λ and to produce a clade
371 of αN species is ln(αN)/λ; the expected time difference is thus ln(α)/λ.
372 Bayesian phylogenetic inference packages such as BEAST and BEAST2 can employ a
373 birth-death (or Yule) model for their clade inference (it has recently been shown, however,
374 that initial model selection has little effect on the results obtained from BEAST2 at
375 least [102]). Such inference discovers the most (a posteriori) likely trees for any given set
376 of input data. A priori, this will favour trees with an early straight-line LTT, because this
17 Figure 5: The possibilities of poor inference from unusually large clades. The solid line represents the true LTT plot for an usually large clade that would normally have been expected to produce c. 60 species in the present after diversifying for 100 Ma (diversification rate 0.0407), but in fact produced 200. Extrapolating the asymptoptic slope of of the curve at the recent backwards assuming the clade was a normal size (dashed line ending in blue dot) would imply an origin at 130 Ma. For simplicity, we have chosen a Yule process, but a similar principle would hold for a birth-death model too. The dotted line might be inferred if the root was known, but not the diversification rate.
18 377 is the central outcome of the birth-death (or Yule) process. If the true process led, however
378 to an unusually large (or small) clade, then the molecular data may be in tension with this
379 central expectation. In other words, the shape of the tree, and the rate of accumulation
380 of change upon it, will be telling different stories. In such a scenario (we predict), the
381 inferred time of origin of the clade will emerge as a compromise between the two, with
382 the exact balance being determined by the relative weight given to each and the relative
383 flexibility of each prior. Even if the time of origin of a clade is well known, and its number
384 of species in the present, the true extent of the LCE might not be inferred, because it
385 is possible to reconstruct the generating process by a higher background diversification
386 rate.
387 BEAST2 recovery of simulated data
388 We have chosen to examine the influence of the LCE on molecular clocks by a simulation
389 approach. Using the TreeSim package in R [103] we first simulated eight sets of five trees
390 (Table1), each with 200 terminal taxa. Each tree was fixed to a height of 100 Myrs (to
391 the most recent common ancestor). The diversification rate in each set was chosen so
392 that the actual 200 taxa in each of the sets of five trees represented 20, 10, 5, 2, 1, 0.5, 0.2
393 and 0.1 times the number that would be expected given the diversification parameters
394 chosen. For further simplicity, a Yule process (i.e. with no extinction) was chosen for the
395 tree generation. After generating these trees we further simulated a process of molecular
396 evolution along the branches from the most recent common ancestor to the terminal taxa.
397 For each tree we simulated 1000 nucleotide sites, with a homogeneous Poisson process
398 for nucleotide substitutions, utilising a substitution rate of 0.03 per site per million years
399 (i.e. 0.01 per possible transition), using the phytools R package [104].
400 For the BEAST2 recovery of the simulated data, we set a strict clock rate to the
401 true value of 0.03, a Yule process tree prior, and a MCMC chain of 10,000,000 samples,
402 thinned to 10,000 trees for further analysis with Tracer [105].
403 For illustrative purposes, we show an exemplar simulated clade of each set, together
19 Set nr. Nr. of taxa Tree height Div. rate (d) Expected nr. of taxa given d 1 200 100 0.0220 10 (clade 20× too big) 2 200 100 0.0294 20 (10×) 3 200 100 0.0366 40 (5×) 4 200 100 0.0459 100 (2×) 5 200 100 0.0530 200 (1×) 6 200 100 0.0599 400 (0.5×) 7 200 100 0.0691 1000 (0.2×) 8 200 100 0.0760 2000 (0.1×)
Table 1: The parameters of each set of five simulated Yule-process trees (giving forty simulations in total).
404 with its LTT plot (Fig.6; (d) is the expected clade size for the given diversification rate).
405 Note that in the smaller-than-expected clades (Fig6 a-c), there is a delayed lineage
406 diversification, leading to a depressed early LTT; whereas in the larger-than-expected
407 clades (6 e-h), the opposite pertains: more lineages emerge early on, and there is an
408 increasingly pronounced bulge in the early LTT plot.
409 Results
410 We show the results of inference of tree height on all 40 sets of simulated data in Fig.7,
411 grouped by (true) relative clade size. In each set of five, we show the 95% High Posterior
412 Density (HPD) interval and the mean of the five means. In addition, we plot the naïve
413 expected value of the tree height based on Fig.5. All the outputs reached a satisfactory
414 value (i.e. >200) of effective sample size (ESS) for the tree height parameter. All the
415 other parameters for all runs had satisfactory ESS except for two of the twenty-times
416 too large clades. The most notable result of the plot is that in the large clade cases, the
417 tree height is consistently overestimated, and in the small clade cases, underestimated,
418 with a clear trend of effect between them. In each case, as predicted, the mean of the
419 mean plot falls between the true value of 100 Myrs and the height calculated from a naïve
420 extrapolation of the recent diversification rate as shown in Fig5. We note two significant
421 sources of stochasticity: that of the individual HPD intervals, and variation between
20 Figure 6: Representative trees and LTT plots for clades that are (A) 0.1 (B) 0.2 (C) 0.5 (D) 1 (E) 2 (F) 5 (G) 10 and (H) 20 times too large for their diversification parameters (see Table1). Note how the LTT plots move from concavity to convexity.
21 422 the five simulations of each parameter set that result from the inherent stochasticity of
423 the Yule process. Despite these sources of variability though, the overall trend from
424 larger-than-expected to smaller-than-expected clades is very clear.
425 Discussion
426 As predicted, choosing to examine trees with known unusual features exerted pressure
427 on the reconstruction process (c.f. the closing comments of [96]), and the clear outcome
428 was that larger-than-expected clades had their heights overestimated by BEAST2, with
429 smaller-than-expected clades being underestimated. Although the values we chose for our
430 simulations were illustrative, the scale factors are absolute, and the time scale is arbitrary,
431 in the sense that one time unit can represent any amount of real time. It is thus possible
432 to translate our results directly into a timescale that would be comparable to that of the
433 Cambrian explosion. For example, for a clade that was five times too big and had its real
434 origin at c. 540 Ma, our results suggest that an origin at c. 650 Ma would be inferred
435 (the extrapolated value from Fig5 would be c. 800 Ma, which represents an approximate
436 maximum expected molecular clock overshoot).
437 For any given birth-death or Yule diversification process, there is, from Fig.3, an
438 exponential distribution for the size of surviving clades. Approximately half of all these
439 trees lie between 0.5 and 2 times the expected clade size. Larger or smaller clades rapidly
440 become less likely: for example, the x20 example is, in our model, exceptionally unlikely.
441 However, other clades are much more likely: approx. 14% are at least 2 times too large;
442 and 5% are at least 3 times too large. On the small side, almost 40% of clades are less
443 than half the mean size, and 28% are less than one third the mean size.
444 The relative commonness of small clades means that the deviation from the true tree
445 height value is less than that for the equivalent large clade (in our examples, the 10 times
446 too small clade is approximately 10 Myrs underestimated, whereas the 10x too large clade
447 is c. 50 Myrs overestimated. This might lead one to the expectation that molecular clocks
448 should consistently underestimate clade origin times, rather than overestimate them as is
22 Figure 7: Inference errors of tree height in BEAST2. Eight groups of five inferences corresponding to the diversifications in Table1 are shown as a plot of inferred tree height (i.e crown group age) against relative clade size. The true tree height in each case is 100 Ma (dashed line). The red dots represent the mean of the means of the five runs in each set, and the blue dots represent the extrapolated inferred tree height from Fig.5. Note that in each case (except the normal sized clade where all three are close), the red dots lie between the real value and the blue dot. The vertical black lines represent the 95% HPD intervals in each run.
23 449 commonly supposed. However, three important features suggest this would be unlikely.
450 The first is that the smaller effect of small clades is less likely to be noticeable; the second
451 is that because large clades have so many species in them, a randomly chosen species is
452 likely to fall within one (indeed, the expected size of a clade from which any randomly
453 chosen species is selected from is twice the average [58]). The third, and most important
454 reason, is that large and charismatic clades are consistently studied at the expense of
455 the smaller, less interesting ones. Whilst it is understandably sometimes stated that
456 these sorts of clades are representative of the diversification process as a whole (e.g. [84]),
457 our results show clearly that they are not. Whilst we have not carried out a survey of
458 empirical clade sizes versus molecular clock mismatches of their origins, it is notable that
459 many clades that exemplify this phenomenon are notably larger than their sister groups.
460 For example: animals have millions of species, but choanoflagellates c. 125; birds have c.
461 10,000 and crocodylians c. 23; angiosperms c. 300,000 and gymnosperms c. 1,000. This
462 selection bias means that large, charismatic clades are precisely those that will show the
463 most striking pathologies in their molecular clock behaviours. Indeed, molecular clock
464 dating for conifers [65] and crocodylians does not exhibit the same sort of mismatch (for
465 crocodylians, the molecular clock might even underestimate [106] the fossil record, as
466 might be predicted).
467 In the light of our results, the mismatch between fossil-based estimates of perhaps
468 crown group bilaterians at c. 545 Ma compared to a cautious molecular clock estimate
469 of 630 Ma [2] could be explained with bilaterians being 3-5 times too large given their
470 diversification parameters. However, even this estimate is likely to be generous to molec-
471 ular clock estimates, because our simulations used a strict clock that we also used in our
472 BEAST2 inferences. In reality, such BEAST2 analyses would not know the true rate, nor
473 would it likely be to be strict.
474 We note that, although the direction of this effect is clear from our results, its precise
475 value will depend on a set of features that we have not explored. For example, the degree
476 of influence of the clock prior versus the tree prior will be affected by the amount of useful
24 477 molecular data included in the analysis. In addition, most molecular clock analyses are
478 calibrated by tip or node dating (or some combination) rather than a direct estimate of
479 the clock rate, which will typically be relaxed. In general, a relaxed clock will considerably
480 increase the LCE overestimate of clade heights, because the more flexible and less-well
481 constrained molecular clock rate will do a less good job of compensating for the rapid
482 appearance of early lineages. How such overestimation is compensated for by accurate
483 fossil-based dating of early nodes remains to be investigated.
484 Large clades of more than 3 times the expected size represent some 5% of all surviving
485 clades in our simulations. However, the Yule process, or even homogeneous birth-death
486 process, may not necessarily reflect the true dynamics of clade diversification. For exam-
487 ple, if the popular diversity-dependent diversification model [77] is an intrinsic part of the
488 diversification process, then, rather than outliers of the diversification process (albeit not
489 particularly rare ones) such clades with their distinctive early LTT bulge would be normal
490 clades, and thus, we would expect these sorts of pathologies to be very widespread. In
491 other words, present day methods assume a central expectation of an exponential process
492 with constant rate of lineage rate, but any source of over-dispersion relative to this prior
493 expectation (whether statistical or ecological/evolutionary in origin) is likely to produce
494 systematic errors of the sort we have outlined here.
495 Summary
496 Despite its imperfections, the fossil record of early animal evolution is highly unlikely to
497 be as grossly in error as many molecular clock estimates indicate. In particular, neither
498 small size nor rarity are at all likely to account for the purported non-appearance of crown
499 group bilaterians in the Ediacaran or earlier. In addition, the apparent orderly appearance
500 of taxa in the Cambrian is strongly suggestive of a real-time evolutionary event being
501 recorded. Here we have outlined a variety of survival and selection biases that can affect
502 our understanding of major evolutionary radiations such as the Cambrian explosion, that
25 503 include the disparity of the phyla. Most importantly, such biases include clade size for
504 charismatic taxa such as the bilaterians and indeed animals as a whole; our analysis here
505 suggests that without taking into account their usual features, molecular clock estimates
506 will tend to be biased in a way that can potentially explain the mismatch between fossil
507 and molecular clock origins of large clades. In other words, the problems with molecular
508 clocks might not just revolve around adequate calibration [95] from the fossil record, but
509 also include oversight of fundamental aspects of the statistics of phylogenetic inference
510 that may be harder to account for [96].
511 Acknowledgements
512 We are very grateful to Tanja Stadler, Rachel Warnock and Tim Vaughan for discussions
513 about inference and BEAST/BEAST2, and in particular for their suggestions about how
514 to test large clade biases. This work was funded by the Swedish Research Council (VR:
515 grant 621-2011-4703 to G.E.B.).
516 References
517 [1] Douglas H Erwin, Marc Laflamme, Sarah M Tweedt, Erik A Sperling, Davide
518 Pisani, and Kevin J Peterson. The Cambrian conundrum: early divergence and
519 later ecological success in the early history of animals. Science, 334(6059):1091–
520 1097, 2011.
521 [2] Mario Dos Reis, Yuttapong Thawornwattana, Konstantinos Angelis, Maximilian J
522 Telford, Philip CJ Donoghue, and Ziheng Yang. Uncertainty in the timing of origin
523 of animals and the limits of precision in molecular timescales. Current Biology,
524 25(22):2939–2950, 2015.
525 [3] Erik A Sperling, Christina A Frieder, Akkur V Raman, Peter R Girguis, Lisa A
526 Levin, and Andrew H Knoll. Oxygen, ecology, and the Cambrian radiation of
26 527 animals. Proceedings of the National Academy of Sciences, 110(33):13446–13451,
528 2013.
529 [4] Erik A Sperling and Richard G Stockey. The temporal and environmental context of
530 early animal evolution: considering all the ingredients of an “Explosion”. Integrative
531 and comparative biology, 58(4):605–622, 2018.
532 [5] John A Cunningham, Alexander G Liu, Stefan Bengtson, and Philip CJ Donoghue.
533 The origin of animals: can molecular clocks and the fossil record be reconciled?
534 BioEssays, 39(1):1–12, 2017.
535 [6] David A Gold, Jonathan Grabenstatter, Alex De Mendoza, Ana Riesgo, Iñaki
536 Ruiz-Trillo, and Roger E Summons. Sterol and genomic analyses validate the
537 sponge biomarker hypothesis. Proceedings of the National Academy of Sciences,
538 113(10):2684–2689, 2016.
539 [7] David J Siveter, Mark Williams, and Dieter Waloszek. A phosphatocopid crus-
540 tacean with appendages from the Lower Cambrian. Science, 293(5529):479–481,
541 2001.
542 [8] Alan Cooper and Richard Fortey. Evolutionary explosions and the phylogenetic
543 fuse. Trends in Ecology & Evolution, 13(4):151–156, 1998.
544 [9] Graham E Budd. The Cambrian fossil record and the origin of the phyla. Integrative
545 and Comparative Biology, 43(1):157–165, 2003.
546 [10] Graham E Budd and Illiam SC Jackson. Ecological innovations in the Cambrian and
547 the origins of the crown group phyla. Phil. Trans. R. Soc. B, 371(1685):20150287,
548 2016.
549 [11] Graham E Budd and Sören Jensen. A critical reappraisal of the fossil record of the
550 bilaterian phyla. Biological Reviews, 75(2):253–295, 2000.
27 551 [12] Graham E Budd. At the origin of animals: the revolutionary Cambrian fossil record.
552 Current genomics, 14(6):344–354, 2013.
553 [13] MW Martin, DV Grazhdankin, SA Bowring, D Ad D Evans, MA Fedonkin, and
554 JL Kirschvink. Age of Neoproterozoic bilatarian [sic] body and trace fossils, White
555 Sea, Russia: Implications for metazoan evolution. Science, 288(5467):841–845,
556 2000.
557 [14] Jan Bergström. The origin of animal phyla and the new phylum Procoelomata.
558 Lethaia, 22(3):259–269, 1989.
559 [15] James W Valentine. Late Precambrian bilaterians: grades and clades. Proceedings
560 of the National Academy of Sciences, 91(15):6751–6757, 1994.
561 [16] Kevin J Peterson, R Andrew Cameron, and Eric H Davidson. Set-aside cells in max-
562 imal indirect development: evolutionary and developmental significance. BioEssays,
563 19(7):623–631, 1997.
564 [17] Ernst Haeckel. Die Gastraea-theorie, die phylogenetische classification des thierre-
565 ichs und die homologie der keimblatter. Jenaische Z f Naturwiss, 8:1–55, 1874.
566 [18] PJS Boaden. Meiofauna and the origins of the Metazoa. Zoological Journal of the
567 Linnean Society, 96(3):217–227, 1989.
568 [19] Christopher E Laumer, Nicolas Bekkouche, Alexandra Kerbl, Freya Goetz, Ri-
569 cardo C Neves, Martin V Sørensen, Reinhardt M Kristensen, Andreas Hejnol,
570 Casey W Dunn, Gonzalo Giribet, et al. Spiralian phylogeny informs the evolu-
571 tion of microscopic lineages. Current Biology, 25(15):2000–2006, 2015.
572 [20] Jacques Gauthier, Arnold G Kluge, and Timothy Rowe. Amniote phylogeny and
573 the importance of fossils. Cladistics, 4(2):105–209, 1988.
28 574 [21] Johanna Taylor Cannon, Bruno Cossermelli Vellutini, Julian Smith, Fredrik Ron-
575 quist, Ulf Jondelius, and Andreas Hejnol. Xenacoelomorpha is the sister group to
576 Nephrozoa. Nature, 530(7588):89, 2016.
577 [22] Greg W Rouse, Nerida G Wilson, Jose I Carvajal, and Robert C Vrijenhoek. New
578 deep-sea species of Xenoturbella and the position of Xenacoelomorpha. Nature,
579 530(7588):94, 2016.
580 [23] Hervé Philippe, Albert J Poustka, Marta Chiodin, Katharina J Hoff, Christophe
581 Dessimoz, Bartlomiej Tomiczek, Philipp H Schiffer, Steven Müller, Daryl Domman,
582 Matthias Horn, et al. Mitigating anticipated effects of systematic errors supports
583 sister-group relationship between Xenacoelomorpha and Ambulacraria. Current
584 Biology, 29(11):1818–1826, 2019.
585 [24] Christoph Bleidorn. Recent progress in reconstructing lophotrochozoan (spiralian)
586 phylogeny. Organisms Diversity & Evolution, pages 1–10, 2019.
587 [25] Colin O Hermans. The systematic position of the Archiannelida. Systematic Biol-
588 ogy, 18(1):85–102, 1969.
589 [26] Bertil Swedmark. Gwynia capsula (Jeffreys), an articulate brachiopod with brood
590 protection. Nature, 213(5081):1151, 1967.
591 [27] Torsten Hugo Struck, Anja Golombek, Anne Weigert, Franziska Anni Franke, Wil-
592 fried Westheide, Günter Purschke, Christoph Bleidorn, and Kenneth Michael Ha-
593 lanych. The evolution of annelids reveals two adaptive routes to the interstitial
594 realm. Current Biology, 25(15):1993–1999, 2015.
595 [28] Luke A Parry, Gregory D Edgecombe, Danny Eibye-Jacobsen, and Jakob Vinther.
596 The impact of fossil data on annelid phylogeny inferred from discrete mor-
597 phological characters. Proceedings of the Royal Society B: Biological Sciences,
598 283(1837):20161378, 2016.
29 599 [29] Joseph Moysiuk, Martin R Smith, and Jean-Bernard Caron. Hyoliths are Palaeozoic
600 lophophorates. Nature, 541(7637):394, 2017.
601 [30] Fangchen Zhao, Martin R Smith, Zongjun Yin, Han Zeng, Guoxiang Li, and
602 Maoyan Zhu. Orthrozanclus elongata n. sp. and the significance of sclerite-covered
603 taxa for early trochozoan evolution. Scientific reports, 7(1):16232, 2017.
604 [31] Jian Han, Simon Conway Morris, Jennifer F Hoyal Cuthill, and Degan Shu. Sclerite-
605 bearing annelids from the lower Cambrian of South China. Scientific reports,
606 9(1):4955, 2019.
607 [32] Ferdinand Marlétaz, Katja TCA Peijnenburg, Taichiro Goto, Noriyuki Satoh, and
608 Daniel S Rokhsar. A new spiralian phylogeny places the enigmatic arrow worms
609 among gnathiferans. Current Biology, 29(2):312–318, 2019.
610 [33] Torsten H Struck, Alexandra R Wey-Fabrizius, Anja Golombek, Lars Hering, Anne
611 Weigert, Christoph Bleidorn, Sabrina Klebow, Nataliia Iakovenko, Bernhard Haus-
612 dorf, Malte Petersen, et al. Platyzoan paraphyly based on phylogenomic data
613 supports a noncoelomate ancestry of Spiralia. Molecular biology and evolution,
614 31(7):1833–1849, 2014.
615 [34] Gonzalo Giribet and Gregory D Edgecombe. “Perspectives in Animal Phylogeny
616 and Evolution”: A decade later. In Perspectives on Evolutionary and Developmental
617 Biology, pages 167–178. University of Padova Press, 2019.
618 [35] Jean-Bernard Caron and Brittany Cheung. Amiskwia is a large Cambrian
619 gnathiferan with complex gnathostomulid-like jaws. Communications biology,
620 2(1):164, 2019.
621 [36] Jakob Vinther and Luke A Parry. Bilateral jaw elements in Amiskwia sagittiformis
622 bridge the morphological gap between gnathiferans and chaetognaths. Current
623 Biology, 29(5):881–888, 2019.
30 624 [37] Chen Ailin, Luke A. Parry, Fan Wei, Jakob Vinther, and Peiyun Cong. Giant stem
625 group gastrotrichs from the early Cambrian. Palaeontological Association Annual
626 Meeting Programme and Abstracts, 62:29, 2018.
627 [38] Graham E. Budd. Arthropod body-plan evolution in the Cambrian with an example
628 from anomalocaridid muscle. Lethaia, 31(3):197–210, 1998.
629 [39] Graham E Budd. Tardigrades as ‘stem-group arthropods’: the evidence from
630 the Cambrian fauna. Zoologischer Anzeiger-A Journal of Comparative Zoology,
631 240(3):265–279, 2001.
632 [40] Omar Rota-Stabelli, Ehsan Kayal, Dianne Gleeson, Jennifer Daub, Jeffrey L Boore,
633 Maximilian J Telford, Davide Pisani, Mark Blaxter, and Dennis V Lavrov. Ecdyso-
634 zoan mitogenomics: evidence for a common origin of the legged invertebrates, the
635 Panarthropoda. Genome biology and evolution, 2:425–440, 2010.
636 [41] Martin R Smith and Javier Ortega-Hernández. Hallucigenia’s onychophoran-like
637 claws and the case for Tactopoda. Nature, 514(7522):363, 2014.
638 [42] John S Peel. A corset-like fossil from the Cambrian Sirius Passet Lagerstatte of
639 North Greenland and its implications for cycloneuralian evolution. Journal of Pa-
640 leontology, 84(2):332–340, 2010.
641 [43] Dongjing Fu, Guanghui Tong, Tao Dai, Wei Liu, Yuning Yang, Yuan Zhang, Lin-
642 hao Cui, Luoyang Li, Hao Yun, Yu Wu, et al. The Qingjiang biota—A Burgess
643 Shale–type fossil Lagerstätte from the early Cambrian of South China. Science,
644 363(6433):1338–1342, 2019.
645 [44] Jian Han, Simon Conway Morris, Qiang Ou, Degan Shu, and Hai Huang. Meio-
646 faunal deuterostomes from the basal Cambrian of Shaanxi (China). Nature,
647 542(7640):228, 2017.
31 648 [45] Graham E Budd. Stem group arthropods from the Lower Cambrian Sirius Passet
649 fauna of north Greenland. In Arthropod relationships, pages 125–138. Springer,
650 1998.
651 [46] Allison C Daley, Graham E Budd, Jean-Bernard Caron, Gregory D Edgecombe, and
652 Desmond Collins. The Burgess Shale anomalocaridid Hurdia and its significance
653 for early euarthropod evolution. Science, 323(5921):1597–1600, 2009.
654 [47] Harry Blackmore Whittington and Derek Ernest Gilmor Briggs. The largest Cam-
655 brian animal, Anomalocaris, Burgess Shale, British-Columbia. Philosophical Trans-
656 actions of the Royal Society of London. B, Biological Sciences, 309(1141):569–609,
657 1985.
658 [48] Thomas Stach. 18 Deuterostome phylogeny–a morphological perspective. In Deep
659 Metazoan Phylogeny: The Backbone of the Tree of Life, pages 425–457. University
660 of Padova Press, 2014.
661 [49] Zhe Chen, Chuanming Zhou, Mike Meyer, Ke Xiang, James D Schiffbauer, Xunlai
662 Yuan, and Shuhai Xiao. Trace fossil evidence for Ediacaran bilaterian animals with
663 complex behaviors. Precambrian Research, 224:690–701, 2013.
664 [50] Luke A Parry, Paulo C Boggiani, Daniel J Condon, Russell J Garwood, Juliana
665 de M Leme, Duncan McIlroy, Martin D Brasier, Ricardo Trindade, Ginaldo AC
666 Campanha, Mírian LAF Pacheco, et al. Ichnological evidence for meiofaunal bila-
667 terians from the terminal Ediacaran and earliest Cambrian of Brazil. Nature ecology
668 & evolution, 1(10):1455, 2017.
669 [51] Frances S Dunn, Alexander G Liu, and Philip CJ Donoghue. Ediacaran develop-
670 mental biology. Biological Reviews, 93(2):914–932, 2018.
671 [52] Graham E Budd and Sören Jensen. The origin of the animals and a ‘Savannah’
672 hypothesis for early bilaterian evolution. Biological Reviews, 92(1):446–473, 2017.
32 673 [53] Zhang Xingliang and Joachim Reitner. A fresh look at Dickinsonia: removing it
674 from Vendobionta. Acta Geologica Sinica-English Edition, 80(5):636–642, 2006.
675 [54] Feng Tang, Stefan Bengtson, Yue Wang, Xun-lian Wang, and Chong-yu Yin. Eoan-
676 dromeda and the origin of Ctenophora. Evolution & development, 13(5):408–414,
677 2011.
678 [55] Andrey Ivantsov, Aleksey Nagovitsyn, and Maria Zakrevskaya. Traces of Locomo-
679 tion of Ediacaran Macroorganisms. Geosciences, 9(9):395, 2019.
680 [56] Zhe Chen, Chuanming Zhou, Xunlai Yuan, and Shuhai Xiao. Death march of a
681 segmented and trilobate bilaterian elucidates early animal evolution. Nature, pages
682 1–4, 2019.
683 [57] Graham E Budd and Richard P Mann. The dynamics of stem and crown groups.
684 bioRxiv, page 633008, 2019.
685 [58] Graham E Budd and Richard P Mann. History is written by the victors: The effect
686 of the push of the past on the fossil record. Evolution, 72(11):2276–2291, 2018.
687 [59] Graham E Budd. The Cambrian fossil record and the origin of the phyla. Integrative
688 and comparative biology, 43(1):157–165, 2003.
689 [60] Adam C Maloof, Susannah M Porter, John L Moore, Frank Ö Dudás, Samuel A
690 Bowring, John A Higgins, David A Fike, and Michael P Eddy. The earliest Cam-
691 brian record of animals and ocean geochemical change. Bulletin, 122(11-12):1731–
692 1774, 2010.
693 [61] Artem Kouchinsky, Stefan Bengtson, Bruce Runnegar, Christian Skovsted, Michael
694 Steiner, and Michael Vendrasco. Chronology of early Cambrian biomineralization.
695 Geological Magazine, 149(2):221–251, 2012.
33 696 [62] Sören Jensen. The Proterozoic and earliest Cambrian trace fossil record; patterns,
697 problems and perspectives. Integrative and Comparative Biology, 43(1):219–228,
698 2003.
699 [63] Ben J Slater, Thomas HP Harvey, and Nicholas J Butterfield. Small carbonaceous
700 fossils (SCFs) from the Terreneuvian (lower Cambrian) of Baltica. Palaeontology,
701 61(3):417–439, 2018.
702 [64] Mario Coiro, James A Doyle, and Jason Hilton. How deep is the conflict between
703 molecular and fossil evidence on the age of angiosperms? New Phytologist, 2019.
704 [65] Jose Barba-Montoya, Mario dos Reis, Harald Schneider, Philip CJ Donoghue, and
705 Ziheng Yang. Constraining uncertainty in the timescale of angiosperm evolution and
706 the veracity of a Cretaceous Terrestrial Revolution. New Phytologist, 218(2):819–
707 834, 2018.
708 [66] John T Clarke, Rachel CM Warnock, and Philip CJ Donoghue. Establishing a
709 time-scale for plant evolution. New Phytologist, 192(1):266–301, 2011.
710 [67] Sean Nee, Robert Mccredie May, and Paul H Harvey. The reconstructed evolution-
711 ary process. Philosophical Transactions of the Royal Society of London. Series B:
712 Biological Sciences, 344(1309):305–311, 1994.
713 [68] Paul H Harvey, Robert M May, and Sean Nee. Phylogenies without fossils. Evolu-
714 tion, 48(3):523–529, 1994.
715 [69] David M Raup. On the early origins of major biologic groups. Paleobiology,
716 9(2):107–115, 1983.
717 [70] Daniele Silvestro, Rachel CM Warnock, Alexandra Gavryushkina, and Tanja
718 Stadler. Closing the gap between palaeontological and neontological speciation
719 and extinction rate estimates. Nature communications, 9(1):5237, 2018.
34 720 [71] Chi Zhang, Tanja Stadler, Seraina Klopfstein, Tracy A Heath, and Fredrik Ron-
721 quist. Total-evidence dating under the fossilized birth–death process. Systematic
722 Biology, 65(2):228–249, 2015.
723 [72] G.U. Yule. A mathematical theory of evolution, based on the conclusions of Dr.
724 JC Willis, FRS. Philosophical Transactions of the Royal Society of London B,
725 1924:21–87, 1924.
726 [73] David M Raup. Extinction: bad luck or bad genes, 1991.
727 [74] Amaury Lambert and Tanja Stadler. Birth–death models and coalescent point
728 processes: The shape and probability of reconstructed phylogenies. Theoretical
729 population biology, 90:113–128, 2013.
730 [75] Niklas Wahlberg, Julien Leneveu, Ullasa Kodandaramaiah, Carlos Peña, Sören
731 Nylin, André VL Freitas, and Andrew VZ Brower. Nymphalid butterflies diver-
732 sify following near demise at the Cretaceous/Tertiary boundary. Proceedings of the
733 Royal Society of London B: Biological Sciences, 276(1677):4295–4302, 2009.
734 [76] Daniel Moen and Hélène Morlon. Why does diversification slow down? Trends in
735 Ecology & Evolution, 29(4):190–197, 2014.
736 [77] Daniel L Rabosky and Irby J Lovette. Density-dependent diversification in North
737 American wood warblers. Proceedings of the Royal Society of London B: Biological
738 Sciences, 275(1649):2363–2371, 2008.
739 [78] Graham J Slater. Not-so-early bursts and the dynamic nature of morphological
740 diversification. Proceedings of the National Academy of Sciences, 112(12):3595–
741 3596, 2015.
742 [79] James A Fordyce. Interpreting the γ statistic in phylogenetic diversification rate
743 studies: a rate decrease does not necessarily indicate an early burst. PLoS One,
744 5(7):e11781, 2010.
35 745 [80] Matthew W Pennell, Brice AJ Sarver, and Luke J Harmon. Trees of unusual
746 size: biased inference of early bursts from large molecular phylogenies. PloS one,
747 7(9):e43348, 2012.
748 [81] Robert E Ricklefs. Estimating diversification rates from phylogenetic information.
749 Trends in Ecology & Evolution, 22(11):601–610, 2007.
750 [82] Albert B Phillimore and Trevor D Price. Density-dependent cladogenesis in birds.
751 PLoS biology, 6(3):e71, 2008.
752 [83] Lindell Bromham. Molecular clocks and explosive radiations. Journal of molecular
753 evolution, 57(1):S13–S20, 2003.
754 [84] Michael SY Lee, Julien Soubrier, and Gregory D Edgecombe. Rates of pheno-
755 typic and genomic evolution during the Cambrian explosion. Current Biology,
756 23(19):1889–1895, 2013.
757 [85] John R Paterson, Gregory D Edgecombe, and Michael SY Lee. Trilobite evolu-
758 tionary rates constrain the duration of the Cambrian explosion. Proceedings of the
759 National Academy of Sciences, 116(10):4394–4399, 2019.
760 [86] Robin MD Beck and Michael SY Lee. Ancient dates or accelerated rates? mor-
761 phological clocks and the antiquity of placental mammals. Proceedings of the Royal
762 Society B: Biological Sciences, 281(1793):20141278, 2014.
763 [87] Bradley Deline, Jennifer M Greenwood, James W Clark, Mark N Puttick, Kevin J
764 Peterson, and Philip CJ Donoghue. Evolution of metazoan morphological disparity.
765 Proceedings of the National Academy of Sciences, 115(38):E8909–E8918, 2018.
766 [88] Michael SY Lee and Simon YW Ho. Molecular clocks. Current biology, 26(10):R399–
767 R402, 2016.
768 [89] Jennifer L Morris, Mark N Puttick, James W Clark, Dianne Edwards, Paul Kenrick,
769 Silvia Pressel, Charles H Wellman, Ziheng Yang, Harald Schneider, and Philip CJ
36 770 Donoghue. The timescale of early land plant evolution. Proceedings of the National
771 Academy of Sciences, 115(10):E2274–E2283, 2018.
772 [90] Thomas Lepage, David Bryant, Hervé Philippe, and Nicolas Lartillot. A general
773 comparison of relaxed molecular clock models. Molecular biology and evolution,
774 24(12):2669–2680, 2007.
775 [91] Charles D Bell. Between a rock and a hard place: Applications of the “molecular
776 clock” in systematic biology. Systematic Botany, 40(1):6–13, 2015.
777 [92] Richard M Bateman. Hunting the Snark: the flawed search for mythical Jurassic
778 angiosperms. Journal of experimental botany, 2019.
779 [93] Rachel CM Warnock, Ziheng Yang, and Philip CJ Donoghue. Exploring uncertainty
780 in the calibration of the molecular clock. Biology letters, 8(1):156–159, 2011.
781 [94] James F Parham, Philip CJ Donoghue, Christopher J Bell, Tyler D Calway, Jason J
782 Head, Patricia A Holroyd, Jun G Inoue, Randall B Irmis, Walter G Joyce, Daniel T
783 Ksepka, et al. Best practices for justifying fossil calibrations. Systematic Biology,
784 61(2):346–359, 2011.
785 [95] Philip Donoghue. Evolution: The Flowering of Land Plant Evolution. Current
786 Biology, 29(15):R753–R756, 2019.
787 [96] Jeremy M Beaulieu, Brian C O’Meara, Peter Crane, and Michael J Donoghue.
788 Heterogeneous rates of molecular evolution and diversification could explain the
789 Triassic age estimate for angiosperms. Systematic Biology, 64(5):869–878, 2015.
790 [97] Joseph E O’Reilly and Philip CJ Donoghue. The effect of fossil sampling on the
791 estimation of divergence times with the fossilized birth–death process. Systematic
792 biology, 2019.
37 793 [98] Sebastian Höhna, Tanja Stadler, Fredrik Ronquist, and Tom Britton. Inferring
794 speciation and extinction rates under different sampling schemes. Molecular biology
795 and evolution, 28(9):2577–2589, 2011.
796 [99] Simon Möller, Louis du Plessis, and Tanja Stadler. Impact of the tree prior on esti-
797 mating clock rates during epidemic outbreaks. Proceedings of the National Academy
798 of Sciences, 115(16):4200–4205, 2018.
799 [100] David Duchêne, Sebastian Duchêne, and Simon YW Ho. Tree imbalance causes
800 a bias in phylogenetic estimation of evolutionary timescales using heterochronous
801 sequences. Molecular Ecology Resources, 15(4):785–794, 2015.
802 [101] Eric Fontanillas, John J Welch, Jessica A Thomas, and Lindell Bromham. The
803 influence of body size and net diversification rate on molecular evolution during
804 the radiation of animal phyla. BMC evolutionary biology, 7(1):95, 2007.
805 [102] Brice AJ Sarver, Matthew W Pennell, Joseph W Brown, Sara Keeble, Kayla M
806 Hardwick, Jack Sullivan, and Luke J Harmon. The choice of tree prior and molecular
807 clock does not substantially affect phylogenetic inferences of diversification rates.
808 PeerJ, 7:e6334, 2019.
809 [103] Tanja Stadler. Treesim: Simulating phylogenetic trees. R package, 2.4, 2019.
810 [104] Liam J Revell. Phytools: an R package for phylogenetic comparative biology (and
811 other things). Methods in Ecology and Evolution, 3(2):217–223, 2012.
812 [105] Andrew Rambaut, Alexei J Drummond, Dong Xie, Guy Baele, and Marc A Suchard.
813 Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Systematic
814 biology, 67(5):901–904, 2018.
815 [106] Michael SY Lee and Adam M Yates. Tip-dating and homoplasy: reconciling the
816 shallow molecular divergences of modern gharials with their long fossil record. Pro-
817 ceedings of the Royal Society B: Biological Sciences, 285(1881):20181071, 2018.
38