Anatomy of a Cladistic Analysis

Nico M. Franz

School of Life Sciences, Arizona State University

XXXI Willi Hennig Meeting – UC Riverside

June 26, 2012; Riverside, CA From the abstract:

• "The monophyly of the Neotropical entimine weevil genus Schoenherr, 1823 (: : Eustylini Lacordaire) is reassessed."

• […] "The present study scrutinizes these traditional perspectives, based on a cladistic analysis of 143 adult morphological characters and 90 species, representing 30 genera and seven tribes of Neotropical entimine weevils."

• The character matrix yielded eight most-parsimonious cladograms (length = 239 steps; consistency index = 66; retention index = 91), with mixed clade support that remains particularly wanting for some of the deeper in-group divergences." (A) E. agrestis (Boheman); (B) E. consobrinus (Marshall); (C) E. hieroglyphicus Chevrolat

(D) E. impressus (Fabricius); (E) E. nicaraguensis Bovie; (F) E. quadrivittatus (Olivier)

(G) E. quinquedecimpunctatus (Olivier); (H) E. roseipes (Chevrolat); (I) E. sulcicrus Champion

(J) E. triangulifer Champion; (K) E. verecundus (Chevrolat); (L) E. vittatus (Linnaeus) Fig. 2. Preferred cladogram & character state optimizations Dissecting the process – motivating themes

"Cladists may use the congruence test to iteratively refine assessments of homology,

and thereby increase the odds of reliable phylogenetic inference under parsimony.

This explanation challenges alternative views which tend to ignore the effects of

parsimony on the process of character individuation in systematics."

Franz. 2005. Outline of an explanatory account of cladistic practice. Biol. Phil. 20: 489–515. Dissecting the process – motivating themes

"Cladists may use the congruence test to iteratively refine assessments of homology,

and thereby increase the odds of reliable phylogenetic inference under parsimony.

This explanation challenges alternative views which tend to ignore the effects of

parsimony on the process of character individuation in systematics."

Franz. 2005. Outline of an explanatory account of cladistic practice. Biol. Phil. 20: 489–515.

"The identification of the valid scope for character statements cannot be a matter of mere ostension, or rigid designation, but must be a matter of scientific theory construction. Scope expansion of character statements can result in a situation where purportedly similar structures, apparently denoted by the same name (proper name or kind name), are in fact not the same. The nonhomology of such characters may be revealed through morphological complexity at the comparative level, by tree topology at the analytical level, or both."

Rieppel. 2007. The performance of morphological characters in broad-scale phylogenetic analyses. Biol. J. Linn. Soc. 92: 297–308. Sneak preview: 60 matrices, 8 stages – tracking the analysis Legacy character assembly Stage 1: legacy character assembly

 Lacordaire (1863) – typically a 1-3

character system for tribes/genera.

Source: Lacordaire. 1863. Histoire naturelle des insectes, Vol. 6. Paris: Roret. Stage 1: legacy character assembly

 Champion (1911) – winged

vs. wingless entimine groups

Source: Champion. 1911. Otiorhynchinae Alatae. In: Biologia Centrali-Americana, Vol. 4, Part 3. London. Stage 1: legacy character assembly

 Character has a length of 10 steps in a narrowly scoped analysis

(Exophthalmus and closely related genera). van Emden (1944)  state-of-the-art for entimine tribes / genera

Source: van Emden. 1944. A key to the genera of Brachyderinae of the world. Annals and Magazine of Natural History 11: 503-532, 559-586. Anderson (2002) – identifies subfamilies, genera; not tribes

Source: Anderson. 2002. Family 131. Curculionidae. In: Arnett et al. (Eds.): American , Vol. 2. Boca Raton, CRC Press; pp. 722-815. Initial characters and states had a reputable pedigree Initial characters and states had a reputable pedigree

Lacordaire. 1863. Histoire naturelle des insectes, Vol 6. Paris, Roret. Champion. 1911. Otiorhynchinae Alatae. Biologia Centrali-Americana, Vol. 4, Part 3. London; pp. 178–317. van Emden. 1944. A key to the genera of Brachyderinae of the world. Ann. Mag. Nat. Hist. 11: 503–532, 559–586. Anderson. 2002. Family 131. Curculionidae. In: American B, Vol. 2. Boca Raton, CRC Press; pp. 722–815. Franz. 2010. Redescriptions of critical type species in the Eustylini Lacordaire (Coleoptera: Curculionidae: Entiminae). J. Nat. Hist. 44: 41-80. Early cladistic outcomes (Stage 1)

52 taxa Early cladistic outcomes (Stage 1)

100 characters Early cladistic outcomes (Stage 1)

547 steps! (~ 5.5 / char.) Matrix 10

Select legacy characters Consistency index and retention index

rapid decline Character coding – binary, multi-state

~ 40% multi-state characters Character provenance – external, internal

initially all external Number of MPTs and nodes collapsed in consensus

 Return to initial homology assessments (chars./states).

3 trees,  Not publication worthy. 3 collapsed Rescoping starts… Examples of poorly performing characters ( recode / eliminate) Matrix 18: Examples of deactivated characters Matrices 11-17: rescoping phase I – some examples Intermediate stages 2-5 – rescoping, taxon/character addition

addressing poorest characters Intermediate stages 2-5 – rescoping, taxon/character addition

evidence of large gaps in sampling Intermediate stages 2-5 – rescoping, taxon/character addition

addition of out-/ingroup taxa (52  90)

Intermediate stages 2-5 – rescoping, taxon/character addition

new taxa  new (unscoped) characters Intermediate stages 2-5 – rescoping, taxon/character addition

rescoping all in expanded context Consistency index and retention index

steady increase Character coding – binary, multi-state

> 90% binary characters

"exploratory" reductive coding Character provenance – external, internal

> 20% internal characters Number of MPTs and nodes collapsed in consensus

 Focus on detail homology, perform "aggressive scope reduction"

 Still not ready for publication.

> 1200 trees, > 35 collapsed And so… Contingent rescoping – tricarinate rostrum of spp.

7 species

17. "Rostrum tricarinate, … Contingent rescoping – tricarinate rostrum of Diaprepes spp.  Are these rostra also tricarinate (in homology to Diaprepes)?

7 species Exophthalmus Otiorhynchus

Pachnaeus Phaops Rhinospathe

17. "Rostrum tricarinate, … Contingent rescoping – tricarinate rostrum of Diaprepes spp.  No, not if intermittent phylogenetic insights are transparently included.

(1) present (0) absent (–) inapplicable

7 species Exophthalmus Otiorhynchus

(0) absent (–) inapplicable (–) inapplicable Pachnaeus Phaops Rhinospathe

17. "Rostrum tricarinate, with a characteristic combination of one median carina and two (dorso-) lateral, apically slightly diverging carinae, each carina narrow, moderately sharp." Narrowly scoped, deep-level homologies Narrowly scoped, deep-level homologies Narrowly scoped, deep-level homologies Terminal stages 6-8 – added resolution, robustness

narrowly rescoped characters, more stable tree length Consistency index and retention index

approaching "maximum" levels Character coding – binary, multi-state

> 20% characters with inapplicables Character provenance – external, internal

~ 40% internal characters Number of MPTs and nodes collapsed in consensus

 Ready for publication.

8 trees, 3 collapsed Review…from matrix 10 to 60

Topology, Optimization, Language Overview of significant topology changes – matrices 1-60

 Transition of matrix 30-31 yielded earliest "reliable" results. Matrix 10 [topology]

 Exophthalmus spp.  Nested within Exoph.

• 52 taxa • 100 characters • 4 MPTs • L = 547 steps • CI = 28 • RI = 66 • 3 nodes collapsed • Bremer support Matrix 20 [topology]

 Exophthalmus spp.  Nested within Exoph.

• 52 taxa (=) • 69 characters (– 31) • 3 MPTs (– 1) • L = 159 steps (– 388) • CI = 48 (+ 20) • RI = 83 (+ 17) • 3 nodes collapsed (=) • Bremer support Matrix 30 [topology]

 Exophthalmus spp.  Nested within Exoph.

• 90 taxa (+ 38) • 91 characters (+ 22) • 2192 MPTs (+ 2189) • L = 205 steps (+ 46) • CI = 45 (– 3) • RI = 83 (=) • 38 nodes collapsed (+ 35) • Bremer support Matrix 60 [topology]

 Exophthalmus spp.  Nested within Exoph.

• 90 taxa (=) • 143 characters (+ 52) • 8 MPTs (– 2184) • L = 239 steps (+ 24) • CI = 66 (+ 21) • RI = 91 (+ 8) • 3 nodes collapsed (– 35) • Bremer support Matrix 10 • 52 taxa • 100 characters [optimization] • 4 MPTs • L = 547 steps • CI = 28 • RI = 66 • 3 nodes collapsed • Diagnoses unwieldy • Synapomorphies rare

Select legacy characters Matrix 60 • 90 taxa (+ 38) • 143 characters (+ 43) [optimization] • 8 MPTs (+ 4) • L = 239 steps (– 308) • CI = 66 (+ 38) • RI = 91 (+ 25) • 3 nodes collapsed (=) • Diagnoses concise • Synapomorph. common Matrix 10 [language]

Lacordaire. 1863. Histoire naturelle des insectes, Vol 6. Paris, Roret. Champion. 1911. Otiorhynchinae Alatae. Biologia Centrali-Americana, Vol. 4, Part 3. London; pp. 178–317. van Emden. 1944. A key to the genera of Brachyderinae of the world. Ann. Mag. Nat. Hist. 11: 503–532, 559–586. Anderson. 2002. Family 131. Curculionidae. In: American B, Vol. 2. Boca Raton, CRC Press; pp. 722–815. Franz. 2010. Redescriptions of critical type species in the Eustylini Lacordaire (Coleoptera: Curculionidae: Entiminae). J. Nat. Hist. 44: 41-80. Matrix 60 [language] Related issue…

Which characters should constitute

phenotype anatomy ontologies? Related issue – the value of parsimony-contingent homology

 Special emphasis on constructing phenotypic anatomy ontologies.

"We have taken an integrative approach in the building of Uberon, and in doing so embrace multiple axes of classification. […] This homology-neutrality of Uberon is a deliberate design feature of the ontology. We believe that specifying homology relationships and descent from common ancestral structures is of obvious high value, but that this need not be tightly coupled to the development of an upper anatomical ontology."

Mungall et al. 2012. Uberon, an integrative multi-species anatomy ontology. Genome Biol. 13: R5. Related issue – the value of parsimony-contingent homology

 Special emphasis on constructing phenotypic anatomy ontologies.

"We have taken an integrative approach in the building of Uberon, and in doing so embrace multiple axes of classification. […] This homology-neutrality of Uberon is a deliberate design feature of the ontology. We believe that specifying homology relationships and descent from common ancestral structures is of obvious high value, but that this need not be tightly coupled to the development of an upper anatomical ontology."

Mungall et al. 2012. Uberon, an integrative multi-species anatomy ontology. Genome Biol. 13: R5.

"Explanatory homology hypotheses should not be mistaken and blended with morphological descriptions, which in their turn are by nature descriptive and not explanatory. […] Instead, we differentiate phylogenetic investigations into the step of producing data and the step of phylogenetic reasoning."

Vogt et al. 2010. The linguistic problem of morphology: structure versus homology and the standardization of morphological data. Cladistics 26: 301–325.  So then is this what we have in mind for classes in phenotype ontologies?

Source: Davis. 2011. Delimiting baridine weevil evolution (Coleoptera: Curculionidae: Baridinae). Zool. J. Linn. Soc. 161: 88–156. Conclusions

• Analytical phylogenetic methods not only organize character information, but may furthermore have the purpose of shaping character individuation.

• In the case of the Exophthalmus analysis, I would have been hard pressed to arrive at the final descriptions of characters and states without benefitting from intermittent parsimony-driven inferences that led to the reweighting and rescoping of earlier homology assessments. It is not always conducive to a researcher's reputation to expose these practices, but they do and must occur frequently. Conclusions

• Analytical phylogenetic methods not only organize character information, but may furthermore have the purpose of shaping character individuation.

• In the case of the Exophthalmus analysis, I would have been hard pressed to arrive at the final descriptions of characters and states without benefitting from intermittent parsimony-driven inferences that led to the reweighting and rescoping of earlier homology assessments. It is not always conducive to a researcher's reputation to expose these practices, but they do and must occur frequently.

• Under the cladistic paradigm, the most precise inferences of homology are often parsimony-influenced and parsimony-contingent, and the two notions are inextricably linked and entrenched in our maturing observational terminology.

• By integrating expressions of structural equivalence at increasingly greater scales, phenotype ontologies also run the risk of 'dialing down' the most precise and phylogenetically scoped assessments of homology that systematics can produce. Acknowledgments

WHS XXXI Organizers

Juliana Cardona Duque, Jennifer Girón, Anyimilehidi Mazo Vargas, Quentin Wheeler

NSF-DEB 1155984: "Systematics of eustyline and geonemine weevils: Connecting and contrasting Caribbean and Neotropical mainland radiations"