Kutty, S N; Bernasconi, M V; Sifner, F; Meier, R (2007). Sensitivity analysis, molecular systematics and natural history evolution of (Diptera: Cyclorrhapha: Calyptratae). Cladistics, 23(1):64-83. Postprint available at: http://www.zora.uzh.ch University of Zurich Posted at the Zurich Open Repository and Archive, University of Zurich. Zurich Open Repository and Archive http://www.zora.uzh.ch Originally published at: Cladistics 2007, 23(1):64-83.

Winterthurerstr. 190 CH-8057 Zurich http://www.zora.uzh.ch

Year: 2007

Sensitivity analysis, molecular systematics and natural history evolution of Scathophagidae (Diptera: Cyclorrhapha: Calyptratae)

Kutty, S N; Bernasconi, M V; Sifner, F; Meier, R

Kutty, S N; Bernasconi, M V; Sifner, F; Meier, R (2007). Sensitivity analysis, molecular systematics and natural history evolution of Scathophagidae (Diptera: Cyclorrhapha: Calyptratae). Cladistics, 23(1):64-83. Postprint available at: http://www.zora.uzh.ch

Posted at the Zurich Open Repository and Archive, University of Zurich. http://www.zora.uzh.ch

Originally published at: Cladistics 2007, 23(1):64-83. Sensitivity analysis, molecular systematics and natural history evolution of Scathophagidae (Diptera: Cyclorrhapha: Calyptratae)

Abstract

The 60 000 described species of Cyclorrhapha are characterized by an unusual diversity in larval life-history traits, which range from saprophagy over phytophagy to parasitism and predation. However, the direction of evolutionary change between the different modes remains unclear. Here, we use the Scathophagidae (Diptera) for reconstructing the direction of change in this relatively small family ( 250 spp.) whose larval habits mirror the diversity in natural history found in Cyclorrhapha. We subjected a molecular data set for 63 species (22 genera) and DNA sequences from seven genes (12S, 16S, Cytb, COI, 28S, Ef1-alfa, Pol II) to an extensive sensitivity analysis and compare the performance of three different alignment strategies (manual, Clustal, POY). We find that the default Clustal alignment performs worst as judged by character incongruence, topological congruence and branch support. For this alignment, scoring indels as a fifth character state worsens character incongruence and topological congruence. However, manual alignment and direct optimization perform similarly well and yield near-identical trees, although branch support is lower for the direct-optimization trees. All three alignment techniques favor the upweighting of transversion. We furthermore confirm the independence of the concepts ‘‘node support'' and ‘‘node stability'' by documenting several cases of poorly supported nodes being very stable and cases of well supported nodes being unstable. We confirm the monophyly of the Scathophagidae, its two constituent subfamilies, and most genera. We demonstrate that phytophagy in the form of leaf mining is the ancestral larval feeding habit for Scathophagidae. From phytophagy, two shifts to saprophagy and one shift to predation has occurred while a second origin of predation is from a saprophagous ancestor. Cladistics

Cladistics 23 (2007) 64–83 10.1111/j.1096-0031.2006.00131.x

Sensitivity analysis, molecular systematics and natural history evolution of Scathophagidae (Diptera: Cyclorrhapha: Calyptratae)

Sujatha Narayanan Kutty1, Marco Valerio Bernasconi2, Frantisˇ ek Sˇ ifner and Rudolf Meier1*

1Department of Biological Sciences, National University of Singapore, Block S2, #02-01, 14 Science Drive 4, Singapore 117543, 2Zoological Museum, University of Zurich Irchel, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland, 3Department of Biology and Environmental Education, Faculty of Education, Charles University, M. D. Rettigove´ 4, CZ-116 39 Prague 1, Czech Republic Accepted 27 July 2006

Abstract

The 60 000 described species of Cyclorrhapha are characterized by an unusual diversity in larval life-history traits, which range from saprophagy over phytophagy to parasitism and predation. However, the direction of evolutionary change between the different modes remains unclear. Here, we use the Scathophagidae (Diptera) for reconstructing the direction of change in this relatively small family ( 250 spp.) whose larval habits mirror the diversity in natural history found in Cyclorrhapha. We subjected a molecular data set for 63 species (22 genera) and DNA sequences from seven genes (12S, 16S, Cytb, COI, 28S, Ef1-alfa, Pol II) to an extensive sensitivity analysis and compare the performance of three different alignment strategies (manual, Clustal, POY). We find that the default Clustal alignment performs worst as judged by character incongruence, topological congruence and branch support. For this alignment, scoring indels as a fifth character state worsens character incongruence and topological congruence. However, manual alignment and direct optimization perform similarly well and yield near-identical trees, although branch support is lower for the direct-optimization trees. All three alignment techniques favor the upweighting of transversion. We furthermore confirm the independence of the concepts ‘‘node support’’ and ‘‘node stability’’ by documenting several cases of poorly supported nodes being very stable and cases of well supported nodes being unstable. We confirm the monophyly of the Scathophagidae, its two constituent subfamilies, and most genera. We demonstrate that phytophagy in the form of leaf mining is the ancestral larval feeding habit for Scathophagidae. From phytophagy, two shifts to saprophagy and one shift to predation has occurred while a second origin of predation is from a saprophagous ancestor. The Willi Hennig Society 2006.

The Cyclorrhapha contain about 60 000 described family of 250 described species is the extreme diversity in species and exhibit a remarkable diversity in breeding biology. Scathophagids breed in different types of dung habits ranging from saprophagy over phytophagy to or other decaying organic matter such as rotting parasitism and predation. In order to understand how seaweed (Gorodkov, 1986; Vockeroth, 1989), mine in this diversity has evolved, studying the life-history leaves, bore in culms and feed on immature flowering evolution across Cyclorrhapha is important. However, heads, or on seed capsules and ovules. Larvae of a few given the size of the Cyclorrhapha, it may be advisable species are also predators of small invertebrates or to first study the life-history evolution of a small caddis fly egg masses; i.e., although commonly known subclade with an unusual diversity in larval habits. A as ‘‘dung flies’’, only a few species in the Scatho- good choice for such a subclade is the Scathophagidae. phaga actually breed in dung. In contrast to the larvae, One of the most striking features of this relatively small adult scathophagids appear all to be predators of other invertebrates. *Corresponding author: Scathophagid systematics is in state of chaos at E-mail address: [email protected] multiple levels. The Scathophagidae along with the

The Willi Hennig Society 2007 S. N. Kutty et al. / Cladistics 23 (2007) 64–83 65 families Fanniidae, and Anthomyiidae com- For example, a recent study investigating the coevolu- prise the superfamily within the Calyptratae tion of male and female reproductive characters utilized (McAlpine and Wood, 1989). This superfamily has been the tree based on COI, although many branches have regarded by some authors as a group of convenience little support (Minder et al., 2005). (Michelsen, 1991; Bernasconi et al., 2000b) while others The feeding modes of the larvae of the Cyclorrhapha considered it monophyletic (McAlpine and Wood, and Scathophagidae can broadly be divided into phy- 1989). The position of the Scathophagidae within tophagy, saprophagy, parasitism and predation. In Muscoidea (if monophyletic) is similarly contentious. general, decaying organic matter is considered the It is sometimes regarded as the sister group to the primitive larval medium for Cyclorrhapha from which Anthomyiidae (Bernasconi et al., 2000b) or at other more specialized forms of feeding such as phytophagy times as the sister group of all remaining Muscoidea and predation have evolved (Ferrar, 1987). The shift to (McAlpine and Wood, 1989). Yet other authors treat phytophagy is seen as an evolutionary hurdle that, once the Scathophagidae as a subclade of subfamily rank of overcome, can lead to rapid radiation and diversification either the Muscidae or Anthomyiidae thus assuming a (Mitter et al., 1988). Here, we identify the ancestral close relationship of Scathophagidae to either of these feeding habit of the Scathophagidae and reconstruct the families (Hackman, 1956; Vockeroth, 1956, 1965, 1989). origins of saprophagy, predation and phytophagy. Such To make matters worse, there is not even convincing tracing requires a robust and well-supported phylo- evidence for the monophyly of the Scathophagidae and genetic tree for the Scathophagidae that we propose here the same applies for its two currently recognized based data from multiple genes. constituent subfamilies (Scathophaginae, Delininae: Gorodkov, 1986; Vockeroth, 1989). Morphological Comparing alignment techniques through a sensitivity characters have been successfully used for developing analysis identification keys and creating an overall satisfactory genus-level . However, as is evident from our Character matrices consisting of molecular data can account on scathophagid phylogenetics, morphological be analyzed using many different alignment strategies characters have been less successful for determining the and weighting regimes. Depending on the strategy and position of Scathophagidae within Calyptratae or weighting regimes that are used for the analysis, the reconstructing the relationships within the family reconstructed phylogeny will differ (Wheeler, 1995; (Bernasconi et al., 2000a,b, 2001). Wheeler and Hayashi, 1998). Here, we use a sensitivity In 2000, Bernasconi et al. (2000a) addressed these analysis for identifying optimal analysis conditions for phylogenetic problems by carrying out analyses based our data set. Our study follows Laamanen et al. (2005) on COI sequences for 61 species representing 22 genera in that we use the same transition, transversion and gap of the Scathophagidae. The most parsimonious tree costs matrices and use multiple criteria for choosing provided had moderate support for many genera, but optimal analysis conditions (topological congruence, the higher-level clades only had weak or no bootstrap character incongruence, node support). Laamanen et al. support. Here, we present the results of a cladistic (2005) had found that character incongruence, tree analysis of the Scathophagidae with additional DNA support and topological congruence favored similar sequences from the mitochondrial genes 12S rRNA, 16S analysis conditions. rRNA, cytochrome oxidase subunit I, cytochrome b and However, in the current analysis we go beyond the nuclear genes 28S rRNA, Elongation factor 1-alpha Laamanen et al. (2005) and other studies comparing (Ef1a) and RNA polymerase II (Pol II). In this analysis, alignment algorithms (Terry and Whiting, 2005) in that we have also added two scathophagid species and we compare three different alignment strategies: Clu- included 11 outgroups representing the remaining mus- stalX (default parameters), Direct Optimization (POY), coid families Fanniidae, Muscidae and Anthomyiidae in and ‘‘manual’’ alignment. We believe that such com- order to be able to rigorously test the monophyly of parison of different alignment strategies is particularly Scathophagidae and to identify its closest relative. important because it is well established that they Phylogenetic research on Scathophagidae is partic- profoundly influence the outcome of cladistic analyses ularly timely because the scathophagids are widely used (e.g., Morrison and Ellis, 1997). Yet, currently most in behavioral ecology. For many years, this only applied systematists are either firm practitioners of manual to (Scathophaginae), which was alignment or strong believers in the superiority of frequently used as a model organism in behavioral optimization alignment; i.e., few analyses systematically studies (e.g., Parker, 1970; Hosken, 1999). However, compare the results obtained under the two most recently scathophagid research has become more com- popular techniques, although such comparative data parative and now includes species from across the entire may help in choosing between the available methods family, although prior work on the phylogenetic rela- and provide critical information for interpreting trees tionships only yielded weakly supported hypotheses. based on data obtained using different alignment 66 S. N. Kutty et al. / Cladistics 23 (2007) 64–83 strategies. Lastly, here we test Grant and Kluge’s (2005) The ribosomal gene sequences for 12S, 16S and 28S contention that ‘‘node stability’’ sensu Giribet (2003) were aligned using three different techniques. First, an and node support are largely synonymous concepts by alignment was obtained using ClustalX 1.8 (Higgins and studying the correlation between both measures for the Sharp, 1988) with the gap opening and extension cost set nodes of our trees. at the default (15 : 6.66). Second, this Clustal alignment was manually re-aligned in MacClade (Maddison and Maddison, 2000). This adjustment was taxon-blind; i.e., Materials and methods the taxa names were not used during the manual optimization of the alignment. Third, direct optimiza- Taxa and DNA extractions tion (Wheeler et al., 2003) was used as implemented in POY (Wheeler, 1996; Wheeler et al., 2003; documenta- Sixty-three species of Scathophagidae are included in tion by De Laet and Wheeler, 2003) The indel cost was the analysis. Whole flies were frozen and pulverized in varied from 1 to 10 (see Fig. 2). For two reasons we liquid nitrogen before the DNA extraction was per- refrained from differentiating between gap opening and formed using the QIAamp tissue kit (Qiagen, Valencia, gap extension costs (‘‘affine gap costs’’: Watermann CA) according to manufacturer’s instructions. DNA et al., 1976; Gotoh, 1982), although recent analyses was eluted in 400 lL of distilled water or buffer. suggest such treatment may improve congruence Compared with Bernasconi et al. (2000a), we have between gene partitions (Aagesen, 2005). First, we were added 13 taxa (Table 1) representing the other muscoid concerned about potential violations of metricity in families Fanniidae, Muscidae and Anthomyiidae. DNA costs sets consisting an opening, extension, and base extractions for the outgroup species and two scatho- change cost. Secondly, comparing the results of affine phagid species were performed using phenol–chloro- gap costs in POY with the other alignment techniques form extraction. The specimens were lyzed in CTAB would be difficult given that different gap opening and buffer, 20 lL Proteinase K was added and the samples extension costs cannot be easily implemented in cladistic were incubated overnight at 55 C. DNA was extracted analyses based on fixed alignments. using phenol ⁄chloroform ⁄isoamyl alcohol (25 : 24 : 1) mix and precipitated with 100% ethanol. After washing Tree search strategies the DNA pellets in 70% ethanol, they were dissolved in 50–100 lL of water. The data set including fixed alignments was analyzed using parsimony as implemented in PAUP* 4.0 (Swof- DNA amplification and sequencing ford, 2002; using batch files for carrying out the various procedures needed for the sensitivity analysis; see Standard PCR amplifications were carried out with Supplementary material). For each weighting regime, Bioline Taq and Takara Ex-Taq using approximately 1– we carried out a heuristic search using 100 random 5 lL of the DNA extractions to amplify the five sequence additions replicates and TBR branch swap- different gene regions of interest. The genes sequenced ping. Node support was assessed using Jackknife values for the Scathophagidae taxa are 12S, 16S, 28S, Ef1a and (250 replicates, 100 random addition sequences each) Pol II. Amplification of COI sequences for species obtained at 36.80% deletion as recommended in Farris lacking information in Bernasconi et al. (2000a) was et al. (1996). The data were analyzed using five cost also carried out. The primers used in this study are given matrices that define different weighting regimes for state in Table 2. All genes were sequenced for the outgroups. transformations. Transitions were downweighted by The PCR cycles consisted of an initial denaturation step assigning higher weights to transversions (2, 4, 6, 8 at 95 C for 7 min, followed by 95 C for 1.5 min, and 10). Third positions of protein encoding genes were annealing at temperatures ranging from 44 to 50 C and downweighted by applying the same higher weights to extension at 72 C for 1.5 min. A final extension at the first and second positions and the sequences for the 72 C for about 5 min was also added. The amplified ribosomal genes. The analysis was carried out with gaps gene products were purified using Bioline Quick-Clean coded as missing data as well as using the gaps as a fifth solution following the manufacturer’s protocol. Cycle character state. When coded as a fifth character state, sequencing reactions was performed on the purified gaps were given at most half the weights of the products using BigDye Terminator v3.1. The products transversions to avoid violations of triangular inequal- were then prepared for direct sequencing by removing ities (Wheeler, 1993). the dye-terminator using 5 lL of Agencourt CleanSEQ The data set was also analyzed using parsimony and solution. The sequences were edited and assembled in direct optimization as implemented in POY (Wheeler Sequencher 4.0. The protein encoding genes COI, Cytb, et al., 2003; documentation by De Laet and Wheeler, Ef1a and Pol II were aligned in the same program and 2003). The protein-encoding genes were entered as yielded gap-free alignments. prealigned data. The sequences for 28S rDNA were Table. 1 Taxa used in study and known larval breeding habits

GenBank accession number

Taxa 12S 16S Cytb COI 28S Ef1a polI Larval biology Fannidae Fannia armata (Meigen, 1826) DQ656883 DQ648646 DQ657050 AF104623 DQ656960 Breeds in dung, rotting wood and fungi Fannia canicularis (Linnaeus, DQ656884 DQ648647 DQ657051 DQ657037 DQ656961 Breeds in decaying matter, dung, 1761) fungi, etc. Fannia manicata (Meigen, DQ656885 DQ657052 DQ657038 DQ656962 Breeds in decaying matter, dung, 1826) fungi, etc. Muscidae Mesembrina mystacea (Lin- DQ656895 DQ657063 DQ657046 DQ656973 DQ657147 Breeds in dung naeus, 1758) Musca domestica Linnaeus DQ656896 DQ648650 DQ657064 AF104622 DQ656974 DQ657113 Breeds in dung, garbage, rotting

(1758) organic matter 64–83 (2007) 23 Cladistics / al. et Kutty N. S. Stomoxys calcitrans (Lin- DQ656886 DQ657053 DQ657039 DQ656963 Breeds in dung and rotting or- naeus, 1758) ganic matter Anthomyiidae Chirosia flavipennis (Falle´n, DQ656887 DQ657054 DQ657040 DQ656964 1823) Zaphne caudata (Zetterstedt, DQ656888 DQ657055 DQ657041 DQ656965 1855) Delia platura (Meigen, 1826) DQ656894 DQ657062 DQ657045 DQ656972 Reared from stems and roots of crop plants, onions and dung Botanophila fugax (Meigen, DQ656890 DQ657057 DQ657042 DQ656967 1826) Hydrophoria lancifer (Harris, DQ656891 DQ657058 DQ657043 DQ656968 1780) Hylemyza partita (Panzer, DQ657059 DQ656969 1798) Lasiomma latipenne (Zetter- DQ656892 DQ657060 DQ657044 DQ656970 stedt, 1838) Lasiomma seminitidum DQ656893 DQ648649 DQ657061 AF104624 DQ656971 DQ657112 DQ657146 (Zetterstedt, 1845)

Scathophagidae Acanthocnema glaucescens DQ656897 DQ648651 DQ657065 AF181023 DQ656975 DQ657114 Breeds in egg masses of caddis (Loew, 1864) flies in swift streams Acerocnema macrocera (Mei- DQ656898 DQ648652 DQ657066 AF181025 DQ656976 gen, 1826) Americina adusta (Loew, DQ656899 DQ648653 DQ657067 AF181030 DQ656977 DQ657148 1863) Ceratinostoma ostiorum (Cur- DQ656914 DQ648668 AF180986 AF180792 DQ656992 DQ657123 DQ657162 Breeds in thick moist seaweed tis, 1832) washed up on beaches (Becklund, 1945) 67 68 Table. 1 Continued

GenBank accession number

Taxa 12S 16S Cytb COI 28S Ef1a polI Larval biology Chaetosa (Opsiomyia) palpalis DQ656915 DQ648669 DQ657082 AF181017 DQ656993 DQ657124 DQ657163 (Coquillett) Chaetosa punctipes (Meigen, DQ656916 DQ648670 DQ657083 AF181016 DQ656994 Aquatic and predacious 1826) Chylizosoma vittatum (Meigen, DQ656900 DQ648654 DQ657068 AF181031 DQ656978 DQ657149 Mines in leaves of 1826) orchids Cleigastra apicalis (Meigen, DQ656901 DQ648655 DQ657069 AF181024 DQ656979 DQ657115 DQ657150 Predacious in the galls of 1826) Lipara sp. (Chloropidae, Diptera) in Phragmites australis Cordilura (Achaetella) varipes DQ656913 DQ648667 DQ657081 AF180996 DQ656991 DQ657161 Walker, 1849 64–83 (2007) 23 Cladistics / al. et Kutty N. S. Cordilura atrata (Zetterstedt, DQ656903 DQ648657 DQ657071 DQ657047 DQ656981 DQ657116 DQ657152 Mines in culm of Carex 1846) rostrata Cordilura (Cordilura) carbona- DQ656904 DQ648658 DQ657072 AF180988 DQ656982 DQ657117 DQ657153 ria Walker 1849 Cordilura (Cordilura) ciliata DQ656905 DQ648659 DQ657073 AF180989 DQ656983 DQ657154 (Meigen, 1826) Cordilura (Cordilura) ontario DQ656908 DQ648662 DQ657076 AF180992 DQ656986 DQ657120 DQ657157 Curran, 1929 Cordilura (Cordilura) pubera DQ656910 DQ648664 DQ657078 AF180987 DQ656988 DQ657122 DQ657159 Mines in culm of Carex (Linnaeus, 1758) hirta Cordilura (Cordilura) pudica DQ656911 DQ648665 DQ657079 AF180991 DQ656989 DQ657160 (Meigen, 1826) Cordilura (Cordilura) umbrosa DQ656912 DQ648666 DQ657080 AF180990 DQ656990 Mines in culm of Carex Loew, 1873 atrata and Carex umbrosa Cordilura (Cordilurina) albipes DQ656902 DQ648656 DQ657070 AF180995 DQ656980 DQ657151 (Falle´n, 1819) Cordilura (Parallelomma) di- DQ656906 DQ648660 DQ657074 AF180993 DQ656984 DQ657118 DQ657155 midiata Cresson, 1918 Cordilura (Parallelomma) DQ656909 DQ648663 DQ657077 AF180994 DQ656987 DQ657121 DQ657158 pleuritica Loew, 1863 Cordilura latifrons Loew, 1869 DQ656907 DQ648661 DQ657075 AF180997 DQ656985 DQ657119 DQ657156 Delina nigrita (Falle´n, 1819) DQ656889 DQ648648 DQ657056 AF181029 DQ656966 Mines leaves, stem and roots of various orchids Gimnomera cerea Coquillett, DQ656917 DQ648671 DQ657084 AF181009 DQ656995 DQ657125 DQ657164 Feeds on ovules and capsule 1908 of Pedicularis canadensis (Neff, 1968) Gimnomera dorsata (Zetter- DQ656919 DQ648673 DQ657086 AF181008 DQ656997 Feeds on seeds of Bartsia stedt, 1838) alpina Gimnomera cuneiventris DQ656918 DQ648672 DQ657085 AF181011 DQ656996 (Zetterstedt, 1846) Table. 1 Continued

GenBank accession number

Taxa 12S 16S Cytb COI 28S Ef1a polI Larval biology Gimnomera tarsea (Falle´n, DQ656920 DQ648674 DQ657087 AF181010 DQ656998 DQ657165 Breeds in seed capsules of 1819) Pedicularis palustris Hydromyza confluens Loew, DQ656921 DQ648675 DQ657088 AF181015 DQ657126 DQ657166 Tunnels in submerged petioles 1863 of Nymphaea Hydromyza livens (Fabricius, DQ656922 DQ648676 DQ657089 AF181014 DQ656999 Mines in leaves and in sub- 1794) merged petioloes of Nuphar species (Brock & van der Velde, 1983) Microprosopa pallidicauda DQ656923 DQ648677 DQ657090 AF181021 DQ657000 (Zetterstedt, 1838) Nanna articulata (Becker, 1894) DQ656924 DQ648678 DQ657091 DQ657048 DQ657001 Nanna brunneicosta Johnson, DQ656925 DQ648679 DQ657092 AF181006 DQ657002 DQ657127 DQ657167 1927 64–83 (2007) 23 Cladistics / al. et Kutty N. S. Nanna fasciata (Meigen, 1826) DQ656926 DQ648680 DQ657093 AF181003 DQ657003 DQ657168 Nanna flavipes (Falle´n, 1819) DQ656927 DQ648681 DQ657094 AF181005 DQ657004 Damages immature flowering heads of Phleum (Raatakainen & Vasarainen, 1972, 1975) Nanna inermis (Becker, 1894) DQ656928 DQ648682 DQ657095 AF181004 DQ657005 DQ657169 Breeds in meadows of islands where Calamagrostis sp. grows Nanna tibiella (Zetterstedt, DQ656929 DQ648683 DQ657096 AF181007 DQ657006 DQ657128 DQ657170 1838) Neorthacheta dissimilis Mal- DQ656930 DQ648684 DQ657097 AF181027 DQ657007 DQ657129 DQ657171 Feeds on young leaf shoots of loch, 1924 Iris Norellia (Norellia) spinipes DQ656936 DQ648690 DQ657103 AF180998 DQ657013 DQ657133 DQ657175 Mines probably in leaves of (Meigen, 1826) Leucojum species (Ciampolini, 1957) Norellia (Norelliosoma) flavi- DQ656931 DQ648685 DQ657098 DQ657049 DQ657008 DQ657130 Mine in petioles of Filipendula corne Meigen (1826) ulmaria (Nelson & Sˇ ifner, 2000) Norellia (Norellisoma) liturata DQ656932 DQ648686 DQ657099 AF181001 DQ657009 DQ657131 DQ657172 (Wiedemann in Meigen, 1826) Norellia (Norellisoma) mirusae DQ656933 DQ648687 DQ657100 AF181002 DQ657010 DQ657132 Sˇ ifner, 1974 Norellia (Norellisoma) spini- DQ656934 DQ648688 DQ657101 AF180999 DQ657011 DQ657173 Mines in Rumex aquaticus mana (Falle´n, 1819) stems Norellia (Norellisoma) striolata DQ656935 DQ648689 DQ657102 AF181000 DQ657012 DQ657174 (Meigen, 1826) Okeniella caudata (Zetterstedt, DQ656937 DQ648691 DQ657104 AF181020 DQ657014 1838) Orthacheta cornuta Loew, 1863 DQ656938 DQ648692 DQ657105 AF181026 DQ657015 DQ657134 DQ657176 Phrosia albilabris (Fabricius, DQ656939 DQ648693 DQ657106 AF181028 DQ657016 Mines in leaves of Liliace- 1805) aea Pogonota (Lasioscelus) sahl- DQ656940 DQ648694 DQ657107 AF181019 DQ657017 DQ657177 bergi (Becker, 1900) 69 Table. 1 70 Continued

GenBank accession number

Taxa 12S 16S Cytb COI 28S Ef1a polI Larval biology Pogonota (Pogonota) barbata DQ656941 DQ648695 DQ657108 AF181018 DQ657018 DQ657178 (Zetterstedt, 1838) Scathophaga analis (Meigen, DQ656942 DQ648696 AF180977 AF180783 DQ657019 DQ657135 DQ657179 1826) Scathophaga calida Curtis, 1832 DQ656943 DQ648697 AF180981 AF181003 DQ657020 DQ657136 DQ657180 Breeds in thick moist seaweed washed up on beaches (Beck- lund, 1945) Scathophaga cineraria (Meigen, DQ656944 DQ648698 AF180978 AF180784 DQ657021 DQ657137 DQ657181 Breeds on cattle excrements 1826) (Say, DQ656945 DQ648699 AF180974 AF180777 DQ657022 DQ657138 DQ657182 Breeds on cattle excrements 1823) Scathophaga incola (Becker, DQ656946 DQ648700 AF180980 AF180786 DQ657023 1900) 64–83 (2007) 23 Cladistics / al. et Kutty N. S. (Meigen, DQ656947 DQ648701 AF180976 AF180781 DQ657024 DQ657183 1826) (Falle´n, DQ656948 DQ648702 AF180983 AF180789 DQ657025 DQ657139 Breeds in thick moist seaweed 1819) washed up on beaches (Beck- lund, 1945) Scathophaga lutaria (Fabricius, DQ656949 DQ648703 AF180975 AF180779 DQ657026 DQ657140 DQ657184 Breed in dung 1794) Scathophaga obscura (Falle´n, DQ656950 DQ648704 AF180984 AF180790 DQ657027 Feeds on egg masses of caddis- 1819) flies Scathophaga pictipennis (Old- DQ656951 DQ648705 AF180979 AF180785 DQ657028 DQ657185 Breeds in dung enberg, 1923) Scathophaga stercoraria (Lin- DQ656952 DQ648706 AF180971 AF180759 DQ657029 DQ657141 DQ657186 Breeds in dung naeus, 1758) (Fabricius, DQ656953 DQ648707 AF180973 AF180773 DQ657030 DQ657142 DQ657187 Breeds in dung 1794) Scathophaga taeniopa Rondani, DQ656954 DQ648708 AF180972 AF180768 DQ657031 DQ657188 1866 Scathophaga tinctinervis (Beck- DQ656955 DQ648709 AF180985 AF180079 DQ657032 DQ657189 Feeds in dung (Nelson, er, 1894) 2000) Scathophaga tropicalis Malloch DQ656956 DQ648710 AF180982 AF180788 DQ657033 DQ657143 DQ657190 Breeds in mammalian dung Spaziphora cincta (Loew, 1863) DQ656957 DQ648711 DQ657109 AF181012 DQ657034 DQ657191 Free-living, aquatic and preda- cious Spaziphora hydromyzina (Fall- DQ656958 DQ648712 DQ657110 AF181013 DQ657035 DQ657192 Predacious and feed on algal e´n, 1819) and fungal growth insewage beds Trichopalpus fraterna (Meigen, DQ656959 DQ648713 DQ657111 AF181022 DQ657036 DQ657193 Aquatic and predacious 1826) (Nelson, 1995) S. N. Kutty et al. / Cladistics 23 (2007) 64–83 71

Table 2 Primers used in study.

Gene Primer name Sequence 12S 12Sai 5¢-AAA CTA GGA TTA GAT ACC CTA TTA 12Sr_cal 5¢-CCC CTA GGA TTA GAT ACC CTA TTA 16S 16Sf.dip 5¢-CGC CTG TTT AAC AAA AAC AT 16Sr.dip 5¢-TGA ACT CAG ATC ATG TAA GAA A COI mtD8 5¢-CCA CAT TTA TTT TGA TTT TTT GG mtD12 5¢-TCC AAT GCA CTA ATC TGC CAT ATTA Cytb CB-J-10933 5¢-TAT GTT TTA CCT TGA GGA CAA ATA TC TSI-N-11683 5¢-AAA TTC TAT CTT ATG TTT TCA AAA C 28S rc28A 5¢-AGC GGA GGA AAA GAA AC 28C 5¢-GCT ATC CTG AGG GAA ACT TCG G Ef1a M-441 5¢-CAG GAA ACA GCT ATG ACC GCT GAG CGY GAR CGT GGT ATC AC rcM4 5¢-ACA GCV ACK GTY TGY CTC ATR TC Pol II 5F 5¢-CCI CAY TTY ATH AAR GAY GA 17R 5¢-TTY TGN GCR TTC CAD ATC AT divided into three fragments using hyperconservative correlated (Laamanen et al., 2005) and MRI remains stem regions as break points while the relatively short unpublished (but see Aagesen et al., 2005). Topological 12S and 16S fragments remained whole. The sequences congruence was determined for each partition by were analyzed using POY in a parallel computing mode counting the number of nodes on the strict and on two clusters at Singapore’s National Grid (‘‘Melon’’: semistrict consensus trees (Laamanen et al., 2005). For four nodes with four Xeon CPUs; ‘‘Hydra3’’: 10 nodes the partitioning scheme using protein encoding genes with four Itanium2 CPUs) using the following string of versus ribosomal genes, the symmetric tree distance commands: -parallel -jobspernode 2 -onan -onannum metric (Penny and Hendy, 1985) and the corrected 2 -dpm -norandomizeoutgroup -maxtrees 10 -hold- symmetric tree distance metric were computed (Laama- maxtrees 100 -fitchtrees -seed -1 -slop 5 -checkslop nen et al., 2005). 10 -dpm -multirandom -replicates 10 -treefuse -fuselimit 10 -fusemingroup 5 -fusemaxtrees 100. For the preferred Natural history analysis condition, we increased the number of repli- cates to 100 (same tree length was found as in the 10 Data for the breeding and feeding habits of larvae for replicate analysis). The analysis was carried out using the different species of the Scathophagidae (Table 1) the same cost matrices that were used for the fixed were compiled from the literature and personal obser- alignments. Node support was assessed using the Jack- vations of Frantisˇ ek Sˇ ifner. Natural history information boot option in POY (-parallel -jackboot -jobspernode was only available for a subset of the taxa used in this 4 -onan -onannum 4 -dpm -norandomize outgroup study and the tree was reduced by pruning all outgroups -maxtrees 10 -holdmaxtrees 1000 -fitchtrees -seed -1 and terminals lacking natural history information. -slop 5 -checkslop 10 -dpm -multirandom -replicates Larval breeding habits were mapped on to the most 100). The number of replicates was increased to 250 for parsimonious tree (Fig. 4) using two different character the preferred analysis condition. definitions and the ‘‘trace character’’ option in Mac- Clade. In the case of multiple optimizations, we inspec- Sensitivity analysis ted all equally parsimonious mappings. We first coded ‘‘phytophagy’’ as one character state in a multistate Separate sensitivity analyses (Wheeler, 1995) were ‘‘natural history evolution’’ character (states: phyto- carried out for the three alignment strategies and the phagy, predation, saprophagy). Alternatively, we divi- following two ways to partition the data: (1) protein- ded ‘‘phytophagy’’ into ‘‘monocot phytophagy’’ and encoding genes versus ribosomal genes; (2) all genes ‘‘dicot phytophagy.’’ This coding does not assume versus each other. Branch support was assessed by homology between feeding on monocots and dicots. summing the jackknife support values above 50 (Laa- manen et al., 2005). Character incongruence for the partitions was assessed using the ILD (incongruence Results length difference) as implemented in Laamanen et al. (2005) (Wheeler and Hayashi, 1998). We opted against Sequence data for the 78 taxa were compiled and using RILD (rescaled incongruence length difference) concatenated in MacClade 4.0 (Maddison and Maddi- and MRI (Meta-Retention Index) because we had son, 2000). The ClustalX aligned data set had 5060 previously found that RILD and ILD were highly characters, whereas in the manually aligned data set the 72 S. N. Kutty et al. / Cladistics 23 (2007) 64–83 number of characters is reduced to 5052. The jackknife sample, the Anthomyiidae are a paraphyletic group. support, ILD and symmetric tree distances for different However, the monophyly of the two subfamilies of partitions of the data set from the various) analyses Scathophagidae is well supported and our phylogenetic using different weighting regimes for different alignment hypothesis is thus consistent with the proposed subfamily strategies and treatment of gaps as missing or informa- classification of the Scathophagidae into the Scathopha- tion are summarized in Table 3. ginae and Delininae (Gorodkov, 1986; Vockeroth, 1989) based on morphology. The data also confirm the Alignment techniques, indel treatment, character monophyly of most genera including Cordilura, Nanna, transformation weighting Norellia, Gimnomera, Hydromyza and Spaziphora and resolves the intergeneric relationships. The ancestral The default Clustal alignment performs worst with feeding habit in the Scathophagidae is phytophagy from regard to two optimality criteria: it yields the highest which saprophagy and predation have evolved. In gen- character incongruence and lowest topological congru- eral, we see that two shifts to saprophagy and one shift to ence regardless of whether indels are scored as missing predation has occurred. A second origin of predation is values or fifth character states. With regard to the third found within the otherwise saprophagous Scathophaga. parameter, branch support, Clustal alignments outper- form POY, but not the manual alignment (however, it is unclear whether jackknife values based on fixed align- Discussion ments can be directly compared with those from optimization alignment analyses). The manual align- Sensitivity analyses and choice of alignment ment and direct optimization perform at similar levels. For the ‘‘protein versus ribosomal genes’’ analysis, the Proposing a phylogenetic hypothesis for the Scathop- lowest character incongruence is observed for POY hagidae based on the results of our sensitivity analysis is while the manual alignment outperforms POY in the at the same time difficult and relatively straightforward ‘‘all genes versus each other’’ sensitivity analysis. A as the following discussion will reveal. We here use a similar pattern emerges with regard to topological number of different indicators to choose optimal ana- congruence with both techniques yielding optimal values lysis conditions and we find slight differences with depending on which measure of topological congruence regard to which weighting regime or alignment tech- is used and how the data set is partitioned (Table 3). nique should be used. However, fortunately the tree(s) Downweighting of third positions for protein-enco- that are favored by all these indicators are either ding genes increases character incongruence and lowers identical or very similar (Figs 1 and 2). Furthermore, jackknife support as well as topological congruence for we find an orderly pattern in the recommendations from all three ways of aligning and two ways of partitioning the sensitivity analyses in that mostly similar weighting the data. However, downweighting transitions improved regimes yield optimal values (see also Terry and tree support and lowered ILD values for all alignment Whiting, 2005). This finding mirrors the results of a techniques. The highest jackknife support was obtained similar studies (e.g., Aagesen et al., 2005; Aagesen, 2005; for the weighting regime tv ¼ 4, when the analysis was Laamanen et al., 2005) that indicated that sensitivity carried out using gaps as information and the manually analyses can be based on several criteria (e.g., character aligned data set. The ILD values and symmetric tree incongruence, topological congruence, branch support) distance between the protein and ribosomal partitions without obtaining wildly conflicting results. are lowest at tv ¼ 2 and tv ¼ 6, respectively, but as the We find that for the scathophagid data set, assigning tree topologies for tv ¼ 2, 4 and 6 weightings are higher weights to transversion than transitions is the identical, we prefer the strict consensus tree of the two preferred treatment. Such transversion weighting most parsimonious trees obtained at tv ¼ 4 (Figs 1 and improves node support, character congruence, and 2). This tree is used for our phylogenetic discussion and topological congruence for all three different ways to for tracing the natural history evolution. The tree align the ribosomal genes (ClustalX, Manual, POY). topology of the favored POY treatment only differs For our laboratory this is now the fourth sensitivity with regard to outgroup arrangement; i.e., it is also very analysis in a row for which we find that transversion similar to the trees obtained based on manual alignment weighting is favored (Meier and Wiegmann, 2002; with transversion weighting. However, the equal weight- Damgaard et al., 2005; Laamanen et al., 2005). We ing tree differs in several regards from the preferred tree thus believe that this strategy may be of considerable (compare Figs 1, 2 and 3). importance for many data sets. For the scathophagid All analyses of the data set confirm the monophyly of data set, we find that several indicators favor a the family Scathophagidae with the Anthomyiidae being transversion weight from 1 to 10 with most preferring its next closest relatives when the tree is rooted to Fannia a weight of 2 or 4. Fortunately, the systematic conclu- armata. Based on our rooting and our very limited taxon sions are not affected as the tree topologies are identical Table 3 Jackknife support, ILD and symmetric tree distances for different partitions of the data set using different weighting regimes for different alignment strategies and treatments of gaps. The numbers in bold represent optimal values in the criterion column within each of the alignment techniques and underlined bold numbers represent the global optimum across the alignment techniques.

Direct optimization Protein versus ribosomal genes All genes versus Shared nodes each other Jackknife Character Corrected Character support congruence SymDif congruence of MPT ILD Metric Strict Semistrict ILD Equal 4263 7.26 0.50 24 38 41.88 .N ut ta./Caitc 3(07 64–83 (2007) 23 Cladistics / al. et Kutty N. S. Tv ¼ 2 4321 7.33 0.49 35 38 39.59 Tv ¼ 4 4298 6.94 0.45 39 41 45.13 Tv ¼ 6 4683 7.61 0.46 38 40 48.87 Tv ¼ 8 4457 8.16 0.42 42 42 57.98 Tv ¼ 10 4333 8.57 0.44 39 41 53.33 1&2¼ 2 2857 8.45 0.73 29 40 46.31 1&2¼ 4 2080 10.11 0.55 28 51 55.15 1&2¼ 6 2085 11.09 0.59 27 46 60.58 1&2¼ 8 2143 11.80 0.56 27 48 64.22 1&2¼ 10 2088 11.99 0.60 28 47 66.97

Manual alignment: All genes Manual alignment: All genes Indel ¼ 5th state Protein versus ribosomal genes versus Indel ¼ missing Protein versus ribosomal genes versus each other each other Shared nodes Shared nodes Jackknife Character Corrected Character Jackknife Character Corrected Character support congruence SymDif Strict Semistrict congruence support congruence SymDif Strict Semistrict congruence of MPT ILD Metric ILD of MPT ILD Metric ILD Equal 4842.06 7.84 0.53 23 42 36.85 Equal 4975.02 7.83 0.52 27 43 37.96 Tv ¼ 2 5203.81 7.18 0.52 33 40 37.87 Tv ¼ 2 5377.4 6.80 0.51 31 45 37.47 Tv ¼ 4 5597.11 7.00 0.54 32 34 43.12 Tv ¼ 4 5689.53 7.28 0.49 36 37 42.56 Tv ¼ 6 5402.59 7.53 0.47 35 41 47.32 Tv ¼ 6 5566.52 8.22 0.47 35 46 46.87 Tv ¼ 8 5501.52 7.88 0.52 32 37 60.16 Tv ¼ 8 5503.94 8.89 0.51 35 40 56.21 Tv ¼ 10 5396.88 8.19 0.54 26 53 51.82 Tv ¼ 10 5580.26 9.32 0.50 36 42 51.67 1&2¼ 2 5038.3 7.92 0.50 26 40 41.78 1 & 2 ¼ 2 4952.31 8.07 0.52 28 41 43.19 1&2¼ 4 4864.9 9.76 0.58 25 42 48.50 1 & 2 ¼ 4 4974.01 9.62 0.59 27 35 49.56 1&2¼ 6 4618.81 10.72 0.58 25 41 53.04 1 & 2 ¼ 6 4811.72 10.70 0.59 27 35 53.99 1&2¼ 8 4557.59 11.35 0.61 23 39 56.73 1 & 2 ¼ 8 4532.1 11.38 0.59 26 33 57.47 1&2¼ 10 4422.52 11.65 0.62 22 35 59.54 1 & 2 ¼ 10 4774.48 11.72 0.65 23 31 60.08 73 74 S. N. Kutty et al. / Cladistics 23 (2007) 64–83

for both weighting regimes. Unambiguous across all of our recent sensitivity analyses (Meier and Wiegmann, 2002; Damgaard et al., 2005; Laamanen et al., 2005) is the futility of downweighting third positions. Several authors had argued for such a treatment because third positions evolve at faster rates when compared with first All genes versus each other Character congruence ILD 38.21 39.76 and second positions (see References in Laamanen et al., 2005). In the present study, the downweighting of third positions never leads to any improvement in branch support, character congruence, or topological congruence and this conclusion holds for any of our three ways to align the data. This result adds to the mounting evidence that the downweighting of third Shared nodes position is generally not a useful technique (see Refer- ences in Laamanen et al., 2005) and that third positions often contain important phylogenetic structure (e.g., Ka¨llersjo¨et al., 1999). Corrected SymDif Metric More controversial are our findings with regard to indel treatment and the preferred alignment technique. Giribet and Wheeler (1999) had strongly argued based on theoretical arguments that (1) indels should always 7.57 0.54 30 36 9.81 0.58 28 33 50.67 Character congruence ILD be coded as fifth character state, and that (2) numerical alignments are always preferable over ‘‘manual’’ or ‘‘by eye’’ alignments. With regard first issue, we find that for our Clustal Jackknife support of MPT 5564.6

missing Protein versus ribosomal genes alignment, coding indels as missing values decreases 24 4991.186 4914.468 4733.1 8.8810 11.49 4623.89 4377.63 12.93 13.83 14.21 0.53 0.56 0.57 0.58 25 0.58 27 27 39 26 40 26 40 39 41 43.54 51.09 56.07 59.95 62.76 ¼ character incongruence and increases topological con- 2 5265.15 810 5234.14 5235.03 10.80 11.28 0.60 0.58 28 28 30 33 60.61 55.95 46 5409.28 8.55 0.57 28 35 46.26 ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ gruence and branch support; i.e., all optimality criteria Tv Clustal alignment: Indel indicate that indels should be treated as missing data. For the manual alignment, the results are more ambi- guous. Depending on what measure is used and how the data are partitioned, character incongruence and topo- 50.89 62.22 Tv All genes versus each other Character congruence ILD logical congruence either improve or worsen while jackknife support is highest for the analysis coding indels as missing data. One could argue that these results may be due to problems with implementing the same indel costs during alignment and cladistic analysis, but unfortunately this cannot be tested because the default

Strict Semistrictalignment parameters Strict Semistrict of Clustal use affine gap costs and

Shared nodes manual alignments do not utilize a fixed cost matrix. What is more important from the empirical point of view is that our result indicates that at least for some data sets and alignments treating gaps as missing data Corrected SymDif Metric 0.700.66 21 22 23 27 may be appropriate. In any case, we find that for our Clustal-aligned data set one can either honor the call for using indels as character states or one can optimize character incongruence and topological congruence. Character congruence ILD 11.85 14.54 Both goals are not simultaneously attainable and we would argue that then the choice of the indel-coding technique becomes a matter of value judgment. It appears to us that overall character incongruence and Jackknife support of MPT 5299.15 topological congruence are more important than the 5th state Protein versus ribosomal genes theoretical arguments in favor of coding indels as fifth 24 4540.876 3876.758 16.90 3702.3510 19.91 3496.28 3407.07 21.27 21.86 0.71 22.23 0.72 0.72 0.72 17 0.74 18 18 19 20 17 18 18 19 17 68.72 79.89 85.89 89.87 1&2 92.81 1&2 1&2 1&2 1&2 ¼ 810 5063.53 4750.7 17.05 17.76 0.69 0.74 18 18 41 18 78.75 75.12 Tv Tv 24 5155.34 6 4933.11 16.03 0.67 21 26 68.56 Tv ¼ ¼ ¼ ¼ ¼ character states; i.e., if one were to use an unmodified ¼ ¼ ¼ ¼ ¼ Tv Tv 1&2 1&2 1&2 1&2 1&2 EqualTv 4792.72Tv Tv 15.40 0.71 16 21 57.98 Equal 4804.13 8.57 0.55 24 37 Clustal alignment: Indel Clustal alignment, one should code the gaps as missing

Table 3 Continued values. S. N. Kutty et al. / Cladistics 23 (2007) 64–83 75

Fig. 1. Most parsimonious tree for all manual alignments and indel treatments under optimal analysis conditions (tv ¼ 4). Support values are indicated on branches as Jackknife support. The set of numbers toward the bottom of the squares represent the [transition, transversion, gap] costs used for the particular analysis. Black fields in the sensitivity plot signify agreement, white disagreement, and gray neither (polytomy). 76 S. N. Kutty et al. / Cladistics 23 (2007) 64–83

Fig. 2. Most parsimonious tree for direct optimization (in POY) under optimal analysis conditions (tv ¼ 4). Support values are indicated on branches as jackknife support. The set of numbers to the bottom of each of the squares represent the [transition, transversion, gap] costs used for the particular analysis. Black fields in the sensitivity plot signify agreement, white disagreement, and gray neither (polytomy). S. N. Kutty et al. / Cladistics 23 (2007) 64–83 77

Fig. 3. Most parsimonious tree under equal weighting using a manual alignment.

With regard to the second issue, the preference of Clustal is a viable alternative to manual alignment for numerical alignments over manual ones, we have to those users who would like to use a fixed alignment for disagree with Giribet and Wheeler (1999), because our their cladistic analysis (see also Laamanen et al., 2005). sensitivity analyses clearly reveal that the manual Our results thus lend some support to the widespread alignment is outperforming the Clustal alignment with practice of using manual alignments. Note also, that the regard to character incongruence, topological congru- results based on such alignments are similar to those ence, and branch support. We thus do not see any based on morphological character matrices in that they evidence that the ‘‘numerical’’ alignment produced by are perfectly repeatable at the cladistic analysis stage. 78 S. N. Kutty et al. / Cladistics 23 (2007) 64–83

For this reason alone, we find it difficult to see why a the equally weighted tree moves to the base of the technique that is acceptable for morphological data Scathophaginae on the preferred tree. It is in discussing should be rejected for DNA sequence data. Of course, this phenomenon that Giribet’s (2003) concept of node we agree with Wheeler (2003) that it would in principle stability proves its utility. Many nodes on our scatho- be preferable to use a technique that automates align- phagid tree are extremely stable (Fig. 2a, b) in that they ment and tree search. But this does not answer the are supported by most or all weighting matrices utilized question of what one should do if automatization in our sensitivity analyses and are furthermore insensit- sacrifices phylogenetic accuracy as measured by parti- ive to alignment techniques. Node stability sensu Giribet tion congruence. (2003) has been criticized by Grant and Kluge (2005) But is there an automated technique that performs as who argued that it is epistemologically unsound and well as manual alignments? We find that the results questioned that node stability and node support are based on our manual alignment and direct optimization fundamentally different concepts. However, as pointed are indeed very similar with regard to character incon- out by Giribet (2003), it is not uncommon to find nodes gruence, topological congruence and preferred tree. The that are very stable across many analysis conditions, but main difference lies in branch support where the manual have relatively low support as measured by jackknifing alignment greatly outperforms direct optimization. or bootstrapping. Our tree furnishes additional exam- Unfortunately, it remains unclear whether jackknife ples such as the Gimnomera + Scathophaga clade that is values across the different techniques can be directly recovered under almost all analysis conditions, although compared because the jackknifing of a fixed alignment the jackknife support is usually less than 50%. Overall, employs different sampling techniques than jackknifing we find that a plot of nodal support versus nodal during optimization alignment. We thus urge further stability across the entire tree (Fig. 5) reveals that the study in this area because more information on this issue two measures are only mediocre predictors of each will be important when comparing support values from other. Interestingly, node stability is higher for direct different studies. Could it be that the jackknife values optimization while we pointed out earlier that node from POY are generally lower than those from analyses support is higher for a fixed alignment. using fixed values? Only additional studies exploring different alignment techniques can resolve this issue. Systematic conclusions Fortunately, in the case of the Scathophagidae, the trees favored by POY and the manual alignment are topolo- We had earlier remarked that scathophagid system- gically identical with regard to ingroup relationships; atics is in a state of chaos at multiple levels. Some of this i.e., the choice of analysis technique does not influence chaos can be removed based on the results of our study. our systematic conclusions. The only difference concerns Commenting on the monophyly of the Muscoidea and the Muscidae, which are not monophyletic on the POY the position of Scathophagidae would require sampling trees, although muscid monophyly is well supported non-muscoid calyptrate and acalyptrate outgroups while based on morphological characters; i.e., the tree based our study only included muscoid taxa. However, on the manual alignments is here favored. This is similar regardless of which branch is used for rooting, certain to Meier and Wiegmann’s (2002) analysis of coelopid hypotheses are never supported. For example, a clade relationships (Diptera). Here, only the analysis using a consisting of Muscidae + Scathophagidae is non-mon- manual alignment of 16S rDNA recovered the monop- ophyletic under all possible roots while other proposals hyly of Coelopidae and the authors favored the trees such as a sister group relationship between Scathop- obtained from the manual alignments given the strength hagidae and the remaining Muscoidea and a clade of the morphological evidence for coelopid monophyly. composed of Anthomyiidae + Scathophagidae are viable options. The latter was favored in a preliminary Node support and node stability analysis in which we used the acalyptrate Curtonotum helvum as a non-muscoid outgroup. Unfortunately, the One remarkable phenomenon displayed by our data data for Curtonotum was too incomplete to allow for a set is that the tree topology supported by equal formal inclusion in this study, but we consider it likely weighting (Fig. 3) differs in several important regards that Anthomyiidae + Scathophagidae will form a from the topology of our favored tree while the latter is clade. very stable with regard to different transversion and More definite answers can be provided with regard to third position weights. For example, the sister group the monophyly of Scathophagidae and its two constitu- relationship between Cordilura and Nanna clades is not ent subfamilies. All three taxa are well-supported as supported on the equally weighted trees and the clade monophyletic. Equally well supported are a number of consisting of the genera Hydromyza, Spaziphora, Chae- important genera such as Cordilura, Nanna, Norellia, tosa, Pogonota, Trichopalpus, Acanthocnema and Okeni- Gimnomera, Hydromyza and Spaziphora. Some of the ella, which is a sister group to the Scathophaga clade on remaining genera are more problematic, but in most S. N. Kutty et al. / Cladistics 23 (2007) 64–83 79 cases the issues can be resolved by eliminating small or previously been treated as a separate genus (Bernasconi monotypic genera. For example, based on our data, et al., 2000a, 2001) and was thought to be closely related Pogonota (seven spp. in genus) is paraphyletic with to Scathophaga (Hackman, 1956). A third group within respect to Okeniella (three spp. in genus), which should Scathophaga consist of S. calida and S. litorea and probably be synonymized with the former. Chaetosa Ceratinostoma. The larvae of the latter all develop in (two spp.) is similarly paraphyletic with respect to decaying brown algae. Trichopalpus (five spp.; Hackman, 1956; Vockeroth, pers. comm; Bernasconi et al., 2000a) and the former Natural history evolution should probably be synonymized with the latter. How- ever, formal synonymization should await a study of all The Cyclorrhapha exploit a large diversity of habitats species. and breeding substrates with the ancestral larval breed- The relationships within the relatively speciose genus ing habit widely being thought to be saprophagy Cordilura are not well supported and relatively labile to (Ferrar, 1987). From here, other breeding habits such variation in analysis conditions, but its monophyly is as phytophagy, parasitism and predation are thought to supported as long as the monotypic Phrosia is synon- have evolved. However, our tree for Scathophagidae ymized with Cordilura. The position of Phrosia has been suggests a very different scenario for this family. The controversial at the subfamily level. Gorodkov (1986) ancestral feeding mode is phytophagy. The larvae of the had placed it in the Delininae, a position that had been closest relative the Anthomyiidae, breed in different questioned by other authors (Pu¨chel, pers. comm) who substrates including decaying organic matter, dung, considered it part of the Scathophaginae based on flowering plants, ferns and fungi (Ferrar, 1987). Deter- morphological similarities of the adults to Cordilura. mining the ground plan condition for this family is The latter position found some support in Bernasconi unfortunately not possible given that the phylogenetic et al. (2000a), but is now very well supported by our new relationships within Anthomyiidae are poorly under- data set (jackknife ¼ 100). Both Cordilura and Phrosia stood and we only include few species. It thus remains have phytophagous larvae living in plants growing in unclear whether the phytophagy in Scathophagidae is wet habitats with the former apparently being restricted homologous to the phytophagy known from many to Carex while Phrosia is known to feed on Liliaceae. species of Anthomyiidae. However, contrary to scenar- The sister group of Cordilura is a clade containing ios proposed in the literature, saprophagy in the form of Nanna, Cleigastra apicalis, Neorthacheta dissimilis and breeding in dung or other decaying organic matter is not Orthacheta cornuta. Again, there has been some taxo- the ancestral mode for Scathophagidae and has instead nomic disagreement, especially with regard to the evolved twice from phytophagy. Predation has evolved placement of Cleigastra apicalis in the scathophagid two times independently, once from phytophagy and subfamilies (Collin, 1958; Kloet and Hincks, 1976; once from saprophagy (Fig. 4). Gorodkov, 1986; Nelson, 1988). Bernasconi et al. The first shift to saprophagy involves the genus (2000a) favored a placement in the Scathophaginae Scathophaga. Most species are coprophagous while (Bernasconi et al., 2000a) and our analysis again others have shifted to other decaying organic matter. suggests the same. Norellia and Gimnomera are both Most notably, S. calida, S. litorea and Ceratinostoma monophyletic with high jackknife support. Based on the have become specialists for decaying brown algae stran- number of bristle rows on the tibia of the first leg, the ded on beaches, a breeding substrate that is popular with genus Norellia has been divided into two subgenera many cyclorrhaphous flies that are otherwise predomin- (Sack, 1937; Gorodkov, 1986; Sˇ ifner, 1995): Norellia, antly known from dung (Moeller, 1965, e.g., Sepsidae: Norelliosoma. The only species in the subgenus Norelli- Orygma; Sphaeroceridae: Thoracochaeta). The second soma (N. tipularia) is placed as the sister species of all shift to saprophagy from phytophagy involves Cleigastra Norellia so that this subdivision is consistent with our apicalis and can illustrate how saprophagy can evolve phylogenetic hypothesis. from phytophagy. The larvae of Cleigastra apicalis are The monophyly of Scathophaga has been the subject found in galls made by Lipara (Choropidae) or tunnels to some speculation. We find that monophyly can only made by lepidopterous larvae feeding on Rumex and be restored if the monotypic Ceratinostoma is synony- Phragmites. However, the Cleigastra larvae are not mized with Scathophaga while the proposed subgenus phytophagous or predacious as previously thought (see classification of Scathophaga is consistent with our tree Chandler and Stubbs, 1969). Instead they feed on the (Cuny, 1983; Vockeroth, pers. comm). The subgenus frass of caterpillars in these tunnels (Groth, 1969); i.e., Scathophaga is monophyletic and consists on our tree of the saprophagous Cleigastra larvae are still intimately S. analis, S. inquinata, S. lutaria, S. cineraria, S. suilla, associated with other phytophagous , although S. taeniopa, S. incola, S. furcata, S. tropicalis, and they are no longer directly feeding on plant tissue. S. stercoraria. Also monophyletic is the subgenus A few species of scathophagids are not only predatory Coniosternum (S. obscura, S. tinctinervis) which had as adults, but also have predatory larvae. Scathophaga 80 S. N. Kutty et al. / Cladistics 23 (2007) 64–83

Fig. 4. Tracing of natural history evolution in the Scathophagidae. One of the optimizations when phytophagy is coded separately as dicot and monocot (represented by symbols) is shown here. In this case, the Scathophagid ancestor was feeding on monocot plants. Taxa for which there is either species-specific information or at least statements about the genus have been included. obscura larvae feeds on the egg masses of caddis flies opalpus. This predatory clade has larvae living on lake (Berte´and Wallace, 1987) and according to our phylo- shores and in sewage where they feed on small inverte- genetic hypotheses evolved this predatory behavior from brates (Graham, 1939; Irwin, 1978). The larvae of the saprophagous ancestors. The same unusual substrate is sister group, Hydromyza, mines in leaves or tunnels in used by Acanthocnema larvae, which are found feeding submerged petioles of Nymphaceae (Nuphar and on egg masses of caddis flies in swift streams (Hilton, Nymphaea), which suggests that predation evolved from 1981; Suwa, 1986; Anderson, 1997). Given the inde- a phytophagous ancestor with aquatic or semiaquatic pendent evolutionary origin, it is not surprising that larvae. Nelson (1992) found that the mode of feeding and Most phytophagous species of Scathophagidae are mouth hook morphology differs significantly between only known from one host species, but these very narrow Acanthocnema and Scathophaga. Acanthocnema belongs host ranges could be due to the generally sparse to the second predatory clade in the Scathophagidae, information on the breeding habits of scathophagids. which also comprises Spaziphora, Chaetosa, and Trich- Unfortunately, the optimization of life-history characters S. N. Kutty et al. / Cladistics 23 (2007) 64–83 81

Fig. 5. Number of times a node is recovered out of the 10 different analysis conditions (nodal stability) under manual alignment (d) and direct optimization alignments (r) versus Jackknife support at the node. does not unambiguously resolve whether the scathopha- Conclusions gid ancestor was feeding on monocots or dicots and whether it deposited its eggs on or into the plant host. The Our study manages to clarify many important issues former is known from the Delininae, which feed on surrounding the systematics of Scathophagidae. It also Orchidaceae, Liliaceae and Commelinaceae while the provides a first insight into the relationships within the Scathophaginae insert their eggs into the plant material Muscoidea and is a start for more extensive work on and utilize a range of monocot and dicot hosts. Calyptratae involving the remaining families in this Depending on character coding, phytophagy itself only group. We also demonstrate that simplistic ideas about evolved once or several times. When phytophagy is a the evolutionary relationships between saprophagy, priori considered homologous and thus treated as a phytophagy, and predation will have to be revised. In single character state, it is found at the base of the Scathophagidae, phytophagy is an important launching scathophagid tree and has been lost three times. platform for becoming saprophagous and predacious. However, a more complex picture emerges when a We are confident that the ongoing work on the non-additive phytophagy presence ⁄absence character is Assembling the Tree of Life for Diptera will soon used and different character states are assigned to provide new data for testing long-standing hypotheses phytophagy on monocots and phytophagy on dicots. about natural history evolution in Diptera and other Fourteen equally parsimonious optimizations suggest a insects. wide range of scenarios ranging from three origins of phytophagy with one host shift between monocots and dicots to a single origin of phytophagy and three host Acknowledgments shifts. However, the relationship between host and the phytophagous scathophagid species is not as chaotic as This research was funded by the Academic Research this may suggest. Some genera are entirely restricted to Fund grant R-154-000-207-112 from the Ministry of particular families of plants; i.e., it appears that over Education in Singapore and the NSF-ATOL grant short time periods scathophagids rarely undertake EF-0334948. We thank Dr Thomas Pape for sending us dramatic host shifts. For example, Hydromyza are only specimens that were used as part of this study and we known from water lilies (Nymphaeaceae), Cordilura are particularly grateful to Gaurav Girish Vaidya and only known from Carex (Cyperaceae), Nanna only the National Grid Singapore (http://www.ngp.org.sg) known from Gramineae, and Gimnomera feeds on the for help and support with the computational work and seeds and ovules of Scrophulariaceae. An exception is all other members of the Evolutionary Biology Labor- Norellisoma, which appears to utilize a wide range of atory at the National University of Singapore. Lastly, hosts (Liliaceae, Rosaceae, Polygonaceae) and Norellia we would like to acknowledge the valuable comments (Amaryllidaceae: Narcissus L. and Leucojum L.). from two anonymous reviewers. 82 S. N. Kutty et al. / Cladistics 23 (2007) 64–83

References Groth, U., 1969. Zur Entwicklung and Biologie von Cnemopogon apicalis Wied (Diptera: Cordyluridae). Wiss. Z. Ernst-Moritz. Aagesen, L., 2005. Direct optimization, affine gap costs, and node Arndt-University Greifswald, Math. Nat. Reihe 18, 85–92. stability. Mol. Phylogenet Evol. 36, 641–653. Hackman, W., 1956. The Scatophagidae (Dipt.) of eastern Fenno- Aagesen, L., Petersen, G., Seberg, O., 2005. Sequence length variation, scandia. Tilgmann, Helsingsforsiae Fauna Fennica, pp. 1–67. indel costs, and congruence in sensitivity analysis. Cladistics, 21, Higgins, D.G., Sharp, P.M., 1988. CLUSTAL: a package for 15–30. performing multiple sequence alignment on a microcomputer. Anderson, H., 1997. Diptera Scathophagidae, Dung flies. In: Nilson, Gene, 73, 237–244. A. (Ed.), The Aquatic Insects of North Europe, Vol. 2 Apollo Hilton, H.E., 1981. Biology of Eggs. Pergamon Press, Oxford. Books Aps, Strenstrup, Denmark, pp. 401–410. Hosken, D.J., 1999. Sperm displacement in yellow dung flies: a role for Becklund, H.O., 1945. Wrack fauna of Sweden and Finland, ecology females. Trends Ecol. Evol., 14, 251–252. and chorology. Opusc. Entomol., 5 (Suppl.), 237. Irwin, A.G., 1978. Some micro-habitats. Mud. In: Stubbs, A., Bernasconi, M.V., Pawlowski, J., Valsangiacomo, C., Piffaretti, J.-C., Chandler, P. (Eds.), A Dipterist’s Handbook. Amat. Entomol. Ward, P.I., 2000a. Phylogeny of the Scathophagidae (Diptera, 15, 83–86. Calyptratae) based on mitochondrial DNA sequences. Mol. Ka¨llersjo¨, M., Albert, V.A., Farris, J.S., 1999. Homoplasy increases Phylogenet. Evol. 16, 308–315. phylogenetic structure. Cladistics 15, 91–93. Bernasconi, M.V., Valsangiacomo, C., Piffaretti, J.-C., Ward, P.I., Kloet, G.S., Hincks, W.D., 1976. A Check List of British Insects. Part 2000b. Phylogenetic relationships among Muscoidea (Diptera, 5: Diptera and Siphonaptera, 2nd edn. Handbooks for the Calyptratae) based on mitochondrial DNA sequences. Insect Identification of British Insects, 11(5). Royal Entomological Mol. Biol. 9, 67–74. Society, London, pp. 1–139. Bernasconi, M.V., Pawlowski, J., Valsangiacomo, C., Piffaretti, J.-C., Laamanen, T., Meier, R., Miller, M.A., Hille, Axel, Wiegmann, B.M., Ward, P.I., 2001. Phylogeny of the genus Scathophaga (Diptera: 2005. Phylogenetic analysis of Thimera (Sepsidae: Diptera): sensi- Scathophagidae) inferred from mitochondrial DNA sequences. tivity analysis, alignment, and indel treatment in a multigene study. Can. J. Zool. 79, 517–524. Cladistics 21, 258–271. Berte´, S.B., Wallace, I.D., 1987. Larvae of Coniosternum minuta Maddison, D.R., Maddison, W.P., 2000. MacClade 4: Interactive (Malloch) and C. obscura Falle´n (Dipt. Scathophagidae) feeding on Analysis of Phylogeny and Character Evolution, Version 4.01. eggs of Trichoptera in Canada and Ireland. Entomol. Month. Mag. Sinauer Associates, Sunderland, MA. 123, 181–184. McAlpine, J.F., Wood, D.M., 1989. Manual of Nearctic Diptera. Brock, T.C.M., van der Velde, G., 1983. An autecological study on Agriculture Canada, Research Branch, Ottawa. Hydromyza livens (Fabricius) (Diptera Scatomyzidae) a fly associ- Meier, R., Wiegmann, B.M., 2002. A phylogenetic analysis of ated with Nympheid vegetation dominated by Nuphar. Tijdschr. Coelopidae (Diptera) based on morphological and DNA sequence Entomol. 126, 59–90. data. Mol. Phylogenet. Evol. 25, 393–407. Chandler, P.J., Stubbs, A.E., 1969. A species of Norellia (Dipt., Michelsen, V., 1991. Revision of the aberrant New World genus Scatophagidae) new to Britain. Proc. Br. Entomol. Nat. Hist. Soc. Coenosopsia (Diptera: Anthomyiidae), with a discussion of antho- 120–124. myiid relationships. Syst. Entomol. 16, 85–104. Ciampolini, M., 1957. Reperti sulla Norellia spinipes Meig (Diptera, Minder, A.M., Hosken, D.J., Ward, P.I., 2005. Co-evolution of male Cordyluridae). Redia, 42, 259–272. and female reproductive characters across the Scathophagidae. Collin, J.E., 1958. A short synopsis of the British Scatophagidae J. Evol. Biol. 18, 60–69. (Diptera). Trans. Soc. Br. Entomol. 13, 37–56. Mitter, C., Farrell, B., Wiegmann, B., 1988. The phytophagy study of Cuny, R., 1983. Las Moscas Escatoı` fagas (Scathophagidae, Diptera) adaptive zones: Has phytophagy promoted insect diversification? de Latinoameı` rica. Rev. Peruana Entomol. 26, 9–19. Am. Naturalist 132, 107–128. ¨ Damgaard, J., Andersen, N.M., Meier, R., 2005. Combining molecular Moeller, J., 1965. Okologische Untersuchungen u¨ber die terrestrische ¨ and morphological analyses of water strider phylogeny (Hemip- Arthropodenfauna im Anwurf mariner Algen. Z. Morph. Okol. tera-Heteroptera, Gerromorpha): effects of alignment and taxon Tiere 55, 530–586. sampling. Syst. Entomol. 30, 289–309. Morrison, D.A., Ellis, J.T., 1997. Effects of nucleotide sequence De Laet, J., Wheeler, W.C., 2003. Poy, version 3.0.11. Published by alignment on phylogeny estimation: a case study of 18S rDNAs of Wheeler, Gladstein and De Laet, 6 May 2003. Command line Apicomplexa. Mol. Biol. Evol. 14, 428–441. documentation. Neff, S.N., 1968. Observation on the immature stages of Gimnomera Farris, J.S., Albert, V.A., Ka¨llersjo¨, M., Lipscomb, D., Kluge, A.G., cerea and G. insurata (Diptera: Anthomyiidae, Scathophaginae). 1996. Parsimony jackknifing outperforms neighbor-joining. Can. Entomol. 100, 74–83. Cladistics 12, 99–124. Nelson, M., 1988. Observations on the prestomal teeth of British Ferrar, P., 1987. A Guide to the Breeding Habits and Immature Stages dung-flies (Dipt., Scathophagidae). Entomol. Mon. Mag. 124, of Diptera Cyclorrhapha. Scandinavian Science Press, Leiden. 157–160. Giribet, G., 2003. Stability in phylogenetic formulations and its Nelson, M., 1990. Observation on the biology and status of dung flies relationship to nodal support. Syst. Biol. 52, 554–564. of the genus Parallelomma Becker (Dipt. Scathphagidae). Entomol. Giribet, G., Wheeler, W.C., 1999. On gaps. Mol. Phylogenet Evol. 13, Mon. Mag. 126, 187–189. 32–143. Nelson, M., 1992. Biology and early stages of the dung fly Acanthoc- Gorodkov, K.B., 1986. Scathophagidae. In: Sooı` s, A., Papp, L. (Eds.), nema glaucescens (Loew) (Dipt., Scathophagidae). Entomol. Mon. Catalogue of Palaearctic Diptera, Scathophagidae-Hypodermati- Mag, 128, 71–73. dae. Elsevier, Amsterdam, pp. 11–41. Nelson, M., 1995. Dung flies Diptera: Scathophagidae) in bird’s nests, Gotoh, O., 1982. An improved algorithm for matching biological with particular reference to Trichopalpus fraternus (Meige). Ento- sequences. J. Mol. Biol. 162, 705–708. mol. Gaz. 46, 285–287. Graham, J.F., 1939. The external features of the early stages of Nelson, M., 2000. The life history of immature stages of Scathophaga Spathiophora hydromyzina (Fall.) (Dipt., Cordyluridae). Proc. R. (Coniosternum) tinctinervis (Becker) (Dipt., Scathophagidae). Entomol. Soc. Lond. (B) 8, 157–162. Entomol. Mon. Mag, 136, 161–164. ˇ Grant, T., Kluge, A.G., 2005. Stability, sensitivity, science and Nelson, M., Sifner, F., 2000. A redescription of Norellisoma flavicorne heurism. Cladistics 21, 597–604. (Meigen, 1826) (Dipt., Scathophagidae), with an account of its S. N. Kutty et al. / Cladistics 23 (2007) 64–83 83

biology and notes on other members of the genus. Entomol. Mon. Vockeroth, J.R., 1956. Distribution patterns of the Scatomyzinae Mag, 136, 31–35. (Diptera, Muscidae). Proc. 10th Int. Cong. Entomol. 1, 619–626. Parker, G.A., 1970. Sperm competition and its evolutionary effect on Vockeroth, J.R., 1965. Scatophaginae. In: Stone, A., Sabrosky, C.W., copula duration in the fly Scatophaga stercoraria L. J. Insect Wirth, W.W., Foote, R.H., Coulson, J.R. (Eds.), A Catalog of the Physiol. 16, 1301–1328. Diptera of America North of Mexico. Agriculture Handbook. U.S. Penny, D., Hendy, M.D., 1985. The use of comparison metrics. Syst. Forest Service, Washington, DC, pp. iv, 1696. Zool. 34, 75–82. Vockeroth, J.R., 1989. Scathophagidae. In: McAlpine, J.F., Wood, Raatakainen, M., Vasarainen, A., 1972. Ecology and control of D.M. (Eds.), Manual of Nearctic Diptera, Vol. 2. Agriculture timothy grass flies (Amaurosoma spp, Diptera, Scathophagidae) Canada, Research Branch, Ottawa, pp. 1085–1097. and the effects of chemical control on the fauna of field stratum. Watermann, M.S., Smith, T.F., Beyer, W.A., 1976. Some biological Ann. Agric. Fenniae 11, 57–73. sequence metrics. Adv. Math. 20, 367–387. Raatakainen, M., Vasarainen, A., 1975. Damage caused by timothy Wheeler, W.C., 1993. The triangle inequality and character analysis. grass flies (Amaurosoma spp) in Finland. Biol. Res. Rep. Univ. Mol. Biol. Evol. 10, 707–712. Jyva¨skyla¨1, 3–8. Wheeler, W.C., 1995. Sequence alignment, parameter sensitivity, and Sack, P., 1937. 62a Cordyluridae. In: Lindner, E. (Ed.), Die Fliegen der the phylogenetic analysis of molecular data. Syst. Biol. 44, 321–331. Palaearktis-chen Region, pp. 1–103. Wheeler, W.C., 1996. Optimization alignment: the end of multiple Sˇ ifner, F., 1995. Species of the family Scathophagidae (Diptera) of the sequence alignment in phylogenetics? Cladistics 12, 1–9. Prague protected localities, of the Czech republic, and species of the Wheeler, W.C., 2003. Implied alignment: a synapomorphy-based genus Norellisoma Wohlgren of the palaearctic region. Bohemia multiple-sequence alignment method and its use in cladogram Centralis 24, 89–128. search. Cladistics 19, 261–268. Suwa, M., 1986. The genus Acanthocnema in Asia and Europe with Wheeler, W.C., Hayashi, C.Y., 1998. The phylogeny of the extant descriptions of three new species from Japan and Nepal (Diptera: chelicerate orders. Cladistics 14, 173–192. Scathophagidae). Insect Matsum 34, 1–33. Wheeler, W.C., Gladstein, D., De Laet, J., 2003. POY, Version 3.0.11. Swofford, D.L., 2002. *PAUP*: Phylogenetic Analysis Using Parsi- Program and documentation available at ftp.amnh.org ⁄ pub ⁄ mony (and Other Methods), V. 4.0b10. Sinauer Associates, Inc., molecular. American Museum of Natural History, New York. Sunderland, MA. Terry, M.D., Whiting, M.F., 2005. Comparison of two alignment techniques within a single complex data set: POY Versus Clustal. Cladistics 21, 272–281.