<<

ON THE OF AND THEIR KIN (FAMILY ) USING RETROTRANSPOSONS, NUCLEAR GENES AND WHOLE MITOCHONDRIAL

William George Dodt

B.Sc. (Biochemistry), B.Sc. Hons (Molecular Biology)

Principal Supervisor: Dr Matthew J Phillips (EEBS, QUT) Associate Supervisor: Dr Peter Prentis (EEBS, QUT) External Supervisor: Dr Maria Nilsson-Janke (Senckenberg Biodiversity and Research Centre, Frankfurt am Main)

Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Science and Engineering Faculty University of Technology 2018

1 Keywords

Adaptive radiation, ancestral state reconstruction, Australasia, Bayesian inference, endogenous retrovirus, evolution, hybridization, incomplete lineage sorting, incongruence, introgression, ,

Macropodidae, , , , maximum likelihood, maximum parsimony, molecular dating, , retrotransposon, speciation, systematics, transposable element

2

Abstract

The family Macropodidae contains the kangaroos, , and several closely related taxa that occupy a wide variety of habitats in , and surrounding islands. This group of is the most rich family within the marsupial order . Despite significant investigation from previous studies, much of the evolutionary history of macropodids (including their origin within Diprotodontia) has remained unclear, in part due to an incomplete early record. I have utilized several forms of molecular sequence data to shed light on the phylogeny and the timescale of kangaroo evolution. This was carried out at the family level (Macropodidae), by investigating the genera

Macropus, Wallabia, , Onychogalea, Setonix, Lagostrophus, , ,

Thylogale, Dendrolagus and Petrogale, and at the level, by investigating the relationships among the species of Wallabia and Macropus (which contain many of the most iconic species, including the largest living marsupials), and in turn, the relationships among the three sub-genera of Macropus –

M.(Macropus), M. (Osphranter) and M. (Notmacropus). This study expands traditional molecular data matrices for mitochondrial genomes and nuclear genes, and I present the first retrotransposon-based phylogeny of kangaroos, utilizing a dataset of 29 phylogenetically informative retrotransposon markers that shed light on several contentious relationships among kangaroos. Maximum parsimony retrotransposon analysis shows that the enigmatic swamp (Wallabia bicolor) has a close relationship with the wallabies of the Macropus subgenus, Notamacropus, thus necessitating taxonomic revision of the genera Macropus and Wallabia. The black gloved wallaby (M. irma) groups with the wallabies of Macropus (Notamacropus), which conflicts with previous mitochondrial analyses that place it within Macropus (Osphranter). I find moderate support from retrotransposons for grouping the nail wallabies as the sister group to Macropus/Wallabia - a finding that is not supported by any other molecular analysis, but has been suggested, based on morphology.

3

In addition, I have addressed an ascertainment bias that arises in retrotransposon studies that utilize only a single reference and I present a new statistical framework that addresses this ascertainment bias.

Furthermore, the detection of polymorphic retrotransposon insertions in M. eugenii suggests that there has been very recent activity of a particular retrotransposon family (KERV) in the kangaroo genome, while another retrotransposon (LINE1) appears to have become silenced, hinting at the possibility of competition between these elements in the genome. This phenomenon has been observed in only a small number of other taxa and has implications for understanding genome evolution in macropods.

Next, the taxonomic depth of phylogenetic inference was increased for a dataset that includes mitochondrial genomes and five nuclear genes, both through new sequencing and previously unpublished data within the lab group. This includes the first phylogenetically informative DNA sequences from the elusive black (M. bernardus) and the first DNA sequence from the extinct toolache wallaby (M. greyi). This is the first molecular study of kangaroos to provide complete sampling, covering all living members (and one recently extinct member), of the genera Macropus and Wallabia. I analysed the mitochondrial and nuclear genes separately, as well as in a combined ‘supermatrix’ dataset consisting of

21,278 bp of concatenated sequence data. Notably, I find M. bernardus is the deepest diverging of the wallaroos within M. (Osphranter), while the extinct toolache wallaby (M. greyi) groups within M.

(Notamacropus) as sister to M. irma.

The evolutionary history of kangaroos was explored using molecular dating, a lineage through time analysis and ancestral state reconstructions of key phenotypic traits - habitat preference, mob size and for size and colour. Ancestral reconstructions inferred that multiple transitions from closed/wet forest environments into more arid zones are coincident with the cooling/drying of Australia since the mid- climatic optimum, 15 – 16 Ma. I show that large mob size and sexual dimorphism tends to be more pronounced in species with ranges that have expanded into more arid grasslands, and that sexual dimorphism appears to be primarily male-driven.

4

Finally, looking deeper in the macropodid tree, I investigated the relationships and timing among the genera within the Macropodidae in order to shed light on the current six-way polytomy between

Macropus/Wallabia, Lagorchestes, Onychogalea, Setonix, Dorcopsis/Dorcopsulus,

Dendrolagus/Petrogale/Thylogale. I sequenced and analysed six novel nuclear genes and combined these with five nuclear genes from a previous study and also with complete and partial mitochondrial genomes. I recovered a strong affinity between Lagorchestes and Macropus/Wallabia, with Setonix and Onychogalea sitting consecutively further out. Sister to this clade was a clade containing the Dendrolagini (Dendrolagus and Petrogale) and Thylogale, with the Dorcopsini (Dorcopsis and Dorcopsulus) as the deepest diverging lineage in this clade. The timing of the major divergences appears to have taken place after the mid-

Miocene climatic optimum as the climate continued to become cooler and more arid.

5

Table of contents

Keywords ...... ii

Abstract...... iii

Table of contents ...... vi

List of Figures ...... x

List of Tables ...... xiii

List of abbreviations ...... xv

Statement of original authorship ...... xvi

Acknowledgments ...... xvii

CHAPTER 1: INTRODUCTION...... 1

1.1 Purpose ...... 1

1.2 Background and Context - Literature Review ...... 2

1.2.1 Marsupials ...... 2 1.2.2 Kangaroos and their kin ...... 6 1.2.3 Phylogenetic Reconstruction Using Molecular Sequence Data ...... 13 1.2.4 Molecular Phylogenies Using Transposable Elements ...... 17 1.2.5 Adaptive Radiations ...... 28 1.2.6 Reconstructing Evolutionary History Through Molecular Dating ...... 30 1.2.7 Ancestral State Reconstruction ...... 32 1.3 Objectives and thesis outline ...... 32

1.3.1 Chapter 1. Introduction ...... 32 1.3.2 Chapter 2. Examining the phylogeny of the genus Macropus and Wallabia using retrotransposon insertions...... 32 1.3.3 Chapter 3. Examining the phylogeny and timing of Macropus and Wallabia, utilizing nuclear genes and mitochondrial genomes ...... 33 1.3.4 Chapter 4. The evolutionary history of kangaroos (Macropus and Wallabia)...... 34 1.3.5 Chapter 5. Phylogeny and timing of deep relationships among the Macropodidae ...... 34 1.3.6 Chapter 6. Conclusion ...... 34 1.3.7 Supplementary objective 1...... 35 1.3.8 Supplementary objective 2...... 35 1.4 Significance of research ...... 36

6

CHAPTER 2: RETROTRANSPOSON PHYLOGENY, UPDATED STATISTICS AND ACTIVITY OF TE LINEAGES IN THE GENUS MACROPUS ...... 37

2.1 Abstract ...... 38

2.2 Introduction ...... 38

2.3 Materials and Methods ...... 42

2.3.1 Taxon Sampling and species verification ...... 42 2.3.2 Extraction of genomic regions containing retrotransposons ...... 43 2.3.3 Primer design ...... 43 2.3.4 Experimental verification ...... 47 2.3.5 Scoring of presence and absence of insertions across the phylogeny ...... 47 2.3.6 Parsimony reconstruction with retrotransposon markers ...... 50 2.3.7 Calculation of retrotransposon phylogenetic support values ...... 50 2.3.8 Derivation of arguments for overcoming the single reference genome ascertainment…54

2.3.9 Conservatism of the insertion ratio test for H2-Introgression ...... 59 2.3.10 Phylogenetic analysis of KERV sub-families ...... 61 2.3.11 Recent integrations of KERV in the genome...... 61 2.3.12 Investigation of LINE1 activity in the Macropus genome ...... 62 2.4 Results ...... 63

2.4.1 Activity of an endogenous retrovirus during the evolution of Macropus ...... 63 2.4.2 Wallabia bicolor is nested within the paraphyletic genus Macropus ...... 68 2.4.3 Macropus irma groups with the M. (Notamacropus) wallabies ...... 70 2.4.4 Deeper Macropodine phylogeny ...... 71 2.4.5 Macropus sub-genera ...... 72 2.4.6 Recent integrations of KERV in the tammar wallaby genome...... 72 2.4.7 TE lineage activity in Macropus ...... 75 2.5 Discussion ...... 75

2.5.1 Conclusion ...... 81 CHAPTER 3: SUPERMATRIX PHYLOGENY AND MOLECULAR DATING OF THE GENUS MACROPUS ...... 82

3.1 Introduction ...... 82

3.1.1 Phylogenetic position of the Macropus sub-genera ...... 84 3.1.2 Phylogenetic position of the extinct toolache wallaby ...... 84 3.1.3 Phylogenetic position of the black-striped wallaby ...... 85 3.1.4 Phylogenetic position of the ...... 85 3.1.5 Phylogenetic position of the and black gloved wallaby ...... 86 3.1.6 Supermatrix approach ...... 87

7

3.2 Materials and Methods ...... 89

3.2.1 DNA preparation ...... 89 3.2.2 Phylogenetic reconstruction ...... 89 3.2.3 Molecular Dating ...... 93 3.3 Results ...... 94

3.3.1 Macropus outgroups and Macropus sub-genera...... 94 3.3.2 Wallabia bicolor ...... 95 3.3.3 Macropus irma ...... 96 3.3.4 Macropus dorsalis ...... 96 3.3.5 Macropus bernardus ...... 96 3.3.6 Macropus greyi ...... 97 3.3.7 Molecular Dating and lineage through time analysis ...... 97 3.4 Discussion ...... 106

3.4.1 Phylogeny of kangaroos ...... 106 3.4.2 Molecular Dating and coincidence with climatic change ...... 112 3.4.3 Conclusion ...... 115 CHAPTER 4: RECONSTRUCTING THE ANCESTRAL EVOLUTIONARY HISTORY OF MACROPUS ...... 116

4.1 Introduction ...... 116

4.2 Materials and Methods ...... 122

4.2.1 Molecular dating and ancestral habitat reconstruction in MrBayes ...... 122 4.2.2 Ancestral State Reconstructions in Bayestraits ...... 124 4.3 Results ...... 126

4.3.1 Adaptive Radiation of Kangaroos ...... 126 4.3.2 Ancestral size sexual dimorphism of kangaroos ...... 130 4.4 Discussion ...... 136

CHAPTER 5: DEEP PHYLOGENY AND DATING OF THE MACROPODIDAE USING RETROTRANSPOSONS, MITOCHONDRIAL AND NUCLEAR GENES ...... 143

5.1 Introduction ...... 143

5.2 Materials and Methods ...... 149

5.2.1 Phylogenetic reconstruction of concatenated dataset in Mr Bayes ...... 152 5.2.2 Molecular Dating in BEAST ...... 153 5.2.3 Molecular Dating in BEAST (Species Tree) ...... 154 5.3 Results ...... 155

5.3.1 Deep macropodid phylogeny ...... 155

8

5.3.2 Molecular Dating ...... 158 5.4 Discussion ...... 161

5.4.1 Phylogeny and evolutionary timescale of the Macropodidae ...... 161 5.4.2 Species trees vs concatenated trees ...... 164 5.5 Conclusion ...... 165

CHAPTER 6: DISCUSSION AND CONCLUSIONS ...... 167

BIBLIOGRAPHY ...... 173

APPENDICES ...... 203

Retrotransposon nexus matrix ...... 203

9

List of Figures

Figure 1. Time calibrated phylogeny of the seven extant orders based on 101 mitochondrial genomes and sequence from 26 nuclear loci ...... 3

Figure 2. Phylogenetic relationships of the seven marsupial orders based on 33 retrotransposon markers that have been plotted on a Bayesian tree inferred from sequence data ...... 4

Figure 3. Retrotransposon insertions favouring each of the three possible topologies among three Australidelphian marsupial clades...... 5

Figure 4. Relationships among the major marsupial clades, represented as a consensus network based on 20 Bayesian inference gene trees ...... 6

Figure 5. Time-calibrated phylogeny of kangaroos based on five nuclear genes...... 10

Figure 6. Maximum likelihood phylogenies of kangaroos illustrating the discordance between (A) mitochondrial and (B) nuclear concatenated datasets ...... 11

Figure 7. Illustration of Incomplete Lineage Sorting. Taken from (Avise, 2000)...... 16

Figure 8. Structure of the major types of transposable elements taken from (Goodier and Kazazian Jr, 2008) ...... 18

Figure 9. A summary of the process of retrotransposition, taken from (De Parseval and Heidmann, 2005)...... 19

Figure 10. Example of a DNA alignment of retrotransposons in various bird species...... 26

Figure 11. Retrotransposon ascertainment bias...... 41

Figure 12. Agarose gel electrophoresis photograph of a typical PCR screening for presence/absence of transposable elements, in various macropod species...... 48

Figure 13. ILS symmetry argument...... 55

Figure 14. Insertion ratio argument...... 57

Figure 15. Conservatism of the insertion ratio test...... 60

10

Figure 16. Kangaroo and wallaby maximum parsimony phylogeny inferred from retrotransposon data...... 64

Figure 17. Bayesian inference phylogeny, for the different sub-families of the endogenous retrovirus, MERV...... 73

Figure 18 . Bayesian Phylogeny (MrBayes) of kangaroos based on the mitochondrial genome dataset with RY coding of third codon positions of coding regions ...... 98

Figure 19. Bayesian phylogeny (MrBayes) of kangaroos based on the five-nuclear gene dataset...... 99

Figure 20. Bayesian tree (MrBayes) of kangaroos, based on the combined ‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes)...... 100

Figure 21. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the mitochondrial genome dataset ...... 101

Figure 22. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the five-nuclear gene dataset...... 102

Figure 23. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the combined ‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes)...... 103

Figure 24. Time-Calibrated Bayesian phylogeny (BEAST v1.8.1) of kangaroos based on the ‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes) ...... 104

Figure 25. Lineage Through Time Plot of Macropus/Wallabia based on the time calibrated BEAST phylogeny, showing the rate of diversification of kangaroo lineages through time... ..105

Figure 26. An evolutionary History of Browsing and , taken from (Janis, 2008)...... 113

Figure 27 . Summary of the palaeoclimate for the Australian continent since 65 Ma. Figure taken from Byrne et al (2011)...... 118

Figure 28. Sexual dimorphism in kangaroos for coat colour and body mass in two members of the sub-genus M. (Osphranter) ...... 121

Figure 29. Dated Bayesian phylogeny (MrBayes) of the Macropodidae ...... 127

Figure 30. Maximum Likelihood ancestral state reconstruction performed (Bayestraits v2) for habitat of kangaroos plotted on a time calibrated Bayesian phylogeny (BEAST). .128

11

Figure 31. Bar graph illustrating ancestral state reconstruction of macropod body mass. ...131

Figure 32. Bar graph illustrating ancestral state reconstruction of macropod body mass ratio (values are natural log of the mass in grams)...... 133

Figure 33. Maximum Likelihood ancestral state reconstruction (Bayestraits) for kangaroo mobsize plotted on a time calibrated Bayesian phylogeny (BEAST)...... 134

Figure 34. Maximum Likelihood ancestral state reconstruction (Bayestraits) for coat colour sexual dimorphism of kangaroos ...... 135

Figure 35. Bayesian Phylogeny (Mr Bayes) of the concatenated nuclear and mtDNA dataset for macropods...... 156

Figure 36. Bayesian reconstruction (BEAST) of the concatenated dataset for macropods. .157

Figure 37. Bayesian inference species tree reconstructed in BEAST for macropods ...... 159

Figure S1. Time-Calibrated Bayesian phylogeny (BEAST v1.8.1) of kangaroos based on the ‘supermatrix’ dataset ...... 211

Figure S2. Super Network generated in SplitsTree4 (Huson and Bryant, 2006) showing the conflict observed between Figures 18 – 23...... 212

Figure S3. Super Network generated in SplitsTree4 (Huson and Bryant, 2006) showing the lack of conflict observed between Figures 35 and 36...... 213

12

List of Tables

Table 1. Previous molecular dating estimates for the major divergences among ...... 31

Table 2. Taxon sampling for the 16 macropod species employed in the retrotransposon study ...... 42

Table 3. Primer list for phylogenetically informative markers, as well as for Macropus eugenii specific markers used in the heterozygous test...... 45

Table 4. Presence/absence of phylogenetically informative ERVs...... 51

Table 5. Cumulative P-values for testing prior tree hypothesis T1 on retrotransposon counts (Kuritzin et al., 2016) amended from (Waddell et al., 2001)...... 53

Table 6. Retrotransposons and target site duplications (TSDs) ...... 65

Table 7. Trifurcation results for each of the major nodes investigated in this study ...... 67

Table 8. Heterozygous test for retrotransposon insertions specific to Macropus eugenii, across multiple individuals ...... 74

Table 9. Partitioning scheme according to Partitionfinder and Phillips et al. (2013) ...... 91

Table 10. Partitioning scheme and models of evolution used based on Partitionfinderand Phillips et al. (2013). See Table 11 for expanded partition names ...... 91

Table 11. Composition of formal kangaroo clades...... 92

Table 12. Coding for ancestral state reconstruction in BayesTraits v2...... 125

Table 13. Ancestral state reconstructions for body mass and body mass ratio ...... 132

Table 14. Orders of mammals exhibiting sexual dimorphism in size...... 139

13

Table 15. Standard deviation of adult body mass between males and females ...... 141

Table 16. Primer List for the six newly sequenced nuclear genes ...... 150

Table 17. Six newly sequenced nuclear genes ...... 151

Table 18. Partitioning scheme and models of molecular evolution for the concatenated dataset consisting of four mitochondrial genes and 11 nuclear genes...... 152

Table S1. Provenance of macropod DNA samples (Chapter 2)...... 204

Table S2. Source data for sequences obtained from Genbank for the supermatrix analysis 205

Table S3. Coding for ancestral habitat reconstruction ...... 209

Table S4. Coding for ancestral habitat reconstruction ...... 210

14

List of abbreviations

ASR Ancestral State Reconstruction

BP Bootstrap support

BPP Bayesian Posterior Probability

ERV Endogenous Retrovirus

ILS Incomplete Lineage Sorting

KERV Kangaroo Endogenous Retrovirus

LINE Long Interspersed Nuclear Element

LTR Long Terminal Repeat mt Mitochondrial

Non-LTR Non-Long Terminal Repeat

NT Nucleotide(s) nuc Nuclear

SINE Short Interspersed Nuclear Element

TE Transposable Element

15

Statement of original authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Signature: QUT Verified Signature

Date: February 2018

16 Acknowledgments

I need to thank my parents and family, without whom none of this would be possible (Bill, Stella and

Mina). Only they know how much they have helped me through this challenging endeavour. I would like to acknowledge and thank my primary supervisor, Dr Matthew Phillips without whom this PhD would not be possible and for having the patience to help me through this journey. I need to thank my external supervisor, Dr Maria Nilsson-Janke, who provided invaluable help and support during my time in

Germany and long after I returned to Australia. My co-supervisor, Dr Peter Prentis provided valuable laboratory and experimental advice at QUT. I would like to acknowledge Dr Mike Bunce and Dalal

Haouchar for kindly providing DNA samples for several macropods as well as sequence data for extinct species. I need to thank Manuela Cascini for valuable assistance in laboratory work/data collection and

Hannah Maloney for collecting the black wallaroo sample. Finally, I would like to thank my colleagues in

Australia, Germany and elsewhere who helped to keep me sane throughout this incredible experience.

17

Chapter 1: Introduction

1Chapter 1: Introduction

This chapter provides the overall purpose of my research (Section 1.1), as well as the background and context in the form of a literature review (Section 1.2), the overarching objectives and thesis outline (Section 1.3), and the significance of the research (Section 1.4).

1.1 PURPOSE

In a broad sense, the purpose of my research has been to shed light on the evolutionary history of kangaroos (and related taxa) from within the family Macropodidae. This group of mammals provides an opportunity to study the processes of evolution in a unique group of that have been evolving in isolation since the separation of Australia from Antarctica, during the

Eocene, about 40 million years ago (Ma) (Wilson and Reeder, 2005). Despite considerable investigation, many of the phylogenetic relationships and the timing of key events in kangaroo evolution have remained uncertain. Therefore, I have endeavoured to approach this problem from multiple angles, using a range of markers (mitochondrial genomes, nuclear genes and retrotransposons). Retrotransposons are repetitive genetic elements that have been shown to be powerful, near-homoplasy free markers of shared ancestry. I have performed the first retrotransposon analysis of kangaroos to shed light on their phylogenetic relationships and have also addressed an ascertainment bias that arises in retrotransposon studies that utilize only a single reference genome. I present a statistical framework that will be valuable to the majority of retrotransposon analyses for the foreseeable future.

In addition, I have examined the rate of diversification among kangaroos and the evolution of key phenotypic traits through time, to investigate the evolutionary history of these iconic Australian marsupials.

1

Chapter 1: Introduction

Finally, I have explored the activity of two retrotransposons in the kangaroo genome (KERV and

LINE1), suggesting that competition of retrotransposons may have occurred in the kangaroo genome.

1.2 BACKGROUND AND CONTEXT - LITERATURE REVIEW

1.2.1 Marsupials

Marsupials evolved in the Northern Hemisphere after diverging from stem eutherians at least

130 Ma (Luo et al., 2003) and possibly as early as ~160 Ma (Luo et al., 2011, Graves and

Renfree, 2013). After a migration to South America approximately 70 – 65 Ma (Phillips et al., 2009), they arrived in Australia at least 55 Ma and rapidly diversified (Beck et al., 2008), as evidenced by specimens found at the Murgon fossil site of south-east Queensland

(Godthelp et al., 1999). There are seven extant orders within the class Marsupialia. These include the Australian (, numbat, marsupial cats/mice),

Diprotodontia (, , possums, kangaroos), (bandicoots & bilbies), Notoryctemorphia (marsupial moles), as well as their monotypic South American sister taxon Microbiotheria (Monito del Monte), which together comprise the superorder

Australidelphia. The remaining two orders, Didelphimorphia and Paucituberculata make up the potentially paraphyletic superorder (Aplin and Archer, 1987, Marshall et al., 1990, Szalay, 2006, MacKenna et al., 1997). At present, there remain some conflicts in the relationships among the seven orders, with traditional sequence based phylogenetic reconstruction favouring a sister relationship between the Paucituberculata and all other orders of marsupials (Figure 1), while retrotransposons suggest that Didelphimorphia are the sister group to all other orders (Gallus et al., 2015b), see Figure 2.

2

Chapter 1: Introduction

Figure 1. Time calibrated phylogeny of the seven extant orders based on 101 mitochondrial genomes and sequence from 26 nuclear loci. Families are noted at the tips of the tree, while orders and higher level taxa are noted on the internal branches, Figure taken from Mitchell et al. (2014).

3

Chapter 1: Introduction

Figure 2. Phylogenetic relationships of the seven marsupial orders based on 33 retrotransposon markers that have been plotted on a Bayesian tree inferred from sequence data (from 28 nuclear gene fragments). Retrotransposon support is represented by orange circles. Grey circles indicate retrotransposon support taken from Nilsson et al. (2010). Figure from Gallus et al. (2015b).

Relationships at the ordinal level among the Australian marsupials of have also been uncertain. Previous phylogenetic reconstructions have favoured all three possible topologies between the Notoryctemorphia, Peramelemorphia and the Dasyuromorphia

4

Chapter 1: Introduction

(Gallus et al., 2015b), see Figure 3, with similar support values for all three groupings, which suggests that more data is required for resolution, or that this particular trifurcation may indeed be unresolvable and best represented as a network (Gallus et al., 2015b), see

Figure 4.

Figure 3. Retrotransposon insertions favouring each of the three possible topologies among three Australidelphian marsupial clades. Orange circles represent phylogenetically informative retrotransposon insertions supporting each topology. Figure taken from Gallus et al. (2015b).

5

Chapter 1: Introduction

Figure 4. Relationships among the major marsupial clades, represented as a consensus network based on 20 Bayesian inference gene trees, to illustrate the conflict observed between the major marsupial orders. The red box represents the trifurcation between the

Dasyuromorphia, Notoryctemorphia and Peramelemorphia. Figure taken from (Gallus et al.,

2015b).

1.2.2 Kangaroos and their kin

The kangaroo sub-order macropodiformes includes ~70 of the ~125 species within the marsupial order Diprotodontia. Macropodiformes is one of the most extensively studied marsupial groups from a molecular phylogenetic perspective, however many of the relationships within this clade remain uncertain. This enigmatic group has been studied using a number of methods including DNA hybridization (Kirsch et al., 1995), micro-compliment fixation (Baverstock et al., 1989), morphology (Tate et al., 1948, Dawson and Flannery,

6

Chapter 1: Introduction

1985, Prideaux and Warburton, 2010) and DNA sequence analysis (Westerman et al., 2002,

Meredith et al., 2008, Phillips et al., 2013). More recently, the entire nuclear genome of the species Macropus eugenii was sequenced at 2 fold coverage using a combination of Sanger sequencing and Next Generation sequencing approaches (Renfree et al., 2011). This represents a significant milestone in the study of macropodid systematics. However, despite extensive research, many of the relationships among kangaroos remain poorly resolved and different studies have drawn conflicting conclusions, at a number of taxonomic levels

(Cardillo et al., 2004, Meredith et al., 2008b, Prideaux and Warburton, 2010, Phillips et al.,

2013).

Macropodiformes have long been classified into three distinct lineages:

Hypsiprymnodontidae, which is represented by a single extant species, the omnivorous

Hypsiprymnodon moschatus (musky rat kangaroo); , which consists of mixed fungivorous/folivorous/omnivorous and ; and Macropodidae, which is the most species rich group and consists of the larger grazing/browsing kangaroos, wallaroos, wallabies, tree kangaroos and (Wilson and Reeder, 2005). The earliest macropodiformes known from the fossil record appear in the Etadunna formation in South

Australia, and date to about 26 Ma (Prideaux and Warburton, 2010). It remains unclear whether these are more closely related to macropodines, potoroines. It appears that some specimens are closer to Hypsiprymnodontids, while others group with the Macropodoidea

(potoroids and macropodids), although exact placements are controversial (Kear et al.,

2007). It is likely that at least some are more closely associated with .

Further fossil record evidence has revealed that each of these groups became more diverse during the Middle to Late Miocene (Prideaux and Warburton, 2010).

7

Chapter 1: Introduction

Early morphological and ecological studies produced conflicting phylogenies for the placement of . Some suggested a basal position for

Hypsiprymnodontidae relative to all other Macropodidae (Raven and Gregory, 1946), while others favoured a closer association of Hypsiprymnodontids with potoroids, to the exclusion of the Macropodidae (Tate et al., 1948, Flannery, 1984). More recent molecular work has confirmed that the musky rat kangaroo is basal to all other macropodiformes (Burk et al.,

1998, Burk and Springer, 2000, Meredith et al., 2008b).

Furthermore, inter-generic relationships within the Macropodidae remain poorly resolved.

This family contains 11 extant genera made up of over 60 species of kangaroos, wallaroos and wallabies. Within this group, the placement of the genus Lagostrophus, relative to the extinct short-faced kangaroo () clade has been a controversial issue (Westerman et al., 2002). Early work, such as Flannery (1983) suggested that Lagostrophus was the only surviving member of Sthenurinae, based on morphology/fossil evidence, however Prideaux

(2004) concluded that Lagostrophus was not part of the sthenurine group. More recently, a whole mitochondrial genome study showed that Lagostrophus is sister to all other

Macropodidae, but statistical analysis could not rule out a sister relationship to Potoroidae and noted that the position of Lagostrophus relative to the sthenurine clade would remain unresolved until adequate ancient DNA sequence data could be obtained (Nilsson, 2006).

More recently still, ancient DNA analyses that include ~1kb of mtDNA have suggested that Lagostrophus belongs to a distinct lineage separate to the sthenurines, however statistical support was relatively low leading to relatively poor resolution (Llamas et al., 2015).

Additional studies among extant species only have increased support for the placement of

Lagostrophus as sister to all other macropodids and support for this placement appears to be

8

Chapter 1: Introduction robust (Meredith et al., 2008b, Burk and Springer, 2000, Westerman et al., 2002). However, the phylogenetic relationships among the remaining genera remain uncertain (Meredith et al., 2008b). At present, there is essentially a six-way polytomy of the major branches among

Macropodidae - Onychogalea, Macropus/Wallabia, Lagorchestes, Setonix, Dorcopsini (the

New Guinean Dorcopsis and Dorcopsulus) and Thylogale/Petrogale/Dendrolagus. These relationships require further investigation by examining additional genetic loci (additional nuclear and mitochondrial genes) and/or utilizing retrotransposon insertions as a means to resolve this phylogeny.

Additional uncertainty is present within the genus Macropus, which consists of three sub- genera: M. (Macropus), the grazing eastern grey and western grey kangaroos; M.

(Osphranter), the grazing kangaroos and wallaroos; and M. (Notamacropus), which includes grazing/browsing wallabies (Phillips et al., 2013). Each of these subgenera is currently taxonomically contentious. I have utilized the traditional status of subgenus in concordance with most other recent molecular and palaeontological phylogenetic studies, such as May-

Collado et al. (2015) and Butler et al. (2016). However more recently studies favouring elevating these subgenera to the level of genus have been suggested, eg. (Zachos, 2015,

Vernes, 2016, Eldridge et al., 2017). Based on a study using five nuclear genes, (Meredith et al., 2008b) suggested that the monotypic swamp wallaby (Wallabia bicolor) should be nested within a paraphyletic genus Macropus, possibly as a sister group to M.

(Notamacropus) (see Figure 5). However, (Phillips et al., 2013) found this conclusion to be incongruent with mitochondrial DNA and showed that this could also be due to incomplete lineage sorting among the nuclear genes (Phillips et al., 2013) (Figure 6). Another phylogenetic conflict within Macropus is the placement of Macropus irma. An investigation by (Meredith et al., 2008b) using nuclear genes placed M. irma as sister to other members of

M. (Notamacropus) (see Figure 5), however (Phillips et al., 2013) showed that mitochondrial

9

Chapter 1: Introduction Potoroidae

Macropodidae

Figure 5. Time-calibrated phylogeny of kangaroos based on five nuclear genes. Timeline in millions of years before the present for kangaroo diversification. Grey bars denote 95% credibility intervals; fossil constrained nodes are indicated with open circles. Plio. =

Pliocene. Figure modified from (Meredith et al., 2008b).

10

Chapter 1: Introduction

Figure 6. Maximum likelihood phylogenies of kangaroos illustrating the discordance between (A) mitochondrial and (B) nuclear concatenated datasets, with RaxML bootstrap support values above branches and MrBayes Bayesian posterior probabilities below branches. Asterisks indicate full support. Clades including members of Macropus are shaded. Figure taken from (Phillips et al., 2013).

11

Chapter 1: Introduction analyses place M. irma with M. (Osphranter) (Figure 6) and postulated that this was most likely due to mitochondrial introgression from a wallaroo (Phillips et al., 2013).

Interestingly, this would be the deepest introgression event identified in marsupials (Phillips et al., 2013) and appears to originate from a position in the tree near where the enigmatic black wallaroo (Macropus bernardus) is expected to have diverged. Inclusion of the black wallaroo mitochondrial genome sequence will help to clarify this expectation.

The black wallaroo is perhaps the least well studied among all kangaroos. Very few studies have investigated this species due to its elusive behaviour, isolated range within Arnhem

Land and habitat preference for rugged terrain (Telfer, 2008). To date, the only molecular phylogenetic analysis that has included the black wallaroo utilized only a short sequence of control region DNA from the mitochondrion, which proved phylogenetically uninformative regarding its placement with the other wallaroos, Macropus robustus and Macropus antilopinus (Eldridge et al., 2014).

Similarly, the phylogenetic position of the now extinct toolache wallaby, thought to be close to the black-gloved wallaby (Cardillo et al., 2004) has been unable to be confirmed, because no molecular phylogenetic analysis has yet been performed. The toolache wallaby likely became extinct in the 1930s. As such, the only available specimens require the use of ancient

DNA techniques to extract useable quantities of DNA for analysis.

Thus despite being one of the most intensively studied marsupial groups, the phylogeny of macropods remains poorly resolved, and further investigation is required to resolve the phylogenetic relationships of this iconic group of Australian marsupials.

12

Chapter 1: Introduction

1.2.3 Phylogenetic Reconstruction Using Molecular Sequence Data

Phylogenetic reconstruction has traditionally been performed using morphological characters, however sequence based reconstruction using molecular data (primarily DNA, but also including

RNA and amino acids) has emerged in recent decades. Reconstructing phylogenies from molecular sequence data has been a key development in evolutionary biology for inferring the relationships among taxa (Zuckerkandl and Pauling, 1965). As sequencing technologies have improved, molecular phylogenetic reconstruction has expanded to describe not only species relationships, but also paralogues within a gene family, population histories and the dynamics of viruses and other pathogens, among others (Sanderson and Doyle, 1992, Wright, 1968, Rambaut et al., 2001).

A number of computational techniques are routinely used to analyse DNA and to reconstruct phylogenetic trees to provide a representation of inter-relatedness between species and/or between individuals within the same species at the population level (Stiller et al., 2009,

Horner et al., 2010). These methods include the Maximum Parsimony approach (Hennig,

1965, Barnabas et al., 1972, Baba et al., 1981, White and Holland, 2011), Maximum

Likelihood (Hutchinson, 1929, Felsenstein, 1981, Rohlf and Wooten, 1988, Felsenstein,

1988, Yang, 1997) and Bayesian Inference (Li et al., 2000, Rannala, 2002, Shoemaker et al.,

1999). All seek to statistically describe the relationships of molecular sequences and represent them as a phylogenetic tree, which formally is a directed, acyclic graph (Steel,

2016).

The maximum parsimony (MP) approach is based on Occam’s razor, and seeks to identify the phylogenetic tree that requires the minimum number of state changes to explain an observed set of data. This data may be anything from a particular phenotypic trait, an amino acid sequence or a DNA sequence from homologous genomic regions of different organisms.

The method attempts to explain the data using a minimal number of changes, following an a priori assumption that change is improbable (Felsenstein, 1981). This method is suitable for

13

Chapter 1: Introduction analysing DNA sequences with a constant rate of molecular evolution across all lineages and relatively few changes overall across the phylogenetic tree (Kimura, 1983). However, when different lineages/sequences have different rates of evolution, the parsimony method may yield statistically inconsistent results (converging on the incorrect tree with increasing consistency). This tendency to converge on an incorrect tree increases as longer sequences are considered for the same set of species being analysed (Felsenstein, 1981). Even when different lineages have the same rate of evolution, but branch-lengths differ across the tree the method can be statistically inconsistent (Hendy and Penny, 1989).

While the MP approach may be justifiably used in cases where the number of state changes over evolutionary time is small and when the mutation rate across sequences and lineages is constant, many data sets do not adhere to these characteristics. In such cases it may become pertinent to utilize other statistical methods of tree reconstruction. When dealing with large data sets that possess unequal rates of mutation across lineages and sequences, the maximum likelihood (ML) approach becomes a useful tool for tree reconstruction (Felsenstein, 1981).

Maximum likelihood principally differs from parsimony in that it explicitly models the evolutionary process, while parsimony does not (or at best, can be considered a “no common mechanism model”) (Tuffley and Steel, 1998). ML essentially seeks to find the evolutionary tree and sequence evolution model that has the highest probability of evolving the observed data (Felsenstein, 1981). Note that this is not the probability that the tree is correct, but rather the probability of the data, given the hypothesis, which is a phylogenetic tree and an evolutionary process. This probability or ‘likelihood’ is calculated as a function of the hypothesis (the tree and model), rather than a function of the data (Felsenstein, 1981).

The usefulness of the ML approach is in its tendency to yield more consistent results as more characters are included in the study. In fact, when large numbers of characters are included in a run, the ML approach will in most cases vary less from the true tree compared to other methods (Rohlf and Wooten, 1988). Conversely, this approach may yield biased results

14

Chapter 1: Introduction when the number of characters in a study is small and can occasionally be susceptible to

“long branch repulsion” (Rohlf and Wooten, 1988). Both MP and ML have an upper limit to the number of species that can be realistically analysed at any one time since the number of possible trees increases exponentially as more species are added. Consequently, the sheer number of possible trees that are generated are limited by available computing power to handle the number of possible tree topologies to be explored (White and Holland, 2011), however this is partly solved by heuristic searches.

Finally, the Bayesian approach to phylogenetic reconstruction (Huelsenbeck et al., 2001b) is a statistical method that makes use of an algorithm known as Bayesian Markov chain Monte

Carlo (MCMC) (Li et al., 2000). Bayesian inference is essentially a likelihood method, with the primary difference being that the posterior probability takes into account prior knowledge about a given system (Huelsenbeck et al., 2001a, Glenner et al., 2004,

Drummond and Rambaut, 2007). In addition to performing phylogenetic reconstruction, the reconstruction of demographic histories can also be easily performed with Bayesian

Inference, by the utilization of Coalescent Theory (Kingman, 1982). This statistical method traces genealogies back through time to predict time to a most recent common ancestor

(MRCA) and various population genetic parameters. An example is the variable population size coalescent model, introduced by Griffiths and Tavare (1994), Bayesian Skyline plots

(Drummond et al., 2005) and the Bayesian Skyride plot (Minin et al., 2008). For phylogenetic reconstruction at the species level and above, Bayesian inference is a powerful tree building method that can take prior information into account, however ML bootstrap values are typically more faithful than typically overconfident Bayesian posterior probability

(BPP) values (Suzuki et al., 2002, Gontcharov et al., 2004, Phillips and Pratt, 2008), unless the model is a very close match to the evolutionary process that has taken place. Bayesian inference programs include MrBayes (Huelsenbeck and Ronquist, 2001) and BEAST

(Drummond et al., 2012), which focus on within and between species phylogenetic

15

Chapter 1: Introduction inference, and BATWING (Wilson et al., 2003), which has a focus on population genetics of microsatellites, using a coalescent approach.

A major problem associated with phylogenetic reconstruction is that conflicting signals can be obtained between different genetic loci that support alternative and conflicting tree topologies.

More generally, this problem can be described as a discordance between any given gene tree and the true species tree (Doyle, 1997). This conflict of phylogenetic signal is a major problem in evolutionary biology and can arise for a number of reasons. Such conflict may arise due to i) homoplasy – character states that are shared between species, but that did not arise through common ancestry, but instead either through evolutionary convergence, parallelism or character reversals; ii) Incomplete lineage sorting (ILS) (Figure 7), often referred to as hemiplasy

(Robinson et al., 2008) is a phenomenon that occurs in cases of rapid speciation, in which unfixed genetic loci may be differentially allocated to lineages in a tree, such that the resulting gene tree topologies may not conform with the true species tree; iii) Introgression – the sharing/transferring of genetic information between lineages as a result of hybridization and repeated back crossing (Anderson and Hubricht, 1938, Anderson, 1949).

Figure 7. Illustration of Incomplete Lineage Sorting. Taken from (Avise, 2000). Gene genealogies within a three-species phylogeny are shown. Two possible gene trees are represented by thin dark lines, with the gene tree in (b) showing a qualitative topological discordance with the species tree. Contrastingly, gene tree (a) is agreement with the species tree.

16

Chapter 1: Introduction

ILS and introgression are especially prevalent in cases where species diversify over a short period of time (i.e. when branch lengths in a phylogenetic tree are short) (Felsenstein, 2004). Cases of difficult to resolve phylogenies due to lineage sorting effects have been described across a wide range of taxa (Avise et al., 1983, Pamilo and Nei, 1988, Takahata, 1989, Doyle, 1992, Maddison,

1997, Rosenberg, 2002, Rosenberg, 2003). In the case of kangaroos, specifically in the genus

Macropus/Wallabia, previous time-calibrated phylogenies have resulted in very short branch lengths, indicating that the major divergences (between each of the Macropus subgenera, as well as Wallabia) took place over a period of 1 – 2 Ma (Meredith et al., 2008b, Phillips et al., 2013), which makes this clade particularly prone to the effects of ILS. Similarly, the tendency for introgression to confound phylogenetic reconstruction has been described in fishes (Smith, 1992), plants (Ellstrand et al., 1999) and reptiles (Leaché and McGuire, 2006). As such, it is useful to utilize a variety of markers for phylogenetic reconstruction. A novel marker system that has emerged with the advent of whole genome sequences are retrotransposon insertions.

1.2.4 Molecular Phylogenies Using Transposable Elements

The discovery of transposable elements (TEs) by Barbara McClintock in the 1950s revolutionized the field of genetics and earned her the 1983 Nobel Prize for Physiology or

Medicine. The phenomenon that she described (coined ‘transposition’) is the tendency of certain mobile genetic elements to relocate within the genome (McClintock, 1956). Several types of TEs are now known to exist, and can be found in the genomes of virtually all eukaryotes (Shedlock and Okada, 2000).

The classification of transposable elements is based on their structural features (see Figure 8) and the mechanisms that they employ to mobilize and propagate within the genome. There are two major classes of transposable elements that have been described (Finnegan, 1989,

Wicker et al., 2007b).

17

Chapter 1: Introduction

Figure 8. Structure of the major types of transposable elements taken from (Goodier and Kazazian Jr, 2008)

1.2.4.1 Class II: DNA Transposons

Historically referred to as ‘jumping genes,’ DNA transposons are mobile elements that operate via a ‘cut and paste’ mechanism, in which the DNA sequence is excised and re- inserted into a different genomic location, by the action of a transposase (Clark and Kidwell,

1997, Hartl et al., 1997). DNA transposons make up less than 2% of the human genome

(Kazazian and Moran, 1998).

18

Chapter 1: Introduction

1.2.4.2 Class I: Retrotransposons

These transposable elements undergo a ‘copy and paste’ mechanism, in which the DNA sequence is copied into an RNA intermediate, which is then reverse transcribed back into

DNA (by the action of reverse transcriptase) and then integrated into a new genomic location (Figure 9) (Weiner et al., 1986, Schmid, 1996, Shedlock and Okada, 2000). This process results in a new copy of the original DNA sequence and therefore causes the genome to increase in size. As a result of this process, retrotransposons are, by far, the most abundant class of transposable element and make up a significant fraction of eukaryotic genomes - up to 45% in humans (Lander et al., 2001), in excess of 50% in marsupials (Gentles et al.,

2007) and as high as 90% in certain plant species (SanMiguel et al., 1996).

Figure 9. A summary of the process of retrotransposition, taken from (De Parseval and

Heidmann, 2005). Retroelements are transcribed into an RNA intermediate followed by reverse transcription back into DNA and integration of the new copy into a virtually random location within the genome.

Retrotransposons can be further divided into two categories based on their structure and mode of propagation: The first category consists of long terminal repeat (LTR)

19

Chapter 1: Introduction retrotransposons, which consist of the repeat element flanked by direct repeat sequences

(Wicker et al., 2007a, McCarthy and McDonald, 2004). The second category of retrotransposons lack long terminal repeats (non-LTRs) and consists of: short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs) (Okada et al.,

2004). Both LTRs and non-LTRs encode the enzyme reverse transcriptase, with the exception of SINEs. As such, SINEs are classified as non-autonomous retrotransposons and rely on the replicative machinery of the autonomous elements (such as LINEs) in order to propagate within the genome (Dewannieux et al., 2003).

1.2.4.3 LTR retrotransposons - endogenous retroviruses

The most abundant and widespread LTR retrotransposons are endogenous retroviruses

(ERVs), which make up ~8% of the human genome (Belshaw et al., 2004, Lander et al.,

2001). ERVs have been identified in almost all examined genomes and are thought to arise when a free-living retrovirus integrates into a germ cell of a host organism, resulting in vertical transmission of the retrovirus along with the genome of the host (Löwer et al., 1996, Gifford and Tristem, 2003). ERVs share many structural and functional similarities with exogenous retroviruses (Belshaw et al., 2004, Gifford and Tristem, 2003) and ERV structure is largely indistinguishable from the proviruses that result from the integration of exogenous retroviruses into somatic cells (De Parseval and Heidmann, 2005).

This proviral form is characterized by two long terminal repeats (LTRs) of approximately

300 – 1200 nucleotides in length that flank a coding region of approximately 5 – 10kb. The retroviral gag, pol, prt and env genes make up this coding region and contain the machinery necessary for propagation (see Fig. 2) (Stoye, 2012, De Parseval and Heidmann, 2005).

The vast majority of such insertion events tend to be either i) deleterious, such that they are quickly selected against, or ii) have little to no effect on the biology of the host (Malik, 2012,

Patel et al., 2011). In rare cases, the insertion of a retrovirus can benefit the host organism. A

20

Chapter 1: Introduction notable example is the acquisition of the gene encoding the polyprotein, syntycin, by eutherian mammals. This gene, which plays a crucial role in development of the placenta and embryogenesis, is a domesticated retroviral envelope (env) gene (Lavialle et al., 2013,

Cornelis et al., 2012). The derived envelope allows extended uterine development compared to marsupials, by providing a barrier between the mother and , which prevents the mother’s immune system from attacking the embryo, which includes foreign (paternal)

DNA.

Recently integrated ERVs will likely retain the ability to replicate for some time after the insertion event, and may continue to introduce additional germline insertions either by retrotransposition or re-infection (Herniou et al., 1998, Coffin et al., 1997). In the context of populations, ERVs may increase in frequency in the gene pool of the host population as a result of genetic drift or ‘hitchhiking’ effects (Smith and Haigh, 1974). ERV insertions can remain in the genome for millions of years, while gradually acquiring mutations at the neutral rate of the host. Through this process, a nascent ERV will eventually lose the machinery necessary for extra-cellular transmission (horizontal transfer) and will propagate exclusively in a vertical transmission fashion along with the genome of the host (Gifford and

Tristem, 2003).

The koala retrovirus (KoRV) is a rare example of a retrovirus that appears to be in the process of transition between exogenous and endogenous states (Tarlinton et al., 2006). This virus has characteristics of ERVs in that it has been shown to be present in koala germ cells, and that it is passed on to offspring in a vertical fashion. However, an examination of koala somatic cells (blood samples) from geographically distinct populations revealed differences in the prevalence of KoRV across these populations. KoRV was completely absent from an isolated island koala population (), showed mixed prevalence in a southern mainland population (), and was found to be uniformly present in a northern mainland population (Queensland) (Tarlinton et al., 2006). Thus, it appears that the KoRV proviral sequence has only recently become endogenized within the koala population, while

21

Chapter 1: Introduction retaining replication competence, resulting in ongoing activity by repeated re-infection events. Furthermore, this retrovirus has been associated with neoplastic disease in the koala population across mainland Australia (Tarlinton et al., 2006, Tarlinton et al., 2005). This rare example of a recently integrated ERV demonstrates the impact that ERV insertions can have on their host genome and the populations in which they occur.

The majority of ERVs currently under investigation originated from much more ancient integration events. The kangaroo endogenous retrovirus (KERV) was first identified in an interspecific of two closely related wallaby species (O'Neill et al., 1998). It has been suggested that KERV is present in all extant marsupials (Ferreri et al., 2005, Ferreri et al.,

2004), which implies an insertion event >80 Mya. Given that ERVs typically undergo short periods of activity, followed by accumulation of mutations or deletion, this claim warrants further investigation.

1.2.4.4 ERVs in genome evolution and host interactions

ERVs have the capacity to shape the genomes and impact the survival of the host in a number of ways. ERVs can influence gene regulation, function and expression via their

LTRs, which contain various regulatory elements, such as polyA signals, enhancers and promoters. These regulatory signals can dramatically affect RNA expression of both retroviral and non-retroviral sequences in the host (Leib-Mosch et al., 2005).

Expression of ERVs in certain tissues can also have a marked influence on the biology of the host. For example, the KoRV retrovirus has been associated with neoplastic disease

(Tarlinton et al., 2005), while certain human endogenous retrovirus (HERV) families have been associated with other disease states, such as breast cancer (Frank et al., 2008) and

Hodgkin's lymphoma (Kewitz and Staege, 2013).

22

Chapter 1: Introduction

Furthermore, ERVs can shape the genomic landscape of their hosts by increasing the size of the genome or by causing chromosomal rearrangements and/or deletions (Ferreri et al.,

2011). One mechanism by which this can occur is through a process of non-allelic homologous recombination between the LTR regions of ERVs (Ferreri et al., 2011). Since

LTRs are identical at the time of integration into the host genome, if multiple copies are present in the genome, then homologous recombination between these LTRs can result in chromosomal rearrangements and/or deletions (De Parseval and Heidmann, 2005). This process often causes deletion of the intervening proviral sequence as well, resulting in large numbers of solo LTRs within the genome (Benachenhou et al., 2009). In the majority of species examined to date, the overall number of solo LTRs in the genome is approximately

10-fold greater than the number of proviral sequences present (Benachenhou et al., 2009).

Thus, investigation of LTR sequences within a genome can provide insight into the activity/amplification of a particular ERV (by examining overall copy number) and the timing of this activity (by examining sequence divergence between LTRs) (Ferreri et al.,

2011, De Parseval and Heidmann, 2005). Thus it appears that ERVs have significantly influenced the genomic landscape of their hosts and further investigations into these processes are required. However, the interaction of ERVs with other transposable elements may also exert an influence on genome evolution.

There is some evidence to suggest that competition between different retrotransposons may take place within a genome. A study carried out by (Cantrell et al., 2005) showed that there was an inverse relationship between expansion of an ERV family (MysTR) and LINE1 depletion, in a South American (Oryzomys). Thus, ERV expansion was correlated with LINE1 depletion in this case and hints at the possibility of competition between these two retrotransposons (Cantrell et al., 2005). Contrastingly, Ferreri et al. (2011) looked at the interaction between an ERV (KERV) and LINE1 in macropods (specifically M. eugenii, M. rufogriseus and Petrogale rothschildi) and showed that there was no correlation in this case

(Ferreri et al., 2011). However, the study by Ferreri et al. (2011) used only a limited number

23

Chapter 1: Introduction of species. Therefore this phenomenon warrants a more in depth analysis across a broader range of species.

1.2.4.5 Non-LTR retrotransposons (SINEs and LINEs)

The other major class of retrotransposons is the non-LTR retrotransposons. These elements differ from LTR retrotransposons in a number of ways, most notably by the absence of long terminal repeat sequences and the mechanism that they employ to propagate within the genome. By using a mechanism known as target primed reverse transcription (TPRT), the

RNA of non-LTR retrotransposons is reverse transcribed back into DNA at the site of integration, unlike LTR retrotransposons which typically reverse-transcribe in the cytoplasm

(Luan et al., 1993).

LINEs are a class of non-LTR retrotransposon that have successfully populated the genomes of a wide range of eukaryotic lineages. These elements are made up of a 5’ untranslated region (UTR), two open reading frames (ORFs) separated by a spacer region, followed by a

3’ UTR (Swergold, 1990). ORF1 encodes an RNA binding protein, while ORF2 encodes reverse transcriptase and an endonuclease. The 3’UTR contains the poly adenylation signal, while the 5’UTR often contains a promoter site (Swergold, 1990).

SINEs are another successful group of non-LTR retrotransposons that are often present at more than 10^4 copies per genome, and are therefore the most abundant mobile genetic element in the eukaryotic genome (Shedlock and Okada, 2000). The majority of SINEs are derived from t-RNAs (Okada, 1991, Ohshima et al., 1996), except for primate Alu1 and rodent B1 families, which are derived from the 7SL RNA (Ohshima et al., 1996, Ullu and

Tschudi, 1984, Krayev et al., 1980). In addition, 5S rRNA derived SINES have been found in fishes (Nishihara et al., 2006) and mammals. The fact that SINEs do not encode for the enzyme reverse transcriptase dictates that they are dependent on other mobile elements for propagation. Typically, any given SINE will be dependent on a corresponding LINE for its

24

Chapter 1: Introduction propagation (Kajikawa and Okada, 2002). This occurs because many SINES and LINES share the same 3’ end sequences, in which the recognition site for reverse transcriptase is located. This structural relationship between SINES and LINES has led to the hypothesis that the origin of t-RNA derived SINES is due to a recombination event between a tRNA primer and the 3’ends of LINES (Ohshima et al., 1996).

There are two major models that describe the propagation of SINEs. The first is the Master

Gene Model (Shen et al., 1991, Shedlock and Okada, 2000). This model is based on the assumption that a master gene (or genes) gives rise to non-propagating copies. Thus, the rate of amplification of new SINEs is dependent on the activity of the master gene. Since mutations acquired in the master gene will affect its activity, this model predicts that the master gene will be under some kind of selective constraint in order to protect it from accumulated mutations that might affect its activity. The second model is known as the

Multiple Source Gene Model of SINE propagation (Matera et al., 1990, Schmid and Maraia,

1992, Shedlock and Okada, 2000). This model predicts that multiple SINE offspring have the capacity to propagate themselves in the same way as the parent copy (Schmid and

Maraia, 1992).

1.2.4.6 Retrotransposons as phylogenetic markers

The utilization of retrotransposons as phylogenetic tools has played an important role in resolving difficult phylogenetic questions.

Retrotransposons have been shown to be virtually homoplasy-free markers of shared ancestry, because of their near-random nature of insertion in the genome (Figure 10)

(Shedlock and Okada, 2000). Homoplasious characters (i.e. traits that arise through convergence/parallelism/reversal, rather than shared ancestry) can confound phylogenetic signal by incorrectly grouping distantly related species together. DNA sequence-based phylogenetic analyses are particularly susceptible to homoplasies, because DNA has only

25

Chapter 1: Introduction four possible character states (GATC). Thus the likelihood of independent lineages acquiring identical characters at a given site (convergence) is high. This problem is compounded when dealing with species that have undergone rapid radiations, where short branches in between divergences leave little phylogenetic signal to compete with hemiplasy or potentially longer terminal branches.

Figure 10. Example of a DNA alignment of retrotransposons in various bird species.

Retrotransposons are in grey boxes. Target site duplications (direct repeats) are in black boxes. The presence of a retrotransposon at the same locus is a reliable marker of shared ancestry. Figure taken from (Suh et al., 2011).

Retrotransposons propagate in a unidirectional fashion, that is, newly inserted copies are rarely subjected to specific removal, but rather remain in the genome and gradually accumulate mutations over time. This unidirectional mode of propagation practically precludes retrotransposons from “back mutations”, one of the main problems associated with sequence based methods. Thus the presence of a retrotransposon at the same locus, in two or

26

Chapter 1: Introduction more species, is a reliable marker of shared ancestry. In some cases one retrotransposon can integrate into another one (SanMiguel et al., 1996). This phenomenon of 'nested retrotransposons' can be used as a method to date the relative ages of the elements involved, since younger elements can jump into older ones, but not vice versa.

Retrotransposons can also provide insight into the processes of introgression (the movement of genetic information from one species into another by hybridization) and incomplete lineage sorting (the differential allocation of genetic alleles from an ancestral population into newly diverged daughter species). Potential examples of both of these phenomena can be found within the kangaroo genus, Macropus (Phillips et al., 2013). Mitochondrial introgression from a wallaroo ancestor into Macropus irma was suggested, based on the apparently close relationship of Macropus irma with wallaroos, in an analysis of a 6kb mtDNA fragment. Introgression can cause significant incongruence when comparing phylogenies based on mtDNA to those based on nuclear data. Similarly, incomplete lineage sorting can confound phylogenetic analyses because of the random allocation of alleles to rapidly diverging lineages that may not reflect the true species tree. An example may be the differing placements of the genus Wallabia relative to Macropus which varies depending on the genes investigated (Phillips et al., 2013).

Retrotransposons have been successfully utilized as phylogenetic markers in groups for which traditional sequence-based methods were inadequate. They have been used to examine the phylogenies of several mammalian species (Shedlock and Okada, 2000, Okada et al.,

2004, Shedlock et al., 2004, Nishihara et al., 2005, Kriegs et al., 2006, Nilsson et al., 2010a,

Kramerov and Vassetzky, 2011). Notable examples include the placement of cetaceans

(whales, dolphins and porpoises) as sister to the Hippopotamidae within Artiodactyla

(Shimamura et al., 1999) as well as relationships among the orders of marsupials (Nilsson et al., 2010a, Munemasa et al., 2008).

27

Chapter 1: Introduction

The retrotransposon approach is not completely free of problems (i.e. some retro-elements have been shown to preferentially insert within specific motifs and specific deletions of retro-elements, although rare, can lead to inaccurate phylogenies) (van de Lagemaat et al.,

2005). However, despite these issues, retrotransposons have been shown to be effective, near homoplasy free markers for phylogenetic analysis due to the near random nature of insertion.

In rare cases in which retrotransposons have been unable to resolve specific relationships, they have been used to show that divergence between species occurred rapidly, with conflict likely due to incomplete lineage sorting (Nishihara et al., 2009). In addition, molecular dating can be performed by examination of acquired mutations of SINE flanking regions

(van de Lagemaat et al., 2005, Kramerov and Vassetzky, 2011) or ERV LTR sequences

(Feschotte and Gilbert, 2012).

1.2.5 Adaptive Radiations

Adaptive radiations are characterized by a rapid diversification of organisms from a single lineage into an array of different forms or niches. This burst in speciation and phenotypic variation occurs when a change in the environment or community assembly makes new resources and ecological niche space available and results in species with morphological, physiological and biochemical traits that allow them to exploit these new environments

(Simpson, 1944, Simpson, 1955, Schluter, 2000). Here I consider niches as a joint property of individuals and the community/environment, rather than a fixed ecospace. A signature of adaptive radiations is a surge in both taxonomic and ecological diversity within a clade, with a strong correlation between the phenotypes of the resulting species and their environmental niches (Schluter, 2000, Simpson, 1953). This phenomenon has been identified to have occurred in a number of clades. Some well-known examples include finches (Tebbich et al.,

2010), anolis lizards (Johnson et al., 2010) and cichlid fishes (Muschick et al., 2012).

28

Chapter 1: Introduction

Perhaps one of the most well-known examples of an adaptive radiation is that which occurred in mammals following the mass event at the -Palaeogene

(KPg) boundary, which led to the demise of the non-avian Dinosaurs (Renne et al., 2013,

Alvarez et al., 1980). This mass extinction, which has been associated with a bolide impact at approximately 66 million years ago (Alvarez et al., 1980, Kuiper et al., 2008, Renne et al.,

2015), marks the transition from the end of the Mesozoic to the beginning of the Cenozoic era. Evidence from the fossil record suggests that mammalian diversity greatly increased in the early Cenozoic, presumably to fill the newly vacated ecological niches that were previously filled by Dinosaurs (Carroll, 1997, Alroy, 1999, Phillips, 2015b). However, the conclusions that have been reached using this paleontological approach differ considerably compared to those inferred using DNA sequences from modern taxa. The discrepancy between divergence dates obtained from the fossil record as opposed to molecular methods has led to a number of competing models that describe the origination and diversification of the extant orders of mammals, such as the Long Fuse Model (Penny and Phillips, 2004,

Hunter and Janis, 2006, Foote et al., 1999, Archibald and Deutschman, 2001) and Short Fuse

Model (Archibald and Deutschman, 2001, Kumar and Hedges, 1998, Cooper and Penny,

1997)

Another example is the much more recent adaptive radiation of kangaroos. Within a few million years the kangaroos and wallabies radiated into a wide array of browsers and grazers occupying almost all terrestrial habitats in Australia and New Guinea - even moving back into the trees, in the case of the tree kangaroos (Dendrolagus). This represents one of the most remarkable, rapid diversification among marsupials. Ecological and morphological aspects of this diversity are well known (Prideaux and Warburton, 2010, Flannery, 1984,

Flannery, 1989), but molecular analyses can help to place this diversity into a temporal and phylogenetic framework, from which to trace the adaptive radiation.

29

Chapter 1: Introduction

1.2.6 Reconstructing Evolutionary History Through Molecular Dating

Molecular dating is the process in which the time since the evolutionary divergence between sequences can be deduced by determining the degree of genetic difference between them and estimating the rate of change of the sequences. This is done by taking into account the mutation rate of the sequences and inferring the amount of time that would be required to explain the observed differences. The observed differences between DNA sequences can be estimated using a relative rate model for each substitution type. DNA or other sequence data can only provide relative rates of evolution. These need to be calibrated with prior rate estimates for the relevant genes or with information from the geological record, such as fossil dates, in order to convert molecular change into time. Thus in the first instance, molecular dating requires one to select the appropriate model of molecular evolution in order to estimate the timing since the divergence of sequences. These models have been described in detail by (Yang, 2006). In short, the models may be simple (i.e. give equal probability to each type of substitution, such as the Jukes Cantor model (Jukes and Cantor, 1969). Alternatively, the model may be more complex by accounting for the unique attributes of specific datasets, such as the general time reversible (GTR) model

(Yang 1994).

One method of using sequence differences to infer divergence times is to utilize ‘clock-like’ data, in which the amount of genetic difference is assumed to accumulate at the same rate across all lineages/sequences/branches in a tree. However, this approach has been challenged for some time for being biologically unrealistic. Methods that have been used to detect variation in rates between sequences have been described (Langley and Fitch, 1973, Takezaki et al., 1995), however even when a dataset is expected to have a relatively uniform rate of sequence evolution, such clocklike behaviour generally cannot be attributed to a particular gene, lineage or taxonomic level (Duchene and Bromham, 2015). Another means of accounting for rate variation is to use

“local clocks”, which incorporate different rate estimates in different parts of the phylogenetic tree, within the same analysis (Rambaut and Bromham, 1998, Yoder and Yang, 2000,

30

Chapter 1: Introduction

Drummond and Suchard, 2010). Alternatively, one can account for rate variation by utilizing a relaxed molecular clock that allows for rates of molecular evolution to vary across different branches of a phylogenetic tree (Sanderson et al., 1998, Drummond et al., 2006). Molecular dating among kangaroo studies are substantially inconsistent among their estimates for divergence times (Table 1) indicating the need for further investigation with expanded datasets and greater consistency in model selection and rate assumptions.

One of the major differences between studies has been the acceptance of different calibrations, both in terms of the clades that are calibrated and the minimum and maximum bounds attributed to these calibrations. For example, even the same authors have varied more than two-fold in the minimum bound given to the divergence of the macropodid and potoroid families, with Meredith et al. (2008) assigning 12 Ma, and Meredith et al. (2011) assigning 24.7 Ma as minimum ages for that divergence. These differences come down to alternative interpretations of the fossil record.

In the former case the authors were uncertain about the affinities of and Early Miocene kangaroos, and so relied on the presence of approx. 12 Ma sthenurine teeth. In contrast, Meredith et al. (2011) accepted that Oligocene Bulungamaya is more closely related to macropodids than potoroids, and therefore must post-date the divergence of those clades.

Table 1. Previous molecular dating estimates for the major divergences among macropodiformes. Modified from supplementary information of (Meredith et al., 2008b) to also include molecular dates from (Phillips et al., 2013).

Burke et al . 1998 Burke and Springer 2000 Kirsch et al . 1997 Westerman et al . 2002 Meredith et al . 2008 Meredith et al. 2008b Phillips et al . 2013 Node Macropodiformes 45 34 - 48 30.5 - 33.5 26 Macropodidae + Potoroidae 30 24.0 - 24.2 14 - 17 15 All macropodids but Lagostrophus 40 10.6 - 11.2 Macropodidae 19.4 - 20.2 12.2 - 20.6 Potoroidae 22.1 - 22.3 21.3 - 22.2 10.8 - 23.1

31

Chapter 1: Introduction

1.2.7 Ancestral State Reconstruction

Ancestral state reconstruction is a method used in evolutionary biology to understand how the traits that I observe in modern species evolved from a common ancestral state (Gascuel and Steel,

2014). Utilizing a phylogenetic tree as a starting point allows us to estimate what state was at the root of the tree or at the internal nodes. This is done using character information from modern taxa that occupy the tips of the tree and implementing methods of reconstruction, such as maximum parsimony, maximum likelihood or Bayesian inference. Ancestral state reconstructions can be based on sequences, ecological, phenotypic, behavioural or biogeographic characters (Voordeckers et al., 2012, Finarelli and Flynn, 2006, Meredith et al., 2008b). In the case of kangaroos, Meredith et al. (2008) reconstructed grades of dental organization to investigate browsing vs grazing and showed that grazing evolved independently on two occasions within the Macropodidae (Meredith et al., 2008b).

1.3 OBJECTIVES AND THESIS OUTLINE

1.3.1 Chapter 1. Introduction

Chapter one is an introduction that presents the context of my research in the form of a literature review and outlines the gaps in knowledge, significance and purpose of the research.

1.3.2 Chapter 2. Examining the phylogeny of the genus Macropus and Wallabia using

retrotransposon insertions.

Chapter two provides an examination of the phylogeny of Macropus and Wallabia using retrotransposons as phylogenetic markers. This chapter is an article currently under review at the journal Scientific Reports. This analysis provides a dataset of 29 phylogenetically informative retrotransposon insertions that shed light on the phylogenetic relationships among kangaroos, including contentious species such as the swamp wallaby and the black gloved wallaby. Also

32

Chapter 1: Introduction presented is a new statistical framework for dealing with the ascertainment bias associated with retrotransposon studies that utilize only a single reference genome. In addition, I have examined whether transitions of certain kangaroo species into more arid zones coincided with past environmental changes in climate and vegetation. I also examine the activity and lineage history of a prevalent retrotransposon, the kangaroo endogenous retrovirus (KERV) in the kangaroo genome and show that another retrotransposon (LINE1) has likely gone extinct in the kangaroo genome, hinting at the possibility of competition between these two elements.

1.3.3 Chapter 3. Examining the phylogeny and timing of Macropus and Wallabia, utilizing

nuclear genes and mitochondrial genomes

Chapter three examines the phylogeny and timescale of Macropus and Wallabia utilizing mitochondrial genomes and nuclear genes as markers, in a concatenated ‘supermatrix’ dataset, and provides more complete taxon coverage than chapter two. This chapter presents the first study covering all living members of Macropus. This includes examining the phylogenetic placements of enigmatic species such as the black wallaroo, for which no species-level molecular sequence information was previously available, as well as the extinct toolache wallaby, by utilizing the first available ancient DNA sequence information from this species. The phylogeny obtained in this chapter is directly comparable to the phylogeny of chapter two, allowing us to compare three independent marker systems (retrotransposons, mitochondrial and nuclear markers) for the phylogenetic reconstruction of kangaroos. Molecular dating provides a temporal context for this phylogeny and also provides a starting point for the ancestral state reconstructions of chapter four. Finally, a lineage through time analysis based on the dated phylogeny examines the rate of diversification of kangaroos through time.

33

Chapter 1: Introduction

1.3.4 Chapter 4. The evolutionary history of kangaroos (Macropus and Wallabia).

Chapter four consists of an analysis of the rate of diversification among kangaroos by a. In addition, ancestral state reconstructions for Macropus/Wallabia were performed to shed light on the evolutionary history of kangaroos and to examine how phenotypic traits in kangaroos have evolved through time. These traits include habitat, group/mob size and sexual dimorphism for size and colour in both modern species (that occupy the tips of the phylogenetic tree) as well as ancestral species (that occupy the root and internal nodes of the tree). I then compare these characters with climate and vegetation changes on the Australian continent over the past several million years.

1.3.5 Chapter 5. Phylogeny and timing of deep relationships among the Macropodidae

Chapter five addresses the phylogeny and evolutionary timescale of macropods at a higher taxonomic level than previous chapters, addressing deeper divergences within the Macropodidae, and utilizing an expanded dataset consisting of six newly sequenced genes across 11 macropod genera, as well as five previously sequenced nuclear genes and complete or near- complete mitochondrial genomes. While the earlier chapters have focussed on the genera Macropus and

Wallabia, chapter five takes a broader view by addressing all genera within the family

Macropodidae and exploring the six-way polytomy among the clades that previous studies have been unable to resolve: Lagorchestes, Onychogalea, Macropus/Wallabia,

Thylogale/Petrogale/Dendrolagus, Setonix and Dorcopsis/Dorcopsulus.

1.3.6 Chapter 6. Conclusion

Chapter six is a discussion chapter that summarizes the findings and outcomes of each of the previous research chapters and provides final conclusions and suggestions for future research.

34

Chapter 1: Introduction

Together, these chapters will investigate the evolutionary history of kangaroos at various taxonomic levels, by reconstructing their phylogeny, performing molecular dating and reconstructing ancestral states of key phenotypic traits. These chapters cover the relationships among the genera Macropus and Wallabia, as well as deeper macropodid relationships. I have utilized a combination of powerful molecular methods including retrotransposon insertions, nuclear genes and mitochondrial genomes in a comprehensive molecular investigation. Furthermore, I provide novel statistical methods for overcoming an ascertainment bias that occurs when only a single reference genome is available for retrotransposon studies. In addition I have explored the dynamics of a specific type of retrotransposon, the kangaroo endogenous retrovirus (KERV) by examining heterozygous

KERV insertions, and by examining the lineage history of KERV sub-families. Finally, I find evidence that another retrotransposon type (LINE1) may have become extinct in the kangaroo genome.

Supplementary objectives

In addition to the primary objectives above, I have also authored or co-authored two publications related to my research.

1.3.7 Supplementary objective 1.

(First author publication).

Dodt, W.G., McComish, B. J., Nilsson, M.A., Gibb, G.C., Penny, D. & Phillips, M. J. 2016. The complete mitochondrial genome of the (Macropus giganteus). Mitochondrial DNA A DNA Mapp Seq Anal, 27, 1366-7.

1.3.8 Supplementary objective 2.

(Co-author publication).

Gallus, S., Hallstrom, B.M., Kumar, V., Dodt, W.G., Janke, A., Schumann, G. G. & Nilsson, M. A. 2015. Evolutionary histories of transposable elements in the genome of the largest living marsupial , the . Molecular biology and evolution, 32, 1268-1283.

35

Chapter 1: Introduction

1.4 SIGNIFICANCE OF RESEARCH

This research addresses the topic of kangaroo evolution at various taxonomic levels and utilizes a variety of molecular markers, as well as phenotypic traits to determine the phylogeny, timescale and evolutionary history of this iconic group of Australian mammals. I have aimed to overcome the problems associated with phylogenetic reconstruction of rapidly evolving taxa (ILS, introgression) by utilizing different types of molecular markers, including mitochondrial genomes, nuclear genes and retrotransposon insertions, and utilizing both concatenated datasets as well as species tree methods (which can account for ILS). To date, no other study of kangaroos has utilized all three of these marker systems within a single study, nor has any study of kangaroos covered all modern members of Macropus – the largest sized and most recognizable of all the kangaroo and wallaby genera. In addition, I have provided an examination of the timing of the divergences among kangaroos through molecular dating and explored their evolutionary history by reconstructing their ancestral states for various phenotypic and behavioural traits. My finding that the swamp wallaby is nested within a paraphyletic Macropus, encourages taxonomic revision of Macropus and Wallabia. Furthermore, I have provided a statistical framework that deals with a common ascertainment bias that arises in retrotransposon studies that have access to only a single reference genome. These statistical tests allow us to confirm or reject the observed tree topology, or alternatively indicate that additional markers or genomes are required for resolution. A lack of reference genomes for retrotransposon studies, below the ordinal level, will continue to be an issue for the foreseeable future, particularly among marsupials for which only four reference genomes currently exist.

36

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2Chapter 2: Retrotransposon phylogeny, updated statistics and

activity of TE lineages in the genus Macropus

*This chapter has been published in Scientific Reports: Dodt, W.G., Gallus, S., Phillips, M.J. and Nilsson, M.A., 2017. Resolving kangaroo phylogeny and overcoming retrotransposon ascertainment bias. Scientific reports, 7(1), p.16811.

2.1 ABSTRACT

Reconstructing a phylogeny from retrotransposon insertions is often limited by access to only a single reference genome, whereby support for clades that do not include the reference taxon cannot be directly observed. Here I have developed a new statistical framework that accounts for this ascertainment bias, allowing us to employ phylogenetically powerful retrotransposon markers to explore the radiation of the largest living marsupials, the kangaroos and wallabies of the genera Macropus and Wallabia. An exhaustive in silico screening of the tammar wallaby (Macropus eugenii) reference genome followed by experimental screening revealed 29 phylogenetically informative retrotransposon markers belonging to a family of endogenous retroviruses. I identified robust support for the enigmatic swamp wallaby (Wallabia bicolor) falling within a paraphyletic genus, Macropus.

My statistical approach provides a means to test for incomplete lineage sorting and introgression/hybridization in the presence of the ascertainment bias. Using retrotransposons as “molecular ”, I reveal one of the most complex patterns of hemiplasy yet identified, during the rapid diversification of kangaroos and wallabies. Ancestral state reconstruction incorporating the new retrotransposon phylogenetic information reveals multiple independent ecological shifts among kangaroos into more open habitats, coinciding with the

Pliocene onset of increased aridification in Australia from ~3.6 million years ago.

37

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.2 INTRODUCTION

The genus Macropus includes kangaroos, wallaroos and wallabies, which are herbivorous, and occupy a wide range of terrestrial habitats throughout Australia, parts of New Guinea and several surrounding islands (Nowak, 1999). The 13 species are currently grouped into three subgenera - the predominantly mesic members of M. (Macropus) and M.

(Notamacropus), as well as the more arid adapted members of M. (Osphranter) (Dawson,

2012). The evolutionary relationships among these subgenera and their relationship to the swamp wallaby (Wallabia bicolor) have remained contentious (Meredith et al., 2008b,

Phillips et al., 2013, Butler et al., 2016). Studies based on maternally inherited mitochondrial

DNA (mtDNA) have favoured a sister relationship between Wallabia and the genus

Macropus (Burk and Springer, 2000, Phillips et al., 2013). This is broadly in agreement with morphological studies, which have placed Wallabia outside of Macropus (Flannery, 1989,

Prideaux and Warburton, 2010). However, analysis of five concatenated nuclear genes provided moderate support for Wallabia bicolor being nested inside Macropus (Meredith et al., 2008b). Conversely, the traditional placement of the black-gloved wallaby (Macropus irma) within M. (Notamacropus) is supported by nuclear DNA (Meredith et al., 2008b), whereas analysis of mtDNA instead placed Macropus irma within M. (Osphranter) (Phillips et al., 2013). The sister group to Macropus and Wallabia also remains unclear.

Morphological characters arguably favour the nail tail wallabies (Onychogalea) as the sister group to Macropus and Wallabia (Butler et al., 2016, Flannery, 1989, Prideaux and

Warburton, 2010), while five concatenated nuclear loci weakly favour the hare wallabies

(Lagorchestes) (Meredith et al., 2008b), and mtDNA analyses remain uncertain (Westerman et al., 2002).

Macropus and Wallabia stem from within a broader adaptive radiation of macropodid genera

(also including Lagorchestes, Onychogalea, and Setonix) that took place during the Late

Miocene, a period of gradual cooling, drying and opening of across Australia (Martin,

2006, Meredith et al., 2008b, Prideaux and Warburton, 2010, Black et al., 2012). According

38

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus to previous molecular dating estimates, the divergence of all three Macropus subgenera and

Wallabia took place over a period of 1-2 million years (Meredith et al., 2008b, Phillips et al.,

2013). This rapid radiation is consistently associated with low statistical support among both nuclear and mitochondrial DNA for relationships among the three subgenera of Macropus.

This uncertainty precludes confident inference of whether habitat expansion into semi-arid grasslands and grazing specializations (see Sanson, 1989) evolved early in macropodids or later, independently in the Macropus subgenera, M. (Osphranter) and M. (Macropus), and also in Onychogalea.

Next generation DNA sequencing and the increasing availability of complete nuclear genomes have allowed phylogenetic relationships to be investigated using novel methods that utilize genome level characters (Ray et al., 2006). Here I employ a genome-wide retrotransposon presence/absence analysis, in an attempt to resolve the evolutionary history of the genera Macropus and Wallabia. Retrotransposons have a number of advantages over traditional sequence based phylogenetic reconstruction. Most notably, retrotransposons are a virtually homoplasy-free marker system (Shimamura et al., 1997, Ray et al., 2006, Nilsson et al., 2010b), due to near-random insertion across the genome providing an almost unlimited size of character space. Traditional DNA sequence-based methods, which are more prone to homoplasy within loci, can obscure gene tree affinities. Furthermore, retrotransposon analyses utilize a relatively simple and unambiguous parsimony approach (Ray et al., 2006), unlike sequence-based methods that require complex models of molecular evolution.

Hypothesis testing with retrotransposons has typically assumed equal prior probability of identifying markers supporting each of the three bifurcating topologies that could be resolved from a phylogenetic trichotomy (Waddell et al., 2001). However, few reference genomes of closely related taxa are available for screening retrotransposon insertions, which results in an ascertainment bias (Nikaido et al., 2007). Specifically, markers that support any grouping that does not include a reference taxon are unlikely to be observed, such that the retrotransposon method is “blind” to some trees (Figure 11) (Nikaido et al., 2007, Kuritzin et

39

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus al., 2016). This is a particularly critical problem for retrotransposon studies based on a single reference genome (e.g. Meyer et al., 2012, Zemann et al., 2013, Gallus et al., 2015a).

Avoiding an ascertainment bias requires n-1 reference genomes for each set of n taxa within the phylogeny of interest. Thus with only four distantly related genomes published so far

(, Tammar wallaby, Tasmanian devil and koala) (Mikkelsen et al., 2007, Renfree et al., 2011, Murchison et al., 2012, Johnson et al., 2014), marsupials (and most other taxa) will continue to be susceptible to the retrotransposon ascertainment bias for the foreseeable future. This highlights the importance of developing analytical methods to overcome genome scarcity, which will in turn, allow researchers to confidently infer phylogenies from retrotransposon markers.

Kuritzin et al. (2016) provided the first step in accommodating the single reference genome ascertainment bias by amending the previously published P-value calculations of Waddell et al. (2001) for hypothesis testing with retrotransposons (Materials and Methods). To meet a significance level P = 0.05, a minimum of three unopposed markers (P = 0.0370) are sufficient when the ascertainment bias does not need to be considered. However, five unopposed binary markers are required (P = 0.0370) with the ascertainment bias. Kuritzin et al. (2016) suggested that when only a single reference genome is available, the ascertainment bias renders phylogenetic resolution impossible, due to the “blind” tree.

I present three arguments that can be used to reject the “blind” tree and confirm the phylogeny or alternatively, to show that additional reference genomes will be required.

These arguments include consideration of a priori evidence, discussed on a case by case basis, and two tests for whether the observed markers could be hemiplasic instead of reflecting species relationships. I define hemiplasy inclusively, to include its original usage for incomplete lineage sorting (ILS) (Robinson et al., 2008), as well as introgression, which often cannot be distinguished from ILS for individual loci (Phillips et al., 2013). The statistical tests are independent of any a priori evidence and allow us to infer whether the observed markers

40

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 11. Retrotransposon ascertainment bias. Hypothetical example illustrating the

“blind” tree scenario that occurs when only a single reference genome is available.

Initial in silico screening of the reference genome (R) will have identified the markers

for experimental screening across taxa, such that insertions will only be observable in

clades that include taxon R. In T1 and T2, the reference genome (species R) groups

with species B and C respectively. However T3 is referred to as the “blind” tree since

any retrotransposon insertion supporting species B and C grouping together will be

unobservable. Black circles represent observable retrotransposon markers supporting

topologies (T1 and T2); the grey circle represents potential retrotransposon markers that

are unobservable in this scenario.

could derive from (i) ILS, which is expected to distribute markers symmetrically between the two non-species trees or (ii) introgression/hybridisation, which influences the ratio of observed markers on successive branches (Materials and Methods). I employ this statistical framework to account for the ascertainment bias that arises when relying on only the

Macropus eugenii reference genome (Ferreri et al., 2011) to extract phylogenetically informative retrotransposons. I present the first retrotransposon-based phylogeny of kangaroos and wallabies (Family Macropodidae), and trace their adaptive diversification over the past 10 million years.

41

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3 MATERIALS AND METHODS

2.3.1 Taxon Sampling and species verification

Samples were obtained from sanctuaries, zoos, museums and from road kill specimens. In cases where road kill specimens were collected, species identification was performed by experts in the field. In addition, a number of samples utilized in this study were already available at the Queensland University of Technology and the Senckenberg

Biodiversity and Climate Research Centre (Table 2). All recent KERV integrations were screened in a panel of six Macropus eugenii individuals to investigate the insertions at the population level. To verify species identifications ~500 nt of the mitochondrial control region was amplified and sequenced using published mt primers (Fumagalli et al., 1997) and either compared to a bank of macropod mt DNA sequences available in the lab (Pers. Comm.

Matthew Phillips) or BLASTed in Genbank.

Table 2. Taxon sampling for the 16 macropod species employed in the retrotransposon study.

Abbreviation Species name Sub-genus Common name Meu Macropus eugenii Notamacropus tammar wallaby Mag Macropus agilis Notamacropus Mpa Macropus parma Notamacropus Mrufo Macropus rufogriseus Notamacropus red-necked wallaby Mir Macropus irma Notamacropus black-gloved wallaby Mru Macropus rufus Osphranter Mro Macropus robustus Osphranter Mgi Macropus giganteus Macropus eastern-grey kangaroo Mfu Macropus fuliginosus Macropus western-grey kangaroo Wbi Wallabia bicolor swamp wallaby Lco Lagorchestes conspicillatus spectacled hare-wallaby Lhi Lagorchestes hirsutus rufous hare-wallaby Oun Onychogalea unguifera northern nail-tail wallaby Tth Thylogale thetis red-necked Lfa Lagostrophus fasciatus banded hare-wallaby Ptr Potorous tridactylus long nosed-

42

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3.2 Extraction of genomic regions containing retrotransposons

The Macropus eugenii genome assembly was used to extract single-copy introns containing retrotransposons of lengths 400 - 3,200bp, which were masked using the repeat masker software http://www.repeatmasker.org (Smit et al., 1996), using the Mammalia Repbase library. Different types of retrotransposons were selected for an initial screen. Based on a previous analysis of the Macropus eugenii genome the youngest kangaroo SINE was found to be WALLSI2 (Nilsson et al., 2010b). Therefore WALLSI2 was selected along with

LINE1, as well as an endogenous retrovirus (KERV). The in-silico screen identified few

KERV sequences in intronic regions, most likely due to the highly fragmented nature of the

Macropus eugenii reference genome (Renfree et al., 2011). And so, the screen was extended to intergenic regions, as has been done in previous studies of similar evolutionary depth

(Meyer et al., 2012, Hormozdiari et al., 2013). The KERV consensus sequence

(MERVK1_LTR) and associated sub-family sequences (MERVK1B_LTR,

MERVK1C_LTR, MERVK1D_LTR) were taken from Repbase and queried against the

Macropus eugenii genome using BLAT (Kent, 2002). Only hits with an identity score of 95-

99% and length >500 nt were retained, which yielded 47,526 intergenic regions. Alignments of putative KERV LTR loci were created by extracting a region of 4 kb sequence flanking the KERV LTRs from the Macropus eugenii genome using samtools faidx (Li et al., 2009).

The resulting sequence alignments were repeat masked using either RepeatMasker (Smit et al., 1996) or CENSOR, http://www.girinst.org/censor/index.php (Kohany et al., 2006) to identify the position of repeat elements within the sequence and facilitate the design of primers.

2.3.3 Primer design

Primers were designed in regions flanking the repetitive elements ranging approximately from 250-400 nt on each side of the retrotransposon insertion (Table 3). Each primer pair was tested with the in-silico PCR program available on the UCSC genome browser for the

43

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Macropus eugenii genome to ensure that each amplified region is single copy, occurring only once in the genome. After the initial experimental screening using species from the three Macropus sub-genera and Wallabia, KERV, was the only element to exhibit activity relevant to the phylogenetic question. The other screened retroposon types were not phylogenetically informative as the insertions were present in all of the investigated species.

In addition, primers for 33 introns lacking repeat elements (‘empty’ introns) were obtained from a previous study (Gallus et al., 2015a) and screened in-silico using BLAT (Kent, 2002).

Introns that lacked retrotransposon elements in the Macropus eugenii genome were subsequently screened experimentally to detect novel insertions in the non-reference species.

44

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 3. Primer list for phylogenetically informative markers, as well as for Macropus eugenii specific markers used in the heterozygous test.

Locus Forward (5' - 3') Reverse (5' - 3')

Phylogenetically 01K GAAGACCTGGTAGTCTTAAGACAGTCAGG GATGACACAGAAATTCAGATGGAGC

informative 02K GTGCACTTTAAAGTGCAGCAACAAC GAGAAGTCATTGAAACTGGGTTAGTGG markers 03K CTCCTCCAGCCAGCAGTCCTGAC CAGTAAGAAATGGGACCCACTGATC 11K CAACTGTGGTTATTTTGCATTCTTG GATGGAGCTCAAACTCCTCCAGAGGGC 70K CCAGTTCGGCATTTCCATAG GGTCCCAGGAAAATCATTTG 71K GTAATTCCAAAGGCTAGTCCTTC CCTAGTCTGAGGCAGTCATG 76K CTCTACCAGGATTGTCCCATG CTATGGCTAGATCTTCCAAATATGC 78K CTGTCTATTCTTGCAGATCCAATC GATTGATCAGTTTCCAGTGTCTATTC 79K GACAAGGAGAATGACTGTAGAACTG CACTGTTCCCTGCTCAGTC 97K GTTCACCTGGATGCCATG GTTGCTTCATCATGTTTGTCTC 99K CAGAATCCCTTACCACCTGTGAC GGTCAGATTGAATTAATGGTAACAATG 100K CATCTTTTGATGAAGTGTCTCAG CTCTGGAATATAACAGTATGAAACTAG 101K GGAAGACCTTGTAGAGTATGGTAAC CACACTGATGTTACTCTGTTCC 102K GGCTTTAATTACTAGGACATTC GTAATCCTAGCTCAGGGTAAC 104K GTTAACTCAATGACCAAAGGAATAAC CCTAACTTTCACCATGAGGGTC 106K GTATCACACAAGGTTAATAGAGATC CCATGACATTATACTCTCACAGTC 107K GCTTTCCAAAAAGTCGACCAG GACAAGCCAGAATAGATTAATTTGA 122K CTTGTCATTATCCTTACTCTTTCCTTC GTGAGAAGGTCTTAAATCATGATC 123K CAATGTGGTGGCAAGATAGTTG CATTCCACATCCAGTTCTCTATC 124K CCAATGAGTTCTGTCTTCTTATTATG GTATCAGATTAATTCAATTCATCAAGAG 126K GTATGGATATCTCTAAGTGTTCATAAATG CCTGATTGATTAAACATATGTGCTG 127K CAGATAGCAAGAAGCCTGG GAGAAAGAATACATACAAATAGTGAACTCC 130K GATACTGCCCTGGTTTGGTAAG GATCCTCAGTCCAGGAGTCC 131K CCATTATTTGGATTCTTCTGAGATTG GTTCCAACCCACCTATTATAGCAG 132K CAATTCCATACAATTATAGTTAACTTTAAG GTAGAGATGATATCTTATTAGGTCCTTG 133K GTAACAGATGTCAAGGCTACAGAGTTAG CAATGCTTGTTATCTCATCCATGAG 135K GATCCATATGAATTACTGAATTC GGTAAAGGTTAATCAAATTCTG 136K CAGTTCCTTCTTCCACTGTTAG GTAAATTGGGATCAGATTGTG 137K CCTTTGGACTGAGTCCAGTGTC CTGCCAGGAGCACAATCATC

45

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Locus Forward (5' - 3') Reverse (5' - 3')

M.eugenii specific 04K GTAGACTCCTTTGCTGAGCAGAGAATGA GTAAGGAAATTTGGCACTTGCCTTTG markers 07K CTCGAAATAATTAAAGTAATACGTGCAGG GGAGACTAAGTTGGTTAGAGCAGGAGGC (heterozygous 08K CCTATCTAGTGGCTGATGATTTCATTC CAAGCTCTGAGATTTGGCAGAGTCCACC test) 68K CATATAGTTCCTTATGAACTTTGCTTC GCACAAAGTACATGTATGTGCAC 74K GAACAAGTCCAATCCCTTTTCTG CAAGACTAAAAGACAGTTTACCTGAGG 75K CTTAATTTTCTGTCCTTGCCACTAC CTTAGCATAGATCAGGTACAATGAAGC 77K CAGAACTGATTAGGAATGGAATCC GGAAAACAGGATGTTACACATAAGAG 84K CTTGCTGTTATAGCTAATATTTCTGG GCAATGAAGTAGTCCTACCAAAC 86K CTATCAAAGCCATCCCTTCAG CTGAAATCAGAACCATAAGACAGAG 88K CTATCAGCTCTACATTGGTTGTCC CATTACAGGTTTGCAGAAAGATC 89K GTAGCATCATCAGACTTGTACTTTAGG GCACAATGGATAAAACAGGAGTC

90K GATCTATAGTAACTAACTGTGAGGACTG GATATTCTACATGGAAAGTGGTCTATAC 92K GCTTAGAAGCAAGTGCATTTC CTGATACTGATGTGCTTTGGAG 93K CCAACTGGACCTCTCCTTTG GTAGTTTGCCACCAGAAATGG 94K GTTTCTTCTCAATTAACAGACCTTG CAGTTGAATGTAAGCTCCTTAAGG 95K GTTGTGATATGGTTAGGCCATC CCAATGCTGATTAATCACTCC 96K CACAAAGGTTGATATCCTCTAATC CATTGGAGCTTTGATAGAAGAT 103K GATTGTCAGAGGGAAAGACAAG CAGTGTTTCCTCTGGGATC 108K GAATTAACAGGCCTCAGGAAGAC GAAAACAGAAATCTCCAATCAGTG 109K GACAGTCACTATTAATAGTTTTATTC GTACCCAGAATGAGAATGAAC 110K GTACTCATAAATGACAAGGAGATTAGC CCTAAATACTTCTTTGGCAACTTTC 111K GTACATTTTACTAAATATCTAGTAGCAG GAAGTTAATGAGATGAATAGAATTC 113K CTCCTAACTCATAGAATTACTCAAAG GAGAATATTGTCTGGATGATATG 114K GGAATATATTGCAGAGGTCG GGAATGAATGTTGAGAATTG 115K CTGTCAGTGATTTATGAGGACTG CATAGAGCTACTACAATAAGAGAAGTAC 116K CACTTTGAAGTCATTCACAATGAAG GAAGGCAAAGGATCAGACATG 124K CCAATGAGTTCTGTCTTCTTATTATG GTATCAGATTAATTCAATTCATCAAGAG 140K CTGAGAATGGCCAAACAG GCCCTAAATCATTCCAGAAG 141K GTGCTCTTTATGAAGAGTTGG CGCACAGTTTGTAATCCTC

46

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3.4 Experimental verification

DNA extractions of kangaroo tissue (for taxon sampling see Table 2) were performed using the phenol chloroform extraction method (Sambrook and Russell, 1989) or using a gDNA extraction kit (Promega) following the manufacturer’s instructions. Next, PCR screening was performed on the DNA extracts, including both a negative (water) and positive control

(Macropus eugenii DNA). All PCR reactions were carried out in 12.5 µL reactions containing ~10 ng of template DNA, and VWR master-mix following the manufacturer’s instructions, using touch-down PCR, decreasing the annealing temperature by 1oC over the initial ten cycles and followed by 24 cycles at the lowest annealing temperature. PCR products were visualized on a 1% agarose gel. The PCR size pattern was used to determine if a particular locus was phylogenetically informative (Figure 12). All promising loci were verified by Sanger sequencing for the presence of KERV insertions. For problematic cases, the PCR products were ligated into the pDrive plasmid, (Qiagen) prior to sequencing and transformed into chemically competent Escherichia coli cells. The resulting clones were

PCR screened and positive colonies were sequenced. All resulting sequences were visually inspected and aligned in Se-AL 2.0 (Rambaut, 2002).

2.3.5 Scoring of presence and absence of insertions across the phylogeny

I followed common practices for establishing the presence and absence of KERV insertions among species to use them as phylogenetic markers (e.g. Suh et al. 2011; Meyer et al. 2012).

The selected markers were initially amplified in a smaller set of taxa, with one representative from each of the four lineages, to establish the general location of the insertion in the phylogenetic tree. Following the initial information, additional species were amplified until the exact insertion of the marker was found. The amplification of ‘filled sites’, i.e. the presence of an insertion is generally ~400 nt larger than the ‘empty site’ (absence), and thus easy to distinguish using agarose gel electrophoresis (Figure 12).

47

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

a. Meu Mag Mpa Mrufo Mir Wbi1 Wbi2 Mro Mru Mgi Mfu

1,500 1kb 1,000

500

b. Meu Mag Mpa Mrufo Mir Wbi1 Wbi2 Mru Mro Mgi Mfu Lco

1,500 1kb 1,000

500

Figure 12. Agarose gel electrophoresis photograph of a typical PCR screening for

presence/absence 94K of transposable 95K elements, in 96K various macropod species. 97K

a= Marker Meu Wbi 97; Mgi b= MarkerMro Meu 130; Wbi Equal Mgi Mroamounts Meu of Wbi PCR Mgi product Mro Meuwere Wbiloaded Mgi an Md rorun on a 1%

agarose gel. Bands with higher molecular weight (higher on photo) indicate presence of the

retrotransposon, while lower molecular weight bands indicate absence. For species

abbreviations, see Table 2. Numbers followed by a ‘K’ indicate the locus being screened.

The 3-letter abbreviations indicate the species: Meu = Macropus eugenii; Wbi = Wallabia

bicolor; Mgi = Macropus giganteus; Mro = Macropus robustus. The band shifts, at each

locus, indicate that Meu contains a repeat element that is absent from the other species.

Subsequent confirmation is then carried out with Sanger sequencing followed by careful

sequence analysis to verify the retrotransposon insertion.

48

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Sanger sequencing was used to verify the KERV insertion sequence as well as target site duplications in the taxa. The marker was amplified in the relevant taxa, including species with absence (e.g. Suh et al. 2011), and the presence-absence information was scored with 1

(presence) or 0 (absence), combining the information from sequencing and agarose gel electrophoresis (e.g. Meyer et al. 2012, Mclain et al. 2012). For each analysed marker, sequences were produced and aligned and carefully analysed. Representatives from multiple taxa following insertion and one or more taxa without the insertion were Sanger sequenced to verify PCR patterns for all markers.

Using a representative from among shallow monophyletic groups (e.g M. giganteus or M. fuliginosus and M. robustus or M. rufus) based on independent molecular studies, is commonly applied in phylogenetic studies based on retrotransposons (e.g Nishihara et al.

2009, Churakov et al. 2009, Suh et al. 2011). It is standard practice that markers relevant to deep questions do not require full sampling among shallow clades (e.g. Suh et al. 2011;

Mclain et al. 2012; Meyer et al. 2012; Platt et al. 2015). For example, within M.

(Notamacropus), multiple sampling among these shallow taxa found no hemiplasy that extends back deeper than Wallabia. Similarly, for markers relevant to shallow questions, e.g. within M. (Notamacropus) and Wallabia, no hemiplasy was revealed by multiple sampling outside Macropus.

49

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3.6 Parsimony reconstruction with retrotransposon markers

A maximum parsimony strict consensus tree on the full retrotransposon data was inferred in

PAUP 4.0b10 (Swofford 2002). Unobservable support for any grouping not including the reference taxon, M. eugenii prevents parsimony resolution. However, revealing the signal among the retrotransposons becomes possible by constraining undoubted groupings that do not include the reference taxon, and by placing a Dollo constraint on character state transitions (Le Quesne, 1974, Farris, 1977, Platt et al., 2015). The constrained clades were the two M. (Macropus), the two M. (Osphranter) and the two Lagorchestes. The reference taxon is at the tip of the tree and always state “1”, the reverse situation from the standard usage of up-Dollo parsimony, where the outgroup is assumed to be “0”. As such, we use down-Dollo parsimony.

2.3.7 Calculation of retrotransposon phylogenetic support values

The sequence alignments and PCR gel-electrophoresis patterns were used to catalogue the presence/absence of phylogenetically informative markers (Table 4), which are plotted on the tree in Figure 16. My strategy was to establish the branch on which insertion occurred, focusing on sampling Macropus (and Wallabia), and also showing that the marker is not present on at least two (ideally successively) deeper lineages. The only less stringent exceptions are K136, which was thus not employed for hypothesis testing (but could be included for parsimony analysis), and K107, which was absent only in the deepest macropod. Several untested markers in M. parma are due to limited DNA availability, but multiple other members of M. (Notamacropus) were tested. Wallabia and at least one of the closely related members of M. (Osphranter) and M. (Macropus) were sampled for all markers pertaining to relationships within Macropodinae.

50

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 4. Presence/absence of phylogenetically informative ERVs. Informative markers are listed on

the left using the convention of KXX for the primary retrotransposon markers and CX for the

conflicting markers. Retrotransposon presence = 1; absence = 0; not amplified or sequenced = ?; * =

amplification failure or sequencing problem. Deletion events are represented by ‘DEL’ for each

species. See Table 2 for species abbreviations. Dark Green shading indicates verification by

sequencing; Light green shading indicates PCR verification.

M. (Notamacropus) M. (Osphranter) M. (Macropus) Outgroups

Meu Mag Mpa Mrufo Mir Wbi Mru Mro Mgi Mfu Lco Lhi Oun Tth Lfa Ptr Marker K01 1 1 1 1 1 1 0 0 0 0 ? 0 ? ? ? ?

K02 1 1 ? * 1 1 * 1 0 0 ? 0 ? 0 0 0

K03 1 1 1 1 1 1 0 0 0 0 ? 0 ? ? ? ?

K11 1 ? ? ? ? ? ? ? ? ? ? 1 1 1 0 0

K70 1 ? ? ? ? 1 ? ? ? ? ? 1 1 1 0 0

K71 C6 1 0 0 * 0 1 0 0 0 0 ? 0 ? ? ? ?

K76 1 ? ? ? ? 1 ? 1 1 ? ? 0 0 * * *

K78 C2 1 1 ? 1 1 0 * 1 1 1 ? 0 0 ? ? ?

K79 1 ? ? ? ? 1 ? 1 1 1 ? 0 0 0 * 0 K97 1 1 1 0 0 0 0 0 0 0 ? ? ? ? ? ? K99 1 1 ? 1 1 1 DEL 1 0 0 ? * 0 ? ? ? K100 1 ? ? ? ? 1 ? 1 1 ? ? 0 1 0 0 0 K101 1 1 ? ? ? 1 1 1 * 0 ? 0 ? ? ? ? K102 1 1 1 1 1 0 ? 0 0 ? ? 0 ? ? ? ? K104 1 ? ? ? ? 1 ? 1 1 ? ? 1 1 1 0 0

K106 C1 1 1 1 1 0 1 ? 0 0 0 ? ? ? ? ? ?

K107 1 ? ? ? ? 1 ? 1 1 ? ? 1 1 1 0 *

K122 1 1 1 1 1 1 ? 0 0 ? ? 0 ? ? ? ?

K123 1 1 ? 1 1 1 0 * 0 ? ? * ? ? ? ?

K124 C7 1 0 1 0 0 1 ? 0 0 ? ? 0 ? ? ? ?

K126 C4 1 1 1 1 1 0 0 0 1 1 ? ? 0 0 ? ?

K127 C8 1 0 0 0 1 1 ? 0 0 ? ? * ? ? ? ?

K130 C5 1 1 1 1 1 1 0 0 1 1 0 0 ? 0 ? ?

K131 1 1 1 1 1 1 ? 0 0 ? ? 0 ? ? ? ? K132 1 1 1 1 1 1 ? 0 0 ? ? ? ? ? ? ? K133 1 1 ? ? ? 1 ? 1 1 1 ? 0 0 0 0 ? K135 1 1 ? 1 1 0 0 0 0 ? ? ? ? ? ? ? K136 1 1 ? ? 1 1 ? 1 1 ? ? 0 1 1 ? ?

K137 C3 1 * ? 1 1 0 ? 1 0 ? ? 0 ? ? ? ?

51

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

P-values for each branch were calculated for the retrotransposon data using binomial probability based on a similar statistical approach to that described by Kuritzin et al. (2016), which updates the statistics of Waddell et al. (2001) (Table 5). The P-values are based on the probability of random allocation of markers to the three alternative bifurcating tree hypotheses that can be resolved from a trichotomy. Consider Figure 11, where 2, 1 and 0 markers respectively support the three topologies; T1 ((R, B), C), T2 ((R, C), B), and T3 ((B,

C), R), where taxon R is the reference genome. This gives a retrotransposon count of [2 1 0].

The exact cumulative binomial probability (PB) for T1 being supported by at least two of

1 three markers, each with probability /3, is PB = 0.2593.

The ascertainment bias resulting from using a single reference genome requires further amendment, because markers supporting the clade that does not include the reference genome are not observable. Hence, we refer to T3 ((B, C), R) as the “blind” tree. The retrotransposon count reduces to [2 1 X], with X indicating unknown status. Hypothesis testing becomes binary, and [2 1 X] is among 23 = 8 permutations for three markers ([3 0 X],

[0 3 X], and three permutations each for [2 1 X] and [1 2 X]). Half of these permutations ([3

0 X] and the three permutations for [2 1 X]) are at least as favourable for T1 as is the observed count [2 1 X]. Hence, when acknowledging that markers for the “blind” T3 are unobservable, the exact cumulative binomial probability for tree 1 being supported by at least two of three markers, each with probability ½, increases to PB = 0.5.

The binomial probability calculations with all markers being observable are identical to

Kuritzin et al.’s (2016) “multi-directional KKSC test” and binomial probability calculations with one clade being “blind” to insertions are identical to the “one-directional KKSC test”.

Given this equivalence with cumulative binomial probability, we will refer to these binomial probability tests as KKSC (PB) tests.

52

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 5. Cumulative P-values for testing prior tree hypothesis T1 on retrotransposon counts (Kuritzin et al., 2016) amended from Waddell et al. (2001). Additional P-values can be calculated at http://retrogenomics.uni-muenster.de:3838/KKSC_significance_test/ a For cases when all markers are observable (P); b Revised values to accommodate the single reference genome bias (PB), excluding markers supporting T3.

Multi-directional searches: markers are One-directional searches: markers for the equally identifiable for the three trees third tree are unidentifiable

Insertion Insertion Insertion Insertion PB a a b b count P value count P value count PB value count value

[1 0 0] 0.3333 [5 1 0] 0.0178 [1 0 X] 0.5000 [5 1 X] 0.1094

[1 1 0] 0.5556 [5 2 0] 0.0453 [1 1 X] 0.7500 [5 2 X] 0.2266

[2 0 0] 0.1111 [6 0 0] 0.0014 [2 0 X] 0.2500 [6 0 X] 0.0156

[2 1 0] 0.2593 [6 1 0] 0.0069 [2 1 X] 0.5000 [6 1 X] 0.0625

[2 2 0] 0.4074 [6 2 0] 0.0197 [2 2 X] 0.6875 [6 2 X] 0.1445

[3 0 0] 0.0370 [7 0 0] 0.0005 [3 0 X] 0.1250 [7 0 X] 0.0078

[3 1 0] 0.1111 [7 1 0] 0.0026 [3 1 X] 0.3125 [7 1 X] 0.0352

[3 2 0] 0.2099 [7 2 0] 0.0083 [3 2 X] 0.5000 [7 2 X] 0.0898

[4 0 0] 0.0123 [8 0 0] 0.0002 [4 0 X] 0.0625 [8 0 X] 0.0039

[4 1 0] 0.0453 [8 1 0] 0.0010 [4 1 X] 0.1875 [8 1 X] 0.0195

[4 2 0] 0.1001 [8 2 0] 0.0034 [4 2 X] 0.3438 [8 2 X] 0.0547

[5 0 0] 0.0041 [9 0 0] 0.00005 [5 0 X] 0.0313 [9 0 X] 0.0020

53

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3.8 Derivation of arguments for overcoming the single reference genome ascertainment

bias

A significant KKSC (PB) test rejects the null hypothesis that there is no difference in support for the two observable trees. To reject the “blind” tree, two further hypotheses need to be rejected. These are, HILS: that the “blind” tree is the species tree and markers supporting the observable trees result from ILS, and HIntrogression: that the “blind” tree is the species tree and markers supporting the observable trees result from introgression/hybridization.

I employ an ILS symmetry argument (see Figure 13) to test HILS. Theory (Green et al., 2010,

Durand et al., 2011, Kuritzin et al., 2016) and observed patterns (Scally et al., 2012,

Doronina et al., 2015) show that ILS will distribute markers that conflict with the species tree roughly symmetrically among the two non-species tree alternatives (the two observable trees, if the “blind” tree is the species tree). The multi-directional KKSC-hybridization test of

Kuritzin et al. (2016) (http:// retrogenomics.uni-muenster.de:3838/KKSC_significance_test) tests whether ILS alone can explain the difference in the number of markers supporting the two clade hypotheses with the fewest insertions. My ILS test is a special case of the KKSC- hybridization test, in which the hypothesis specifies that the “blind” tree X is the species tree.

Therefore X can take any value ≥ the number of markers supporting the favoured observable clades. Conveniently, under this condition the multi-directional KKSC-hybridization test

(and ILS test) is independent of X.

In Figure 13 the “blind” tree, T3 (iii) is ((B, C), R), where R is the reference genome.

Significant disparity in the number of markers supporting trees T1 (i) and T2 (ii) allows us to reject the hypothesis that the observed markers resulted from ILS, and hence, reject HILS. The

ILS (and KKSC-hybridization) test is the same binomial probability test as for KKSC (PB), except PILS is two-tailed, because the disparity could be T1>T2 or T1

54

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 13. ILS symmetry argument. Hypothetical scenario illustrating the ILS symmetry test for accepting or rejecting the “blind” tree, under the assumption of ILS. There are three possible topologies (T1-T3) for clades R, B and C. Different numbers of retrotransposon markers are observed

(black dots) or principally unobservable (grey dot), when genome data is only available for the reference genome, R. Generally, the species tree is expected to have the highest number of markers, while ILS is expected to distribute insertions approximately evenly between the two alternative (non- species tree) groupings. Under ILS, in (a) if T1 (i) has numerous markers, and T2 (ii) has few markers, then the “blind” T3 (iii) is also expected to have few markers (maintaining ILS symmetry), and T1 can be inferred to be the species tree. If few insertions occurred along the (R, B) stem lineage of the species tree, that topology may not be supported by significantly more insertions than ILS distributes to the non-species tree alternatives. Hence, in (b) if T1 (iv) and T2 (v) are supported by a similar number of markers, symmetry in the number of deep coalescences between the two non-species trees could arise in two ways. T3 (vi) could be the species tree and potentially supported by a larger number of unobserved markers (ILS symmetry between T1 and T2) or either T1 or T2 could be the species tree with a similar number of markers supporting (C, B) and ILS symmetry maintained between T2 and T3 or T1 and T3, respectively.

55

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Rejecting HILS still leaves the possibility that the “blind” tree is the species tree – if the observable markers result from introgression/hybridization (HIntrogression). I have developed a test for hybridization/introgression that considers the number of markers identified on successive branches along the stem lineage of the reference taxon (see Figure 14). This

“insertion ratio test” exploits the biological expectation that the proportion of insertions that introgression shares is governed by the proportion of the genome shared. If the “blind” tree is the species tree, these introgressed markers will instead appear to support a non-species tree that includes the reference taxon.

A hypothetical example of introgression is illustrated in Figure 14. I start by assuming the

“blind” tree (i) is the species tree (grouping taxa B+C). The parameters α and β are the respective numbers of markers that inserted before and after a proportion (γ) of the reference genome (R) was shared and remains with taxon B. This genetic sharing favours the observed tree (ii), grouping the reference (R) with taxon B. The expected number of introgressed markers supporting this “incorrect” tree is n = βγ. The number of markers expected along the stem lineage of R is m = α+ β(1-γ), where the term β(1-γ) is the number of markers along the lineage leading to R that inserted before the introgression event, but are in the portion of the genome not shared with taxon B.

The insertion ratio test rejects the hypothesis that the observed tree (Figure 14, iii) derives from introgression/hybridization, if the number of markers (d) supporting (R+B), is significantly greater than introgression is expected to contribute, denoted n in (ii).

Unfortunately, we cannot know the true value of n (recall that n = βγ), because the number of markers (β and α) in (i) and the shared proportion of the genome shared (γ) are all unknown. However, we note that n is maximized when α=0 and γ is high, e.g. 0.5 would be an extreme value for γ. Then knowing only the observed tree (iii), the maximum value of n

(which we denote N) is the proportion of the genome shared (γ), multiplied by the number of markers potentially available in stem-R for sharing (d+e = 8). In this scenario N = γ(d+e) =

4, which being a maximum value, provides a conservative insertion ratio test.

56

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 14. Insertion ratio argument. Hypothetical scenario illustrating the ‘insertion ratio test' for accepting or rejecting the “blind” tree under the assumption of introgression/hybridization. (i) Hypothetical “blind” tree with α and β representing insertions respectively occurring after and before an introgression event between the stem lineages of species R (reference genome) and species B. The proportion of the genome shared and retained between species R and B is γ. (ii) The expected tree under this hypothetical introgression scenario, with γ= 0.5. The expected number of insertions from introgression supporting (R+B) is n = β γ and the expected number of unshared insertions (and so, unique to R) is m = α + β(1-γ). (iii) The experimentally identified “observed tree” with d markers supporting R+B and e markers unique to the branch leading to the reference genome, R. If d is significantly higher than n under binomial probability, we reject the hypothesis that introgression/hybridization can explain the level of support for the observed tree. A maximum value for n can be estimated as N = γ(d+e).

The insertion ratio test is denoted PR50 when γ= 0.5, and is expressed in the form (d,e) (see

Figure 14, iii) as a standard cumulative binomial probability test, for which the number of

57

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

“trials” (markers) is d+e, the number of “successes” (markers shared) is d, and the “expected probability of success” for each trial reduces to γ. Returning to the observable tree in Figure

4iii, the clade (R+B) is supported by d=6 markers and the next (shallower) clade that includes the reference taxon is supported by e=2 markers. With this observed insertion ratio count (6,2) in Figure 4iii, PR50=0.1445. Essentially, the observed support (d=6) is not significantly higher than expected under 50% introgression (N=4). Hence, we cannot reject

HIntrogression: “blind” species tree with 50% introgression/hybridization contributing the markers supporting (R+B).

Setting the shared and retained proportion of the genome (γ) to 0.5 is an extreme scenario, such as for homoploid hybrid species derived purely from F1 hybrid ancestors. Thus, PR50 is very conservative. On both morphological and population genetic evidence there is only support for lower level introgression, even among closely related kangaroos (Neaves et al.,

2010a, Phillips et al., 2013), and so we also present a more realistic scenario for the insertion ratio test, with γ=0.2 (PR20). In the Figure 14 (iii) example, this reduces N to 0.2×(6+2) = 1.6, and for (6,2), PR20=0.0012. If the ILS test and the insertion ratio test respectively reject HILS and HIntrogression, then at least some markers supporting the observed tree derive from shared species tree ancestry, and we can reject the “blind” tree.

Ethics statement: All specimens were already deceased at the time of sampling (ie. , animal shelters, wildlife sanctuaries and zoos). Tissue use is covered by QUT ethics approval confirmation number 1400000559. No further experimental ethics requirements are necessary, given the nature of this research.

58

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.3.9 Conservatism of the insertion ratio test for H2-Introgression

The insertion ratio test considers the hypothesis that the “blind” tree is the species tree, and markers supporting the observable trees result from introgression/hybridization. In Figure 15 this is illustrated using a value of=2 as shown, however if we consider a case in which the value of  was 0, then the proportion of markers, n/(n+m)=0.5, will be exactly the same as the value of gamma (the proportion of the genome shared between the reference taxon R and the non-reference taxon B). If instead =2, as depicted, then n/(n+m)=0.375, which is below the true value of gamma, and thus, makes for a conservative test for introgression explaining the apparent support for the R+B grouping. Notably, in (b) the direction of sharing is opposite, being B to R. In this extreme case, in which the rate of insertion along B is twice as fast as the rate of insertion along R, n/(n+m)=0.5, the true value of gamma. Even in this case the duration of R2R. Thus, for the insertion ratio test to be overconfident, i.e. where n/(n+m) > gamma, introgression would have to be from B to R, and with retrotransposition over twice the rate in B than R. Such a situation may become more likely as retrotransposition patterns diverge over longer divergences between taxa, although, the probability of introgression will also decrease with divergence.

As well as the potential for insertion rates to vary between sister lineages, they may also vary between successive branches along the stem lineage leading from the reference taxon. In theory it would be possible for a difference in insertion rates along a lineage to lead the insertion ratio test to falsely reject the hypothesis that the observed support derives from introgression, i.e. falsely rejecting the “blind” tree. The first relevant point here is that this null hypothesis involves insertions noted on successive branches (d and e) on the observed tree, e.g. Figure 14 (iii) that are in fact from the same branch, the R lineage on Figure 14 (i).

Thus, the appearance of these R lineage insertions being on two branches of the observed tree is an artefact of some being shared and some not.

59

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 15. Conservatism of the insertion ratio test. Hypothetical scenario illustrating the conservative nature of the ‘insertion ratio argument’ for accepting or rejecting the “blind” tree under the assumption of introgression/hybridization. = insertions present after the introgression event that are specific to lineage R; β= insertions present before the introgression event in lineage R; γ= the proportion of the genome shared between the two lineages after introgression. (a) Represents introgression from the reference genome, R, into lineage B. (b) represents the reverse scenario in which introgression occurs from lineage B into the reference genome, R, when the insertion rate in B is twice that of R, and no insertions occurred after the introgression event (i.e. α=0). For the insertion ratio test to be overconfident, then n/(n+m) must be greater than γ. However even in the extreme scenario of

(b), the test yields a conservative outcome (i.e. n/(n+m) is equal to or less than γ).

60

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Nevertheless, there is the possibility of rate variation along the R lineage itself, before and after the insertion event (insertions labelled beta and alpha on Figure 14 (i)). However, the insertion ratio test makes the conservative assumption that all of the markers d and e in

Figure 14 (iii) were present at the time of the introgression event. This is equivalent to assuming that the insertion rate is zero along lineage R following the introgression event.

Allowing for any positive value of that rate will only make the test more conservative.

Therefore, insertion rate differences along lineage R before and after the introgression may reduce the power of the test, but will not promote false rejection of the “blind” tree.

2.3.10 Phylogenetic analysis of KERV sub-families

The nucleotide sequences from the 83 experimentally screened KERV solo-LTRs were extracted and used to build an alignment. The four KERV solo-LTR consensus sequences obtained from Repbase (MERVK1_LTR, MERVK1B_LTR, MERVK1C_LTR and

MERVK1D_LTR) were included in the alignment in order to identify the clades to which the experimentally screened loci belonged. The data set was aligned in MUSCLE (Edgar,

2004), and poorly aligned regions were subsequently removed using Gblocks (Castresana,

2000). The resulting alignment was used to construct a Bayesian inference tree using

MrBayes 3.3.6 (Ronquist and Huelsenbeck, 2003) under a HKY-GAMMA model, as favoured by AIC within ModelTest 3.7 (Posada and Crandall, 1998). The presence/absence information from the experimental screen was combined with the phylogenetic ERV tree to gain a deeper understanding of when the different LTRs were active.

2.3.11 Recent integrations of KERV in the tammar wallaby genome

All autapomorphic insertions were screened in a panel of six Macropus eugenii individuals (five of which were obtained from CSIRO (Canberra)) to investigate the insertions at the population level.

61

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Heterozygous insertions occur when one chromosome in an individual contains the insertion, while the other lacks the insertion, therefore I performed PCR amplification followed by agarose gel electrophoresis in order to detect heterozygous insertions (which appear as double bands when visualized on an agarose gel).

2.3.12 Investigation of LINE1 activity in the Macropus genome

The genome of Macropus robustus was experimentally screened for intact LINE1 ORF2 reading frames. In mammalian LINE1 sequences, ORF2 encodes the endonuclease and reverse transcriptase necessary to perform retrotransposition. A published protocol (Cantrell et al., 2000) was modified (Gallus et al., 2016) to identify intact LINE1 ORF2 sequences.

Primer sequences consisting of forward: CTCTTTGCAGATGATATGATG and reverse:

ACCTARTMTATTCCACTGATG, located at position 4,060 – 4,674 of L1-1_ME (the youngest LINE1 in the kangaroo genome), were used to amplify this region from genomic

DNA. The resulting 614 nt PCR product was purified and cloned into a Topo-TA vector

(Invitrogen) and transformed into TOP10 cells. 100 random colonies were picked and used for colony PCRs using the primers M13F and M13R, and the resulting clones with inserts were sequenced. The sequences were screened in Geneious (Biomatters) to identify clones for which the entire sequence could be translated without stop codons or indels disrupting the reading frame.

The within group mean nucleotide distance was estimated after primer removal with

MEGA6 (Tamura et al., 2013). Data generated from two species with known LINE1 retrotranspositional activity, the opossum (Monodelphis domestica) and human (Homo sapiens) were generated using the same protocol and used as controls (Gallus et al., 2016).

62

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.4 RESULTS

2.4.1 Activity of an endogenous retrovirus during the evolution of Macropus

An exhaustive in silico screen of the reference genome, Macropus eugenii, identified three prevalent retrotransposon types, LINE (LINE1), SINE (WALLSI2) and ERV (KERV-1 deposited in Repbase as MERVK1) elements. Thirty-eight LINE1 and seven WALLSI2 loci were screened experimentally, but these elements showed no phylogenetic activity for the investigated branches as the markers were present in all tested species. I experimentally screened 83 KERV-1 (MERVK1) loci across 16 macropodiform species, covering the major clades over ~25 million years of evolution, and identified 29 phylogenetically informative solo-LTRs (Figure 16 and Table 6). Each KERV-1 insertion is flanked by 5-6 nt long target site duplications without a common motif (Table 6). Many of the examined ERVs appear to have diagnostic sequence changes (e.g. 40 nt deletions) separating them from previously published KERV-1 elements, suggesting that the number of currently described KERV-1 sub-families has been underestimated. Additional screening was carried out in six Macropus eugenii individuals to investigate polymorphic retrotransposon markers, which suggest recent or ongoing retrotransposition. To reduce the single reference genome ascertainment bias, 33 introns that lacked retrotransposon insertions in Macropus eugenii were also experimentally screened in additional species, but yielded no novel markers. The 29 phylogenetically informative KERV-1 markers are shown in Figure 16. Eight of these markers (~28%) phylogenetically conflict with the majority, and are designated as C1-C8

(Figure 16). For each relevant node we note the insertion pattern count (Table 7).

63

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 16. Kangaroo and wallaby maximum parsimony phylogeny inferred from retrotransposon data. Dark red circles represent retrotransposon markers. Well-established clades for which there is no retrotransposon information (shown here without red dots) were then obtained from phylogenetic analyses of both nuclear and mitochondrial DNA sequences (Meredith et al., 2008b, Phillips et al., 2013, Mitchell et al., 2014). ML support values shown on branches that also contain retrotransposon data are obtained from an independent analysis (Meredith et al., 2008b) that utilized nuclear genes, for direct comparison with the retrotransposon markers. Shaded rectangles represent the genus Wallabia (light blue) and the Macropus subgenera (orange). Coloured vertical bars (C1 – C8) each represent a retrotransposon marker that conflicts with the majority, and indicate alternative groupings of taxa (e.g. conflict bar C1 supports a grouping of M. eugenii + M. agilis + M. parma + M. rufogriseus + W. bicolor to the exclusion of M. irma and other macropods). Identified presences and absences are respectively denoted (+) and (–). Outgroup species are Onychogalea unguifera, Lagorchestes hirsutus, Thylogale thetis and Lagostrophus fasciatus. Marker indexes refer to Table 7, and are numbered i) to viii), with iii) and iv) both summing markers along two branches. The location of the markers is based on Sanger sequencing and scoring filled and empty sites from PCR amplification patterns. Kangaroo images by Jon Baldur Hlidberg.

64

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 6. Retrotransposons and target site duplications (TSDs). TSD sequences for each of the phylogenetically informative KERVs are listed, as well as the sub-family to which each informative KERV belongs. KERVs that have integrated into other TE’s have been indicated. Other types of TE’s present in the sequence are listed.

Marker Conflict Sub-family of informative TSD sequence Other TE types KERV integrated in marker KERV present in other TE amplicon K01 MERVK1C_LTR GAAATC CHARLIE4 (DNA YES (integrated in transposon) - CHARLIE4) present in all species K02 MERVK1B_LTR CAGGAA MIR3, ERV17_MD NO - present in all species K03 MERVK1B_LTR GAAACT MIR - present in all YES (integrated in species MIR) K11 MERVK1C_LTR CACAGA none NO K70 MERVK1C_LTR TATTTC MERVK1_I - NO present in all species K71 C6 MERVK1B_LTR CTTAAG L3 - present in all YES (integrated in species L3) K76 MERVK1C_LTR GAACCC none NO K78 C2 MERVK1B_LTR ATCCTC none NO K79 MERVK1B_LTR ATAGCC L2B_ME - present NO in all species K97 MERVK1B_LTR GGTAAG L3, MAR1_MD - YES (integrated in present in all L3) species K99 MERVK1C_LTR GAAACT MIR3 - present in NO all species K100 MERVK1C_LTR GAAATC WALLSI4 - present NO in all species K101 MERVK1C_LTR GGGTAG L2B_ME - present NO in all species K102 MERVK1C_LTR GGAAAG MIR3A_MarsA, NO L3_ME - present in all species K104 MERVK1C_LTR GGGGC CHARLIE4 (DNA NO transposon) - present in all species K106 C1 MERVK1C_LTR TATGTC L2-2 - present in all YES (integrated in species L2) K107 MERVK1C_LTR TATCAG L1-3_ME - present NO in Lagostrophus only K122 MERVK1B_LTR AAGACT L2-2, SINE-2_MD - NO present in all species K123 MERVK1B_LTR ATTATC MIRc - present in NO all species K124 C7 MERVK1B_LTR CAACAC BovB_Ma, RTE- YES (integrated in

65

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

3_ME RTE-3_ME) K126 C4 MERVK1B_LTR GATTCC none NO K127 C8 MERVK1B_LTR GGCAGG MIR - present in all YES (integrated in species MIR) K130 C5 MERVK1B_LTR CTGACC none NO K131 MERVK1B_LTR TCTAT RTE0_Mars - NO present in all species K132 MERVK1B_LTR TATCAG LTR4_ME - present NO in all species K133 MERVK1B_LTR AGTATG WRETRO - present NO in all species K135 MERVK1B_LTR CAACAC L1_Mars1b_3end - YES (integrated in present in all L1) species K136 MERVK1C_LTR TTTTGC L2-2_ME - present NO in all species K137 C3 MERVK1B_LTR GTTATT MIR, MIR3, NO MIR3_MarsA - present in all species

66

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 7. Trifurcation results for each of the major nodes investigated in this study, with P- values reported for the binomial probability KKSC (PB) test, as well as for the insertion ratio and ILS symmetry tests (when PB≤0.1). Wall= Wallabia; Nota= M. (Notamacropus); Osph= M. (Osphranter); Mac= M. (Macropus); Ony= Onychogalea; Lag= Lagorchestes; Lagost=Lagostrophus; Thy= Thylogale; M= Macropus; c-Nota= core members of M. (Notamacropus), which excludes M.irma

KKSC ILS Insertion Ratio Test

(PB) Test Test

Ratio Insertion Trifurcation Topology PB PILS PR50 PR20 pattern Pattern i) M.irma, c-Nota, 2 (M.irma, Nota), Wall 0.5 - - - - Wall 1 (C1) (Nota, Wall), M.irma

X (blind) (M.irma, Wall), Nota ii) Wall, Nota, Osph 6 (Wall, Nota), Osph 0.0625 0.125 (6,2) 0.1445 0.0012

1 (C3) (Osph, Nota), Wall

X (blind) (Wall, Osph), Nota iii) M.irma, Nota, 8 (i) + (ii) (M.irma, Nota), Osph 0.0039 0.0078 (8,0) 0.0039 <0.0001 Osph (regardless of 0 (Nota, Osph), M.irma Wall) X (blind) (M.irma, Osph), Nota

iv) Wall, Nota, 9 (ii) + (v) Macropus 0.0107 0.0215 (9,2) 0.0327 <0.0001 Mac/Osph 1 (C2) Macropus monophyly v) (Nota+Wall), 3 ((Nota+Wall), Osph), Mac 0.3125 - - - - Osph, Mac 1 (C5) ((Nota+Wall), Mac), Osph

X (blind) (Mac, Osph), (Nota+Wall) vi) Nota+Osph+Wall, ((Nota+Osph+Wall)+Mac), - - - 3 0.125 - Mac, Ony Ony

((Nota+Osph+Wall)+Ony), 0 Mac

(M+Ony), X (blind) (Nota+Osph+Wall)+M)

67

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

vii) Ony, Lag, 1 ((Wall+M), Ony), Lag 0.5 - - - - (Wall+M) 0 ((Wall+M), Lag), Ony

X (blind) (Ony, Lag), (Wall+M) viii) Lagostrophus, ((Ony+Lag+Wall+M),Thy), 0.0625 0.1250 (4,0) 0.0625 0.0016 4 Thy, Lagost (Ony+Lag+Wall+ M) ((Ony+Lag+Wall+M), Lagost), 0 Thy

(Thy,Lagost), X (Ony+Lag+Wall+M)

2.4.2 Wallabia bicolor is nested within the paraphyletic genus Macropus

Grouping Wallabia and the subgenus M. (Notamacropus), to the exclusion of M.

(Osphranter) is supported by six shared retrotransposon markers (Figure 16, Table 7,ii). One single conflicting marker was found, C3 (K106) placing Wallabia outside of M.

(Notamacropus)/M. (Osphranter). Overall, these retrotransposon markers provide moderately strong support for Wallabia grouping with M. (Notamacropus), [6 1 X]. My statistical testing provides PB=0.0625 and PILS=0.125. The insertion ratio test for this

Wallabia /M. (Notamacropus) clade (6,2) gives PR50=0.1445, PR20=0.0012 (Table 7, ii).

Strong rejection of the 20% introgression/hybridisation hypothesis can be explained as follows. If 20% of the genome is shared and retained, the probability of any one marker in stem-Notamacropus being shared with Wallabia is 0.2. So among the maximum of 8 markers (6 and 2, respectively from clades ii and i in Figure 16) that are shared by all members of M. (Notamacropus), we would expect on average, N = 0.2 × 8 = 1.6 to be shared by introgression with stem-Wallabia, and appear as support for grouping Wallabia with M.

(Notamacropus). The remaining 6.4 markers would be expected to appear as support exclusively for M. (Notamacropus). The observed support is the reverse, with six markers

68

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus supporting Wallabia/M. (Notamacropus) and only two markers supporting M.

(Notamacropus).

Note that the value of N for gene flow from the clade including the reference genome (R) is robust to variation in retrotransposition rates among lineages. For gene flow in the opposite direction, to R, the calculation of N assumes equal rates of retrotransposition before the gene sharing event, between the lineage leading to R (e.g. M. (Notamacropus)) and the lineage leading to its hypothesised sister taxon (e.g. Wallabia). The continuity of the KERV genetic divergence profile for Macropus eugenii (Ferreri et al., 2011) suggests that retrotransposition rates remain similar over the short timeframes that would cover the critical periods since species diverged, in which large-scale hybridisation/introgression remains likely. However, the rate variation of retrotransposition for KERV-1 is not well described. Nevertheless, PR values would only be overconfident in the scenario that gene flow was from Wallabia to M.

(Notamacropus) and retrotransposition was more than twice as fast in stem-Wallabia than stem-Notamacropus (as described hypothetically in Figure 15). Theoretically, rate differences could become more of a concern as retrotransposition patterns diverge among taxa that are more divergent, although, the probability of introgression will also decrease with divergence. Otherwise, PR values will typically be conservative for the given gene flow percentage. The effect of insertion rate variation is further discussed in the section

(“Conservatism of the insertion ratio test”).

For assigning markers to the statistical tests we follow the usual practice of only including unambiguous patterns (Kuritzin et al., 2016). That is, patterns that fit one of the three trees within the trifurcation of interest without necessitating hemiplasy or homoplasy. Ambiguous insertion patterns (or multilevel conflicts) (Kuritzin et al., 2016) are prevalent around the base of M. (Notamacropus), most likely due to short time intervals between divergences allowing complex patterns of ILS and introgression. Hemiplasy (or homoplasy) across multiple internal branches is required to explain ambiguous patterns, such as C1 (K106), which could place Wallabia with M.(Notamacropus), but excludes Macropus irma.

69

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Including ambiguous patterns would add five further markers (C1, C5-8) in support of grouping Wallabia bicolor with members of M. (Notamacropus). One additional insertion,

C2 (K78) instead excludes Wallabia from M. (Notamacropus)/M. (Osphranter), but is also present in M. (Macropus). These ambiguous markers increase statistical support for placing

Wallabia with M. (Notamacropus) [11 2 X], PB=0.0112, however, as Kuritzin et al. (2016) point out, hemiplasy across multiple internal branches contravenes the assumptions of current statistical tests for retrotransposons.

Looking deeper at the affinities of Wallabia, three additional markers are shared with both

M. (Notamacropus) and M. (Osphranter), [3 1 X], PB=0.3125 (Table 7, v), such that, cumulatively, nine markers unambiguously support Wallabia bicolor falling within a paraphyletic Macropus [9 1 X] PB=0.0107, PILS=0.0215, (9,2) PR50=0.0327, PR20<0.0001

(Table 7, iv). One additional marker (C5) also favours placing Wallabia bicolor within a paraphyletic Macropus, but is ambiguous on the tree, because it is M. (Osphranter) rather than M. (Macropus) that lacks the marker and is therefore excluded. Overall, the findings provide strong evidence that the monotypic swamp wallaby (Wallabia bicolor) is a member of Macropus, and not that clade’s sister taxon.

2.4.3 Macropus irma groups with the M. (Notamacropus) wallabies

Placement of Macropus irma with the other members of M. (Notamacropus) to the exclusion of Wallabia is supported by two markers, with one conflicting retrotransposon marker, C1

(K106) (Figure 16) that excludes Macropus irma from M. (Notamacropus)/Wallabia, [2 1

X], PB= 0.5 (Table 7 i). Thus, retrotransposon insertion markers alone provide only weak support for the monophyly of M. (Notamacropus). Setting aside the question of Wallabia however, and focusing only on the position of Macropus irma among the three Macropus subgenera, a total of eight (Figure 16) unopposed markers clarify the placement of Macropus irma with the core members of M. (Notamacropus), [8 0 X], PB=0.0039. The ILS symmetry

70

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus and insertion ratio tests strongly reject ILS and introgression/hybridization contributing the eight markers that support Macropus irma grouping with the core members of M.

(Notamacropus) PILS=0.0078 and (8,0) PR50=0.0039, PR20<0.0001 (Table 7, iii). Here, the ILS symmetry test tells us that if the “blind” grouping of Macropus irma with M. (Osphranter) was the true species relationship, the 8 versus 0 asymmetry for markers supporting the two alternative, observable tree patterns is highly unlikely to result from ILS. The eight markers that support Macropus irma grouping with the core members of M. (Notamacropus) are also unlikely to be derived from hybridization/introgression between Macropus irma and M.

(Notamacropus). This is because we would expect a similar or greater number of markers shared by just the core members of M. (Notamacropus), derived from the portion of the genome not shared with Macropus irma – however there are none (Table 7, iii).

2.4.4 Deeper Macropodine phylogeny

I identified conflicting phylogenetic signal at the base of Macropus. Three retrotransposon markers (Figure 16, Table 7, v) support grouping M. (Notamacropus), Wallabia and M.

(Osphranter) to the exclusion of M. (Macropus). However, one marker (C5) supports grouping M. (Macropus) with M. (Notamacropus) and Wallabia ([3 1 X] PB= 0.3125). Three markers support the monophyly of Macropus plus Wallabia [3 0 X] PB= 0.125 (Table 7, vi), a further two retrotransposon markers are shared between Macropus, Wallabia and

Onychogalea, to the exclusion of Lagorchestes, although one of these is ambiguous, also being shared with the deeper Thylogale, hence [1 0 X], PB=0. 5 (Table 7, vii). Finally, four markers group macropodines to the exclusion of Lagostrophus fasciatus (the banded hare wallaby) [4 0 X] PB=0.0625, PILS=0.1250, (4,0) PR50=0.0625, PR20=0.0016 (Table 7, viii).

71

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.4.5 Macropus sub-genera

I identified conflicting phylogenetic signal among the three Macropus subgenera. Three retrotransposon insertions support the grouping of M. (Notamacropus), Wallabia and M.

(Osphranter) to the exclusion of M. (Macropus). However, one marker (C5) supports grouping M. (Macropus) with M. (Notamacropus) and Wallabia, ([3 1 X] PB= 0.3125).

2.4.6 Recent integrations of KERV in the tammar wallaby genome

From the 83 KERVs that were screened, 29 were found to be autapomorphic insertions in the

Macropus eugenii genome. After screening the 29 loci in a set of six Macropus eugenii individuals, the KERV insertions were found to be present in all individuals, which in turn provided a glimpse into the generation of retroposon hemiplasy within species. Four loci were found to be heterozygous across all individuals for the presence of the KERV, 24 loci were homozygous for presence, and one locus was heterozygous or homozygous, depending on the individual (Table 8). The majority of the heterozygous markers belonged to a particular clade of MERVK_LTR (Figure 17).

72

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Figure 17. Bayesian inference phylogeny, for the different sub-families of the endogenous retrovirus, MERV, based on long terminal repeats (LTRs). The blue shaded region contains the majority of the heterozygous insertions and thus represents a potentially young clade that may have arisen through a recent KERV expansion in the tammar wallaby genome. Labels coloured red indicate KERV consensus sequences obtained from Repbase.

73

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

Table 8. Heterozygous test for retrotransposon insertions specific to Macropus eugenii, across multiple individuals. Individuals with +/- are heterozygous for the retrotransposon insertion (observed as two distinct bands on an agarose gel)

Individual 1 Individual 2 Individual 3 Individual 4 Individual 5 Individual 6 MARKER

K4 +/- +/- +/- +/- +/- +/-

K7 +/- +/- +/- +/- +/- +/-

K8 +/- +/- +/- +/- +/- +/-

K68 +/+ +/+ +/+ +/+ +/+ +/+

K74 +/+ +/+ +/+ +/+ +/+ +/+

K75 +/+ +/+ +/+ +/+ +/+ +/+

K77 +/+ +/+ +/+ +/+ +/+ +/+

K84 +/- +/- +/+ +/+ +/+

K86 +/+ +/+ +/+ +/+ +/+ +/+

K88 +/- +/- +/- +/- +/- +/-

K89 +/+ +/+ +/+ +/+ +/+ +/+

K90 +/+ +/+ +/+

K92 +/+ +/+ +/+ +/+

K93 +/+ +/+ +/+ +/+ +/+ +/+

K94 +/+ +/+ +/+ +/+ +/+ +/+

K95 +/+ +/+ +/+ +/+ +/+ +/+

K96 +/+ +/+ +/+ +/+ +/+ +/+

K103 +/+ +/+ +/+ +/+ +/+ +/+

K108 +/+

K109 +/+ +/+ +/+ +/+ +/+ +/+

K110 +/+ +/+ +/+ +/+ +/+ +/+

K111 +/+ +/+ +/+ +/+ +/+ +/+

K113 +/+ +/+ +/+

K114 +/+ +/+ +/+ +/+ +/+ +/+

K115 +/+ +/+ +/+ +/+ +/+ +/+

K116 +/+ +/+ +/+ +/+ +/+ +/+

K124 +/+ +/+ +/+ +/+ +/+ +/+

K140 +/+ +/+ +/+ +/+ +/+ +/+

K141 +/+ +/+ +/+ +/+ +/+ +/+

74

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2.4.7 TE lineage activity in Macropus

I experimentally screened a region of the LINE1 ORF2 in Macropus robustus to further investigate the finding that each LINE1 and SINE marker was present in all tested Macropus and Wallabia species, and thus, may have been silenced. From a total of 100 randomly selected Sanger-sequenced clones, 99 contained deletions and/or stop codons within the analysed partial ORF2 fragment. Thus, only 1% of ORF2 sequences in my data set contained an ORF2 sequence that could be translated into amino acids. The mean within-group nucleotide distance among the 100 sequences is 0.178.

2.5 DISCUSSION

The first and primary statistical test for retrotransposon markers (Waddell et al., 2001) does not account for a critical ascertainment bias for which markers cannot be identified in support of clades that do not include a reference genome. In addition, Kuritzin et al. (2016) show that this ascertainment bias results in a loss of statistical power. Complete genome sampling for most groups remains sparse, indeed many retrotransposon studies employ a single reference genome (Meyer et al., 2012, Zemann et al., 2013, Gallus et al., 2015a) and may be overstating statistical confidence and prematurely confirming or overturning DNA sequence based phylogenetic inferences.

In my study of kangaroos and wallabies, the ascertainment bias for detecting retrotransposon markers is clear – all of the identified markers fall on branches ancestral to the single reference genome, Macropus eugenii. This includes phylogenetically conflicting markers

(hemiplasy) that support alternative groupings (Figure 16, C1-8). However, insertion patterns for clades that exclude Macropus eugenii remain unobservable. Experimental approaches

75

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus were employed to address this ascertainment bias. Screening of 33 introns “devoid” of retrotransposon insertions in Macropus eugenii did not yield any novel markers, and the approach will require many-fold more loci for effective, novel insertion discovery among kangaroos and wallabies.

In the absence of additional experimental or in silico evidence, three lines of reasoning were used to circumvent the single reference genome ascertainment bias. These arguments include consideration of (i) a priori evidence, (ii) the expectation that ILS will distribute markers that conflict with the species tree roughly symmetrically between the two alternative trees, and (iii) an insertion ratio test for whether introgression/hybridization could contribute the markers supporting the favoured “observed” tree. The latter two arguments (Figure 13 and

Figure 14) provide a basis for statistical tests that support the observed placements of

Macropus irma and Wallabia bicolor with core-Notamacropus, and reject “blind” trees in which either Macropus irma or Wallabia would share a closer relationship with other

Macropus subgenera (Table 7).

Overall my KKSC (PB), ILS and insertion ratio test results provide good agreement with

Meredith et al.’s (2008b) five nuclear gene phylogeny, indeed with the retrotransposons often lending statistically stronger support. In the one case of disagreement (retrotransposons favour Onychogalea instead of Lagorchestes as sister to Macropus/Wallabia), the nuclear sequence result was poorly resolved (57% ML bootstrap support), and the retrotransposon grouping agrees with morphological studies (Butler et al., 2016, Flannery, 1989, Prideaux and Warburton, 2010). Retrotransposon support for M. eugenii grouping with M. parma instead of with M. agilis as found by Meredith et al. (2008b) is not incongruence, but results from the latter study mislabeling M. eugenii and M. agilis.

Retrotransposon markers strongly support placing Wallabia bicolor within Macropus [9 1 X]

(Figure 16, Table 7, iv) in agreement with nuclear genes (Meredith et al., 2008b), and overturning Macropus monophyly, which has generally been favoured by morphological

76

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus studies (Flannery, 1989, Prideaux and Warburton, 2010) and by mitochondrial DNA

(Westerman et al., 2002, Mitchell et al., 2014). For the more precise placement of Wallabia, my data suggest this genus is sister to the subgenus M. (Notamacropus), an arrangement that

Meredith et al. (2008b) weakly favoured, based on a five nuclear gene concatenation. My retrotransposon markers provide stronger support for this Wallabia/M. (Notamacropus) clade

(Figure 16 and Table 7, ii) and my ILS symmetry and insertion ratio tests reject the “blind”

Wallabia/M. (Osphranter) clade. A priori evidence strengthens the argument, because no previous molecular or morphological phylogenetic investigations favour the “blind” tree, leaving the most relevant comparison as the strong binary preference for Wallabia/M.

(Notamacropus) over M. (Notamacropus)/ M. (Osphranter).

Conflicting retrotransposon markers have been shown to be common when lineages diverge in rapid succession (Scally et al., 2012, Doronina et al., 2015). Parallels can be drawn with other retrotransposon studies that have revealed hemiplasy arising from rapid radiations.

African cichlids have long been known to have undergone a rapid adaptive radiation resulting in considerable incomplete lineage sorting (Takahashi et al., 2001), while the root of the Neoaves is perhaps one of the most striking examples of a rapid radiation and exhibits considerable incomplete lineage sorting arising from the super radiation that occurred among birds following the K-Pg boundary (Suh et al., 2015). The present study on kangaroo and wallaby relationships is remarkable however, in the complexity of the hemiplasy. For example, four alternative insertion patterns place Wallabia bicolor with different groupings within Macropus that do not appear on the species tree (Figure 16, C1, C5, C6/7, and C8).

These provide additional support for Macropus paraphyly, although, for my statistical analyses I only include unambiguous markers (those without multilevel conflicts). This diversity of conflict is consistent with rapid successive divergences among the Macropus subgenera and within M. (Notamacropus), allowing ILS and perhaps introgression to span several internal branches on the species tree. Interestingly, the one conflict pattern that excludes Wallabia from Macropus (C2, Figure 16) has the same phylogenetic placement as

77

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus coalescent simulations based on mtDNA, and was inferred to have arisen not by ILS, but by introgression into Wallabia from an extinct taxon outside of Macropus (Phillips et al., 2013).

Phillips et al.’s (2013) coalescent simulation study also showed potential for shallower ILS for nuclear loci between Wallabia bicolor and the Macropus subgenera. In all cases the conflicting markers have the same diagnostic ERV mutations and are therefore unlikely to result from independent insertion events of different ERVs. I cannot exclude the possibility of exact deletions of ERVs, however, exact deletions are very rare in other retrotransposon studies, and comprise <0.5% in primate genomes (van de Lagemaat et al., 2005).

The placement of Wallabia bicolor within the genus Macropus, as sister to M.

(Notamacropus) presents a taxonomic anomaly. Meredith et al. (2008b) suggested subsuming Wallabia bicolor into the genus Macropus, with the creation of a new subgenus,

M. (Wallabia). Another possibility is maintaining Wallabia, and instead elevating the three

Macropus subgenera (Osphranter, Macropus and Notomacropus) to independent genera

(Jackson and Groves, 2015). Short internal branches separating the subgenera (~0.8 million years) and the potential for hybridization, even if offspring are typically sterile (Van Gelder,

1977, Close and Lowry, 1990) may favour subsuming Wallabia bicolor into Macropus.

Conversely, substantial behavioural and ecological differences between each of the

Macropus subgenera and Wallabia argue for elevating each to the genus level.

Morphological considerations are also required to guide this taxonomic decision, while resolving the affinities of Macropus fossils should allow more confident temporal and ecological inferences of the group’s diversification.

There are two clear a priori hypotheses for the placement of Macropus irma; the first, a close affinity with M. (Notamacropus), is based on morphology (Dawson and Flannery,

1985) and five nuclear genes (Meredith et al., 2008b). The alternative, which places

Macropus irma with M. (Osphranter), based on mtDNA (Phillips et al., 2013) is the “blind” grouping for this retrotransposon study. Thus, the a priori evidence argument cannot be used

78

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus to reduce the emphasis on the “blind” tree for Macropus irma affinities. However, support from retrotransposons alone is sufficiently strong to confidently group Macropus irma (with or without Wallabia) with the core members of M. (Notamacropus) and reject the “blind” tree ([8 0 X] Table 7, iii). Combining the ILS symmetry, insertion ratio, and a priori evidence arguments has substantially overcome the limitations of a single reference genome being available, and lends confidence to placing Macropus irma and Wallabia as consecutive sister taxa to the core members of M. (Notamacropus).

The relationship among the three Macropus subgenera remains unclear. Three markers group together M. (Notamacropus)/Wallabia and M. (Osphranter) in agreement with numerous molecular studies, including early serological studies (Kirsch, 1977), DNA hybridization

(Kirsch, 1977, Kirsch et al., 1995, Kirsch et al., 1997) and nuclear genes (Meredith et al.,

2008b). However, two conflicting markers were found that group M. (Macropus) and

M. (Notmamacropus) together, to the exclusion of M. (Osphranter). One of these markers includes Wallabia bicolor (Figure 16, C5) and the other does not (C4). This hemiplasy across short internal branches over successive divergences is consistent with speciation events early in Macropus occurring more rapidly than the rate of allele fixation. Greater resolution from additional markers will be required or indeed the basal Macropus trichotomy may be unresolvable, as has been suggested for the deep divergences within placental mammals (Nishihara et al., 2009) and among avian orders (Suh et al., 2015). An additional reference genome will allow assessment of the “blind” alternative among the three subgenera, specifically the grouping of M. (Macropus) with M. (Osphranter), which is generally favoured by morphology (Prideaux and Warburton, 2010) and mtDNA (Mitchell et al., 2014).

One shared retrotransposon marker provides the first molecular evidence for a close relationship between Onychogalea and the Macropus/Wallabia clade (Figure 16). This grouping has often been weakly supported by morphology, particularly dental traits that appear to have evolved for grazing (Flannery and Hann, 1984, Prideaux and Warburton,

79

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus

2010). In contrast, recent molecular analyses (Meredith et al., 2008b, Mitchell et al., 2014) tend to favour a deeper affinity for Onychogalea, outside the clade containing Lagorchestes and Macropus/Wallabia, though with very weak statistical support. Confirming the placement of Onychogalea with Macropus/Wallabia will require additional markers, the ability to rule out the “blind” Onychogalea/Lagorchestes relationship, and resolving the phylogenetic position of the (Setonix).

The composition of transposable elements in the genome can vary dramatically between taxonomic groups (Chalopin et al., 2015). My transposable element screen of Macropus and

Wallabia revealed little or no LINE1 activity over the last 10 million years. Instead only

ERV markers were found which are generally widespread in kangaroo and other mammalian genomes (O'Neill et al., 1998, Zhuo and Feschotte, 2015). My phylogenetic analysis of the

ERV solo LTRs found in the Macropus eugenii genome shows a clustering of different clades, with the majority of young LTRs occurring within a single clade with some heterozygous markers, characteristic of recent insertion events that have not reached fixation

(Figure 17). The phylogenetic screen coupled with the LINE1 ORF2 screen suggests that

LINE1 either has very low retrotranspositional activity, or may have become entirely inactivated in the kangaroo genome. Cases for LINE extinction among mammals have been proposed for megabats (Cantrell et al., 2008), sigmodontine (Casavant et al., 2000,

Grahn et al., 2005, Rinehart et al., 2005), Tasmanian devil (Sarcophilus harrisii) (Gallus et al., 2015a), thirteen-lined squirrel (Ictidomys tridecemlineatus) (Platt II and Ray, 2012), and the spider monkey (Ateles paniscus) (Boissinot et al., 2004). It is possible that ERV activity in the kangaroo genome may have increased due to the absence of competition from LINE1 activity, and indeed parallels have been observed in sigmodontine rodents (Erickson et al.,

2011). However, we caution that our results, although suggestive, cannot rule out the possibility of intact LINE1 copies, given that mammalian genomes typically contain as many as ~500,000 LINE1 copies. Therefore, further screening of high quality genome assemblies

80

Chapter 2: Retrotransposon phylogeny, updated statistics and activity of TE lineages in the genus Macropus will make it possible to explore the evolutionary interplay between LINE1 and ERVs in the kangaroo genome.

2.5.1 Conclusion

In this study, I provide a statistical framework for accommodating the retrotransposon ascertainment bias that arises when only a single reference genome is utilized. This has implications for significance testing in studies performing retrotransposon-based phylogenetic reconstruction. For the first time, I am able to identify clades that are strongly supported and robust to the ascertainment bias caused by genome availability and I identify other clades that need to be tested with additional genome data. Retrotransposon support, for both previous and future single reference genome retrotransposon studies, should be assessed in a similar fashion to verify their conclusions. In addition, I have demonstrated experimentally that LINE1 silencing likely occurred in kangaroos. ERV insertions provide highly significant phylogenetic support among kangaroos, including for placing the swamp wallaby, Wallabia bicolor, as sister to the open forest wallabies of M. (Notamacropus).

81

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3Chapter 3: Supermatrix phylogeny and molecular dating of the genus

Macropus

3.1 INTRODUCTION

The Miocene epoch (23.03-5.33 Ma) is characterized by large vegetation changes on a global scale (e.g., (Willis and McElwain, 2014), which occurred concomitantly with changing climatic conditions (Byrne et al., 2011). An understanding of the influence of climate on the evolution of species is crucial for gaining insight into future changes in diversity (Jansson and Dynesius,

2002, Dirzo and Raven, 2003). The earliest Miocene of Australia (and elsewhere) was characterized by an early icehouse period, followed by a gradual warming which peaked at the mid-Miocene climatic optimum, 15 – 17 Ma (Martin, 2006, Prideaux and Warburton, 2010,

Byrne et al., 2011). Following this, the climate began to cool once again leading to increasing icehouse conditions (McGowran et al., 1997, Macphail, 1997). In the northern hemisphere, the

Miocene saw a spread of more open/arid conditions and an expansion of C4 grass lands (Bernor,

1983, Cerling et al., 1997, Utescher et al., 2000, Mosbrugger et al., 2005, Retallack, 2001), this also occurred in Australia but marked expansion of grasslands is not apparent until the middle

Pliocene (Martin, 1994, Martin, 1998, White, 1997, Byrne et al., 2011) and appears to coincide with the diversification of kangaroos, as they acquired adaptations to increasingly arid environments (Prideaux and Warburton, 2010).

Determining the phylogenetic relationships and timing of key events in the evolutionary history of the kangaroos and wallabies of the genus Macropus has been contentious (Burk and Springer,

2000, Kear and Pledge, 2008, Meredith et al., 2008b, Phillips et al., 2013, Westerman et al.,

82

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

2002). Macropus appears to originate from a rapid adaptive radiation that occurred in Australia during the late Miocene to early Pliocene approximately 5 – 8 million years ago (Mya) (Martin,

2006, Prideaux and Warburton, 2010).

Previously, phylogenetic analyses on kangaroos have utilized morphology and (to a lesser extent) ecological features (Alpin and Archer, 1987), while molecular studies have utilized serological affinity (Kirsch, 1977), DNA hybridization (Kirsch, 1977, Kirsch et al., 1995,

Kirsch et al., 1997), mitochondrial (MT) genes (Westerman et al., 2002), nuclear (NUC) genes (Meredith et al., 2008b) and combined analyses such as the combined MT Cytochrome

B and nuclear selenocysteine tRNA analysis of (Bulazel et al., 2007) and the combined MT and NUC analysis of (Phillips et al., 2013). More recently, kangaroos have been investigated using whole genome characters (retrotransposons) (Dodt et al., 2017).

Despite extensive investigation, no molecular analysis has been performed that covers all modern members of the genus Macropus (and Wallabia). As such, a number of phylogenetic questions have remained unresolved. These, in turn, could have a bearing on the relative positions of the Macropus sub-genera, the swamp wallaby (Wallabia bicolor), the recently extinct toolache wallaby (Macropus greyi), the black striped wallaby (Macropus dorsalis), the elusive black wallaroo (Macropus bernardus) and the proposed mitochondrial introgression of the black gloved wallaby (Macropus irma). I also address the issue of a potential sampling issue with Macropus eugenii, from a previously published phylogeny

(Meredith et al., 2008b). The phylogeny of Meredith et al. (2008b) suggests a close affinity between Macropus agilis and Macropus eugenii based on a five nuclear gene analysis, however closer examination of the sequences (by performing a BLAST search against the M. eugenii genome) has revealed that most M. agilis and M. eugenii sequences from the

Meredith et al. (2008b) study were identical, and likely resulted from sequencing M. agilis twice, thereby calling into question the position of M. eugenii within the diversity of M.

(Notamacropus).

83

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3.1.1 Phylogenetic position of the Macropus sub-genera

The relationships among the three sub-genera of Macropus have long been a point of contention. Previous analyses based on morphology have tended to group the sub-genera, M.

(Osphranter) and M. (Macropus), as sister taxa to the exclusion of M. (Notamacropus)

(Bensley, 1903, Raven and Gregory, 1946, Flannery, 1989, Prideaux and Warburton, 2010).

It is possible that this grouping may be an artefact of allometric correlations, because these are the two large body-sized sub genera. This would explain the discrepancy with molecular studies, such as early serological studies (Kirsch, 1977), DNA hybridization (Kirsch, 1977,

Kirsch et al., 1995, Kirsch et al., 1997) and nuclear genes (Meredith et al., 2008b) which often group M.(Notamacropus) and M.(Osphranter) together, to the exclusion of

M.(Macropus). More recently, the retrotransposon analysis of (Dodt et al – under review) suggested that M. (Notamacropus) and M. (Osphranter) group together to the exclusion of

M. (Macropus), based on three retrotransposon insertion markers, however two conflicting markers were also detected.

3.1.2 Phylogenetic position of the extinct toolache wallaby

The phylogenetic position of the extinct toolache wallaby has been unclear due to a lack of available DNA sequence data. The toolache wallaby became extinct in the early twentieth century, as a result of human induced habitat destruction, for its attractive pelt and by introduced fauna, such as the red fox (Vulpes vulpes), with documented reports of small remnant populations possibly persisting until the early 1970’s (Van Dyck and

Strahan, 2008). As such, the only available specimens require the use of ancient DNA techniques to extract useable quantities of DNA for analysis.

84

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3.1.3 Phylogenetic position of the black-striped wallaby

The phylogenetic position of the black-striped wallaby (Macropus dorsalis) has been unclear. This nocturnal grazer tends to inhabit thick scrub, and is relatively poorly understood compared to more conspicuous wallaby species. Phillips et al. (2013) provided the first molecular analysis that included this species, although with only ~3.5 kb of mitochondrial sequence. The inclusion of this taxon in the present analysis is crucial for clarifying the phylogenetic position of other living members of the sub-genus M.

(Notamacropus). Relationships among M. (Notamacropus) have been uncertain in part due to a lack of available molecular sequence data. Past molecular studies have largely agreed on a sister relationship between M. eugenii and M. agilis (Cardillo et al., 2004, Meredith et al.,

2008b, Phillips et al., 2013), which in turn group with M. parma and M. parryi, which also place as sister taxa to each other (Meredith et al., 2008b). Phillips et al. (2013) showed that

M. dorsalis groups with the M. eugenii + M. agilis clade with moderate support. M. rufogriseus places outside of this clade, followed sequentially my M. irma (Meredith et al.,

2008b), however given the lack of taxon coverage in previous studies, no single study has ever included all members of M. (Notamacropus) in a single analysis.

3.1.4 Phylogenetic position of the black wallaroo

The phylogenetic position of the enigmatic black wallaroo (M. bernardus) is uncertain.

Sample collection has likely been difficult, due to its very limited and geographically isolated population. The black wallaroo is restricted to the sandstone plateau and outliers of

Western Arnhem land – a region characterized by steep rocky escarpments that make observation and sample collection difficult (Van Dyck and Strahan, 2008). This species has been listed as Near Threatened due to its limited range and habitat (Woinarski, 2016). To date the only molecular phylogenetic analysis that included the black wallaroo utilized a short sequence of control region DNA from the mitochondrion. That sequence proved

85

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus phylogenetically uninformative regarding the placement of M. bernardus with the other wallaroos, in a study that focussed on population variation within M. antilopinus (Eldridge et al., 2014).

3.1.5 Phylogenetic position of the swamp wallaby and black gloved wallaby

There is a long history of conflicting placements for the black gloved wallaby and the swamp wallaby. While morphological studies have favoured placing the black-gloved wallaby as the sister taxon to the toolache wallaby, M. greyi (Cardillo et al., 2004), within the diversity of

M. (Notamacropus), there has been a lack of molecular studies exploring this relationship due to the unavailability of DNA sequence of M.greyi. Here I include the first molecular data for the toolache wallaby, in the form of 12s and Cytb mitochondrial sequences to explore this putative affinity between M. irma and M. greyi, within M. (Notamacropus).

Both the wallabies of M. (Notamacropus) and Wallabia bicolor were once included in the genus (Tate et al., 1948), however Ride (1957) argued that parallel evolution and plesiomorphy were sufficient to explain the morphological similarities between

Wallabia and the wallabies of M. (Notamacropus) and both have long been moved to the genera Wallabia and Macropus, respectively, on morphological grounds (Archer, 1984,

Flannery, 1989). The genus Protemnodon now includes only extinct members. Partial mt genomes imply an introgression event between a now extinct close relative of Macropus and the swamp wallaby, however complete mt genomes will be required to verify this hypothesis. I have explored the phylogenetic positions of both the swamp wallaby and black gloved wallaby in detail in chapter 2 using retrotransposons, but here I expand this investigation using a more robust supermatrix dataset.

86

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3.1.6 Supermatrix approach

I have utilized a supermatrix approach to reconstruct the phylogeny and timescale of

Macropus and Wallabia combining five nuclear genes (ApoB, BRCA1, IRBP,

Rag1 and vWF) taken from Meredith et al. (2008b) and complete (or near-complete) mitochondrial genomes, in the first molecular analysis that covers all modern members of the genus Macropus.

As the number of sequenced genomes continues to increase and the field of phylogenetics continues to grow, there is an increasing need for more extensive phylogenetic hypotheses encompassing a wider range of species and sequences. In the past, the majority of phylogenetic studies were limited (by practical and computational limitations) to only a few taxa (Sanderson et al., 1998) – a consequence of the exponential increase of possible tree topologies that needed to be searched (‘tree-space’) as additional taxa are added to an analysis (Felsenstein, 1978).

The stitching together of a number of smaller phylogenetic trees into a larger ‘supertree’ has proven to be a powerful method for overcoming this problem (Sanderson et al., 1998). When certain conditions are met, supertrees may retain all or most of the phylogenetic information from the source trees allowing this method to infer relationships of taxa that do not co-occur on any one source tree (Sanderson et al., 1998). The source trees from which a supertree is constructed may be congruent, that is, the phylogenetic signal between source trees may be in agreement. Conversely, source trees may be incongruent, which occurs when the phylogenetic signal between source trees conflict with each other (Gordon, 1986).

Methodological advances in search strategies and algorithmic shortcuts, such as matrix representation with parsimony (MRP) (Baum, 1992, Ragan, 1992) and others, has resulted in increasingly larger phylogenetic questions being addressed (Bininda-Emonds et al., 2002,

Sanderson and Shaffer, 2002). Supertree methods have been used to produce phylogenies

87

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus from a wide array of taxa including flowering plants (Davies et al., 2004), eukaryotes (Pisani et al., 2007) and mammals (Bininda-Emonds et al., 2007).

However, unlike the supertree approach, which involves separate analyses of data partitions followed by integration of the resulting trees, the supermatrix approach analyses all character data simultaneously (de Queiroz and Gatesy, 2007). The ‘supermatrix’ approach combines diverse types of data into a single phylogenetic matrix prior to phylogenetic reconstruction, and was first proposed by Kluge (1989). The supermatrix approach has been key in resolving a variety of evolutionary questions (de Queiroz and Gatesy, 2007), such as whether snakes have their origin in a marine or terrestrial habitat (Lee, 2005), whether major divergences among placental mammals are correlated with plate-tectonics (Asher et al., 2003) and how whales became obligately aquatic (Geisler and Uhen, 2005). An advantage of the supermatrix approach over the supertree approach is that character evidence is more fully utilized in estimating the tree (Geisler and Uhen, 2005). In supertree analyses, some character information is lost when sets of characters are summarized as trees (Kluge, 1989, de Queiroz et al., 1995). This advantage of the supermatrix approach cannot be understated.

The direct and simultaneous use of data can result in the emergence of phylogenetic signal that may not be apparent in methods that analyse the data separately, potentially because the combined approach allows the phylogenetic signal to overwhelm noise and provide resolution for clades that would otherwise be poorly resolved (Barrett et al., 1991). Even in cases where supermatrix and supertree approaches converge on the same tree, the supermatrix approach can reveal support for relationships in the final tree that would not be supported if the data partitions were analysed separately (Gatesy et al., 1999, Olmstead and

Sweere, 1994).

In this study, I have utilized the supermatrix method, including the first ancient DNA sequence from the extinct toolache wallaby (Macropus greyi) and a complete mitochondrial genome from the elusive black wallaroo (Macropus bernardus). Additional mt genomes were sequenced at BGI and QUT and have been sequenced and added for the first time

88

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus include Macropus rufogriseus, Macropus dorsalis, Macropus agilis, Macropus rufus,

Macropus robustus, Onychogalea fraenata and Thylogale thetis.

3.2 MATERIALS AND METHODS

3.2.1 DNA preparation

Macropod DNA was extracted at Queensland University of Technology, Curtin University and the Senckenberg Biodiversity and Research Centre in Frankfurt. Complete mt genome sequencing for Macropus and Wallabia was performed at BGI (Beijing Genomics Institute), utilizing paired-end Illumina sequencing, following sample preparation at the Senckenberg

Biodiversity and Climate Research Centre. Mt genomes were assembled for each species from the paired-end illumine reads using MITObim 1.6 (Hahn et al., 2013). Where available other sequences were obtained from Genbank (see Supplementary Table S2) Mitochondrial

DNA 12S rRNA and cytochrome b sequences from the extinct toolache wallaby (M. greyi) were kindly donated by Michael Bunce and Dalal Haouchar at Curtain University. DNA sequences were aligned in AliView (Larsson, 2014), inspected manually and used for down- stream analysis. I performed analyses on three datasets: A mitochondrial genome dataset, a nuclear gene dataset and a combined concatenated dataset that included both mitochondrial and nuclear data that I refer to as the ‘supermatrix’ analysis. The impact of missing data is minimized because for any taxa that lack a particular data type (mt or nuc), well-established closely related taxa are included for which that data type is available.

3.2.2 Phylogenetic reconstruction

Kangaroo phylogenetic reconstruction was carried out using maximum likelihood and

Bayesian inference for the mitochondrial and nuclear datasets independently and also for the

89

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus combined supermatrix dataset. Whole and partial mitochondrial genomes were utilized where available, producing a dataset of up to 15,287 bp. The nuclear dataset consisted of five nuclear genes (BRCA1, IRBP, ApoB, Rag1, vWF) totalling 5991 bp. Thus my combined

‘supermatrix’ dataset consisted of 21,278 bp. Sequence data was partitioned and appropriate models of molecular evolution were selected based on the programs PartitionFinder (Lanfear et al., 2016) resulting in a total of 11 partitions (Table 9). The sequences were partitioned across mitochondrial and nuclear protein coding positions and stem and loop sites for RNA data. RY-coding was used for rapidly evolving third codon positions for the mt protein coding genes following (Phillips et al., 2013). RY-coding converts the four DNA bases into purine (A+G) and pyrimidine (C+T) to improve phylogenetic signal at third codon positions and reduce saturation effects. For the nuclear dataset, the model specifications followed

Phillips et al. (2013).

90

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Table 9. Partitioning scheme according to Partitionfinder (mt DNA) and Phillips et al. (2013) (nuc DNA).

Partition name Partition description

stems Mitochondrial tRNA stem regions

loops Mitochondrial tRNA loop regions

cbm1 Cytochrome B (codon position 1)

cbm2 Cytochrome B (codon position 2)

cbm3 Cytochrome B (codon position 3)

m1 all remaining mt genes EXCEPT CytB (codon position 1)

m2 all remaining mt genes EXCEPT CytB (codon position 2)

m3 all remaining mt genes EXCEPT CytB (codon position 3 )

n1 All nuclear genes (codon position 1)

n2 All nuclear genes (codon position 2)

n3 All nuclear genes (codon position 3)

Table 10. Partitioning scheme and models of evolution used based on Partitionfinder and Phillips et al. (2013). See Table 9 for expanded partition names.

Partition number Partition name Model 1 stems TrN+I+ Γ 2 loops TrN+I+ Γ 3 cbm1 GTR+I+ Γ 4 cbm2 GTR+I+ Γ 5 cbm3 GTR+I+ Γ 6 m1 GTR+I+ Γ 7 m2 GTR+I+ Γ 8 m3 GTR+I+ Γ 9 n1 TIM+ Γ 10 n2 TVM+ Γ 11 n3 TVMef+ Γ

91

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Table 11. Composition of formal kangaroo clades.

Clade name Clade definition macropodiformes Hypsiprymnodontidae + Potoroidae + Macropodidae

Potoroidae Potoroos and Bettongs macropodoidea Macropodidae + Potoroidae

Macropodidae All macropodids including Lagostrophus macropodinae All macropodids except Lagostrophus

Dendrolagini Petrogale + Dendrolagus

Dorcopsini Dorcopsis + Dorcopsulus

Bayesian phylogenetic inference was conducted on the mitochondrial genome (mtg), nuclear and supermatrix datasets in MrBayes 3.2 (Huelsenbeck & Ronquist, 2001). For the mitochondrial dataset the MCMC chain was run for 1,000,000 generations and sampled every 5000 generations. The nuclear dataset and the combined supermatrix dataset were run for 2000,000 MCMC generations, and sampled every 5000 generations. A burnin of 20% was used in all cases and the appropriate model of molecular model of evolution was utilized for each partition (Table 10). Branch lengths were proportional across the mitochondrial (1-

8) and nuclear (9-11) partitions. Maximum Likelihood phylogenetic reconstruction was performed using the program RAxML v. 7. 6. 3. (Stamatakis, 2006). All RY coded data was recoded to binary (1,0) and a total of 500 bootstrap iterations (using the –f a command for rapid bootstrapping) were performed on all three datasets, using the general time reversible

(GTR) substitution model with gamma distribution and invariant sites, utilizing the same data partitions that were employed for the Bayesian analysis (Table 10) and branch lengths were kept proportional across all partitions. Phylogenetic conflict/congruence among the three MrBayes analyses and three RaxML analyses was examined using a supernetwork approach in Splitstree4 (Huson and Bryant, 2006).

92

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3.2.3 Molecular Dating and Lineage Through Time Analysis

For molecular dating of the supermatrix dataset, I utilized the program BEAST v1.8.1

(Drummond et al., 2012). Partitioning and models of molecular evolution followed the scheme described in Table 10 (by inclusion of a nexus “data sets” block, allowing for the full partitioning scheme to be utilized in BEAST), or utilized the next more general available model according to the software. This was done, which allowed Beauti The relaxed molecular clock model was partitioned for nuclear and mitochondrial data, with branch rates distributed according to a lognormal distribution (Drummond et al., 2006). The lognormal distribution provides greater flexibility than the exponential distribution option (Drummond et al. 2006). A -death process was set as the tree prior. The BEAST analysis was run for

40,000,000 MCMC generations as the sum of total runs and sampled every 5000 generations, with convergence checked in Tracer 1.6 (Rambaut and Drummond, 2007). For fossil calibrations, taxon sets for the well-established clades, Macropodidae and Potoroidae were set as monophyletic. Four fossil-based priors were used for calibration. These include uniform bound fossil constraints placed on (1) Dendrolagini, the divergence of Petrogale and

Dendrolagus (3.60-14.22 Ma) with the minimum bound based on Mt. Etna Dendrolagus material (Hocknull, 2005), (2) Macropus (3.6 – 14.22 Ma) with the minimum bound based on the age of Macropus pavana (Meredith et al., 2008), and (3) the divergence between

Lagorchestes and Macropus (4.46-14.22 Ma), based on Macropus fossils in the Hamilton fauna (Flannery, Rich, Turnbull & Lundelius, 1992). In each case the maximum bound acknowledges the absence of any macropodine fossils from well sampled Early-middle

Miocene sites at Riversleigh or elsewhere (Phillips et al., 2013). One further calibration (4) is for the root of the tree (macropodids versus potoroids), with hard bounds from 17.79 Ma to 54.65 Ma. The minimum is based on the age of the earliest known well-documented macropodid genus, from Riversleigh Faunal Zone B sites dated to at least 17.79

Ma (Woodhead et al. 2014) and the maximum is based on the absence of any members of modern crown marsupial orders among the Tingamarra Fauna. This flat prior is inappropriate

93

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus here, because it is generally accepted that several fossils from close to the Oligocene-

Miocene boundary were close to the macropodid/potoroid divergence, and that the upper bound is far too conservative, although necessary due to the absence of any intervening fossil records. As such, I employed a lognormal distribution for the prior, with a mean of 23.03 Ma

(Oligo-Miocene boundary) and standard deviation of 2.587 Ma, placing an upper 97.5% soft bound at 54.65 Ma. The soft bound helps to define the shape of the prior, but is redundant to the hard upper bound at the same time. Next, a lineage through time plot was constructed based on the dated phylogeny, in which the number of lineages was plotted (as a natural log scale) against the time since the most recent common ancestor (MRCA), to gain insight into the rate of diversification through time.

3.3 RESULTS

3.3.1 Macropus outgroups and Macropus sub-genera

My Bayesian and maximum likelihood mitochondrial analyses both suggest a monophyletic status for Macropus, with strong statistical support from Bayesian posterior probability (BPP),

(BPP=1.00), and Maximum Likelihood bootstrap support (BP), (BP=100%) (Figure 18 and

Figure 21). Both analyses place the hare-wallabies of Lagorchestes as the sister group to the

Macropus/Wallabia clade (BPP=1.00, BP=99%), followed by Setonix (BPP=1.00, BP=97%).

The relationships among the sub-genera of Macropus remain less clear. Both Bayesian and ML mitochondrial analyses favour the monophyly of the grey kangaroos of M. (Macropus),

(BPP=1.00, 100%) (Figure 18 and Figure 21). Curiously however, M. (Notamacropus) was paraphyletic with the red-necked wallaby (Macropus rufogriseus) placing as the sister taxon to the grey kangaroos in the Bayesian analysis, with reasonably strong Bayesian support

(BPP=0.9913), while the likelihood analysis placed M. rufogriseus as sister to a clade encompassing M. (Macropus)+M. (Osphranter), with weak bootstrap support (BP=19%). The

Bayesian analysis results in M. (Osphranter) grouping with the M. (Macropus) + Macropus

94

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus rufogriseus clade (BPP=0.9893), with the remaining wallabies of M. (Notamacropus) falling outside (BPP=1.00) (Figure 18). However, the likelihood analysis groups M. (Macropus) with M.

(Osphranter) (BP=27%) followed consecutively by M. (rufogriseus) (BP=19%) and M. rufus

(BP=48%), effectively rendering these branches of the likelihood analysis a polytomy. In turn, this is followed by the consecutive placement of the remaining wallabies of M. (Notamacropus)

(BP=100%) as a clade that fall sister to all other Macropus members (Figure 21).

When the nuclear dataset is analysed, Lagorchestes+Setonix form a clade (BPP=0.8695,

BP=45%) and group with Macropus+Wallabia (BPP=1, BP=78%). The supermatrix analysis inferred that M. (Osphranter) and M. (Notamacropus) grouped together with weak support

(BP=36%), and with the grey kangaroos of M. (Macropus) (BP=97%). Lagorchestes placed with

Macropus+Wallabia (BP=92%), followed consecutively by Setonix (BP=89%),

Thylogale/Dendrolagini ((BP=100%) and Lagostrophus (BP=100%) (Figure 23). Some of this uncertainty can be overcome by utilizing a Wallabia+M. (Notamacropus) constraint based on very strong support from retrotransposons (Dodt et al., 2017) and whole nuclear genome sequences (Pers. comm Maria Nilsson-Janke). With these constraints in place, the nuclear

Bayesian analysis showed that M. (Notamacropus)+Wallabia with M. (Osphranter)

(BPP=0.994), then M. (Macropus) falling just outside (BPP=1), (Figure 19).

3.3.2 Wallabia bicolor

The mitochondrial-genome analyses placed the swamp wallaby (Wallabia bicolor) as sister to the genus Macropus, with robust support (BPP=1, BP=94%) (Figure 18 and Figure 21).

Conversely, the nuclear data analyses place Wallabia bicolor within the genus Macropus, rendering Macropus paraphyletic (Figure 19 and Figure 22) with weak support

(BPP=0.9121, BP=42%). The nuclear data groups Wallabia bicolor with M. (Notamacropus) and M. (Osphranter). The unconstrained maximum likelihood supermatrix analysis (Figure

23) agrees with the nuclear-gene phylogeny (Figure 22) finding that the swamp wallaby

95

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus groups as the sister taxon to the M. (Notamacropus) + M. (Osphranter) clade, with weak support (BP=56%).

3.3.3 Macropus irma

The mitochondrial genome Bayesian and maximum likelihood analyses placed the black gloved wallaby (Macropus irma) with the wallabies of M. (Notamacropus) (BPP=1,

BP=57%), as the deepest diverging of these wallabies (Figure 18 and Figure 21). The nuclear gene analyses place Macropus irma with Macropus eugenii (BPP=0.992, BP=88%), and in turn places the M.irma+M.eugenii clade with all other M. (Notamacropus) wallabies

(BPP=1, BP=87%) (Figure 19 and Figure 22). Both the Bayesian and Maximum Likelihood supermatrix reconstructions placed Macropus irma with M. (Notamacropus) (BPP=0.992,

BP=98%), and these grouped, in turn, with core-Notamacropus (defined here as: M. eugenii,

M. agilis, M. parma, M. dorsalis, M. parryi, M. rufogriseus) (BPP=0.9874, BP=61%)

(Figure 20 and Figure 23).

3.3.4 Macropus dorsalis

With the inclusion of the Cytochrome b sequence to expand on the analysis of (Phillips et al.,

2013), the combined supermatrix analyses resolved the position of Macropus dorsalis within M.

(Notamacropus), specifically with Macropus parma with strong support (BPP=1.00, BP=100%).

3.3.5 Macropus bernardus

Bayesian and maximum likelihood reconstructions of the mitochondrial-genome dataset provide the first confident molecular placement of the black wallaroo (M. bernardus), and show that it is nested within M. (Osphranter), grouping with Macropus robustus (BPP=1.00,

BP=100%) (Figure 18 and Figure 21). The supermatrix analyses (which also included

Macropus antilopinus) showed robust support for placement of Macropus bernardus with

Macropus robustus+Macropus antilopinus (BPP=1.00, BP=100%), while support for

Macropus robustus+Macropus antilopinus was also robust (BPP=0.9993, BP=100%), indicating that Macropus bernardus is the deepest diverging of the wallaroos (Figure 20 and

96

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 23). This was followed in turn by the red kangaroo (Macropus rufus) which is the deepest diverging member of M. (Osphranter) (BPP=1.00, BP=100%) (Figure 20 and Figure

23).

3.3.6 Macropus greyi

Both Bayesian and maximum likelihood reconstructions of the supermatrix dataset show that the toolache wallaby (M. greyi) groups with the black-gloved wallaby (Macropus irma)

(BPP=0.992, BP=98%) providing the first molecular confirmation that this extinct macropod is the closest relative of M. irma (Figure 20 and Figure 23).

3.3.7 Molecular Dating and Lineage Through Time Analysis

My time calibrated phylogeny (Figure 24) in BEAST v1.8.1 (Drummond et al., 2012), estimates that the sub-genus M. (Macropus) containing the grey kangaroos diverged from all other members of Macropus (including Wallabia) near the Miocene-Pliocene boundary, 6.27 Ma (95%

HPD 4.30 – 8.44). M. (Notamacropus)+Wallabia diverged from M. (Osphranter) approximately

5.83 Ma (95% HPD 4.03 – 7.73), while Wallabia diverged from M. (Notamacropus) slightly later, at 5.64 Ma (95% HPD 3.96 – 7.48). Within the sub-genus M. (Osphranter), Macropus bernardus split from the other wallaroos (Macropus robustus+Macropus antilopinus) at 2.92 Ma

(95% HPD 1.59 – 4.34), following the red kangaroo (Macropus rufus) splitting from the wallaroos at 4.58 Ma (3.07 – 6.33). Among the wallabies of M. (Notamacropus), the Macropus irma+Macropus greyi clade was found to have diverged from core-Notamacropus at 5.33 Ma

(95% HPD 3.82 – 7.06) and split between Macropus irma and Macropus greyi occurred later at

3.57 Ma (95% HPD 2.03 – 5.39). Macropus rufogriseus diverged from other wallabies at 4.78

Ma (95% HPD 3.37 – 6.32), with the mainland sub-species, Macropus rufogriseus banksianus,diverging from the the Tasmanian sub-species, Macropus rufogriseus rufogriseus,0.46 Ma (95% HPD 0.12 – 1.06). Macropus dorsalis split from Macropus parma at

1.73 Ma (95% HPD 0.75 – 2.90). Finally, the lineage through time analysis shows an increase in

97

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus the rate of diversification (Figure 25), roughly coincident with the expansion of grasslands in

Australia during the mid-Pliocene, from ~3-4 Ma.

Figure 18. Bayesian Phylogeny (MrBayes) of kangaroos based on the mitochondrial genome dataset with RY coding of third codon positions of protein coding regions. Bayesian posterior probabilities are shown at each node. Dendrolagus and Thylogale are represented by multiple species and branches have been collapsed. See Table S2 for specimen list. A supernetwork showing phylogenetic conflict among Figures 18 – 23 (Splitstree4) is shown in the Appendix – Figure S2.

98

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 19. Bayesian phylogeny (MrBayes) of kangaroos based on the five-nuclear gene dataset. Bayesian posterior probabilities are shown at each node. Dendrolagus and Thylogale are represented by multiple species and branches have been collapsed. See Table S2 for specimen list. A supernetwork showing phylogenetic conflict among Figures 18 – 23 (Splitstree4) is shown in the Appendix – Figure S2.

99

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 20. Bayesian tree (MrBayes) of kangaroos, based on the combined ‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes). Bayesian posterior probabilities are shown at each node. A supernetwork showing phylogenetic conflict among Figures 18 – 23

(Splitstree4) is shown in the Appendix – Figure S2.

100

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 21. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the mitochondrial genome dataset. Bootstrap support values are shown at each node. A supernetwork showing phylogenetic conflict among Figures 18 – 23 (Splitstree4) is shown in the Appendix – Figure S2.

101

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 22. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the five- nuclear gene dataset. Bootstrap support values are shown at each node. A supernetwork showing phylogenetic conflict among Figures 18 – 23 (Splitstree4) is shown in the Appendix

– Figure S2.

102

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 23. Maximum Likelihood Phylogeny (RAxML) of kangaroos based on the combined

‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes). Bootstrap support values are shown at each node. A supernetwork showing phylogenetic conflict among

Figures 18 – 23 (Splitstree4) is shown in the Appendix – Figure S2.

103

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Figure 24. Time-Calibrated Bayesian phylogeny (BEAST v1.8.1) of kangaroos based on the

‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes). Bayesian posterior probabilities are shown at each node. See Figure S1 (Appendix A) for confidence intervals of the molecular dating.

104

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3

2.5

2

1.5

1

0.5 No. of lineages (natural log) (natural of lineages No.

0 7 6 5 4 3 2 1 0

Time since MRCA (Ma)

Figure 25. Lineage Through Time (LTT) Plot of Macropus/Wallabia based on the time calibrated BEAST phylogeny, showing the rate of diversification of kangaroo lineages through time. X-axis is the time to the MRCA of Macropus/Wallabia in millions of years before present. Y-axis is the number lineages represented as natural log values.

105

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

3.4 DISCUSSION

3.4.1 Phylogeny of kangaroos

3.4.1.1 Macropus Sub-genera

The relative positions of the sub-genera of Macropus have long been a contentious issue.

Both my nuclear-gene and supermatrix analyses favour the placement of the subgenus M.

(Macropus) at the root of Macropus, as sister to M. (Osphranter) and M.

(Notamacropus)+Wallabia. This is supported by previous studies based on serology (Kirsch,

1977), DNA hybridization (Kirsch, 1977, Kirsch et al., 1995, Kirsch et al., 1997), nuclear genes (Meredith et al., 2008b) and retrotransposons (Dodt et al., 2017) – all of which favoured placing M.(Macropus) as the deepest diverging lineage of Macropus. Conversely, studies that have utilized mitochondrial data, tend to place M. (Notamacropus) at the base of

Macropus (Westerman et al., 2002), and indeed my mito-genome analysis conforms with this finding (see Figure 18 and Figure 21). My data supports previous work by Phillips et al.

(2013), which showed discordance between mt DNA and nuc DNA. The combined analysis of Phillips et al (2013) clearly showed the discordance between mitochondrial and nuclear datasets with regards to the position of the sub-genera (Phillips et al., 2013). Morphological analyses have favoured placing the wallabies of M. (Notamacropus) as the deepest diverging lineage within Macropus in agreement with mitochondrial analyses (Bensley, 1903, Raven and Gregory, 1946, Flannery, 1989, Prideaux and Warburton, 2010). More recently, Dodt et al (2017) utilized retrotransposon insertions to shed light on the relationship of the sub- genera and found some support for M. (Macropus) being placed at the root of Macropus

(with some conflict) providing further support to the placement found in the nuclear-gene and supermatrix analyses. The conflict observed in the analysis of Dodt et al (2017) suggests the importance of ILS in the Macropus radiation, which may be an explanation for the observed mitochondrial topology.

106

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

In light of this conflict, it appears that more data will be required to resolve this phylogenetic question, or it may be that the trifurcation at the root of Macropus is indeed unresolvable. Such difficult to resolve trichotomies are not uncommon when divergences occur very rapidly (ie. when branch lengths between divergences are short), and parallels may be drawn with other mammals. So called ‘hard polytomies’ that result from compressed cladogenesis have been suggested at the root of placental mammals, for which all three possible topologies among the major lineages have been proposed at various times (Kriegs et al., 2006, Churakov et al., 2009,

Hallström and Janke, 2010, Meredith et al., 2011, McCormack et al., 2012,Tarver et al., 2016,

Esselstyn et al., 2017). And a similar polytomy has been suggested for the Australasian marsupials (Dasyuropmorphia, Peramelemorphia and Notoryctemorphia) (Gallus et al., 2015b).

3.4.1.2 Position of Wallabia bicolor

Both Bayesian and likelihood analyses based on the mitochondrial dataset place Wallabia as the sister taxon to Macropus. This is in agreement with previous studies that have utilized mitochondrial datasets (Westerman et al., 2002), but incongruent with nuclear gene studies which place Wallabia within a paraphyletic Macropus (Meredith et al., 2008b). This sister relationship with Macropus is also supported by phenotypic characters, such as morphological, reproductive, karyotypic and behavioural traits (Van Dyck and Strahan,

2008). The swamp wallaby is a browser, distinguishing it from the other wallabies of M.

(Notamacropus), which are grazers or intermediate browsers-grazers. This major dietary difference is reflected in the of Wallabia which differs substantially from that of other wallabies (Prideaux and Warburton, 2010), and in a low gait that is well suited to negotiating dense vegetation (Van Dyck and Strahan, 2008).

In addition to ecological factors, karyotype studies have shown that the swamp wallaby has a chromosomal number of 11 in the male and 10 in the female, unlike the wallabies and kangaroos of Notamacropus, which have 16 (Van Dyck and Strahan, 2008). However, these

107

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus phenotypic and karyotypic characters may be autapomorphic, and therefore not reliable for placing the swamp wallaby in its true phylogenetic context.

Returning to the issue of incongruence, the combined molecular analysis of Phillips et al.

(2013) illustrates the conflict between mitochondrial and nuclear datasets with regard to the swamp wallaby (Phillips 2013). Incomplete lineage sorting among nuclear genes has been suggested as a possible explanation for this incongruence (Phillips et al., 2013). Under this scenario, the mitochondrial placement of Wallabia as sister to Macropus would be the true species relationship, while incomplete lineage sorting among the nuclear genes may be causing the conflicting signal that groups the swamp wallaby within a paraphyletic

Macropus. Simulations performed by Phillips et al (2013) suggest that analysis of as many as

30 nuclear genes may be necessary to overcome the putative ILS signal arising from the current standard nuclear dataset (five nuclear genes), which was first utilized by Meredith et al (2008). However, my study of retrotransposon insertions (chapter 2) found strong evidence that the swamp wallaby should be placed within a paraphyletic Macropus, in concordance with previous nuclear gene studies (Meredith et al., 2008b).

In light of the retrotransposon evidence, coupled with the present nuclear gene phylogeny, I suggest that the true species relationship of the swamp wallaby is indeed nested within the diversity of Macropus, as sister to the wallabies of M. (Notamacropus). Under this scenario, the conflicting mitochondrial signal is most likely due to an introgression event from a now extinct sister lineage of Macropus, since coalescent simulations in Phillips et al. (2013) confidently exclude mt DNA ILS as an explanation. Mitochondrial introgression into the stem lineage of Wallabia would explain the tendency for mitochondrial studies to pull the swamp wallaby outside of the genus Macropus as a sister lineage, and indeed several now extinct macropods are known to have existed at that time based on the fossil record

(Prideaux and Warburton, 2010). The close affinity of Wallabia bicolor with M.

(Notamacropus) is also supported by the fact that Wallabia bicolor has long been known to hybridize with members of M. (Notamacropus). Hybrids have been observed between

108

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Wallabia bicolor and Macropus agilis (Van Gelder, 1977), Macropus rufogriseus (Smith et al., 1979) and Macropus eugenii (O'Neill et al., 1998). My nuclear-gene reconstructions found that Wallabia sits slightly deeper within Macropus, as sister to both M.

(Notamacropus) and M. (Osphranter), with apparently strong Bayesian support, but considerably less confident bootstrap support (BPP=0.9121, BP=42%) – a curious placement that has been suggested in only a single reconstruction by Westerman et al (2002), based on a combined dataset of mitochondrial genes and a single nuclear gene (Protamine P1) and with weak maximum likelihood bootstrap support (BP=52%). Thus while taxonomic revision may indeed be warranted in this case, I suggest that further investigation, such as whole genome studies, should be conducted before serious discussion regarding the taxonomic re-classification of Wallabia bicolor can take place. Further, whole genome studies may shed light on putative introgression by examining large linkage blocks, and methods describing this have been proposed previously (Sousa and Hey, 2013). This demonstrates the conflict that can occur between different genetic loci when successive speciation events occur very rapidly.

3.4.1.3 Position of Macropus irma

All of my analyses (mitochondrial, nuclear and supermatrix) suggest that Macropus irma groups with the wallabies of M. (Notamacropus), a finding that is congruent with morphology (Dawson and Flannery, 1985, Wann and Bell, 1997, Van Dyck and Strahan,

2008) and nuclear gene studies (Meredith et al., 2008b) but contradicts earlier mt studies which place Macropus irma within the diversity of the wallaroos of M. (Osphranter)

(Phillips et al., 2013). Based on simulations, Phillips et al (2013) suggested a mitochondrial introgression event from an ancestral wallaroo into Macropus irma may be the most likely explanation for the mitochondrial placement of Macropus irma with the wallaroos of M

(Osphranter). Post-speciation introgression has been observed in other kangaroo clades, such as between populations of rock wallaby (Petrogale) (Briscoe et al., 1982, Eldridge and

109

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Close, 1992) and between the grey kangaroos (Macropus giganteus and Macropus fuliginosus) (Neaves et al., 2010b). However, Phillips et al. (2013) only employed five mt genes. With the inclusion of a complete mitochondrial genome for Macropus irma in my study, the mitochondrial placement of Macropus irma shifts to M. (Notamacropus) in my analysis. It is likely that model misspecification leading to underappreciation of stochastic variation explains the previous grouping of Macropus irma with M. (Osphranter. The mitochondrial placement of Macropus irma inside M. (Osphranter) from Phillips et al (2013) was supported robustly by Bayesian posterior probability (BPP=1.00), but with more moderate support from maximum likelihood (BP=88%), further demonstrating the potential over-confidence of Bayesian posterior probabilities in phylogenetics. Bootstrap values have previously been shown to be far more faithful than typically overconfident BPP values

(Suzuki et al., 2002, Gontcharov et al., 2004). My finding that Macropus irma groups with the wallabies of M. (Notamacropus), based on both the mtg and nuclear sequences further supports the retrotransposon study (Chapter 2).

3.4.1.4 Macropus bernardus

The membership of M. (Osphranter) is well defined morphologically (e.g. Dawson and Flannery

(1985)), and all four species are arid adapted (M. rufus and M. robustus) or at least seasonally dry adapted in the case of the northern members (M. antilopinus and M. bernardus). However, the latter two wallaroos have had scant consideration with molecular data, and the placement of the black wallaroo (M. bernardus) among the other members of the sub genus is especially unclear.

For the first time, this study includes all four species together for analysis of DNA sequences.

With the complete mitochondrial genome of the black wallaroo, my dated supermatrix (Figure

24) analysis shows that M. bernardus is sister to Macropus antilopinus and Macropus robustus, and diverged from these approximately 2.9 Ma (95% HPD 1.59, 4.34), in the late Pliocene. This confirms that the black wallaroo is the deepest diverging of the wallaroos. The deep placement of the black wallaroo relative to the other wallaroos is consistent with previous phylogenetic

110

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus estimates based on morphological and ecological traits (Cardillo et al., 2004, Eldridge et al.,

2014). However, despite being the deepest diverging of the wallaroos, the black wallaroo is not the deepest diverging member within M. (Osphranter), this position is held by the largest living marsupial, the red kangaroo (Macropus rufus). This earlier divergence from the wallaroos was estimated at 4.58 (95% HPD 3.07, 6.33) Ma. Given the phylogenetic placement of the black wallaroo among the other members of M. (Osphranter), it appears that the smaller size, solitary behaviour and ecological distinctiveness of the black wallaroo from other wallaroos (Macropus robustus and Macropus antilopinus) and from the red kangaroo (Macropus rufus), are likely derived traits arising from adaptation to unique ecological conditions, rather than an indicator of the ancestral state of M. (Osphranter). However, this will be formally tested in Chapter 4.

3.4.1.5 Relationships among the wallabies of M. (Notamacropus)

Combining the mtDNA and nuclear genes in the supermatrix analysis for all living

Macropus species clarifies relationships among the wallabies of M. (Notamacropus). In addition, this study provides the first inference of the affinities of the extinct toolache wallaby (Macropus greyi), which is strongly supported as sister to Macropus irma, confirming morphological assessments (Dawson and Flannery, 1985, Robinson and Young,

1983).

Understanding the relationship of Macropus dorsalis has also been limited by a paucity of studies that have included this species. According to morphology, Macropus dorsalis is robustly grouped with the wallabies of M. (Notamacropus) (Dawson and Flannery, 1985). However, only Phillips et al (2013) has included this species with DNA, and only for approx. 3.5 kb of the mitochondrial genome. I have expanded upon this dataset, by including the mitochondrial Cytochrome b sequence. Full taxonomic coverage of M. (Notamacropus) allows M. dorsalis to be placed confidently among the other wallabies of M. (Notamacropus). My supermatrix analysis agrees with Phillips et al. (2013) by placing M. dorsalis close to the clade containing M. eugenii+M. agilis (Figure 20 and Figure 23). However with the inclusion of additional taxa, my analysis

111

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus infers that M. dorsalis is the sister species to M. parma, with M. parryi falling outside, as sister to the M.dorsalis+M. parma clade. Meredith et al. (2008) also found a close association between

M. parma and M. parryi with robust support, however that study did not include M. dorsalis.

Thus, I confirm that M. dorsalis is the closest living relative to M. parma, having diverged approximately 1.73 Ma (Figure 24). The supermatrix analysis also conforms with Phillips et al.

(2013) by grouping M. eugenii and M. agilis together with reasonably strong support.

3.4.2 Molecular Dating and coincidence with climatic change

The Cenozoic saw a period of climate change from the warm-humid greenhouse conditions of the Paleogene to the icehouse phase of the Quarternary (Retallack, 2001, Zachos et al., 2001).

Global timing of the responses of mammals during that time has been varied. For kangaroos, my molecular dating suggests that major cladogenesis took place from the origin of Macropus and

Wallabia at ~7.62 Ma (95% HPD: 5.44-10.13 Ma) through to the basal divergences within M.

(Notamacropus) at ~5.33Ma (95% HPD: 3.82-7.05 Ma) (Figure 24), covering the period from the late Miocene to the early Pliocene, coinciding with substantial cooling and drying that took place in Australia at that time (Byrne et al., 2011). This climatic trend ultimately resulted in a global expansion of C4 grasses that took place from the late Miocene to the present (Cerling et al., 1997). The timing of the grassland expansion was largely simultaneous on a global scale (eg. in Australia, Africa, Eurasia, North and South America), with evidence for some slight variations in timing across different geographical regions (eg. Australia and Africa appear to have had major grassland/savannah expansions in the Pliocene, while Eurasia, North and South America saw grassland expansion in the Miocene) (see Figure 26) (Cerling et al., 1997, Strömberg, 2006).

In Australia, it appears that the major divergences within Macropus (and Wallabia) all occurred close to the Miocene-Pliocene boundary with diversification rates slowing in the

Pliocene/ (Figure 26). This expansion is coincident with substantial

112

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus cooling/aridification, and represents a period of faunal turnover during which the major splits leading to the modern grazing kangaroos took place (Prideaux and Warburton, 2010).

Figure 26. An evolutionary History of Browsing and Grazing Mammals, taken from (Janis,

2008). Bars indicate times of widespread grasslands: prairie (also equivalent to steppe or

pampas) = treeless grassland; savanna = treed grassland. Closed circles record the first

appearance of various events. Grassy habitats = habitats containing some grasses, probably

woodland savanna or brushland. Grazers = mammals with craniodental adaptations (apart from

hypsodonty) indicative of specialist grazing (i.e., > 90% of grass in the diet on a year-round

basis). Palaeobotanical information adapted from data in Jacobs et al. (1999) and Strömberg

(2004). Palaeotemperature curve is from global deep sea oxygen isotopes (adapted from

Zachos et al. 2001); as used here it represents relative temperatures only (for general

comparison, mid latitude mean temperatures during the early Eocene climatic optimum were

probably around 30° C). Modified from Janis et al. 2004.

The role of climate in driving evolution over large geographical regions and for timescales that are shorter than typical mammalian species duration is unclear (Cerling et al., 1997), however there is compelling evidence to suggest that faunal turnover coinciding with climate changes over several million years has occurred in many taxa that are associated with grassland expansion.

113

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus

Early Miocene faunas in east Africa transitioned from primarily small tropical forest-dwelling browsers into more savanna adapted faunas, with grazers emerging such as grazing antelopes and hippos, which replaced chevrotains and anthracotheres as the dominant artiodactyls (Cerling et al., 1997). The end of the late Miocene saw an expansion of seasonal open forests and wooded grasslands in Africa and a decline of woodland/closed forest habitats, with the Lothagam fauna representing transitional forms between the earlier Miocene and Plio-Pleistocene faunas (Leakey et al., 1996). Other Miocene examples include the Equidae of North America (MacFadden,

2000) and rodents in Spain (van Dam, 2006).

Fossil information indicates that a high rate of faunal turnover and adaptation has been associated with climate/environmental change for several placental clades that are comparable to the faunal turnover/adaptations observed in kangaroos. The locomotory adaptations of limb evolution in , which transitioned from small, forest-dwelling fauna to open-adapted grazers corresponding to the opening of habitats in the mid-Miocene (Janis, 2008) seem to mirror the locomotory adaptations (bounding) of kangaroos to increasingly arid environments (Windsor and

Dagg, 1971, Dawson, 1995). The decline of woodland-adapted placental mammals followed by an increase in open-habitat representatives between 7-8 Ma based on the fluvial Neogene Siwalik formations of northern Pakistan (Barry et al., 1985) are another example of comparable placental mammal faunal turnover. Hypsodont artiodactyls replaced tragulids, while hippopotamids and true giraffes appear in post-7.5Ma fossil assemblages (Barry, 1995). Faunal turnover of small placental mammals included the regional extinction of dormice, coincident with the appearance of species with more open/arid adaptations such as rhyzomids and hares (Lindsay et al.,

2013).The timing of these events seems to foreshadow a similar transition in kangaroos, and indeed, my molecular dating analysis (see Figure 24) places the origin of Macropus and Wallabia approximately at this time, in the late Miocene. However, there is a considerable lack of vertebrate fossil record in Australia corresponding to the late-Miocene, precluding direct comparison (Prideaux and Warburton, 2010). The only well-known late-Miocene fossil assemblages of are the Alcoota (7-8Ma) and Ongeva (6-7 Ma) assemblages from the

114

Chapter 3: Supermatrix phylogeny and molecular dating of the genus Macropus southern of Australia (Archer et al., 1998). No known arid adapted specimens are known from that time, with all fossil at these sites being browsers rather than grazers (Murray, 1997), however balbarids are known to have become extinct during this interval, followed by the emergence of macropodines and Sthenurines (Murray, 1991, Prideaux,

2004), indicating some level of faunal turnover.

In North America, the equid diversity increased to its maximum by the middle Miocene, followed by a decline in the late Miocene to early Pliocene ~ 7 – 4.5 Ma, along with declines among camelids antilocaprids, paaoemerycids and gomphotheres during this interval, in favour of the hypsodont lineages which took over in the Pliocene (Vrba, 1995, MacFadden, 1994). This placental mammal transition parallels my finding that a period of major cladogenesis occurred in

Macropus during this interval, including the major divergences that gave rise to the three sub- genera of Macropus and Wallabia (Figure 24).

3.4.3 Conclusion

I have provided the first molecular phylogenetic study that includes all living members of the genus Macropus as well as the recently extinct toolache wallaby. This study sheds light on the positions of key taxa within Macropus and Wallabia, and provides a time-calibrated phylogeny showing the coincidence of the diversification of Macropus and Wallabia with the known cooling and drying trend that characterized the Miocene-Pliocene epochs. The expansion of grasslands appears to coincide with a burst of diversification in kangaroos during the Pliocene, during which kangaroos evolved adaptations to more open environments. Comparison with the placental grazing herbivores (Hunt and Robert, 2004,

MacFadden, 2000) suggests that faunal turnover and adaptation to more arid environments was a global phenomenon during the Miocene and Pliocene, however the timing of these events between Australia and other geographical locations may not have been synchronous.

115

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

4Chapter 4: Reconstructing the ancestral evolutionary history of

Macropus

*Parts of this chapter have been published in Scientific Reports: Dodt, W.G., Gallus, S., Phillips, M.J. and Nilsson, M.A., 2017. Resolving kangaroo phylogeny and overcoming retrotransposon ascertainment bias. Scientific reports, 7(1), p.16811.

4.1 INTRODUCTION

In this chapter I use phylogenetic ancestral state reconstruction to trace aspects of kangaroo ecology and life history over the past 10 million years, and consider the impacts of climate and vegetation change. Ancestral state reconstruction is used in evolutionary biology to infer the ancestral path of heritable traits from common ancestors through to the modern taxa (Gascuel and

Steel, 2014). By utilizing character state information from modern (and/or extinct) taxa, a phylogenetic tree with temporal or branch-length information and an evolutionary model of character change, it is possible to reconstruct the most likely character states at ancestral nodes

(including the root) of a phylogenetic tree.

In the previous chapter I aimed to explore the phylogenetic history of kangaroos, but in the present chapter I have expanded this investigation to shed light on the ancestral states of key phenotypic characters among kangaroos, and I assess whether transitions of character states coincided with environmental change. Gaining an understanding of how species have responded to environmental changes in the past is critical for understanding how species will respond to modern climate change trends (Edwards et al., 2005), and to inform conservation programs.

116

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

The kangaroos and wallabies of the family Macropodidae have their origins in the Late

Oligocene/Early Miocene (Raven and Gregory, 1946, Westerman et al., 2002, Kear et al.,

2007, Prideaux and Warburton, 2010) and the expansion of the modern lineages appears to have coincided with cooling and aridification of Australia since that period (Prideaux and

Warburton, 2010). At present, up to 70% of the Australian continent is classified as arid or semi-arid (Byrne et al., 2011), however this was not always so. Throughout the Cenozoic,

Australian environments have undergone considerable change. Prior to the Eocene, Australia was relatively warm and wet and was dominated by closed/wet/rainforest habitats (Byrne et al., 2011), but underwent a cooling/drying trend from the early Eocene which persisted well into the Oligocene, resulting in contraction of mesic environments and an expansion of arid zones (Byrne et al., 2011) and see (Figure 27). Interpretations of Oligocene-Miocene climates have been less clear and somewhat controversial (Archer et al., 1989, Archer et al.,

1997, Martin, 2006, McGowran and Li, 1994, Megirian, 1992, Megirian et al., 2004).

However, studies examining marine biota, mammalian fossil assemblages (examining body mass distributions, and apparent ecological venue), changes in sea level and rainfall have been used to infer a relatively cool/dry icehouse environment in the late Oligocene

(McGowran, 1986, McGowran and Li, 1994, Archer et al., 1997, Travouillon et al., 2009).

By the end of the Late Oligocene, this cooling trend was interrupted by a warming period that persisted until the mid-Miocene climatic optimum, and which resulted in an upsurge in forest-adapted marsupial species in fossil records of that time (Travouillon et al., 2009).

From the late-Miocene to the present, the climate once again began to cool, resulting in an increase in more arid conditions, with the Pliocene aridification being exacerbated by mountain building in New Guinea casting a rain shadow (McGowran, 1986, McGowran and

Li, 1994, Archer et al., 1997, Travouillon et al., 2009). The period encompassing the Late

Miocene and Pliocene has been described as the beginning of the major drying of Australia that has led to the current climate (Byrne et al., 2011). There is no evidence of grasslands emerging until the Pliocene (Archer et al., 1994, Martin, 1994).

117

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Figure 27 . Summary of the palaeoclimate for the Australian continent since 65 Ma. Figure taken from Byrne et al (2011).

Miocene-Pliocene cooling/drying led to the opening of many closed, forest habitats, with increased distribution of more open sclerophyll forests (Martin, 2006, Black et al., 2012). Over this period two diverse kangaroo clades emerged, the short-faced kangaroos (Sthenurinae), which became extinct in the Late Pleistocene (Prideaux, 2004), and Macropodinae. However, sparse

Late Miocene fossil assemblages have limited the potential for tracing these radiations against climatic/vegetation shifts. My molecular dating estimates in Chapter 3 provide an opportunity to more closely trace the evolution of macropodines.

The most iconic and species rich group within the Macropodinae are the kangaroos, wallaroos and wallabies of the genus Macropus, and Wallabia. These genera, as well as various closely related clades, have evolved to occupy a wide variety of habitats across the Australian continent and on various surrounding islands (Van Dyck and Strahan, 2008). The diversity of these habitats has led to macropods acquiring a wide array of adaptations for varying ecological conditions.

118

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Body mass, size, colouration, behaviour and habitat preference vary substantially across this diverse group of marsupials (Van Dyck and Strahan, 2008).Members of Macropus are particularly well adapted to grazing (Sanson, 1989). However, most molecular dates place the origin of this genus several million years before the expansion of grasslands in Australia

(Meredith et al, 2008; Mitchell et al., 2014). Resolving the timing of the initially rapid diversification of Macropus/Wallabia is important for inferring whether this event occurred before or coincident with the major spread of grasslands during the mid-Pliocene, from ~3.6 Ma

(Byrne et al., 2011).

The wallabies of the sub-genus M. (Notamacropus) are generally the smallest members of

Macropus, with average adult body masses (averaged across females and males) of 4 – 16 kg.

The red kangaroo and the wallaroos of the sub-genus M. (Osphranter), as well as the grey kangaroos of M. (Macropus) tend to be larger. The average adult body mass of members of M.

(Osphranter) ranges from 17–46 kg, while the mass of members of M. (Macropus) falls within the range of 26-33 kg (Van Dyck and Strahan, 2008). Foraging behaviour among the larger bodied kangaroos of M. (Osphranter) and M. (Macropus) tends to be primarily grazing, with a preference for more open habitats, and in several cases their distributions expand into more arid habitats. The smaller bodied wallabies of M. (Notamacropus) more often exhibit a combination of browsing/grazing foraging within or nearby to open forest or woodland habitats (Van Dyck and Strahan, 2008).

The formation of social groups known as ‘mobs’, are common among kangaroos and wallabies.

While some species exhibit solitary behaviour, others are highly gregarious and form mobs ranging in size from a few to dozens of individuals and are often centred on a dominant alpha male with several females (Jarman, 1987). Some kangaroo species form transient feeding aggregates, but these lack the social structure of permanent mobs and tend to disperse after feeding (Van Dyck and Strahan, 2008).

119

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

In addition to inhabiting a diverse array of habitats and exhibiting a wide range of behavioural traits between species, kangaroos and wallabies also show a high degree of variation between sexes of the same species. Sexual dimorphism is a phenomenon in which males and females of the same species present significant differences for any trait other than those associated with the sex organs. These differences may include body mass/size, colour, markings and/or behavioural differences (Darwin, 1883, Kottler, 1980, Wallace, 2007). Sexual dimorphism is very common and highly distinctive among many animal groups (striking examples include pea fowls, butterflies, and angel fish) (Manning, 1989, Oliver and Monteiro, 2011). Extreme sexual dimorphism is less common in mammals than some other vertebrate taxa such as birds, however some notable examples within mammals include the red kangaroo (Frith and Calaby, 1969, Van

Dyck and Strahan, 2008), the sperm whale (Physeter catodon) (Bryden, 1972), and weasels such as Mustela erminea (Hall, 1951).

Among marsupials however, sexual dimorphism is usually subtle, most often involving moderate body size differences (Van Dyck and Strahan, 2008). The most extreme sexual dimorphism among marsupials occurs in macropods (Warburton et al., 2013, Richards et al., 2015). Among kangaroos and wallabies, sexual dimorphism includes: i) body mass – in which males, on average, tend to be larger than females in the majority of cases. The most striking examples of size sexual dimorphism in kangaroos are exemplified in the largest kangaroo species, the red kangaroo (Macropus rufus) and the grey kangaroos of M. (Macropus), with males more than twice the body mass of females (Warburton et al., 2013). Size sexual dimorphism is also observed in several other kangaroo and wallaby species to varying degrees. Among living macropodids only in Lagorchestes hirsutus are females typically (slightly) larger than males

(mean adult body mass is 1740grams for females, and 1580 grams for males) (Van Dyck and

Strahan, 2008); ii) Coat colour sexual dimorphism – this characteristic is generally regarded as a sexually selected trait, and within kangaroos, is only substantial among the wallaroos and red kangaroo of the sub-genus M. (Osphranter), as shown in Figure 28.

120

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

A B

C D

Figure 28. Sexual dimorphism in kangaroos for coat colour and body mass in two members of the sub-genus M. (Osphranter). A) male black wallaroo; (B) female black wallaroo; C) female red kangaroo; D) male red kangaroo.

*Kangaroo photos taken from:

(https://animalcaseprofile.wordpress.com/2014/05/15/red-kangaroo-Macropus-rufus/),

(http://www.travelling-australia.info/Infsheets/Redkangaroo.html) and

(http://mayh-dja-kundulk.bininjgunwok.org.au/plant_or_animals/black-wallaroo-female)

In this chapter I present ancestral state reconstructions built upon a dated phylogeny to trace the evolution of life history traits among kangaroos, with a focus on the genera Macropus and

Wallabia. Utilizing the first taxonomically complete molecular phylogeny of the genus

121

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Macropus/Wallabia, I present a set of Bayesian inference ancestral state reconstructions for habitat, size sexual dimorphism, coat colour sexual dimorphism and mob size in order to trace the evolutionary history of this iconic group of Australian marsupials. These characters cover key factors that can influence the evolution of species, including sexual selection, camoflauge and gregariousness. I also explore the putative influence of environmental change on these traits.

4.2 MATERIALS AND METHODS

4.2.1 Molecular dating and ancestral habitat reconstruction in MrBayes

Phylogenetic reconstruction, molecular dating and ancestral habitat reconstruction was carried out with MrBayes 3.2.6, and employed the five nuclear gene (Rag1, BRCA1, vWF,

IRBP, ApoB) data matrix of Meredith et al. (2008) for 32 macropods, six outgroup diprotodontians and Dromiciops, with key constraints based on the retrotransposon analysis

(Chapter 2), in order to assess the implications of the retrotransposon findings. I initially ran a non-clock phylogenetic analysis with the groupings of Wallabia/M. (Osphranter)/M.

(Notamacropus) and Onychogalea/“Macropus” held consistent with information from the independently derived retrotransposon information (Chapter 2), but without restricting the placement of taxa not included in the retrotransposon study. Molecular dates and habitat ancestry were then co-inferred on the resulting topology. In all analyses evolutionary models were partitioned, with stationary base frequencies, the GTR substitution matrix, gamma shape, invariant sites, and relative branch lengths separately estimated. Four million mcmc generations (sampled every 5000th) for two independent runs, each for one cold and two heated chains were sufficient to provide model ESS values, in Tracer 1.6, well over 200

(indicating convergence of the chains) and spilt frequency standard deviations <0.01.

For molecular dating, the igr relaxed clock model was employed in MrBayes, with priors igrvarpr=exp(10) and clockratepr=lognorm(-6.0,0.6). The following fossil calibrations were used: (1) Diprotodontia and (2) , both with uniform bounds, 25.5 – 54.65 Ma.

122

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

The lower bound is based on the minimum age of the oldest crown vombatiformes (and diprotodontians) among the Etadunna faunas (Meredith et al., 2011). The maximum bound is the maximum age of the Tingamarra fauna (Beck et al., 2008), which includes an assemblage of plesiomorphic marsupials and no putative crown diprotodontians. (3)

Petauridae-, with uniform bounds, 25.5-54.65 Ma (Meredith et al., 2008a).

(4) Macropodiformes, with uniform bounds, 24.7-54.65 Ma. The maximum bound follows calibrations 1-3, however, the minimum bound is based on the earliest well established macropodoids, such as Bulungamaya (Butler et al., 2016) from Etadunna Faunal Zone C. (5)

Macropodoidea, with truncated lognormal bounds, 17.79 – 28.5 Ma. This calibration is based on (Phillips, 2015a) but with the minimum age updated in view of new radiometric dates for

Ganguroo at the Neville’s Garden site at Riversleigh (Woodhead et al., 2016). The distribution mean (23.03 Ma) is placed at the Oligo-Miocene boundary, recognizing close macropodoid crown/stem transitional forms from around this time (e.g. (Cooke et al., 2015)).

The maximum bound is “soft”, allowing for the possibility of origins pre-dating the base of the Late Oligocene. (6) Thylogale-Dendrolagini, with uniform bounds, 4.36-14.22 Ma. The minimum is based on Thylogale ignis fossils from the Hamilton fauna (Flannery, 1992) and the maximum bound is the maximum age of Ringtail site at Riversleigh and recognises the absence of Macropodinae from Riversleigh Faunal Zone C sites or contemporaneous sites elsewhere.

For ancestral state reconstruction, primary habitat was coded (0) closed/wet/rainforest, (1) open canopy forest, (2) arid zone based on known habitat preferences (Strahan, 1995,

Tyndale-Biscoe, 2005). Ancestral state reconstructions were performed for habitat based on maximum parsimony and Bayesian inference in MrBayes 3.2.6. Most species of kangaroos and wallabies will be found in open canopy forest/woodland at least some of the time, however, the rainforest state here is defined as such, only when the majority of populations from a species/clade predominantly occupy rainforest or other Mesic closed canopy habitats.

The grassland state is distinguished by distributions extending into sparsely treed, more arid

123

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus grasslands or other open habitats. These grassland specialized macropods are typically dentally specialized grazers, including M. (Osphranter), M. (Macropus) and Onychogalea

(see (Sanson, 1989)). In terms of ecology, geography and adaptation, state 1 is intermediate between states 0 and 2, and therefore this character was treated as ordered, and variable.

Table S3 (Appendix) shows habitat coding for each species.

4.2.2 Ancestral State Reconstructions in Bayestraits

In addition to the MrBayes time-calibrated phylogeny and ancestral habitat reconstruction above

(section 4.2.1), I performed additional Bayesian ancestral state reconstructions in the program

BayesTraits v2 (available at www.evolution.rdg.ac.uk), based on the time-calibrated phylogeny obtained from the more complete supermatrix dataset (chapter 3) see section 3.2.3 for parameters. BayesTraits provides advantages over MrBayes in being able to use (and test) complex, but biologically reasonable restrictions on transformations between particular states, and in testing evolutionary correlations between traits.

Bayestraits v2 was used to estimate ancestral states for (1) habitat; (2) mob size; (3) coat colour sexual dimorphism; (4) body mass; and (5) body mass ratio (see Table 12). Discrete coding was used for coat colour and habitat, while mobsize was also discretized, because approximations and ranges rather than counts are typically given in the literature. Continuous coding was used for size sexual dimorphism (Table 12).

124

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Coat colour sexual Mean female body mass Mean male body mass (natural size ratio Species Habitat Mobsize dimorphism (natural log of mass in grams) log of mass in grams) (natural log) Potorous_tridactylus 1 0 0 6.93 7.07 0.45 Aepyprymnus_rufescens 1 0 0 7.82 7.87 0.06 Lagostrophus_fasciatus 1 0 0 7.57 7.57 0 Thylogale 0 0 0 8.29 8.71 0.4 Dendrolagus 0 0 0 8.99 9.14 0.15 Petrogale_xanthopus - 1 0 8.35 8.64 0.29 Setonix_brachyurus 1 1 0 7.97 8.19 0.22 Lagorchestes_conspicillatus 1 0 0 7.93 7.97 0.05 Lagorchestes_hirsutus 2 0 0 7.34 7.29 -0.03 Wallabia_bicolor 1 0 0 9.47 9.74 0.27 Macropus_giganteus 2 1 0 9.99 10.71 0.72 Macropus_fuliginosus 2 1 0 9.76 10.44 0.66 Macropus_rufus 2 1 1 10.18 11.1 0.91 Macropus_robustus 2 1 0 9.81 10.43 0.62 Macropus_bernardus 1 0 1 9.47 9.95 0.48 Macropus_eugenii_genome 1 1 0 8.61 8.92 0.31 Macropus_agilis 1 1 0 9.31 9.85 0.55 Macropus_parma 1 0 0 8.48 8.68 0.21 Macropus_irma 1 0 0 8.99 8.99 0 MB1611_Toolache_wall_SA_mland 1 0 0 8.85 8.85 0 Macropus_dorsalis 1 1 0 8.78 9.68 0.9 Macropus_parryi 1 1 0 9.31 9.68 0.37 Macropus_antilopinus 2 1 1 9.77 10.52 0.75 Macropus_rufogriseus_KJ868122 1 1 0 9.55 9.89 0.34 Macropus_rufogriseus_banksianus 1 1 0 9.53 9.83 0.3

Table 12. Coding for ancestral state reconstruction in BayesTraits v2. Habitat coding is

0=wet/closed forest 1=open forest; 2=extending into the arid zone. Mobsize coding is

0=does not form mobs; 1=does form mobs (not just feeding aggregations). Coat colour sexual dimorphism coding is 0= not sexually dimorphic; 1=sexually dimorphic. Body mass values are presented as the natural log of mean mass (in grams). Size ratio values are the natural log of the male to female ratio. Thylogale represents Thylogale stigmatica and

Thylogale billardierii; Dendrolagus represents Dendrolagus lumholtzi and Dendrolagus goodfellowi.

For size sexual dimorphism, all values were based on average body mass of female and male macropods taken from (Van Dyck and Strahan, 2008) and (Myers et al., 2006). A ‘continuous random walk’ model was employed for continuous trait MCMC analysis, with an exponential prior, and a mean of 10 as a default parameter specified by the Bayestraits manual V2 (available at www.evolution.rdg.ac.uk.). For mobsize, all modern taxa were coded as either ‘mob forming’ or ‘non-mob forming’ based on data from (Van Dyck and Strahan, 2008) and (Myers et al.,

2006). For discrete characters I utilized the ‘multistate’ model within BayesTraits, with

125

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Maximum Likelihood analysis performed. For coat colour sexual dimorphism, all modern taxa were coded as coat colour sexual dimorphism being present or absent. Some taxa, such as M. robustus and M. rufogriseus banksianus have minor coat colour sexual dimorphism at least in some populations, however, taxa were only considered to have coat colour sexual dimorphism present if that dimorphism is more pronounced than geographical variation in coat colour. In other words, coat colour dimorphism was coded as present if the variation between males and females within a population is greater than the variation between populations. Coat colour dimorphism was determined based on the following sources: http://animaldiversity.org, (Strahan,

1995), (Vernes, 2016) and extensive field knowledge from my supervisory team. For habitat, all modern taxa were coded as either closed forest, open forest or arid. I define the habitats as 1) closed/wet/rainforest; 2) open forest; 3) distribution extends substantially into arid zones.

4.3 RESULTS

4.3.1 Adaptive Radiation of Kangaroos

In this chapter, I inferred the timescale and ancestral habitats (Figure 29) for the diversification of kangaroos in the program MrBayes 3.2.6, based on the phylogenetic findings from the five nuclear gene (Rag1, BRCA1, vWF, IRBP, ApoB) data matrix of

Meredith et al. (2008) for 32 macropods and utilizing constraints from my retrotransposon findings in chapter 2. This was performed in order to test the implications of the retrotransposon findings. Next I carried out ancestral habitat reconstruction in BayesTraits

(Figure 30) based on the phylogeny from the more complete ‘supermatrix’ data of chapter 3.

As such Figure 29 and Figure 30 are both ancestral reconstructions of habitat, but are derived from phylogenies based on different datasets. I then expanded my ancestral habitat reconstructions in BayesTraits to also include mob size, coat colour sexual dimorphism and size sexual dimorphism, in addition to habitat preference.

126

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Figure 29. Dated Bayesian phylogeny (MrBayes) of the Macropodidae with ancestral states for habitat preference displayed as coloured branches. Black= closed wet forest; green= mesic/intermediate/open; brown= arid zones. Red shaded window represents the mid-

Miocene climatic optimum of approximately 15-16 Ma; Yellow shaded window represents the major aridification of Australia and the coincident expansion of grasslands from ~4 Ma.

Boxes coloured brown/green/black indicate that the Petrogale clades represented by the included taxa are polymorphic with regards to habitat preference, with some species favouring grasslands over mesic habitats. The anomaly zone (Degnan and Rosenberg, 2006)

(blue dotted box) outlines the portion of the tree that contains the greatest degree of hemiplasy.

127

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Figure 30. Maximum Likelihood ancestral state reconstruction performed in Bayestraits v2 for habitat of kangaroos plotted on a time calibrated (BEAST) Bayesian phylogeny (Figure 24). Bayesian posterior probability support values for this tree are given in Figure 24. Blue colouring indicates closed/wet forest, orange colouring indicates open forest and red colouring indicates arid zone habitat preference. Black indicates uncertainty, relatively high rates of habitat preference evolution leave low signal for nodes ancestral to long-unbroken branches. Thylogale represents T.stigmatica, T.thetis, T. billardierii; Dendrolagus represents D. lumholtzi and D.bennettianus.

128

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

4.3.1.1 Molecular dating and ancestral habitat reconstruction in MrBayes

Relaxed clock molecular dating analyses of five nuclear genes, using six fossil calibrations

(Figure 29) and incorporating constraints based on the retrotransposon findings, found the four “Macropus” clades, M. (Macropus), M. (Osphranter), M. (Notamacropus) and Wallabia successively diverge from each other over a period covering about 1.7 Ma, beginning 6.71

Ma (95% HPD: 5.07-8.22 Ma). The last of these divergences, between Wallabia and M.

(Notanacropus) was inferred at 4.98 Ma (95% HPD: 3.78-6.53 Ma). An earlier phase of diversification from about 7-9 Ma covers almost all intergeneric divergences among both potoroids and macropodines.

Parsimony ancestral state reconstruction favours a transition from open canopy forest to grassland prior to the divergence of Onychogalea and Macropus, well before Pliocene aridification and major grassland expansion, during which Wallabia and M. (Notamacropus) would require a reversal to habitats dominated by open canopy forest. Bayesian inference modelling of habitat evolution in MrBayes instead favours a scenario in which all transitions from open canopy forests to grassland can be dated to the Pliocene grassland expansion, and no reversals are required (Figure 29). However, this more palaeobotanically congruous

Bayesian inference analysis does not strongly reject earlier transitions.

4.3.1.2 Ancestral habitat preference of kangaroos in BayesTraits

Ancestral reconstructions for habitat in BayesTraits, with the expanded taxon sampling showed multiple ecological transitions from open forest to arid zones occurring independently in the lineages leading to Lagorchestes, Macropus agilis, Macropus rufus, Macropus robustus and the lineage leading to the grey kangaroos of M. (Macropus) (Figure 30). My reconstructions find no reversals from more arid zones to open or closed forest habitats, which is biologically/climatically plausible given the gradual aridification of Australia since the mid-

Miocene.

129

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

4.3.2 Ancestral size sexual dimorphism of kangaroos

Ancestral state reconstruction inferred substantial size sexual dimorphism among ancestral macropodines, with males being larger than females in the majority of cases (Figure 31, 32).

Little to no size sexual dimorphism was found at the root of the Macropodoidea, however this may reflect reduced model/statistical power along the long early branches, for which the relatively high rate of size evolution results in means being drawn close to central values.

The signal is clearer at the root of the Macropodinae, with a mean inferred ratio of adult male body mass relative to females of ~1.3. Similar degrees of size sexual dimorphism were inferred for the ancestor of Macropus+Wallabia+Lagorchestes+Setonix, the ancestor of

Thylogale+Dendrolagini, the ancestor of the Dendrolagini, and the ancestor of

Macropus+Wallabia bicolor+Lagorchestes (see Table 11 for clade name explanations).

Increased size sexual dimorphism (male:female adult body mass ratio of ~1.5) was inferred for the most recent common ancestor (MRCA) of the Macropus/Wallabia clade, and for most daughter nodes within. The most notable exception is the MRCA of the M. irma+M. greyi clade (male:female adult body mass ratio, 1.21) The greatest magnitude of size sexual dimorphism occurs within M. (Macropus) and their MRCA is inferred to have a male to female body mass ratio of ~1.9. Size sexual dimorphism estimates were similarly high among M. (Osphranter), with male:female size ratio estimates for all-inclusive nodes ranging from ~1.6 to ~1.9, (Table 13). Several clades showed substantially reduced size sexual dimorphism. This was estimated in the ancestor to Lagorchestes conspicillatus +

Lagorchestes hirsutus, with a ratio of 1.18. The size sexual dimorphism ratio for the ancestor of M. irma + M. greyi (1.21) is also reduced substantially from 1.46 at the origin of M.

(Notamacropus) (Table 13).

130

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Figure 31. Bar graph illustrating ancestral state reconstruction of macropod body mass

(grams) displayed for males (red) and females (blue), indicating the degree of size sexual dimorphism across different clades.

131

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Table 13. Ancestral state reconstructions for body mass and body mass ratio (male: female), based on the dated (BEAST) phylogeny (Figure 24). Values are presented in absolute numbers, and also as natural log (ln) values to reduce multiplicative error.

132

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Figure 32. Bar graph illustrating ancestral state reconstruction of macropod body mass ratio

(values are natural log of the mass in grams).

4.3.2.1 Ancestral mobsize of kangaroos

Ancestral reconstructions suggest that mobbing had evolved by the root of Macropus/Wallabia, approximately 6.2Ma, in the Late Miocene, though even deeper origin of mobbing is also possible (Figure 33). Subsequent loss of mobbing behaviour was inferred for the lineage leading to the MRCA of Macropus irma and Macropus greyi, therefore from at least ~3.5Ma, during the

Pliocene. Mobbing behaviour was also inferred to have been lost in the lineages that respectively

133

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus gave rise to Wallabia bicolor, Macropus bernardus and Macropus parma. All other members of

Macropus appear to have retained mobbing behaviour (Figure 33).

Figure 33. Maximum Likelihood ancestral state reconstruction (Bayestraits) for kangaroo mobsize plotted on a time calibrated Bayesian phylogeny (BEAST). Green colouring indicates mobbing behaviour, while red colouring indicates absence of mobbing. Black branches indicate uncertain ancestral states.

134

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

4.3.2.2 Ancestral coat colour sexual dimorphism

Ancestral reconstructions of coat colour sexual dimorphism showed robust support for only a single origin of substantive sexual dimorphism in coat colour within macropods, in the ancestral lineage of M. (Opshranter). All modern members of M. (Osphranter) appear to have retained this characteristic, with the exception of M. robustus (Figure 34).

Figure 34. Maximum Likelihood ancestral state reconstruction (Bayestraits) for coat colour sexual dimorphism of kangaroos plotted on a time calibrated Bayesian phylogeny (BEAST). Green colouring indicates clades that are sexually dimorphic for coat colour, while red colouring indicates absence of substantive coat colour sexual dimorphism.

135

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

4.4 DISCUSSION

The cooling and aridification trend that characterized the Neogene, resulted in among rainforest communities, and range contraction of the surviving lineages into refugia (Byrne et al.,

2011). Aridification of the Australian continent necessitated the adaptation of many species to more arid and open conditions and is epitomized in the adaptive radiation of kangaroos (Prideaux and Warburton, 2010). Molecular timescales for the evolution of Macropus and Wallabia have remained uncertain; partly due to conflicting phylogenetic placements and the use of speculative fossil calibrations (see Phillips, 2015a). Here, with several phylogenetic placements clarified from the retrotransposon analysis (chapter 2) and sequence based analysis (chapter 3), I have attempted to reconstruct the ancestral life history evolution of kangaroos, on a dated phylogeny, estimating ancestral states for habitat, mob size, size sexual dimorphism and coat colour sexual dimorphism.

Ancestral habitat reconstruction reveals multiple independent ecological shifts among kangaroos with distribution expansions into more arid, open habitats (particularly grasslands), coinciding with the Pliocene increase in aridification across Australia over the past ~4 million years (Martin,

2006, Black et al., 2012). Using robust fossil calibrations, my estimate for the crown origin of

Macropus+Wallabia of 6.71 (5.07-8.22) Ma is slightly younger and more precise than most earlier estimates (e.g. Meredith et al. 2008b, Phillips et al. 2013). Deeper in the tree, the crown origins of both major macropod families, Macropodidae and Potoroidae approximately coincide with the mid-Miocene climatic optimum, about 15-16 Ma (Figure 29) when rainforest dominated much of Australia. It is interesting that my habitat reconstruction places the ancestors of both potoroids and macropodids in open canopy forest, potentially advantaging both groups of taxa as the forests opened later in the Miocene. Open forests already existed during the Oligocene (Byrne et al., 2011), when the initial transition from rainforest to open-canopy forests is likely to have occurred among macropods (Figure 29).

136

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Transitions or expansions from open canopy forest habitats to more open and widespread grasslands are inferred to have occurred independently in the lineages leading to

Onychogalea, M. (Macropus), M. (Osphranter), Lagorchestes hirsutus, and within

Petrogale. Each of these transitions falls on branches that temporally match the development of Australia’s grasslands, which became widespread by the Late Pliocene (3.6-2.6 Ma)

(Martin, 2006, Black et al., 2012). These inferences are also consistent with forest-dwelling being retained earlier, in the oldest known ~5-4.5 Ma (Pledge, 1992, Flannery et al., 1992) putative members of both Macropus and their close relative, Protemnodon (Close and

Lowry, 1990).

Ancestral reconstructions for sexual dimorphism for body mass showed considerable heterogeneity among kangaroos and to a lesser extent for sexual dimorphism of coat colour (see

Figure 31 and Figure 34). Overall, the greatest degree of body mass sexual dimorphism among modern kangaroo taxa is seen among the more arid adapted members of M. (Osphranter) and the grey kangaroos of M. (Macropus) (Van Dyck and Strahan, 2008). These findings among the extant taxa are indeed reflected by my inferences among ancestral members of these lineages, with regard to both size sexual dimorphism (Figure 28) and ancestral habitat reconstruction

(Figure 29 and Figure 30), as well as coat colour sexual dimorphism (Figure 32) in the case of

M. (Osphranter). The drivers of sexual dimorphism among these kangaroos and wallabies are unclear, and more generally there have been comparatively few studies on the drivers of sexual dimorphism in mammals. The majority of studies of sexual dimorphism among terrestrial vertebrates have focussed on passerine birds (Verner and Willson, 1966, Orians, 1969), although some examples of mammalian studies include pinnipeds (Bartholomew, 1970), primates (Crook,

1972), porpoises (Jefferson, 1990) and bears (Derocher et al., 2005).

Sexual dimorphism among mammals is generally quite varied, such that some species favour males being larger than females - often with additional secondary traits that are absent in females

(Andersson, 1994, Eisenberg and Eisenberg, 1981) - while others favour females being larger

137

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus than males (Ralls, 1976, Ralls, 1977). Overall body size can often be an indicator of the degree of size sexual dimorphism among mammals, and it has been suggested that this is the general rule within the animal kingdom (Rensch, 1950, Rensch and Rensch, 1959). Indeed, modern kangaroo taxa, specifically the larger bodied grey kangaroos of M. (Macropus) and the red kangaroo of M.

(Osphranter) conform to this general rule (Van Dyck and Strahan, 2008), as do my inferred ancestral states for these groups (Figure 31 and Figure 32). This possible allometric link agrees with trends observed in other mammals, for which the most sexually dimorphic species tend to be those groups in which the modal body size is large, such as catarrhine primates, proboscideans, artiodactyls and larger members of Chiroptera (Ralls, 1977). Further, Ralls

(1977) described extreme dimorphism as a male to female body mass ratio of 1.6 or greater, see

Table 14 for examples of extreme size sexual dimorphism among mammalian orders). By this criterion, my reconstructions infer that extreme size sexual dimorphism among kangaroos occurred in the ancestors of M. (Osphranter), M. (Macropus) and the ancestor to M. parma+M. dorsalis (Table 13), indicating that extreme size sexual dimorphism evolved in multiple kangaroo lineages during the Pliocene and Pleistocene, roughly coinciding with a significant expansion of grasslands across the Australian continent (Martin, 2006, Black et al., 2012). In each of the above-noted groups, the most sexually size dimorphic are the most arid (or open habitat) distributed.

My ancestral reconstructions for body mass recovered the ancestor of the Macropodoidea as well as the root of the tree (Macropodiformes) as slightly sexually dimorphic in favour of larger females (see Figure 32). These putative ancestral lineages are inferred to have been solitary, rainforest dwelling species with females similar in size or potentially being slightly larger than males – a phenomenon only observed in Lagorchestes hirsutus among extant Macropods (Van

Dyck and Strahan, 2008) and examined in other mammalian clades by Ralls (1976). These results for the root and early nodes of my tree are surprising given that only a single modern taxon (Lagorchestes) exhibits sexual dimorphism in which females are larger than males. As such, I caution that this finding may be an artefact of the ancestral state reconstruction. It has

138

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus been suggested that the ancestral state of the root, can be more difficult to infer than other nodes in the tree under some scenarios, and indeed the root of the tree is the most ancient node in the tree and therefore the furthest node from the data (extant taxa) that are observed today (Gascuel and Steel, 2014).

Table 14. Orders of mammals exhibiting sexual dimorphism in size, taken from Ralls (1977).

A tendency for mobbing and sexual size dimorphism is prevalent among the larger grazing kangaroos of M. (Macropus) and M. (Osphranter) and their close ancestors, which suggests a possible link between larger mob size and sexual dimorphism for size. This assumption is biologically congruous and in agreement with findings among placental mammals that sexual dimorphism typically increases with the ratio of females to males (McPherson and Chenoweth,

2012). Trivers (1972) emphasised the role of parental investment in driving sexual dimorphism for size, pointing out that the sex which invests the least into time and energy into parental

139

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus investment will tend to compete for mates and be subject to sexual selection. This appears to be the case for kangaroos, in which competition between large males for access to females is evident in the more sexually dimorphic clades (the grey kangaroos and the red kangaroo), and indeed males show the greatest degree of variation in size when compared across multiple kangaroo taxa compared to females (see Table 15) for variation of body size among male macropods compared to females).

It can be inferred from my ancestral state reconstructions that sexual dimorphism of coat colour

(dichromatrism) evolved along the M. (Osphranter) stem lineage and has been retained in extant taxa of this sub-genus, with the exception of M. robustus (minor differences between male and colouration occur in some M. robustus populations). The underlying causes of colour sexual dimorphism among mammals are poorly understood, particularly among marsupials, in part because of a lack of studies among these taxa. A possible explanation is that more open, arid environments allow greater visibility and therefore phenotypic traits, such as fur colouration, are likely to be influenced, either through sexual selection by females, or due to camouflage – an idea that has been suggested in some cases among passerine birds (Savalli, 1995, Badyaev and Hill,

2003, Owens, 2006), but mammalian studies addressing this question have been sparse (Dixson et al., 2005, Setchell and Jean Wickings, 2005). In contrast to birds, relatively few mammalian lineages exhibit sexual dimorphism for colour. Studies of ungulates, such as eland, have been observed to turn dark blue as adult harem holders when one or few males control a comparatively large number of females (Caro, 2005), a finding that parallels the red kangaroo, in which males tend to exhibit distinctly red/brown colouration, while females tend to be blueish/grey and a dominant male tends to control a harem of females (mean mob size of ~10 individuals) (Van

Dyck and Strahan, 2008). In the case of lions, black mane colouration is often sexually selected by females as an indicator of higher food intake, testosterone levels and age, indicating a strong role for sexual selection in driving dichromatism in this case (West and Packer, 2002).

140

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus

Table 15. Standard deviation of adult body mass between male and female macropods.

Species Mean adult male size (grams) Mean adult female size (grams)

Potorous tridactylus 1180 1020 Aepyprymnus rufescens 2630 2480 Lagostrophus fasciatus 1936 1936 Thylogale 6050 4000 Dendrolagus 9300 8000 Petrogale xanthopus 5640 4230 Setonix brachyurus 3600 2900 Lagorchestes conspicillatus 2896 2770 Lagorchestes hirsutus 1470 1540 Wallabia bicolor 17000 13000 Macropus giganteus 45000 21800 Macropus fuliginosus 34100 17600 Macropus rufus 66000 26500 Macropus robustus 33800 18200 Macropus bernardus 21000 13000 Macropus eugenii 7500 5500 Macropus agilis 19000 11000 Macropus parma 5900 4800 Macropus irma 8000 8000 Macropud greyi 7000 7000 Macropus dorsalis 16000 6500 Macropus parryi 16000 11000 Macropus antilopinus 37000 17500 Macropus rufogriseus 19700 14000 Macropus rufogriseus banksianus 18600 13800 male std dev. female std. dev. 16051.44672 6951.615594

Interestingly, male lions tend to live in small coalitions with other males, while females live in prides, lending further credibility to the idea that group size and group dynamics play a role in sexual dichromatism among mammals. In addition, West and Packer (2002) found an association between dark mane colour and cooler environments, implying a possible link between climate and mammalian dichromatism. This is relevant to kangaroos because the most arid adapted members tend to have the greatest degree of sexual dimorphism in both size (Figure 31 and

Figure 32) and coat colour (Figure 34), and also tend to be mob forming (Figure 33) (Van Dyck and Strahan, 2008). It also appears that the most variation in coat colour (and size) is among males. For example, the red kangaroo (Macropus rufus) inhabits much of the arid interior of the

Australian continent (Van Dyck and Strahan, 2008), with males exhibiting red coat colouration much the same as the red iron-oxide rich soil that blankets the interior of the continent, while females tend to be more blue-grey (Van Dyck and Strahan, 2008). This may be due to

141

Chapter 4: Reconstructing the ancestral evolutionary history of Macropus behavioural differences between males and females. Until males “take over” a mob they tend to remain solitary and therefore may have more reliance on camouflage, while females tend to be more gregarious and therefore retain a degree of group protection (pers.comm. MP), perhaps lessening the need for camouflage. However the colouring of males may also be due to sexual selection favouring distinct coat colour relative to females, or a combination of these factors.

However, Dawson and Brown (1970) suggested that the colouration of fur among the more arid adapted members of kangaroos (the red kangaroo and the euro or hill wallaroo) is likely associated with the reflective properties of certain colours and their influence on body heat regulation, rather than camouflage and suggests that the dichromatism between males and females in these arid adapted species is largely driven by sexual selection (Dawson and Brown,

1970). Curiously, the black wallaroo (Macropus bernardus) tends to be solitary for both sexes, yet represents a striking example of sexual dichromatism between males and females, with males being black, while females tend to be much lighter in colour (Van Dyck and Strahan, 2008). This lessens the emphasis on group dynamics, such as mobbing, and suggests that sexual selection alone may be the driver of sexual dichromatism in this taxon, although I cannot exclude the possibility that camouflage within the more forested escarpments it occupies may also play a role. Overall, it appears that sexual dimorphism manifests most often in males among Macropus species, with males having the greatest variation in size and fur colouration, while females exhibit much less variation in size and tend to be largely grey or grey-brown (Van Dyck and

Strahan, 2008). However, given the heterogeneity observed among different kangaroo clades, it is likely that no single factor is responsible for driving sexual dimorphism among macropods, but rather a combination of abiotic factors (such as climate) and biotic factors (such as sexual selection and group dynamics) likely work synergistically to give rise to the sexual dimorphism observed within some kangaroo clades.

142

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

5Chapter 5: Deep phylogeny and dating of the Macropodidae using

retrotransposons, mitochondrial and nuclear genes

5.1 INTRODUCTION

The Macropodiformes are the most species rich sub-order within the order Diprotodontia and present a striking radiation into a diverse array of environments across Australasia (Eisenberg,

1981, Flannery, 1984, Flannery, 1989). Within the Macropodiformes, the most species rich family is the Macropodidae, which has mid-Miocene or earlier origins (Prideaux and Warburton,

2010) and include kangaroos, tree-kangaroos, rock wallabies, and several genera of small to mid- sized wallabies (Van Dyck and Strahan, 2008). The basal divergence between Lagostrophus and the macropodines is well supported by previous molecular studies (Baverstock et al., 1989,

Westerman et al., 2002, Mitchell et al., 2014, Llamas et al., 2015) and morphology (Prideaux,

2004, Prideaux and Warburton, 2010, Prideaux and Tedford, 2012). However, despite a long history of investigation using both morphology and molecular datasets, the basal divergences among the Macropodinae remain unresolved, as a six clade polytomy: Macropus+Wallabia (true kangaroos and wallabies), Lagorchestes (hare-wallabies), Onychogalea (nail-tail wallabies),

Setonix (), Dendrolagini+Thylogale (tree kangaroos, rock wallabies and pademelons) and the Dorcopsini (New Guinean Forest Wallabies).

The phylogenetic position of the Macropus/Wallabia clade (described in detail in previous chapters) relative to other Macropods has been suggested to be sister to the nail-tail wallabies

(Onychogalea) based on dental characters that appear to have evolved for grazing (Flannery and

Hann, 1984, Prideaux and Warburton, 2010), although with weak statistical support. Conversely, molecular studies have largely preferred a more distant relationship between Macropus/Wallabia and Onychogalea, instead favouring Lagorchestes as the sister group to Macropus/Wallabia

(Meredith et al., 2008b, Mitchell et al., 2014, Meredith et al., 2009). The molecular analysis of

143

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Mitchell et al. (2014) found a sister relationship between Macropus/Wallabia and Lagorchestes, with Onychogalea sitting outside and grouping with Setonix, albeit with relatively weak statistical support. The nuclear gene analysis of Meredith et al. 2008b found a sister relationship between Macropus/Wallabia with Lagorchestes and placed Onychogalea deeper in the tree, although neither grouping was statistically well supported.

The tree kangaroos of the genus Dendrolagus inhabit the tropical rainforests of far northeastern

Queensland, New Guinea and some surrounding islands, with only two species occurring in

Australia (D. bennetianus and D. lumholtzi ) (Flannery, 1995, Van Dyck and Strahan, 2008).

Another genus of tree kangaroos, Bohra, inhabited less mesic Australian forests and woodlands

(e.g. Dawson, 2004) until its Late Pleistocene extinction. Interestingly, it appears that tree kangaroos are secondarily arboreal having evolved from terrestrial kangaroos in the Late

Miocene (Flannery, 1989). This is supported by ancestral state reconstructions by Meredith et al.

(2009), which inferred that the ancestor to all members of the Macropodidae was terrestrial. Tree kangaroos represent a return to a canopy dwelling lifestyle that appears to be unique among macropods (Procter-Gray and Ganslosser, 1986) and are the only species of macropod possessing the ability to bipedally hop as well as move their legs alternately (Flannery et al., 1996).

The phylogenetic position of the tree kangaroos has historically been contentious, with early studies suggesting a close relationship with the New Guinean Forest Wallabies (Dorcopsis and

Dorcopsulus) close to the base of the Macropodinae (Bensley, 1903, Raven and Gregory, 1946,

Kirsch, 1977, Archer, 1984, Flannery, 1989). It was suggested by Bensley (1903) that

Dendrolagus gave rise to the Dorcopsini by secondary reversion to a terrestrial lifestyle, however

Raven and Gregory (1946) suggested that the reverse scenario is far more likely.

Counter to this grouping, Windsor and Dagg (1971) suggested grouping Dendrolagus and

Petrogale together based on similar locomotary styles, and this was also supported by osteological characters by Prideaux and Warburton (2008 and 2010). The serological analysis of Baverstock et al. (1989) was the first to show molecular evidence for a close

144

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes association between Dendrolagus and Petrogale, using microcompliment fixation of albumin. However, they stressed that this phylogenetic placement may be an artefact of slow rates of albumin evolution in these two genera. A similar grouping was found in the DNA hybridisation analyses of Kirsch et al. (1995) (Kirsch et al., 1995), which also grouped

Dendrolagus and Petrogale together, with Thylogale sitting just outside. A nuclear gene analysis by Meredith et al. (2009) showed a similar result to the hybridisation analysis.

Mitochondrial estimates also placed Dendrolagus with Petrogale, but Thylogale grouped more distantly, in a basal position to a sister clade made up of Lagorchestes, Setonix,

Onychogalea, and Macropus (Westerman et al., 2002). A mtDNA association between

Dendrolagus and Petrogale has also been suggested by Potter et al. (2017).

The rock wallabies of the genus Petrogale inhabit steep rocky terrain, escarpments, cliffs, rocky outcrops and boulder piles (Sharman and Maynes, 1983) across mainland Australia and several offshore islands, but not in nor New Guinea (Sharman et al., 1989, Eldridge et al.,

1991). Rock wallabies inhabit a diverse range of habitats and climatic zones ranging from tropical rainforests to deserts (Van Dyck and Strahan, 2008). Traditionally, Petrogale has been grouped with Thylogale and/or Macropus (Bensley, 1903, Raven and Gregory, 1946, Kirsch,

1977, Archer, 1984, Flannery, 1989).

Given that rock wallabies occupy a wide range of habitats, while tree-kangaroos have long been considered to occupy a very narrow range of environments, namely tropical rainforests, a popular view is that Dendrolagus originated in the rainforest from an especially agile rock wallaby (Flannery et al., 1996). Both groups share an affinity for climbing and navigating a three-dimensional environment and indeed some rock wallaby species have the ability to climb trees (Martin, 2005). Prideaux and Warburton (2010) claim that the ostelogical anatomy of rock wallabies, precludes direct ancestry to tree kangaroos, but conform to the idea that they share a common ancestor. The divergence date of the common ancestor of

Petrogale and Dendrolagus has been suggested to be 7.5Ma based on the DNA hybridisation

145

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes study of Campeau-Péloquin et al. (2001), however Prideaux and Warburton (2010) caution that this may be too recent, given the stark morphological differences between the two genera. The association of Plio-Pleistocene tree kangaroos of the extinct genus Bohra with more arid woodland environments of Southern Australia and the interior of the continent suggests that their range was more diverse in the past than what is represented by extant taxa

(Tedford et al., 1992, Prideaux et al., 2007, Prideaux and Warburton, 2008, Prideaux and

Warburton, 2009). This is consistent with the idea that the common ancestor of Petrogale and Dendrolagus inhabited the arid sclerophyll woodland habitats of the interior of

Australia, with Dendrolagus populations becoming established in eastern Australia (and subsequently, New Guinea) in the Miocene-Pliocene boundary, while Petrogale became established in eastern Australia in the Pliocene (Petrogale persephone lineage) and

Pleistocene (Petrogale penicillata complex) (Briscoe et al., 1982, Prideaux and Warburton,

2010).

An association between Petrogale and the former genus, Peradorcas, was supported by

Meredith et al. (2009), which grouped Petrogale+Peradorcas together, then grouped consecutively with Dendrolagus, followed by Thylogale. Contrastingly, the micro complement fixation (MC’F) study of Baverstock et al. (1990) largely agreed, but placed

Dendrolagus outside of Petrogale/Peradorcas/Thylogale. More recently, Potter et al. (2012) suggested dissolving the monotypic genus, Peradorcas based on continuously evolving dental characters, and as a result these genera have now been incorporated into a single genus, Petrogale. With the molecular analyses of Mitchell et al. (2014) and May-Collado et al. (2015) a more solid consensus is developing around Thylogale being sister to

Dendrolagini (Dendrolagus and Petrogale).

The quokka (Setonix brachyurus) is the only living member of the genus Setonix, and is currently distributed on islands off the coast of (Rottnest Island and Bald Island), and scattered across several remnant mainland populations in Western Australia (Van Dyck and

146

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Strahan, 2008). The phylogenetic placement of Setonix has been contentious over the past century, with studies suggesting an affinity with practically every macropodine genus (Prideaux and Warburton, 2010). Placement of Setonix at the base of non-Dorcopsini Macropodines has been suggested by Prideaux and Warburton (2010) and agrees with the (weakly supported) placement found by a DNA hybridisation study by Kirsch et al. (1995). Bensley et al. (1903) placed Setonix with Dendrolagus, based on dental characters, while other notable studies

(Thomas, 1888, Tate et al., 1948, Flannery, 1989) claimed an affinity between Setonix and

Macropus, and Raven and Gregory (1946) grouped Setonix with Thylogale – a grouping that agrees with chromosome number of these genera (2n=22) (Hayman, 1989). The MC’F study of

Baverstock (1989) instead showed an affinity between Setonix and Macropus. Thus there is a long history of disagreement regarding the position of Setonix, and given the apparent lack of close relatives, clarifying its position requires further investigation.

Early studies of New Guinean Forest Wallabies (Dorcopsis and Dorcopsulus) concluded that they were derived from tree kangaroos by secondary reversion to a terrestrial habitat (Bensley,

1903). An affinity between Dorcopsis and Dorcopsulus has long been accepted. DNA hybridisation studies (Springer and Kirsch, 1991, Kirsch and Palma, 1995), mitochondrial analyses (Burk et al., 1998, Burk and Springer, 2000, Westerman et al., 2002) and morphology

(Woodburne, 1967, Flannery, 1984, Prideaux and Warburton, 2010) all agreed that the

Dorcopsini grouped together and are the most basal sister lineage to all other Macropods with the exception of Lagostrophus. A nuclear gene analysis by Meredith et al. (2009) agrees that

Dorcopsis and Dorcopsulus are close relatives, but placed them as sister to the

Dendrolagini+Thylogale clade, although with weak statistical support, leaving the question of the relative position of the Dorcopsini open to further investigation.

There has long been a debate over which is the best estimator of the true species tree in phylogenetic reconstruction – sequences from multiple genes, analysed as a single concatenate or sequences from multiple genes, analysed separately and summarized as gene trees. It has been

147

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes claimed that the concatenated (or combined-data) approach was the most reasonable method to reconstruct the species tree (Kluge and , 1993, Siddall, 1997). This stems in part from belief that natural data partitions do not exist and that concatenating sequences from all available genes, in addition to other characters (such as morphological features), is the best approach, because it reduces subjectivity (Kluge, 1989, Kluge and Wolf, 1993, Nixon and Carpenter, 1996).

However, this method ignores genes as the basic functional unit of the genome (Liu and Pearl,

2007), instead viewing nucleotides as the best estimators of the species tree. Slowinski and Page

(1999) were critical of this approach because it assumes that nucleotides are independent estimators of species relationships, and that the longer the sequence, the more closely the inferred tree will match the species tree. The alternative approach to use gene trees still requires analysis of nucleotides within genes, but allows phylogenetic signal heterogeneity among genes to potentially be considered for estimating the species tree (Page, 1998, Pamilo and Nei, 1988). It is now generally accepted that any given gene tree may not match the species tree regardless of the length of the sequence (Liu and Pearl, 2007). This is supported by work which showed that incongruent gene trees are more likely than congruent gene trees during speciation, under certain combinations of branch lengths , in particular when time between divergences is short (Degnan and Salter, 2005, Degnan and Rosenberg, 2006).

Concatenation methods can substantially gain in precision by overcoming stochastic error with large numbers of variable sites (Hillis et al., 1994), and this method can be the more reliable when individual gene trees have little phylogenetic signal or when there is little

Incomplete lineage sorting (ILS). However, problems can arise when gene trees do not share the same topology. This has led to the concatenation method being challenged and methods that reconstruct phylogeny, while taking into account incongruence between loci, being proposed (Carstens and Knowles, 2007, Kubatko and Degnan, 2007, Degnan and Rosenberg,

2009, Liu et al., 2009).

148

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

The primary intention with species tree approaches has been to overcome discordance among gene trees, including the over-influence of ‘outlier’ genes that may be associated with incomplete lineage sorting or introgression, as well as horizontal gene transfer, gene duplication and gene deletion - such methods for ‘summarizing’ multiple gene trees into a single tree have gained widespread use in phylogenetics in recent years (Maddison, 1997,

Degnan and Rosenberg, 2009, Nakhleh, 2013). The most important recent advances with species trees have been phylogenetic approaches that model the coalescent process by which gene trees evolve and can be embedded within species divergence (Kubatko et al., 2009,

Maddison and Maddison, 2010). It is now possible to simultaneously estimate species trees and their constituent gene trees (Liu, 2008, Heled and Drummond, 2009). Each of these approaches can account for population-level processes such as incomplete lineage sorting, that can cause gene tree discordance.

Thus, despite extensive investigation among many of these Macropodine genera, a well- supported consensus for the relative phylogenetic positions of the key clades:

Macropus+Wallabia, Lagorchestes, Onychogalea, Setonix, Dendrolagini+Thylogale and the

Dorcopsini remains elusive. To shed light on this polytomy, I performed phylogenetic inference on a concatenated nuclear and mtDNA dataset. I have expanded upon previous datasets by adding six newly sequenced nuclear genes and utilizing complete and partial mitochondrial genomes. Having partially clarified macropodid relationships I then used concatenation and species-tree methods to estimate the timescale of kangaroo evolution.

5.2 MATERIALS AND METHODS

Macropod DNA was extracted at Queensland University of Technology, Curtin University,

University of Adelaide and the Senckenberg Biodiversity and Research Centre in Frankfurt.

All sequencing (Sanger method) was carried out at Queensland University of Technology.

PCR primers were designed (Table 16) for protein coding regions of six nuclear genes that

149

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes have not previously been sampled widely for macropods (Table 17), but were included in

Meredith et al.’s (2011) family-level mammal study. These genes were selected based on having >5% parsimony informative sites relative to the total sequence length, calculated in

PAUP 4.0b10 (Swofford 2002) and based on a preliminary analysis using available marsupial sequences for these loci from Genbank (Table S4). Each PCR reaction contained final concentrations of 14.75 μL of distilled H2O, 5 μL of Bioline 5x MyTaq reaction buffer,

2 μL of (50mM) MgCl2 solution, forward and reverse primers (10pmol/μL) at 1 μL each,

0.25 μL of Taq polymerase, 1μL of DNA template, for a total volume of 25μL. The PCR protocol involved initial denaturation (3 minutes at 94oC); 35 cycles of denaturation (30 seconds at 94oC), annealing temperature (appropriate temperature for each amplicon, for 30 seconds), extension (90 seconds at 68oC), final extension (10 minutes at 72oC). PCR reactions were performed in an Eppendorf Cycler (Queensland University of Technology).

Table 16. Primer List for the six newly sequenced nuclear genes

Gene abbreviation Primer Name Primer Sequence (5'-3')

ADRA2 (A2AB) K_ADRA2_F1 CCCTACTCTGTGCAGGCCAC K_ADRA2_R1 CTGGTTGAAAATTGTATAGATGACAGG TYR1 K_Tyr1_F1 CTGGARCCCCCAAGYAGTGTG K_Tyr1_R1 CCAGGAGATCCRCCTTTCCAG K_Tyr1_F2 CCAAGYAGTGTGTCYCGGG K_Tyr1_R2 CTCTCCCTGTGGCCAGCT ENAM ENAM-MarsR4 GCTGGGACATAATTCTTTTGGTCCCATGAC ENAM-MarsF4 CCAACCAGCCCTGGAGAAACTCACAAGGTTATG ENAM-MarsR2 GGTTCTGAAACTATTCCTTTGGGAAAATTTG ENAM-MarsF2 GCTATTTTGGATAYCAYGGATTTGGAGGGC ENAM_WD_F1 GAGACCATTTCTGGCTCTC ENAM_WD_R1 CCCATATTATTCAGAAGAGATGT DMP1 DMP1-MarsR2 GCTAATAGCCATCTTGGCAGTCATTGTCATC DMP1-MarsF1 GAATCAGAGGAAGACTTGGGYCTTAMTGATC DMP1_WD_F1 CATCTTGGCAGTCATTGTCATC DMP1_WD_R1 GAATCAGAGGAAGACTTGGG BCHE BCHE_WD_F1 CAATTTCATAACCATGCATGAC BCHE_WD_R1 GGATCTGAGATGTGGAATCCA BCHE-F TCAGAGATGTGGAACCCAAA BCHE-R ATGCATGACTCCCATCCATT ADORA3A ADORA3A_WD_F1 ATGAGACAGCAGGATGCCC ADORA3A_WD_R1 CCCCATGTTTGGCTGGAAT ADORA3A-R GATAGGGTTCATCATGGAGTT ADORA3A-F_short CCATGTTTGGCTGGAA

150

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Table 17. Six newly sequenced nuclear genes. Amplicons comprised a portion of the protein coding region from each gene.

Amplicon length Gene abbreviation Gene name after trimming 1299 DMP1 dentin matrix acidic phsophoprotein1 ADRA adrenoceptor alpha 2B 1166 TYR1 tryosinase 375 BCHE butyrylcholinesterase 1045 ENAM enamelin 926 ADORA3A Adenosine A3 receptor 277

The PCR products were checked by electrophoresis (90v for 45 minutes) in a 1% TBE agarose gel that had been stained with Gel Red (Biotium) alongside the BioLine

HyperladderTM molecular weight marker for comparison. PCR products were purified for sequencing using an UltraCleanTM PCR clean up kit (Bioline), according to the manufacturer’s protocol. Purified PCR products were amplified in a sequencing reaction containing 1 μL of BigDye® Terminator v1.1, 3.5 μL of BigDye® v3.1 5x sequencing buffer

(Applied Biosystems, California, USA), 1 μL of forward primer (3.2 pmol/μL), 13.5 μL of distilled H2O and 1.0μL of PCR product, for a total volume of 20μL. The reaction was repeated using the reverse primer in place of the forward primer, to obtain the reverse sequence.

The sequencing reaction cycle protocol involved initial denaturation at 94°C for 5 minutes,

30 cycles of 96°C for 10 seconds, 50°C for 5 s, 60°C for 4 minutes and a final extension at

72°C for 10 minutes. The products of the sequencing reaction were precipitated using a standard Ethanol/EDTA protocol (Recommended by the Griffith University DNA Sequence

Facility). Samples were sequenced at QUT on a Genetic Analyzer (Applied Biosystems)

3500 sequencer. All resulting sequences were visually inspected and aligned in Se-AL 2.0

(Rambaut, 2002).

151

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

5.2.1 Phylogenetic reconstruction of concatenated dataset in MrBayes

Bayesian phylogenetic inference was conducted on the concatenated dataset in MrBayes 3.2

(Huelsenbeck & Ronquist, 2001). The concatenated dataset consisted of complete and partial mitochondrial genomes, the newly sequenced six genes (Table 17) and five nuclear genes taken from Meredith et al. (2008b), ApoB, BRCA1, IRBP, Rag1, vWF. Thus a total of 11 nuclear genes + mitochondrial genomes, were combined to yield a dataset of 26,515 bp. RY coding was utilized on third codon positions of the protein coding mitochondrial genes. RY- coding converts the four DNA bases into purine (A+G) and pyrimidine (C+T) to improve phylogenetic signal at third codon positions (which evolve more rapidly) and reduce saturation effects. The MCMC chain was run for 10,000,000 generations and sampled every

5000 generations, with a burnin of 20% in all cases. The nuclear data was partitioned according to codon positions to allow a separate partition for the more rapidly evolving third codon positions (Table 18). The mitochondrial data was partitioned into RNA-coding stems and loops, and codon positions protein coding regions. Appropriate models of molecular evolution were utilized for each partition based on Partition Finder (Lanfear et al., 2016), or utilized the next more general model available, based on the software (Pers. comm. MP)

(Table 18). Branch lengths were proportional across the six newly sequenced gene partitions, the five pre-existing genes (Meredith et al., 2011) and the mitochondrial partitions.

Table 18. Partitioning scheme and models of molecular evolution for the concatenated dataset consisting of four mitochondrial genes and 11 nuclear genes.

Partition Model Mt stems GTR + I + Γ Mt loops GTR + I + Γ Mt codon position 1 GTR + I + Γ Mt codon position 2 GTR + I + Γ Mt codon position 3 F81 + I + Γ NUC codon position 1,2 GTR + I + Γ NUC codon position 3 GTR + Γ

152

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

NUC (new) codon position 1,2 GTR + I + Γ NUC (new) codon position 3 GTR + Γ

5.2.2 Molecular Dating in BEAST

For molecular dating of the concatenated dataset, we utilized the program BEAST v1.8.1

(Drummond et al., 2012). Unlinked relaxed molecular clocks for nuclear and mitochondrial partitions were implemented (Drummond, Ho, Phillips & Rambaut, 2006), in which branch rates are distributed according to a lognormal distribution. A birth-death process was set as the tree prior. The BEAST analysis was run for 10,000,000 MCMC generations and sampled every 5000 generations, with convergence checked in Tracer 1.6 (Rambaut and Drummond,

2007). All partitions utilized the HKY substitution model, with empirical base frequencies and gamma plus invariant sites. This relatively simple model accounts for the most salient variation across sites and among substitution types, and allows for a more direct (and sufficiently converged) partition comparison with the BEAST species tree analysis (see below).

Five fossil-based priors were used to calibrate the BEAST analysis. These include uniform bound fossil constraints placed on (1) Thylogale-Dendrolagini, with normal bounds, 4.36-

14.22 Ma. The minimum is based on Thylogale ignis fossils from the Hamilton fauna

(Flannery, 1992) and the maximum bound is the maximum age of Ringtail site at Riversleigh and recognises the absence of Macropodinae from Riversleigh Faunal Zone C sites or contemporaneous sites elsewhere. (2) Macropodoidea, with normal bounds, 17.79 – 28.5 Ma.

This calibration is based on (Phillips, 2015a) but with the minimum age updated in light of new radiometric dates for Ganguroo at the Neville’s Garden site at Riversleigh (Woodhead et al., 2016). The distribution mean (23.03 Ma) is placed at the Oligo-Miocene boundary, recognizing close macropodoid crown/stem transitional forms from around this time (e.g.

(Cooke et al., 2015)). (3) Macropodidae, 14 - 24.7 Ma. The maximum is based on an absence of derived macropodids in the Etadunna Faunal Zone C, while the minimum is a soft

153

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes bound based on the Wanburoo hilarus fossils from the Riversleigh faunal Zone C sites

(Cooke, 1999). (4) The divergence between Lagorchestes and Macropus (4.46-14.22 Ma), based on Macropus fossils in the Hamilton fauna (Flannery, Rich, Turnbull & Lundelius,

1992). (5) One further calibration is for the root of the tree, 24.7-54.65 Ma, the minimum bound is based on the earliest well established macropodoids, such as Bulungamaya (Butler et al., 2016) from Etadunna Faunal Zone C. The maximum bound is the maximum age of the

Tingamarra fauna (Beck et al., 2008), which includes an assemblage of plesiomorphic marsupials and no putative crown diprotodontians. A supernetwork showing the phylogenetic conflict/congruence between the concatenated MrBayes analysis and concatenated BEAST analysis was analysed using Splitstree4 (Huson and Bryant, 2006).

5.2.3 Molecular Dating in BEAST (Species Tree)

The species tree analysis was run in BEAST v1.8.1 (Drummond et al., 2012). The major difference from the concatenated BEAST analysis is that a multispecies coalescent model is used to allow the inference of a species tree that underlies the evolution of separate gene trees (which may differ due to ILS and stochastic variation). The data was partitioned by gene, with each nuclear gene receiving its own partition, while the mitochondrial data was treated as a single partition. The same parameters (number of generations, etc) were employed as for the concatenated BEAST analysis and the GTR model was selected with gamma and invariant sites (Pers. comm. MP). Partitioning by gene, allows the program to estimate the overall species tree from the estimated gene trees. Other differences to accommodate species tree estimaton were Yule species tree prior, with a piecewise constant population model, and assuming autosomal ploidy for all gene partitions, except haploidy for the mtDNA.

154

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

5.3 RESULTS

5.3.1 Deep macropodid phylogeny

In this chapter, I performed a Bayesian phylogenetic reconstruction (MrBayes) and two time- calibrated Bayesian phylogenetic reconstructions (BEAST), which consisted of a species tree analysis and a concatenated analysis. I note that some nodes remain unresolved, potentially due to missing data in my dataset. Overall, this chapter sheds light on deeper macropod relationships, but may require additional loci to resolve nodes with low support.

5.3.1.1 Bayesian reconstruction of the concatenated nuclear and mtDNA dataset in

MrBayes and BEAST

Bayesian phylogenetic reconstruction in MrBayes (Figure 35) found a clade containing

Macropus+Wallabia+Setonix+Lagorchestes (BPP= 0.8985), with a more weakly supported affinity between Setonix and Lagorchestes (BPP=0.7521). The concatenated BEAST analysis (Figure 36) agreed with the affinity between

Macropus+Wallabia+Lagorchestes+Setonix, with even greater support (BPP= 0.951), but grouped Macropus+Wallabia+Lagorchestes (BPP=0.92), to the exclusion of Setonix.

The MrBayes analysis found robust support for grouping Thylogale+Dendrolagus+Petrogale together (BPP=1), with the Dorcopsini placing just outside to form a

Thylogale+Dendrolagus+Petrogale+Dorcopsini clade (BPP=0.8829) (Figure 35). The

BEAST run agreed with these groupings, but with slightly weaker statistical support (Figure

36). The position of Onychogalea differed between the Mr Bayes and BEAST analyses. The

MrBayes analysis grouped Onychogalea as sister to all other macropodines, which in turn were supported at BPP=1 (Figure 35). In contrast to this deepest diverging macropodine lineage, the BEAST analysis grouped Onychogalea with

Macropus+Wallabia+Setonix+Lagorchestes (BPP=0.977), as the deepest diverging member of that clade (Figure 36).

155

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Both analyses recovered Dorcopsini (Dorcopsis and Dorcopsulus): MrBayes (BPP=1);

BEAST (0.9995), but curiously showed a greater affinity between Dorcopsulus vanheurni and Dorcopsis veterum, with Dorcopsis hageni falling outside in both cases: MrBayes

(BPP=0.9513); BEAST (BPP=0.993).

Dorcopsini

Dendrolagini/Thylogale

Setonix Lagorchestes

Macropus+Wallabia

Onychogalea

Figure 35. Bayesian Phylogeny (Mr Bayes) of the concatenated nuclear and mtDNA dataset for macropods. Numbers at each node indicate Bayesian posterior probability values. Dendrolagus_inust_dori represents both D. inustus and D. dorianus. A supernetwork showing the lack of phylogenetic conflict with the concatenated BEAST analysis (Figure 36) was analysed using Splitstree4, see Appendix – Figure S3.

156

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Figure 36. Bayesian reconstruction (BEAST) of the concatenated dataset for macropods. Blue bars indicate 95% HPD confidence scores for molecular dating. Numbers at each node indicate Bayesian posterior probability values. Time scale is in millions of years before present. Dendrolagus_inust_dori represents both D. inustus and D. dorianus. A supernetwork showing the lack of phylogenetic conflict with the concatenated MrBayes analysis (Figure 35) was analysed using Splitstree4, see Appendix – Figure S3.

157

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

An association between Dendrolagus and Petrogale was robustly supported in both analyses:

MrBayes (BPP=1); BEAST (BPP=0.9995), confirming the monophyly of the Dendrolagini.

Within Petrogale, both analyses found robust support for a clade containing P. concinna +

P. brachyotis + P. burbidgei: MrBayes (BPP=1); BEAST (BPP=0.9995). The other

Petrogale clade consisted of P. lateralis + P. penicillata + P. xanthopus: MrBayes (BPP=

0.7178); BEAST (BPP=0.9945). Both analyses showed a strong affinity between these two clades with strong support – MrBayes (BPP=1); BEAST (BPP=0.999). Finally,

Lagostrophus was recovered as the most basal lineage relative to all other macropodids in both analyses.

5.3.1.2 Bayesian Reconstruction of Species Tree in BEAST

The species tree reconstruction (see Figure 37) recovered a topology with key differences to the concatenated analyses (see Figure 35 and Figure 36) with relatively weak support values. A clade containing Macropus+Wallabia+Dendrolagini (BPP=0.5902) was favoured, and which in turn grouped with Lagorchestes (BPP=0.3168). Onychogalea grouped with the Dorcopsini with weak support (BPP=0.1259). Thylogale grouped with all other Macropodines (BPP=0.6682), as the deepest diverging lineage, placing outside of a clade containing all other macropodines

(BPP=0.6722), with the exception of Thylogale billardierii which curiously grouped with Setonix

(BPP=0.3788).

In addition to these stark differences from the concatenated analyses (and from other recent molecular studies), the species tree did recover several expected clades, including the position of

Wallabia bicolor, again with relatively weak support.

5.3.2 Molecular Dating

Both the concatenated tree and the species tree reconstructions were used to infer the evolutionary timing among the macropodidae. These divergence dates, inferred in BEAST, are shown in Figure 36 and Figure 37. Overall, the species tree provided considerably younger dates than the concatenated tree.

158

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

Figure 37. Bayesian inference species tree reconstructed in BEAST for macropods. Blue bars indicate 95% HPD confidence scores for molecular dating. Numbers at each node indicate Bayesian posterior probability values. Time scale is in millions of years before present. Dendrolagus_inust_dori represents both D. inustus and D. dorianus.

159

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

The origin of the Macropodidae was inferred to be 22.15 Ma (95% HPD 18.66-25.82 Ma) in the concatenated tree, while the species tree recovered a date of 16.95 Ma (95% HPD 16.57-

19.03 Ma), placing the origin closer to the mid-Miocene climatic optimum. The concatenated tree inferred the origin of Macropus+Wallabia to be in the late Miocene, splitting from the

Lagorchestes lineage at 9.62 Ma (95% HPD 7.46-11.68 Ma). The species tree estimated a divergence of Macropus+Wallabia from the inferred sister clade (Dendrolagini) to be 6.54

Ma (95% HPD 5.91-7.31 Ma).

The stem origin of Setonix was recovered as late Miocene by the concatenated tree, splitting from the Macropus+Wallabia lineage at 10 Ma (95% HPD 7.93-12.22 Ma). The species tree also inferred the origin of Setonix as late Miocene, but surprisingly suggested that Setonix split from the lineage leading to Thylogale billardierii at 6.53 Ma (95% HPD 0.64-8.02 Ma).

The concatenated tree, which recovered Thylogale monophyly, estimated that these forest wallabies split from Dendrolagini at 9.52 Ma (95% HPD 7.38-11.59 Ma).

Both concatenated and species trees recovered a late-Miocene date for the split between

Dendrolagus and Petrogale, with the concatenated tree inferring an age of 8.34 Ma (95%

HPD 6.36-10.41 Ma), while the species tree recovered the age of the split to be 5.5 Ma (95%

HPD 4.82-6.21 Ma), closer to the Pliocene onset of increased aridification in Australia.

Similarly, the stem origin of Onychogalea was inferred to be 10.37 Ma (95% HPD 8.14-

12.57 Ma) in the concatenated tree, while the species tree placed it at 7.18 Ma (95% HPD

6.42-7.48 Ma).

The split between Lagostrophus and macropodines was inferred to be 19.47 Ma (95% HPD

16.0-22.76 Ma) in the concatenated tree, while the species tree suggested a more recent date of 15.67 Ma (95% HPD 14.25-16.66 Ma).

Despite these differences in overall dates, both analyses suggest an increase in cladogenesis taking place roughly coincident with the increased aridification of Australia since the late-

Miocene and Pliocene. In the case of the concatenated timetree, there is a preponderance of

160

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes divergences in the latest Miocene, around 5-7 Ma. For the dated species tree these divergences shift forward to the Early-mid Pliocene, around 3-5 Ma.

5.4 DISCUSSION

5.4.1 Phylogeny and evolutionary timescale of the Macropodidae

This chapter provides a phylogeny and molecular dating for the Macropodidae using two major methods: concatenated tree reconstruction and a species tree approach. Although molecular dating was performed on both of these datasets, I have favoured the topology and dates recovered by the concatenated dataset, which are statistically better resolved, and more consistent with analyses of independent data. Although critics of concatenation methods assert that such methods artificially inflate statistical support values and may be confounded by coalescent processes such as incomplete lineage (Kubatko and Degnan, 2007, Carstens and Knowles, 2007), others have found little to no difference in statistical accuracy between concatenation methods and species tree methods under simulation conditions (Tonini et al.,

2015). I discuss the unusual results of the species tree in the context of the potential pitfalls of reconstructing species trees from small numbers of low information content loci, and when missing data is prevalent in a multi-locus dataset.

The inclusion of sequences from six additional nuclear genes corroborates the basal position of Lagostrophus as the sister group to macropodines, placing it in a basal position among macropodids in agreement with previous molecular studies (Baverstock et al., 1989,

Westerman et al., 2002, Mitchell et al., 2014, Llamas et al., 2015) and morphology

(Prideaux, 2004, Prideaux and Warburton, 2010, Prideaux and Tedford, 2012). My analyses show a close affinity between Lagorchestes and Macropus+Wallabia, in agreement with previous molecular studies (Meredith et al., 2008b, Mitchell et al., 2014, Meredith et al.,

2009). This contradicts the sister grouping of Onychogalea to Macropus/Wallabia, which has been tentatively suggested, based on a single unambiguous retrotransposon marker

161

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

(Chapter 2) and morphology (Flannery and Hann, 1984, Prideaux and Warburton, 2010).

The grazing-adapted dentition shared between Onychogalea and Macropus/Wallabia may indeed be a result of convergence, as suggested by Meredith et al. (2009), however it is curious that retrotransposons (albeit limited data) and nuclear+mitochondrial genes (current chapter) conflict in their placement of Onychogalea relative to other macropodines.

Additional molecular data may be required to clarify the phylogenetic position of

Onychogalea. My time-calibrated concatenated analysis (Figure 36) suggests that

Onychogalea diverged from Lagorchestes+Macropus+Wallabia+Setonix in the mid-late

Miocene – a period in which the climate was cooling/drying following the mid-Miocene climatic optimum, but significantly earlier than the expansion of grasslands, which only became significant in the Pliocene (Martin, 2006, Black et al., 2012) – this is consistent with the suggestion of Meredith et al. (2009) that adaptation to grazing occurred twice, independently, in Onychogalea then again in Macropus/Wallabia.

The position of Setonix has been particularly uncertain, with multiple placements proposed previously (Prideaux and Warburton, 2010). My new analyses corroborate a previous nuclear study favouring a Macropus/Wallabia/Lagorchestes/Setonix clade (Meredith et al., 2009).

This affinity between Setonix and Macropus+Wallabia+Lagorchestes is also consistent with earlier studies that found an affinity between Setonix and Macropus based on morphological characters (Thomas, 1888, Tate et al., 1948, Flannery, 1989). However, I note that statistical support for the placement of Setonix with Macropus+Wallabia+Lagorchestes, in my analysis (Figure 36) is relatively weak (BPP=0.95). Given that Bayesian posterior probabilities tend to produce overconfident support values (Suzuki et al., 2002, Gontcharov et al., 2004, Phillips and Pratt, 2008), a result of 0.95 or less, is considered weak support

(Pers. comm. Dr Matthew Phillips) and suggests additional data may be required to resolve the position of Setonix, particularly given that it has no close living relatives.

162

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes

My analyses support the monophyly of Dendrolagus+Petrogale+Thylogale confirming earlier molecular studies based on serological affinity (Baverstock et al., 1989), DNA hybridisation analyses of (Kirsch et al., 1995) and nuclear genes (Meredith et al., 2009).

Mitochondrial DNA agrees with the Dendrolagini grouping of Dendrolagus and Petrogale, but placed Thylogale more distantly, in the case of (Westerman et al., 2002). This dendrolagini-Thylogale clade was however recovered with the use of additional mtDNA

(Phillips et al., 2013). Interestingly, my inferred timing for the divergence between

Petrogale and Dendrolagus place the split in the late Miocene, close to the Miocene-

Pliocene boundary, in agreement with previous estimates (Campeau-Péloquin et al., 2001,

Meredith et al., 2009) – a period characterized by significant expansion of drier, open habitats (Martin, 2006, Black et al., 2012). This lends credibility to the proposed idea that the ancestor of Dendrolagus and Petrogale inhabited more open habitats (Meredith et al.,

2009, Flannery, 1989), which is further supported by the association of the extinct tree kangaroo genus Bohra with more arid woodland environments (Tedford et al., 1992,

Dawson, 2004, Prideaux et al., 2007, Prideaux and Warburton, 2008, Prideaux and

Warburton, 2009).

I recovered an affinity between Dorcopsis and Dorcopsulus - a clade (Dorcopsini) that was previously suggested by DNA hybridisation studies (Springer and Kirsch, 1991, Kirsch and

Palma, 1995), mitochondrial analyses (Burk et al., 1998, Burk and Springer, 2000,

Westerman et al., 2002) and morphology (Woodburne, 1967, Flannery, 1984, Prideaux and

Warburton, 2010), however these earlier studies placed the Dorcopsini deep in the macropodid tree, as the basal sister lineage to all macropodids, with the exception of

Lagostrophus. My analyses placed the Dorcopsini in a shallower position in the tree, as the sister group to the Thylogale/Dendrolagini clade initially identified by Meredith et al.

(2008a), and placing the split in the mid-late Miocene (Figure 36) . All of my reconstructions inferred a close relationship between Dorcopsis veterum and Dorcopsulus vanheurni, with

Dorcopsis hageni falling outside, rendering Dorcopsis paraphyletic. Little work has been

163

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes done on the systematics and of the New Guinean forest wallabies. The present strong evidence for these two genera not being reciprocally monophyletic indicates a need to revisit the dorcopsine phylogeny and evolution.

5.4.2 Species trees vs concatenated trees

ILS is a major confounding factor to species tree reconstruction (Edwards, 2009) and several methods have been developed that estimate species trees in the presence of ILS (Degnan and

Rosenberg, 2009, Yang and Warnow, 2011, Nakhleh, 2013). Most of these methods (including for StarBEAST, Heled and Drummond, 2009) focus on accommodating incomplete lineage sorting (Heled and Drummond, 2009), although have potential to crudely accommodate introgression as well (Edwards et al., 2007). It is promising then, that with eleven nuclear genes and a strongly discordant mtDNA signal for excluding Wallabia from Macropus, the

StarBEAST analysis was able to recover the M. (Notamacropus) sister placement for Wallabia that was strongly favoured by the retroposon study (Chapter 2) and also supported by Meredith et al. (2008b). However, the StarBEAST species tree (Figure 36) presents many apparently spurious relationships in view of other recent molecular and morphological studies, particularly for Thylogale, which is resolved as polyphyletic, and one of these Thylogale clades at the base of

Macropodinae (see Figure 37), contradicting all previous studies, many of which favour

Thylogale grouping with the Dendrolagini (Meredith et al., 2009, Kirsch et al., 1995, Prideaux and Warburton, 2010).

A study utilizing both biological and simulated data by Mirarab (2014) suggested that summary methods can produce highly accurate species trees when a large number of sufficiently accurate gene trees are involved, but showed that species tree methods are susceptible to gene tree estimation error – a phenomenon that has been observed in other studies (Leaché and Rannala,

2010, DeGiorgio and Degnan, 2013, Patel et al., 2013, Bayzid and Warnow, 2012). Furthermore,

Mirarab et al. (2014) showed that when datasets with a small number of estimated gene trees are analyzed, the statistical advantages of coalescent-based species tree estimation is reduced

164

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes compared to concatenation. This appears to be consistent with my species tree estimation of the

Macropodidae (see Figure 37), in which Bayesian posterior probabilities were far lower and placed several taxa inconsistently with other recent studies (see Figure 35 and Figure 36) - indicating the need for further species tree work here. Indeed, the low support values is likely indicative of low information content within the individual genes, effectively resulting in stochastic variation being amplified under the coalescent model. I also note that some missing data in the matrix may also be compromising the result. Instead, the true benefit of species tree methods may lie in datasets incorporating genome-wide data (Mirarab et al., 2014).

5.5 CONCLUSION

I have provided a phylogeny and timescale for the evolution of the major clades within the

Macropodidae, addressing a six-way polytomy that exists within this family of marsupials.

Interestingly, I find discordance between the concatenated phylogenetic reconstruction and the species tree, with the species tree inferring several unexpected relationships and overall, less robust statistical support. However, this is likely an artefact of the reconstruction of species trees from multi-locus data arising from missing data and a relatively low number of loci, and perhaps a poor fit of the coalescent model to real data. I have improved resolution of the grouping of key clades among the 6-way polytomy at the base of the Macropodidae, however the relative positions of Onychogalea and Lagorchestes remain uncertain. Furthermore, my concatenated analysis provides the strongest evidence yet for an association between the New Guinean Forest

Wallabies (Dorcopsini) and pademelons, rock wallabies and tree kangaroos

(Thylogale+Dendrolagini). However, my finding that Dorcopsis and Dorcopsulus are not reciprocally monophyletic necessitates further investigation. The enigmatic quokka (Setonix) was found to group with the Macropus+Wallabia+Lagorchestes clade, however additional molecular data may be required to confidently resolve the position of Setonix relative to other

165

Chapter 5: Deep phylogeny and dating of the Macropodidae using retrotransposons, mitochondrial and nuclear genes macropodines. Whole genome studies, may be required to confidently resolve the remaining uncertainties at the base of the Macropodidae.

166

Chapter 6: Discussion and Conclusions

6Chapter 6: Discussion and Conclusions

The overall objective of my thesis was to shed light on the phylogenetic relationships among kangaroos and their kin (family Macropodidae), and to trace the timing and dynamics of their evolutionary history over the past ~20 million years.

The first empirical research chapter (2) focussed on the phylogeny of the genera Macropus and

Wallabia, utilizing retrotransposons as phylogenetic markers. This powerful method for reconstructing phylogenies resolved some long-contentious relationships among kangaroos, notably showing a clear affinity between Wallabia and M. (Notamacropus), resulting in a paraphyletic status for Macropus, thus encouraging taxonomic revision of these genera.

Similarly, another contentious species, Macropus irma was also shown to group with the wallabies of M. (Notamacropus). Chapter 2 also addressed an ascertainment bias that arises in retrotransposon studies that utilize only a single reference genome and I have provided a statistical framework that addresses this bias. The ascertainment bias is particularly problematic for phylogenetic reconstruction because markers that support any grouping that does not include a reference taxon will not be observable, such that the method is “blind” to some trees (Nikaido et al., 2007) (Figure 1). This highlights the importance of developing analytical and statistical methods to overcome genome scarcity, which will in turn, allow researchers to confidently infer phylogenies from retrotransposon markers.

I have presented an “ILS symmetry test” to assess whether the observed retrotransposon markers may be derived from ILS, under the assumption that the ‘blind’ tree is the species tree. This test is based on theoretical arguments (Green et al., 2010, Durand et al., 2011, Kuritzin et al., 2016) and observed patterns (Scally et al., 2012, Doronina et al., 2015), which show that ILS will

167

Chapter 6: Discussion and Conclusions distribute markers that conflict with the species tree roughly symmetrically among the two non- species tree alternatives (the two observable trees, if the “blind” tree is the species tree).

Rejecting ILS still leaves the possibility that introgression/hybridization has influenced the distribution of retrotransposon markers, and so I have developed a test for hybridization/introgression that considers the number of markers identified on successive branches along the stem lineage of the reference taxon. This “insertion ratio test” is based on the biological expectation that the proportion of insertions shared between taxa due to introgression is proportional to the amount of the genome shared. If the “blind” tree is the species tree, these introgressed markers will instead appear to support a non-species tree that includes the reference taxon.

In general, the n-1 reference genomes that are required to eliminate the ascertainment bias have not been available in past retrotransposon phylogenetic studies, and this will continue to be the case for many studies in the foreseeable future. This is particularly problematic in marsupials, in which only distantly related genomes have been published so far (opossum, Tammar wallaby,

Tasmanian devil and koala) (Mikkelsen et al., 2007, Renfree et al., 2011, Murchison et al., 2012,

Johnson et al., 2014). As such, my statistical tests will fulfil an important role in validating previous and future retrotransposon studies.

The future of retrotransposon research lies primarily in the burst of new genome sequencing that has been taking place in recent years, which has allowed researchers to examine many branches in a phylogenetic tree (Doronina et al., 2015). This will reduce and eventually eliminate the ascertainment bias that arises when only a single reference genome is available for phylogenetic reconstruction, allowing retrotransposon insertions to be detected supporting all three possible trees in a trichotomy. Furthermore, all retrotransposon studies may eventually be performed entirely in silico and at present there are software packages that detect structural variation between genomes, such as Variation Hunter (Hormozdiari et al., 2010) and Retroseq (Keane et al., 2012). These packages detect the presence or absence of known retrotransposons in

168

Chapter 6: Discussion and Conclusions homologous regions between two or more genome assemblies by identifying structural variation

(Hormozdiari et al., 2010, Keane et al., 2012). However, experimental verification will still be necessary for the foreseeable future in order to verify that genome assembly artefacts are not the cause of variation (Treangen and Salzberg, 2012). In addition, this burst in newly available genomes will provide insights into the evolution of transposable elements themselves, as suggested by my findings that the retrotransposon, LINE1, appears to have become silenced in the kangaroo genome, while another family of retrotransposon, KERV, appears to have become abundant (chapter 2). Thus future directions of TE research will likely explore competition between TEs in the host genome (Chalopin et al., 2015). This will provide an understanding of the great diversity of TEs in eukaryotic genomes and how they arise and ultimately become inactive.

In Chapter (3) I further examined relationships among Macropus and Wallabia, but took a different approach, utilizing a supermatrix dataset consisting of mitochondrial genomes and nuclear genes. In this chapter I also estimated the timing of the adaptive radiation of kangaroos through molecular dating, and considered kangaroo evolution in the context of climatic changes throughout the Miocene through to the present. This was the first study to encompass complete taxon coverage of all living members of Macropus and Wallabia, and also included a recently extinct species (Macropus greyi). The study utilized 5 nuclear genes combined with complete or near complete mitochondrial genomes. This allowed me to confirm several incongruences between mitochondrial datasets and nuclear genes/transposable elements. I clarified the position of previously contentious taxa, such as the position of M. dorsalis and the recently extinct M. greyi within the diversity of M. (Notamacropus), with M. greyi placing as the sister taxon to M. irma. I utilized the first mitochondrial genome of the enigmatic species, M. bernardus, to clarify its position as the deepest diverging wallaroo within M. Osphranter. The time-calibrated phylogeny shows a clear trend in which major cladogenesis among kangaroos coincided with the

Pliocene onset of aridification in Australia (Martin, 2006, Black et al., 2012).

169

Chapter 6: Discussion and Conclusions

Future work may include an expanded dataset and may benefit from the inclusion of complete nuclear genomes in order to shed light on some key phylogenetic relationships that lacked resolution in my analysis, such as the relative position of the Macropus subgenera – M.

(Macropus), M. (Osphranter) and M. (Notamacropus). Furthermore, a limiting factor in current analyses is the uncertainty associated with kangaroo fossil assemblages, which are limited by relatively uninformative (wide bounds) fossil calibrations. Confirming the timing of key events in kangaroo evolution may be made more precise by clarifying the affinities of key fossil kangaroo taxa. That would also allow better mapping of morphological variation among kangaroos to environmental change over time.

Chapter 4 involved the reconstruction of ancestral states of various phenotypic traits among kangaroos, including habitat, mob size, sexual dimorphism in size and sexual dimorphism of coat colour. I found multiple independent transitions from closed forests into more open habitats, coinciding with the gradual aridification of Australia. It was inferred that the major diversifications of both macropodines and potoroids began from close to the mid-Miocene climatic optimum, but that despite predominantly rainforest habitats across Australia at the time

(Martin, 2006, Black et al., 2012), the most recent common ancestors of both clades were reconstructed as open forest dwellers. Subsequent radiation of these clades as habitats opened may thus have been advantaged by the contingency of their ecological history. Later expansion of grasslands during the Pliocene onset of aridification was coincident with independent expansions into these and other arid, open habitats of M. (Macropus), M. (Osphranter),

Onychogalea, and Lagorchestes hirsutus. Inference from ancestral state reconstruction suggests that the greatest degree of sexual dimorphism occurred in kangaroo lineages leading to the more arid adapted members of M. (Osphranter) and M. (Macropus). Allometry may explain some of the variation in body size sexual dimorphism, but it is nonetheless clear that sexual dimorphism is predominantly found in males, both where there is variation in size and in coat colour, and that female preference may still play a role in driving this phenomenon.

170

Chapter 6: Discussion and Conclusions

Understanding the genetic basis of coat colour sexual dimorphism in the more sexually dimorphic species (particularly for M. (Osphranter) is a potential avenue of future research, as additional kangaroo genomes become available. Furthermore, comparing transitions into more open habitats across other marsupial clades may provide insight into the consistency of this trend across marsupials as a whole.

Chapter 5 took a phylogenetically deeper perspective on macropodid evolution, by examining the relationships and timing of divergences among genera at the base of the Macropodinae, addressing a six-way polytomy. Overall, I identified an open habitat (or less mesic) clade containing Macropus/Wallabia/Lagorchestes/Setonix and an apparently ancestrally closed-forest clade, including Dorcopsini/Dendrolagini/Thylogale, while the affinities of Onychogalea remain uncertain, with different placements recovered between the concatenated MrBayes and BEAST analyses, as well as with the retrotransposon findings of chapter (2).

I found significant discordance between my concatenated analysis and species tree analysis, with the species tree showing less resolution among the genera, and a number of groupings that are incongruous with all other recent molecular and morphological interpretations. The benefit of species trees is in their ability to deal with gene tree discordance, such as incomplete lineage sorting (Pamilo and Nei, 1988, Maddison, 1997, Degnan and Rosenberg, 2006, Heled and

Drummond, 2009). However, some of my current species tree results may be artefacts of coalescent models with simple assumptions, such as no migration or population structuring, and/or the stochastic swamping of phylogenetic signal that can arise when a small number of loci are utilized.

Future work may benefit from the inclusion of larger multi-locus datasets, expanding upon my 11 nuclear gene and mitochondrial genome analysis. Ideally complete nuclear genomes will provide a more robust species tree. Mirarab et al. (2014) suggested that the true promise of coalescent- based species tree estimation lies in genome-scale datasets. Furthermore, genome-wide methods

171

Chapter 6: Discussion and Conclusions that take into account ILS and introgression are becoming increasingly more common (Song et al., 2012, Mirarab et al., 2014, Mirarab and Warnow, 2015, Alexander et al., 2017). However, at least in the case of ILS-aware methods, Gatesy and Springer (2013, 2014) and Springer and

Gatesy (2014, 2016) have criticized these approaches, pointing out that they are only justified if

(i) ILS is a major source of the observed conflict, (ii) the examined genes correspond to genomic blocks with a unique coalescent history and (iii) the blocks are sufficiently large and phylogenetically informative to allow reliable gene trees to be reconstructed (Xi et al., 2015,

Scornavacca and Galtier, 2017). The mammalian study of Scornavacca et al. (2017), based on 43 fully sequenced mammalian genomes covering placentals, marsupials and monotremes, showed that ILS-aware methods are no better than previously published super-tree methods. Furthermore, recent comparative genomic analyses of mammals have shown that hybridisation can occur between closely related taxa (eg., Li et al., 2016), which can resemble ILS when examining recently diverged lineages. However little has been explored on this topic in the mammalian phylogenomic literature (Scornavacca and Galtier, 2017).

The inclusion of extinct macropod fossils, combining molecular and morphological data, may provide a broader lens through which to examine the diversification of kangaroos over the past

~20 million years, and particularly the drivers of their radiation. While it is clear that habitat opening and subsequent grassland expansion coincide with the macropod adaptive radiation, we also need to better understand the possible role of competition, such as might be suggested by the vombatiform diversity decline that roughly mirrors the macropod radiation from the Miocene onwards.

172

Bibliography Page 173

Bibliography

ALEXANDER, A. M., SU, Y. C., OLIVEROS, C. H., OLSON, K. V., TRAVERS, S. L. & BROWN, R. M. 2017. Genomic data reveals potential for hybridization, introgression, and incomplete lineage sorting to confound phylogenetic relationships in an adaptive radiation of narrow‐mouth frogs. Evolution, 71, 475-488.

ALPIN, K. P. & ARCHER, M. 1987. Recent advances in marsupial systematics with a new syncretic classification. In: ARCHER, M. (ed.) Possums and : Studies in evolution. Australia: Surrey Beatty: Cliping Norton, N.S.W.

ALROY, J. 1999. The fossil record of North American mammals: evidence for a Paleocene evolutionary radiation. Systematic Biology, 48, 107-118.

ALVAREZ, L. W., ALVAREZ, W., ASARO, F. & MICHEL, H. V. 1980. Extraterrestrial cause for the Cretaceous-Tertiary extinction. Science, 208, 1095-1108.

ANDERSON, E. 1949. Introgressive hybridization. Introgressive hybridization.

ANDERSON, E. & HUBRICHT, L. 1938. Hybridization in Tradescantia. III. The evidence for introgressive hybridization. American Journal of Botany, 396-402.

ANDERSSON, M. B. 1994. Sexual selection, Princeton University Press.

APLIN, K. & ARCHER, M. 1987. Recent advances in marsupial systematics with a new syncretic classification. Possums and opossums: studies in evolution, 1, 15-72.

ARCHER, M. 1984. Origins and early radiations of marsupials. Vertebrate zoogeography and evolution in Australasia, 585-625.

ARCHER, M., DODSON, J. & HEAD, L. 1998. From plesiosaurs to people: 100 million years of environmental history. Canberra: Environment Australia.

ARCHER, M., GODTHELP, H., HAND, S. & MEGIRIAN, D. 1989. Fossil mammals of Riversleigh, northwestern Queensland: preliminary overview of biostratigraphy, correlation and environmental change. Australian Zoologist, 25, 29-66.

ARCHER, M., HAND, S. & GODTHELP, H. 1994. Patterns in the ’s mammals and inferences about palaeohabitats. History of the Australian vegetation, 80-103.

ARCHER, M., HAND, S., GODTHELP, H. & CREASER, P. 1997. Correlation of the Cainozoic sediments of the Riversleigh World Heritage fossil property, Queensland, Australia. Mémoires et travaux de l'Institut de Montpellier, 131-152.

ARCHIBALD, J. D. & DEUTSCHMAN, D. 2001. Quantitative Analysis of the Timing of the Origin and Diversification of Extant Placental Orders. Journal of Mammalian Evolution, 8, 107-124.

Page 173 Bibliography Page 174

ASHER, R. J., NOVACEK, M. J. & GEISLER, J. H. 2003. Relationships of endemic African mammals and their fossil relatives based on morphological and molecular evidence. Journal of Mammalian Evolution, 10, 131-194.

AVISE, J. C. 2000. Phylogeography: the history and formation of species, Harvard university press.

AVISE, J. C., SHAPIRA, J., DANIEL, S. W., AQUADRO, C. F. & LANSMAN, R. A. 1983. Mitochondrial DNA differentiation during the speciation process in Peromyscus. Molecular Biology and Evolution, 1, 38-56.

BABA, M. L., DARGA, L. L. & GOODMAN, M. 1981. Maximum parsimony test of the clock model of molecular-change using amino-acid-sequence data. American Journal of Physical Anthropology, 54, 198-198.

BADYAEV, A. V. & HILL, G. E. 2003. Avian sexual dichromatism in relation to phylogeny and ecology. Annual Review of Ecology, Evolution, and Systematics, 34, 27-49.

BARNABAS, J., GOODMAN, M. & MOORE, G. W. 1972. Descent of mammalian alpha globin chain sequences investigated by maximum parsimony method. Journal of Molecular Biology, 69, 249-&.

BARRETT, M., DONOGHUE, M. J. & SOBER, E. 1991. Against consensus. Systematic Zoology, 40, 486-493.

BARRY, J. C. 1995. Paleoclimate and evolution, with emphasis on human origins, Yale University Press.

BARRY, J. C., JOHNSON, N. M., RAZA, S. M. & JACOBS, L. L. 1985. Neogene mammalian faunal change in southern Asia: correlations with climatic, tectonic, and eustatic events. Geology, 13, 637-640.

BARTHOLOMEW, G. A. 1970. A model for the evolution of pinniped polygyny. Evolution, 24, 546-559.

BAUM, B. R. 1992. Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon, 3-10.

BAVERSTOCK, P., KRIEG, M. & BIRRELL, J. 1990. Evolutionary relationships of Australian marsupials as assessed by albumin immunology. Australian Journal of Zoology, 37, 273-287.

BAVERSTOCK, P. R., RICHARDSON, B. J., BIRRELL, J. & KRIEG, M. 1989. Albumin Immunologic Relationships of the Macropodidae (Marsupialia). Systematic Biology, 38, 38-50.

BAYZID, M. S. & WARNOW, T. 2012. Estimating optimal species trees from incomplete gene trees under deep coalescence. Journal of Computational Biology, 19, 591-605.

BECK, R. M., GODTHELP, H., WEISBECKER, V., ARCHER, M. & HAND, S. J. 2008. Australia's oldest marsupial fossils and their biogeographical implications. PLoS One, 3, e1858.

BELSHAW, R., PEREIRA, V., KATZOURAKIS, A., TALBOT, G., PAČES, J., BURT, A. & TRISTEM, M. 2004. Long-term reinfection of the human genome by endogenous

Page 174 Bibliography Page 175

retroviruses. Proceedings of the National Academy of Sciences of the United States of America, 101, 4894-4899.

BENACHENHOU, F., JERN, P., OJA, M., SPERBER, G., BLIKSTAD, V., SOMERVUO, P., KASKI, S. & BLOMBERG, J. 2009. Evolutionary conservation of orthoretroviral long terminal repeats (LTRs) and ab initio detection of single LTRs in genomic data. PloS one, 4, e5179.

BENSLEY, B. A. 1903. III. On the Evolution of the Australian Marsupialia; with Remarks on the Relationships of the Marsupials in general. Transactions of the Linnean Society of London. 2nd Series: Zoology, 9, 83-217.

BERNOR, R. L. 1983. Geochronology and zoogeographic relationships of Miocene Hominoidea. New interpretations of ape and human ancestry. Springer.

BININDA-EMONDS, O. R., CARDILLO, M., JONES, K. E., MACPHEE, R. D., BECK, R. M., GRENYER, R., PRICE, S. A., VOS, R. A., GITTLEMAN, J. L. & PURVIS, A. 2007. The delayed rise of present-day mammals. Nature, 446, 507-512.

BININDA-EMONDS, O. R., GITTLEMAN, J. L. & STEEL, M. A. 2002. The (super) tree of life: procedures, problems, and prospects. Annual Review of Ecology and Systematics, 33, 265-289.

BLACK, K. H., ARCHER, M., HAND, S. J. & GODTHELP, H. 2012. The rise of Australian marsupials: a synopsis of biostratigraphic, phylogenetic, palaeoecologic and palaeobiogeographic understanding. Earth and life. Springer.

BOISSINOT, S., ROOS, C. & FURANO, A. V. 2004. Different rates of LINE-1 (L1) retrotransposon amplification and evolution in New World monkeys. Journal of molecular evolution, 58, 122-130.

BRISCOE, D., CALABY, J., CLOSE, R., MAYNES, G., MURTAGH, C. & SHARMAN, G. 1982. Isolation, introgression and genetic variation in rock-wallabies. Species at Risk: Research in Australia’.(Eds RH Groves and WDL Ride.) pp, 73-87.

BRYDEN, M. 1972. Growth and development of marine mammals. Functional anatomy of marine mammals, 1, 1-79.

BULAZEL, K. V., FERRERI, G. C., ELDRIDGE, M. & O’NEILL, R. J. 2007. Species- specific shifts in centromere sequence composition are coincident with breakpoint reuse in karyotypically divergent lineages. Genome Biol, 8, R170.

BURK, A. & SPRINGER, M. 2000. Intergeneric Relationships Among Macropodoidea (: Diprotodontia) and The Chronicle of Kangaroo Evolution. Journal of Mammalian Evolution, 7, 213-237.

BURK, A., WESTERMAN, M. & SPRINGER, M. 1998. The phylogenetic position of the musky rat-kangaroo and the evolution of bipedal hopping in kangaroos (Macropodidae: Diprotodontia). Syst Biol, 47, 457-74.

BUTLER, K., TRAVOUILLON, K. J., PRICE, G. J., ARCHER, M. & HAND, S. J. 2016. Cookeroo, a new genus of fossil kangaroo (Marsupialia, Macropodidae) from the Oligo-Miocene of Riversleigh, northwestern Queensland, Australia. Journal of Vertebrate Paleontology, e1083029.

Page 175 Bibliography Page 176

BYRNE, M., STEANE, D. A., JOSEPH, L., YEATES, D. K., JORDAN, G. J., CRAYN, D., APLIN, K., CANTRILL, D. J., COOK, L. G. & CRISP, M. D. 2011. Decline of a biome: evolution, contraction, fragmentation, extinction and invasion of the Australian mesic zone biota. Journal of Biogeography, 38, 1635-1656.

CAMPEAU-PÉLOQUIN, A., KIRSCH, J. A., ELDRIDGE, M. D. & LAPOINTE, F.-J. 2001. Phylogeny of the rock-wallabies, Petrogale (Marsupialia: Macropodidae) based on DNA/DNA hybridisation. Australian Journal of Zoology, 49, 463-486.

CANTRELL, M. A., EDERER, M. M., ERICKSON, I. K., SWIER, V. J., BAKER, R. J. & WICHMAN, H. A. 2005. MysTR: an endogenous retrovirus family in mammals that is undergoing recent amplifications to unprecedented copy numbers. Journal of virology, 79, 14698-14707.

CANTRELL, M. A., GRAHN, R. A., SCOTT, L. & WICHMAN, H. A. 2000. Isolation of markers from recently transposed LINE-1 retrotransposons. Biotechniques, 29, 1310-1317.

CANTRELL, M. A., SCOTT, L., BROWN, C. J., MARTINEZ, A. R. & WICHMAN, H. A. 2008. Loss of LINE-1 activity in the megabats. Genetics, 178, 393-404.

CARDILLO, M., BININDA-EMONDS, R. P., BOAKES, E. & PURVIS, A. 2004. A species-level phylogenetic supertree of marsupials. Journal of Zoology, 264, 11-31.

CARO, T. 2005. The Adaptive Significance of Coloration in Mammals. BioScience, 55, 125- 136.

CARROLL, R. L. 1997. Patterns and processes of vertebrate evolution, Cambridge University Press.

CARSTENS, B. C. & KNOWLES, L. L. 2007. Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers. Systematic Biology, 56, 400-411.

CASAVANT, N. C., SCOTT, L., CANTRELL, M. A., WIGGINS, L. E., BAKER, R. J. & WICHMAN, H. A. 2000. The end of the LINE?: lack of recent L1 activity in a group of South American rodents. Genetics, 154, 1809-1817.

CASTRESANA, J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular biology and evolution, 17, 540-552.

CERLING, T. E., HARRIS, J. M., MACFADDEN, B. J. & LEAKEY, M. G. 1997. Global vegetation change through the Miocene/Pliocene boundary. Nature, 389, 153.

CHALOPIN, D., NAVILLE, M., PLARD, F., GALIANA, D. & VOLFF, J.-N. 2015. Comparative analysis of transposable elements highlights mobilome diversity and evolution in vertebrates. Genome biology and evolution, 7, 567-580.

CHURAKOV, G., KRIEGS, J. O., BAERTSCH, R., ZEMANN, A., BROSIUS, J. & SCHMITZ, J. 2009. Mosaic retroposon insertion patterns in placental mammals. Genome research, 19, 868-875.

CLARK, J. B. & KIDWELL, M. G. 1997. A phylogenetic perspective on P transposable element evolution in Drosophila. Proceedings of the National Academy of Sciences, 94, 11428-11433.

Page 176 Bibliography Page 177

CLOSE, R. & LOWRY, P. 1990. Hybrids in marsupial research. Australian Journal of Zoology, 37, 259-267.

COFFIN, J. M., HUGHES, S. H., VARMUS, H. E., BOEKE, J. & STOYE, J. 1997. Retrotransposons, endogenous retroviruses, and the evolution of retroelements.

COOKE, B. N. 1999. Wanburoo hilarus gen. et sp. nov., a lophodont bulungamayine kangaroo (Marsupialia: Macropodoidea: Bulungamayinae) from the Miocene deposits of Riversleigh, northwestern Queensland. Records of the Western Australian Museum Supplement, 57, 239-253.

COOKE, B. N., TRAVOUILLON, K. J., ARCHER, M. & HAND, S. J. 2015. Ganguroo robustiter, sp. nov.(Macropodoidea, Marsupialia), a middle to early late Miocene basal macropodid from Riversleigh World Heritage Area, Australia. Journal of Vertebrate Paleontology, 35, e956879.

COOPER, A. & PENNY, D. 1997. Mass Survival of Birds Across the Cretaceous- Tertiary Boundary: Molecular Evidence. Science, 275, 1109-1113.

CORNELIS, G., HEIDMANN, O., BERNARD-STOECKLIN, S., REYNAUD, K., VÉRON, G., MULOT, B., DUPRESSOIR, A. & HEIDMANN, T. 2012. Ancestral capture of syncytin-Car1, a fusogenic endogenous retroviral envelope gene involved in placentation and conserved in Carnivora. Proceedings of the National Academy of Sciences, 109, E432-E441.

CROOK, J. H. 1972. Sexual selection, dimorphism, and social organization in the primates. Sexual selection and the descent of man, 18711971.

DARWIN, C. 1883. The Descent of Man and Seletion in Relation to Sex, Рипол Классик.

DAVIES, T. J., BARRACLOUGH, T. G., CHASE, M. W., SOLTIS, P. S., SOLTIS, D. E. & SAVOLAINEN, V. 2004. Darwin's abominable mystery: insights from a supertree of the angiosperms. Proceedings of the National Academy of Sciences, 101, 1904- 1909.

DAWSON, L. 2004. A new Pliocene tree kangaroo species (Marsupialia, Macropodinae) from the Chinchilla Local Fauna, southeastern Queensland. Alcheringa, 28, 267-273.

DAWSON, L. & FLANNERY, T. 1985. Taxonomic and phylogenetic status of living and fossil kangaroos and wallabies of the genus Macropus Shaw (Macropodidae: Marsupialia), with a new subgeneric name for the larger wallabies. Australian Journal of Zoology, 33, 473-498.

DAWSON, T. 1995. Kangaroos: the biology of the largest marsupials. University of Press, Sydney. New South Wales, Australia.

DAWSON, T. J. 2012. Kangaroos, CSIRO PUBLISHING.

DAWSON, T. J. & BROWN, G. D. 1970. A comparison of the insulative and reflective properties of the fur of desert kangaroos. Comparative Biochemistry and Physiology, 37, 23-38.

DE PARSEVAL, N. & HEIDMANN, T. 2005. Human endogenous retroviruses: from infectious elements to human genes. Cytogenetic and genome research, 110, 318- 332.

Page 177 Bibliography Page 178

DE QUEIROZ, A., DONOGHUE, M. J. & KIM, J. 1995. Separate versus combined analysis of phylogenetic evidence. Annual Review of Ecology and Systematics, 26, 657-681.

DE QUEIROZ, A. & GATESY, J. 2007. The supermatrix approach to systematics. Trends in Ecology & Evolution, 22, 34-41.

DEGIORGIO, M. & DEGNAN, J. H. 2013. Robustness to divergence time underestimation when inferring species trees from estimated gene trees. Systematic biology, 63, 66- 82.

DEGNAN, J. H. & ROSENBERG, N. A. 2006. Discordance of species trees with their most likely gene trees. PLoS genetics, 2, e68.

DEGNAN, J. H. & ROSENBERG, N. A. 2009. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends in ecology & evolution, 24, 332- 340.

DEGNAN, J. H. & SALTER, L. A. 2005. Gene tree distributions under the coalescent process. Evolution, 59, 24-37.

DEROCHER, A. E., ANDERSEN, M. & WIIG, Ø. 2005. Sexual dimorphism of polar bears. Journal of Mammalogy, 86, 895-901.

DEWANNIEUX, M., ESNAULT, C. & HEIDMANN, T. 2003. LINE-mediated retrotransposition of marked Alu sequences. Nature genetics, 35, 41.

DIRZO, R. & RAVEN, P. H. 2003. Global state of biodiversity and loss. Annual Review of Environment and Resources, 28, 137-167.

DIXSON, A., DIXSON, B. & ANDERSON, M. 2005. Sexual selection and the evolution of visually conspicuous sexually dimorphic traits in male monkeys, apes, and human beings. Annual review of sex research, 16, 1-19.

DODT, W. G., GALLUS, S., PHILLIPS, M. J. & NILSSON, M. A. 2017. Resolving kangaroo phylogeny and overcoming retrotransposon ascertainment bias. Scientific reports, 7, 16811.

DORONINA, L., CHURAKOV, G., SHI, J., BROSIUS, J., BAERTSCH, R., CLAWSON, H. & SCHMITZ, J. 2015. Exploring massive incomplete lineage sorting in arctoids (, Carnivora). Molecular biology and evolution, msv188.

DOYLE, J. J. 1992. Gene trees and species trees: molecular systematics as one-character taxonomy. Systematic Botany, 144-163.

DOYLE, J. J. 1997. Trees within trees: genes and species, molecules and morphology. Systematic Biology, 46, 537-553.

DRUMMOND, A. J., HO, S. Y., PHILLIPS, M. J. & RAMBAUT, A. 2006. Relaxed phylogenetics and dating with confidence. PLoS Biol, 4, e88.

DRUMMOND, A. J. & RAMBAUT, A. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. Bmc Evolutionary Biology, 7.

DRUMMOND, A. J., RAMBAUT, A., SHAPIRO, B. & PYBUS, O. G. 2005. Bayesian coalescent inference of past population dynamics from molecular sequences. Molecular Biology and Evolution, 22, 1185-1192.

Page 178 Bibliography Page 179

DRUMMOND, A. J. & SUCHARD, M. A. 2010. Bayesian random local clocks, or one rate to rule them all. BMC biology, 8, 114.

DRUMMOND, A. J., SUCHARD, M. A., XIE, D. & RAMBAUT, A. 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution, 29, 1969-1973.

DUCHENE, D. & BROMHAM, L. 2015. Molecular Dating of Evolutionary Events. Encyclopedia of Scientific Dating Methods, 593-596.

DURAND, E. Y., PATTERSON, N., REICH, D. & SLATKIN, M. 2011. Testing for ancient admixture between closely related populations. Molecular biology and evolution, 28, 2239-2252.

EDGAR, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research, 32, 1792-1797.

EDWARDS, M. E., BRUBAKER, L. B., LOZHKIN, A. V. & ANDERSON, P. M. 2005. Structurally novel biomes: A response to past warming in Beringia. Ecology, 86, 1696-1703.

EDWARDS, S. V. 2009. Is a new and general theory of molecular systematics emerging? Evolution, 63, 1-19.

EDWARDS, S. V., LIU, L. & PEARL, D. K. 2007. High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences, 104, 5936-5941.

EISENBERG, J. F. 1981. Mammalian radiations, University of Chicago Press.

EISENBERG, J. F. & EISENBERG, J. F. 1981. The mammalian radiations: an analysis of trends in evolution, adaptation, and behaviour.

ELDRIDGE, M. & CLOSE, R. 1992. Taxonomy of rock wallabies, Petrogale (Marsupialia, Macropodidae). 1. A revision of the Eastern Petrogale with the description of 3 new species. Australian Journal of Zoology, 40, 605-625.

ELDRIDGE, M., JOHNSTON, P. & CLOSE, R. 1991. Chromosomal Rearrangements in Rock Wallabies, Petrogale (Marsupialia, Macropodidae). 5. Chromosomal Phylogeny of the Lateralis-Penicillata Group. Australian Journal of Zoology, 39, 629-641.

ELDRIDGE, M. D., MILLER, E. J., NEAVES, L. E., ZENGER, K. R. & HERBERT, C. A. 2017. Extensive genetic differentiation detected within a model marsupial, the tammar wallaby (Notamacropus eugenii). PloS one, 12, e0172777.

ELDRIDGE, M. D., POTTER, S., JOHNSON, C. N. & RITCHIE, E. G. 2014. Differing impact of a major biogeographic barrier on genetic structure in two large kangaroos from the monsoon tropics of . Ecology and evolution, 4, 554-567.

ELLSTRAND, N. C., PRENTICE, H. C. & HANCOCK, J. F. 1999. Gene flow and introgression from domesticated plants into their wild relatives. Annual review of Ecology and Systematics, 30, 539-563.

ERICKSON, I. K., CANTRELL, M. A., SCOTT, L. & WICHMAN, H. A. 2011. Retrofitting the genome: L1 extinction follows endogenous retroviral expansion in a group of muroid rodents. Journal of virology, 85, 12315-12323.

Page 179 Bibliography Page 180

ESSELSTYN, J. A., OLIVEROS, C. H., SWANSON, M. T. & FAIRCLOTH, B. C. 2017. Investigating difficult nodes in the placental mammal tree with expanded taxon sampling and thousands of ultraconserved elements. Genome biology and evolution, 9, 2308-2321.

FARRIS, J. S. 1977. Phylogenetic analysis under Dollo's Law. Systematic Biology, 26, 77- 88.

FELSENSTEIN, J. 1978. The number of evolutionary trees. Systematic Biology, 27, 27-33.

FELSENSTEIN, J. 1981. EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH. Journal of Molecular Evolution, 17, 368-376.

FELSENSTEIN, J. 1988. PHYLOGENIES FROM MOLECULAR SEQUENCES - INFERENCE AND RELIABILITY. Annual Review of Genetics, 22, 521-565.

FELSENSTEIN, J. 2004. Inferring phylogenies, Sinauer associates Sunderland.

FERRERI, G., LISCINSKY, D., MACK, J., ELDRIDGE, M. & O'NEILL, R. 2005. Retention of latent centromeres in the mammalian genome. Journal of Heredity, 96, 217-224.

FERRERI, G., MARZELLI, M., RENS, W. & O’NEILL, R. 2004. A centromere-specific retroviral element associated with breaks of synteny in macropodine marsupials. Cytogenetic and genome research, 107, 115-118.

FERRERI, G. C., BROWN, J. D., OBERGFELL, C., JUE, N., FINN, C. E., O'NEILL, M. J. & O'NEILL, R. J. 2011. Recent amplification of the kangaroo endogenous retrovirus, KERV, limited to the centromere. Journal of virology, 85, 4761-4771.

FESCHOTTE, C. & GILBERT, C. 2012. Endogenous viruses: insights into viral evolution and impact on host biology. Nature Reviews Genetics, 13, 283-296.

FINARELLI, J. A. & FLYNN, J. J. 2006. Ancestral state reconstruction of body size in the Caniformia (Carnivora, Mammalia): the effects of incorporating data from the fossil record. Systematic Biology, 55, 301-313.

FINNEGAN, D. J. 1989. Eukaryotic transposable elements and genome evolution. Trends in Genetics, 5, 103-107.

FLANNERY, T. 1983. Revision in the macropodid subfamily Sthenurinae (Marsupialia: Macropodoidea) and the relationships of the species of Troposodon and Lagostrophus. Australian Mammal Society, 6, 15.

FLANNERY, T. 1984. Kangaroos: 15 million years of Australian bounders. Vertebrate Zoogeography & Evolution in Australasia, 817-836.

FLANNERY, T. 1989. Phylogeny of the Macropodoidea; a study in convergence. Kangaroos, wallabies and rat-kangaroos, 1, 1-46.

FLANNERY, T. & HANN, L. 1984. A new macropodine genus and species (Marsupialia: Macropodidae) from the early Pleistocene of southwestern . Australian Mammalogy, 7, 193-204.

Page 180 Bibliography Page 181

FLANNERY, T. F. 1992. The Macropodoidea (Marsupialia) of the early Pliocene Hamilton Local Fauna, Victoria, Australia, Field Museum of Natural History.

FLANNERY, T. F. 1995. Mammals of new guinea, Reed.

FLANNERY, T. F., MARTIN, R. & SZALAY, A. 1996. Tree kangaroos: a curious natural history, Reed books.

FLANNERY, T. F., RICH, T., TURNBULL, W. & LUNDELIUS JR, E. 1992. The Macropodoidea (Marsupialia) of the early Pliocene Hamilton Local Fauna, Victoria, Australia, Field Museum of Natural History.

FOOTE, M., HUNTER, J. P., JANIS, C. M. & SEPKOSKI, J. J. 1999. Evolutionary and Preservational Constraints on Origins of Biologic Groups: Divergence Times of Eutherian Mammals. Science, 283, 1310-1314.

FRANK, O., VERBEKE, C., SCHWARZ, N., MAYER, J., FABARIUS, A., HEHLMANN, R., LEIB-MÖSCH, C. & SEIFARTH, W. 2008. Variable transcriptional activity of endogenous retroviruses in human breast cancer. Journal of virology, 82, 1808-1818.

FRITH, H. J. & CALABY, J. H. 1969. Kangaroos, Hurst.

FUMAGALLI, L., POPE, L. C., TABERLET, P. & MORITZ, C. 1997. Versatile primers for the amplification of the mitochondrial DNA control region in marsupials. Molecular Ecology, 6, 1199-1201.

GALLUS, S., HALLSTRÖM, B. M., KUMAR, V., DODT, W. G., JANKE, A., SCHUMANN, G. G. & NILSSON, M. A. 2015a. Evolutionary histories of transposable elements in the genome of the largest living marsupial carnivore, the Tasmanian devil. Molecular biology and evolution, 32, 1268-1283.

GALLUS, S., JANKE, A., KUMAR, V. & NILSSON, M. A. 2015b. Disentangling the relationship of the Australian marsupial orders using retrotransposon and evolutionary network analyses. Genome biology and evolution, 7, 985-992.

GALLUS, S., LAMMERS, F. & NILSSON, M. 2016. When Genomics is not Enough: Experimental Evidence for a Decrease in LINE-1 Activity During the Evolution of Australian Marsupials. Genome Biology and Evolution, evw159.

GASCUEL, O. & STEEL, M. 2014. Predicting the ancestral character changes in a tree is typically easier than predicting the root state. Systematic biology, 63, 421-435.

GATESY, J., O'GRADY, P. & BAKER, R. H. 1999. Corroboration among data sets in simultaneous analysis: hidden support for phylogenetic relationships among higher level artiodactyl taxa. , 15, 271-313.

GEISLER, J. H. & UHEN, M. D. 2005. Phylogenetic relationships of extinct cetartiodactyls: results of simultaneous analyses of molecular, morphological, and stratigraphic data. Journal of Mammalian Evolution, 12, 145-160.

GENTLES, A. J., WAKEFIELD, M. J., KOHANY, O., GU, W., BATZER, M. A., POLLOCK, D. D. & JURKA, J. 2007. Evolutionary dynamics of transposable elements in the short-tailed opossum Monodelphis domestica. Genome research, 17, 992-1004.

Page 181 Bibliography Page 182

GIFFORD, R. & TRISTEM, M. 2003. The evolution, distribution and diversity of endogenous retroviruses. Virus genes, 26, 291-315.

GLENNER, H., HANSEN, A. J., SORENSEN, M. V., RONQUIST, F., HUELSENBECK, J. P. & WILLERSLEV, E. 2004. Bayesian inference of the metazoan phylogeny; a combined molecular and morphological approach. Curr Biol, 14, 1644-9.

GODTHELP, H., WROE, S. & ARCHER, M. 1999. A new marsupial from the Early Eocene Tingamarra Local Fauna of Murgon, southeastern Queensland: a prototypical Australian marsupial? Journal of Mammalian Evolution, 6, 289-313.

GONTCHAROV, A. A., MARIN, B. & MELKONIAN, M. 2004. Are combined analyses better than single gene phylogenies? A case study using SSU rDNA and rbcL sequence comparisons in the Zygnematophyceae (Streptophyta). Molecular Biology and Evolution, 21, 612-624.

GORDON, A. D. 1986. Consensus supertrees: the synthesis of rooted trees containing overlapping sets of labeled leaves. Journal of classification, 3, 335-348.

GRAHN, R., RINEHART, T., CANTRELL, M. & WICHMAN, H. 2005. Extinction of LINE-1 activity coincident with a major mammalian radiation in rodents. Cytogenetic and genome research, 110, 407-415.

GRAVES, J. A. M. & RENFREE, M. B. 2013. Marsupials in the Age of Genomics. Annual review of genomics and human genetics, 14.

GREEN, R. E., KRAUSE, J., BRIGGS, A. W., MARICIC, T., STENZEL, U., KIRCHER, M., PATTERSON, N., LI, H., ZHAI, W. & FRITZ, M. H.-Y. 2010. A draft sequence of the Neandertal genome. science, 328, 710-722.

GRIFFITHS, R. C. & TAVARE, S. 1994. sampling theory for neutral alleles in a varying environment. Philosophical Transactions of the Royal Society of London Series B- Biological Sciences, 344, 403-410.

HAHN, C., BACHMANN, L. AND CHEVREUX, B., 2013. Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads—a baiting and iterative mapping approach. Nucleic acids research, 41(13), pp.e129-e129.

HALL, E. R. 1951. American weasels, University of Kansas.

HALLSTRÖM, B. M. & JANKE, A. 2010. Mammalian evolution may not be strictly bifurcating. Molecular Biology and Evolution, 27, 2804-2816.

HARTL, D. L., LOHE, A. R. & LOZOVSKAYA, E. R. 1997. Modern thoughts on an ancyent marinere: function, evolution, regulation. Annual review of genetics, 31, 337-358.

HAYMAN, D. 1989. Marsupial cytogenetics. Australian Journal of Zoology, 37, 331-349.

HELED, J. & DRUMMOND, A. J. 2009. Bayesian inference of species trees from multilocus data. Molecular biology and evolution, 27, 570-580.

HENDY, M. D. & PENNY, D. 1989. A framework for the quantitative study of evolutionary trees. Systematic Biology, 38, 297-309.

HENNIG, W. 1965. Phylogenetic systematics. Annual review of entomology, 10, 97-116.

Page 182 Bibliography Page 183

HERNIOU, E., MARTIN, J., MILLER, K., COOK, J., WILKINSON, M. & TRISTEM, M. 1998. Retroviral diversity and distribution in vertebrates. Journal of virology, 72, 5955-5966.

HILL, R. S. 2004. Origins of the southeastern Australian vegetation. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 359, 1537- 1549.

HILLIS, D. M., HUELSENBECK, J. P. & CUNNINGHAM, C. W. 1994. Application and accuracy of molecular phylogenies. Science-AAAS-Weekly Paper Edition-including Guide to Scientific Information, 264, 671-676.

HOCKNULL, S. A. (2005). Ecological succession during the late Cainozoic of central eastern Queensland: extinction of a diverse rainforest community. Memoirs of the Queensland Museum, 51(1), 39-122.

HORMOZDIARI, F., HAJIRASOULIHA, I., DAO, P., HACH, F., YORUKOGLU, D., ALKAN, C., EICHLER, E. E. & SAHINALP, S. C. 2010. Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery. Bioinformatics, 26, i350-i357.

HORMOZDIARI, F., KONKEL, M. K., PRADO-MARTINEZ, J., CHIATANTE, G., HERRAEZ, I. H., WALKER, J. A., NELSON, B., ALKAN, C., SUDMANT, P. H. & HUDDLESTON, J. 2013. Rates and patterns of great ape retrotransposition. Proceedings of the National Academy of Sciences, 110, 13457-13462.

HORNER, D. S., PAVESI, G., CASTRIGNANO, T., DE MEO, P. D., LIUNI, S., SAMMETH, M., PICARDI, E. & PESOLE, G. 2010. Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Briefings in Bioinformatics, 11, 181-197.

HUELSENBECK, J. P. & RONQUIST, F. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17, 754-755.

HUELSENBECK, J. P., RONQUIST, F., NIELSEN, R. & BOLLBACK, J. P. 2001a. Bayesian inference of phylogeny and its impact on evolutionary biology. Science, 294, 2310-4.

HUELSENBECK, J. P., RONQUIST, F., NIELSEN, R. & BOLLBACK, J. P. 2001b. Bayesian inference of phylogeny and its impact on evolutionary biology. science, 294, 2310-2314.

HUNT, J. & ROBERT, M. 2004. Chapter 11: Global climate and the evolution of large mammalian during the later Cenozoic in North America. Bulletin of the American Museum of Natural History, 139-156.

HUNTER, J. P. & JANIS, C. M. 2006. "Garden of Eden" or "fool's paradise"? Phylogeny, dispersal, and the southern continent hypothesis of placental mammal origins. Paleobiology, 32, 339-344.

HUSON, D.H. and BRYANT, D., 2005. Application of phylogenetic networks in evolutionary studies. Molecular biology and evolution, 23(2), pp.254-267.

HUTCHINSON, J. B. 1929. The application of the "Method of Maximum Likelihood" to the estimation of linkage. Genetics, 14, 519-537.

Page 183 Bibliography Page 184

JACKSON, S. & GROVES, C. 2015. Taxonomy of Australian mammals, CSIRO PUBLISHING.

JANIS, C. 2008. An evolutionary history of browsing and grazing ungulates. The ecology of browsing and grazing. Springer.

JANSSON, R. & DYNESIUS, M. 2002. The fate of clades in a world of recurrent climatic change: Milankovitch oscillations and evolution. Annual Review of Ecology and Systematics, 33, 741-777.

JARMAN, P. J. 1987. Group size and activity in eastern grey kangaroos. Animal Behaviour, 35, 1044-1050.

JEFFERSON, T. 1990. Sexual dimorphism and development of external features in Dall's porpoise (Phocoenoides dalli)(Porpoise).

JOHNSON, M. A., REVELL, L. J. & LOSOS, J. B. 2010. Behavioral convergence and adaptive radiation: effects of habitat use on territorial behavior in Anolis lizards. Evolution, 64, 1151-1159.

JOHNSON, R. N., HOBBS, M., ELDRIDGE, M. D., KING, A. G., COLGAN, D. J., WILKINS, M. R., CHEN, Z., PRENTIS, P. J., PAVASOVIC, A. & POLKINGHORNE, A. 2014. The koala genome corsortium. Technical Reports of the Australian Museum, 24, 91-92.

JUKES, T. H. & CANTOR, C. R. 1969. Evolution of protein molecules. Mammalian protein metabolism, 3, 132.

JURKA, J., KAPITONOV, V. V., PAVLICEK, A., KLONOWSKI, P., KOHANY, O. & WALICHIEWICZ, J. 2005. Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and genome research, 110, 462-467.

KAJIKAWA, M. & OKADA, N. 2002. LINEs mobilize SINEs in the eel through a shared 3′ sequence. Cell, 111, 433-444.

KAZAZIAN, H. H. & MORAN, J. V. 1998. The impact of L1 retrotransposons on the human genome. Nature genetics, 19, 19-24.

KEANE, T. M., WONG, K. & ADAMS, D. J. 2012. RetroSeq: transposable element discovery from next-generation sequencing data. Bioinformatics, 29, 389-390.

KEAR, B. P., COOKE, B. N., ARCHER, M. & FLANNERY, T. F. 2007. Implications of a new species of the Oligo-Miocene kangaroo (Marsupialia: Macropodoidea) Nambaroo, from the Riversleigh World Heritage Area, Queensland, Australia. Journal of Paleontology, 81, 1147-1167.

KEAR, B. P. & PLEDGE, N. S. 2008. A new fossil kangaroo from the Oligocene-Miocene Etadunna Formation of Ngama Quarry, Lake Palankarinna, South Australia. Australian Journal of Zoology, 55, 331-339.

KENT, W. J. 2002. BLAT—the BLAST-like alignment tool. Genome research, 12, 656-664.

KEWITZ, S. & STAEGE, M. S. 2013. Expression and regulation of the endogenous retrovirus 3 in Hodgkin’s lymphoma cells. Frontiers in oncology, 3.

Page 184 Bibliography Page 185

KIMURA, M. 1983. The Neutral Theory of Molecular Evolution. Cambridge University Press.

KINGMAN, J. F. C. 1982. The coalescent. Stochastic Processes and their Applications, 13, 235-248.

KIRSCH, J. & PALMA, R. 1995. DNA/DNA hybridization studies of carnivorous marsupials. V. A further estimate of relationships among opossums (Marsupialia: Didelphidae). Mammalia, 59, 403-426.

KIRSCH, J. A. 1977. The comparative serology of Marsupialia, and a classification of marsupials. Australian Journal of Zoology, 25, 1-152.

KIRSCH, J. A., LAPOINTE, F. J. & FOESTE, A. 1995. Resolution of portions of the kangaroo phylogeny (Marsupialia: Macropodidae) using DNA hybridization. Biological Journal of the Linnean Society, 55, 309-328.

KIRSCH, J. A., SPRINGER, M. S. & LAPOINTE, F.-J. 1997. DNA-hybridisation studies of marsupials and their implications for metatherian classification. Australian Journal of Zoology, 45, 211-280.

KLUGE, A. G. 1989. A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Systematic Biology, 38, 7-25.

KLUGE, A. G. & WOLF, A. J. 1993. Cladistics: what's in a word? Cladistics, 9, 183-199.

KOHANY, O., GENTLES, A. J., HANKUS, L. & JURKA, J. 2006. Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC bioinformatics, 7, 474.

KOTTLER, M. J. 1980. Darwin, Wallace, and the origin of sexual dimorphism. Proceedings of the American Philosophical Society, 124, 203-226.

KRAMEROV, D. A. & VASSETZKY, N. S. 2011. Origin and evolution of SINEs in eukaryotic genomes. Heredity, 107, 487-95.

KRAYEV, A. S., KRAMEROV, D. A., SKRYABIN, K. G., RYSKOV, A. P., BAYEV, A. A. & GEORGIEV, G. P. 1980. The nucleotide-sequence of the ubiquitous repetitive dna-sequence b1 complementary to the most abundant class of foldback RNA. Nucleic Acids Research, 8, 1201-1215.

KRIEGS, J. O., CHURAKOV, G., KIEFMANN, M., JORDAN, U., BROSIUS, J. & SCHMITZ, J. 2006. Retroposed Elements as Archives for the Evolutionary History of Placental Mammals. PLoS Biol, 4, e91.

KUBATKO, L. S., CARSTENS, B. C. & KNOWLES, L. L. 2009. STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics, 25, 971-973.

KUBATKO, L. S. & DEGNAN, J. H. 2007. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology, 56, 17-24.

KUIPER, K., DEINO, A., HILGEN, F., KRIJGSMAN, W., RENNE, P. & WIJBRANS, J. 2008. Synchronizing rock clocks of Earth history. science, 320, 500-504.

Page 185 Bibliography Page 186

KUMAR, S. & HEDGES, S. B. 1998. A molecular timescale for vertebrate evolution. Nature, 392, 917-920.

KURITZIN, A., KISCHKA, T., SCHMITZ, J. & CHURAKOV, G. 2016. Incomplete Lineage Sorting and Hybridization Statistics for Large-Scale Retroposon Insertion Data. PLOS Comput Biol, 12, e1004812.

LANDER, E. S., LINTON, L. M., BIRREN, B., NUSBAUM, C., ZODY, M. C., BALDWIN, J., DEVON, K., DEWAR, K., DOYLE, M. & FITZHUGH, W. 2001. Initial sequencing and analysis of the human genome. Nature, 409, 860-921.

LANFEAR, R., FRANDSEN, P. B., WRIGHT, A. M., SENFELD, T. & CALCOTT, B. 2016. PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Molecular Biology and Evolution, msw260.

LANGLEY, C. H. & FITCH, W. M. 1973. The constancy of evolution: a statistical analysis of a and b haemoglobins, cytochrome c, and fibrinopeptide A. Genetic structure of populations. University of Hawaii Press, Honolulu, 246-262.

LARSSON, A. 2014. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics, 30, 3276-3278.

LAVIALLE, C., CORNELIS, G., DUPRESSOIR, A., ESNAULT, C., HEIDMANN, O., VERNOCHET, C. & HEIDMANN, T. 2013. Paleovirology of ‘syncytins’, retroviral env genes exapted for a role in placentation. Philosophical Transactions of the Royal Society B: Biological Sciences, 368, 20120507.

LE QUESNE, W. J. 1974. The uniquely evolved character concept and its cladistic application. Systematic Biology, 23, 513-517.

LEACHÉ, A. D. & MCGUIRE, J. A. 2006. Phylogenetic relationships of horned lizards (Phrynosoma) based on nuclear and mitochondrial data: evidence for a misleading mitochondrial gene tree. Molecular phylogenetics and evolution, 39, 628-644.

LEACHÉ, A. D. & RANNALA, B. 2010. The accuracy of species tree estimation under simulation: a comparison of methods. Systematic biology, 60, 126-137.

LEAKEY, M. G., FEIBEL, C. S., BERNOR, R. L., HARRIS, J. M., CERLING, T. E., STEWART, K. M., STORRS, G. W., WALKER, A., WERDELIN, L. & WINKLER, A. J. 1996. Lothagam: a record of faunal change in the Late Miocene of East Africa. Journal of Vertebrate Paleontology, 16, 556-570.

LEE, M. S. 2005. Molecular evidence and marine snake origins. Biology Letters, 1, 227-230.

LEIB-MOSCH, C., SEIFARTH, W. & SCHON, U. 2005. Influence of human endogenous retroviruses on cellular gene expression. Retroviruses and primate genome evolution, 123-143.

LI, H., HANDSAKER, B., WYSOKER, A., FENNELL, T., RUAN, J., HOMER, N., MARTH, G., ABECASIS, G. & DURBIN, R. 2009. The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078-2079.

LI, G., DAVIS, B. W., EIZIRIK, E. & MURPHY, W. J. 2016. Phylogenomic evidence for ancient hybridization in the genomes of living cats (Felidae). Genome research, 26, 1-11.

Page 186 Bibliography Page 187

LI, S. Y., PEARL, D. K. & DOSS, H. 2000. Phylogenetic tree construction using Markov chain Monte Carlo. Journal of the American Statistical Association, 95, 493-508.

LINDSAY, E. H., FAHLBUSCH, V. & MEIN, P. 2013. European Neogene mammal chronology, Springer Science & Business Media.

LIU, L. 2008. BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics, 24, 2542-2543.

LIU, L. & PEARL, D. K. 2007. Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Systematic biology, 56, 504-514.

LIU, L., YU, L., KUBATKO, L., PEARL, D. K. & EDWARDS, S. V. 2009. Coalescent methods for estimating phylogenetic trees. Molecular Phylogenetics and Evolution, 53, 320-328.

LLAMAS, B., BROTHERTON, P., MITCHELL, K. J., TEMPLETON, J. E., THOMSON, V. A., METCALF, J. L., ARMSTRONG, K. N., KASPER, M., RICHARDS, S. M. & CAMENS, A. B. 2015. Late Pleistocene Australian marsupial DNA clarifies the affinities of extinct megafaunal kangaroos and wallabies. Molecular biology and evolution, 32, 574-584.

LÖWER, R., LÖWER, J. & KURTH, R. 1996. The viruses in all of us: characteristics and biological significance of human endogenous retrovirus sequences. Proceedings of the National Academy of Sciences, 93, 5177-5184.

LUAN, D. D., KORMAN, M. H., JAKUBCZAK, J. L. & EICKBUSH, T. H. 1993. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell, 72, 595-605.

LUO, Z.-X., JI, Q., WIBLE, J. R. & YUAN, C.-X. 2003. An Early Cretaceous tribosphenic mammal and metatherian evolution. Science, 302, 1934-1940.

LUO, Z.-X., YUAN, C.-X., MENG, Q.-J. & JI, Q. 2011. A eutherian mammal and divergence of marsupials and placentals. Nature, 476, 442-445.

MACFADDEN, B. J. 1994. Fossil horses: systematics, paleobiology, and evolution of the family Equidae, Cambridge University Press.

MACFADDEN, B. J. 2000. Cenozoic mammalian herbivores from the Americas: reconstructing ancient diets and terrestrial communities. Annual Review of Ecology and Systematics, 31, 33-59.

MACKENNA, M. C., BELL, S. K. & SIMPSON, G. G. 1997. Classification of mammals: above the species level, Columbia University Press.

MACPHAIL, M. 1997. Late Neogene climates in Australia: fossil pollen-and spore-based estimates in retrospect and prospect. Australian Journal of Botany, 45, 425-464.

MADDISON, W. & MADDISON, D. 2010. Mesquite: a modular system for evolutionary analysis. 2011; Version 2.75. See mesquiteproject. org/mesquite/download/download. html.

MADDISON, W. P. 1997. Gene trees in species trees. Systematic biology, 46, 523-536.

Page 187 Bibliography Page 188

MALIK, H. S. 2012. Retroviruses push the envelope for mammalian placentation. Proceedings of the National Academy of Sciences, 109, 2184-2185.

MANNING, J. 1989. Age‐advertisement and the evolution of the peacock's train. Journal of Evolutionary Biology, 2, 379-384.

MARSHALL, L., CASE, J. & WOODBURNE, M. 1990. Phylogenetic relationships of the families of marsupials. Current mammalogy, 433-505.

MARTIN, H. 1994. Australian Tertiary phytogeography: evidence from palynology. History of the Australian vegetation: Cretaceous to Recent, 104-142.

MARTIN, H. Tertiary climatic evolution and the development of aridity in Australia. PROCEEDINGS-LINNEAN SOCIETY OF NEW SOUTH WALES, 1998. LINNEAN SOCIETY OF NEW SOUTH WALES, 115-136.

MARTIN, H. 2006. Cenozoic climatic change and the development of the arid vegetation in Australia. Journal of Arid Environments, 66, 533-563.

MARTIN, R. 2005. Tree-kangaroos of Australia and New Guinea, CSIRO PUBLISHING.

MATERA, A. G., HELLMANN, U. & SCHMID, C. W. 1990. A transpositionally and transcriptionally competent Alu subfamily. Mol Cell Biol, 10, 5424-32.

MAY-COLLADO, L. J., KILPATRICK, C. W. & AGNARSSON, I. 2015. Mammals from ‘down under’: a multi-gene species-level phylogeny of marsupial mammals (Mammalia, Metatheria). PeerJ, 3, e805.

MCCARTHY, E. M. & MCDONALD, J. F. 2004. Long terminal repeat retrotransposons of Mus musculus. Genome biology, 5, R14-R14.

MCCLINTOCK, B. Controlling elements and the gene. Cold Spring Harbor Symposia on Quantitative Biology, 1956. Cold Spring Harbor Laboratory Press, 197-216.

MCCORMACK, J. E., FAIRCLOTH, B. C., CRAWFORD, N. G., GOWATY, P. A., BRUMFIELD, R. T. & GLENN, T. C. 2012. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species-tree analysis. Genome research, 22, 746-754.

MCGOWRAN, B. 1986. Cainozoic oceanic and climatic events: The Indo-Pacific foraminiferal biostratigraphic record. Palaeogeography, Palaeoclimatology, Palaeoecology, 55, 247-265.

MCGOWRAN, B. & LI, Q. 1994. The Miocene oscillation in southern Australia. Records of the South Australian Museum, 27, 197-212.

MCGOWRAN, B., LI, Q., CANN, J., PADLEY, D., MCKIRDY, D. M. & SHAFIK, S. 1997. Biogeographic impact of the Leeuwin Current in southern Australia since the late middle Eocene. Palaeogeography, Palaeoclimatology, Palaeoecology, 136, 19- 40.

MCLAIN, A. T., MEYER, T. J., FAULK, C., HERKE, S. W., OLDENBURG, J. M., BOURGEOIS, M. G., ABSHIRE, C. F., ROOS, C. & BATZER, M. A. 2012. An Alu-based phylogeny of lemurs (Infraorder: Lemuriformes). PLoS One, 7, e44035.

Page 188 Bibliography Page 189

MCPHERSON, F. & CHENOWETH, P. 2012. Mammalian sexual dimorphism. Animal reproduction science, 131, 109-122.

MEGIRIAN, D. 1992. Interpretation of the Miocene Carl Creek Limestone, northwestern Queensland. Beagle: Records of the Museums and Art Galleries of the Northern Territory, The, 9, 219.

MEGIRIAN, D., MURRAY, P., SCHWARTZ, L. & VON DER BORCH, C. 2004. Late Oligocene Kangaroo Well Local Fauna from the Ulta Limestone (new name), and climate of the Miocene oscillation across central Australia. Australian Journal of Earth Sciences, 51, 701-741.

MEREDITH, R. W., JANEČKA, J. E., GATESY, J., RYDER, O. A., FISHER, C. A., TEELING, E. C., GOODBLA, A., EIZIRIK, E., SIMÃO, T. L. L., STADLER, T., RABOSKY, D. L., HONEYCUTT, R. L., FLYNN, J. J., INGRAM, C. M., STEINER, C., WILLIAMS, T. L., ROBINSON, T. J., BURK-HERRICK, A., WESTERMAN, M., AYOUB, N. A., SPRINGER, M. S. & MURPHY, W. J. 2011. Impacts of the cretaceous terrestrial revolution and KPg extinction on mammal diversification. Science, 334, 521-524.

MEREDITH, R. W., MENDOZA, M. A., ROBERTS, K. K., WESTERMAN, M. & SPRINGER, M. S. 2010. A phylogeny and timescale for the evolution of Pseudocheiridae (Marsupialia: Diprotodontia) in Australia and New Guinea. Journal of mammalian evolution, 17, 75-99.

MEREDITH, R. W., WESTERMAN, M., CASE, J. A. & SPRINGER, M. S. 2008a. A phylogeny and timescale for marsupial evolution based on sequences for five nuclear genes. Journal of Mammalian Evolution, 15, 1-36.

MEREDITH, R. W., WESTERMAN, M. & SPRINGER, M. S. 2008b. A phylogeny and timescale for the living genera of kangaroos and kin (Macropodiformes : Marsupialia) based on nuclear DNA sequences. Australian Journal of Zoology, 56, 395-410.

MEREDITH, R. W., WESTERMAN, M. & SPRINGER, M. S. 2009. A phylogeny of Diprotodontia (Marsupialia) based on sequences for five nuclear genes. Molecular Phylogenetics and Evolution, 51, 554-571.

MEYER, T. J., MCLAIN, A. T., OLDENBURG, J. M., FAULK, C., BOURGEOIS, M. G., CONLIN, E. M., MOOTNICK, A. R., DE JONG, P. J., ROOS, C. & CARBONE, L. 2012. An Alu-based phylogeny of gibbons (Hylobatidae). Molecular biology and evolution, 29, 3441-3450.

MIKKELSEN, T. S., WAKEFIELD, M. J., AKEN, B., AMEMIYA, C. T., CHANG, J. L., DUKE, S., GARBER, M., GENTLES, A. J., GOODSTADT, L. & HEGER, A. 2007. Genome of the marsupial Monodelphis domestica reveals innovation in non- coding sequences. Nature, 447, 167-177.

MININ, V. N., BLOOMQUIST, E. W. & SUCHARD, M. A. 2008. Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics. Molecular Biology and Evolution, 25, 1459-1471.

MIRARAB, S., BAYZID, M. S. & WARNOW, T. 2014. Evaluating summary methods for multilocus species tree estimation in the presence of incomplete lineage sorting. Systematic Biology, 65, 366-380.

Page 189 Bibliography Page 190

MIRARAB, S., REAZ, R., BAYZID, M. S., ZIMMERMANN, T., SWENSON, M. S. & WARNOW, T. 2014. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics, 30, i541-i548.

MIRARAB, S. & WARNOW, T. 2015. ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics, 31, i44-i52.

MITCHELL, K. J., PRATT, R. C., WATSON, L. N., GIBB, G. C., LLAMAS, B., KASPER, M., EDSON, J., HOPWOOD, B., MALE, D. & ARMSTRONG, K. N. 2014. Molecular phylogeny, biogeography, and habitat preference evolution of marsupials. Molecular biology and evolution, msu176.

MOSBRUGGER, V., UTESCHER, T. & DILCHER, D. L. 2005. Cenozoic continental climatic evolution of Central Europe. Proceedings of the National Academy of Sciences of the United States of America, 102, 14964-14969.

MUNEMASA, M., NIKAIDO, M., NISHIHARA, H., DONNELLAN, S., AUSTIN, C. C. & OKADA, N. 2008. Newly discovered young CORE-SINEs in marsupial genomes. Gene, 407, 176-185.

MURCHISON, E. P., SCHULZ-TRIEGLAFF, O. B., NING, Z., ALEXANDROV, L. B., BAUER, M. J., FU, B., HIMS, M., DING, Z., IVAKHNO, S. & STEWART, C. 2012. Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer. Cell, 148, 780-791.

MURRAY, P. 1991. The sthenurine affinity of the late Miocene kangaroo, Hadronomas puckridgi Woodburne (Marsupialia, Macropodidae). Alcheringa, 15, 255-283.

MURRAY, P. 1997. Alcoota: a snapshot of the Australian late Miocene. Riversleigh Notes, 35, 2-7.

MUSCHICK, M., INDERMAUR, A. & SALZBURGER, W. 2012. Convergent evolution within an adaptive radiation of cichlid fishes. Curr Biol, 22, 2362-8.

MYERS, P., ESPINOSA, R., PARR, C., JONES, T., HAMMOND, G. & DEWEY, T. 2006. The animal diversity web. Accessed October, 12, 2.

NAKHLEH, L. 2013. Computational approaches to species phylogeny inference and gene tree reconciliation. Trends in ecology & evolution, 28, 719-728.

NEAVES, L., ZENGER, K., COOPER, D. & ELDRIDGE, M. 2010a. Molecular detection of hybridization between sympatric kangaroo species in south-eastern Australia. Heredity, 104, 502-512.

NEAVES, L. E., ZENGER, K., COOPER, D. W. & ELDRIDGE, M. 2010b. Molecular detection of hybridization between sympatric kangaroo species in south-eastern Australia. Heredity, 104, 502-512.

NIKAIDO, M., PISKUREK, O. & OKADA, N. 2007. Toothed whale monophyly reassessed by SINE insertion analysis: the absence of lineage sorting effects suggests a small population of a common ancestral species. Molecular phylogenetics and evolution, 43, 216-224.

Page 190 Bibliography Page 191

NILSSON, M. A. 2006. Phylogenetic relationships of the Banded Hare wallaby (Lagostrophus fasciatus) and a map of the kangaroo mitochondrial control region. Zoologica Scripta, 35, 387-393.

NILSSON, M. A., CHURAKOV, G., SOMMER, M., VAN TRAN, N., ZEMANN, A., BROSIUS, J. & SCHMITZ, J. 2010a. Tracking Marsupial Evolution Using Archaic Genomic Retroposon Insertions. Plos Biology, 8.

NILSSON, M. A., CHURAKOV, G., SOMMER, M., VAN TRAN, N., ZEMANN, A., BROSIUS, J. & SCHMITZ, J. 2010b. Tracking marsupial evolution using archaic genomic retroposon insertions. PLoS Biol, 8, e1000436.

NISHIHARA, H., MARUYAMA, S. & OKADA, N. 2009. Retroposon analysis and recent geological data suggest near-simultaneous divergence of the three superorders of mammals. Proceedings of the National Academy of Sciences, 106, 5235-5240.

NISHIHARA, H., SATTA, Y., NIKAIDO, M., THEWISSEN, J. G., STANHOPE, M. J. & OKADA, N. 2005. A retroposon analysis of Afrotherian phylogeny. Mol Biol Evol, 22, 1823-33.

NISHIHARA, H., SMIT, A. F. A. & OKADA, N. 2006. Functional noncoding sequences derived from SINEs in the mammalian genome. Genome Research, 16, 864-874.

NIXON, K. C. & CARPENTER, J. M. 1996. On simultaneous analysis. Cladistics, 12, 221- 241.

NOWAK, R. M. 1999. Walker's Mammals of the World, JHU Press.

O'LEARY, M. A., BLOCH, J. I., FLYNN, J. J., GAUDIN, T. J., GIALLOMBARDO, A., GIANNINI, N. P., GOLDBERG, S. L., KRAATZ, B. P., LUO, Z.-X. & MENG, J. 2013. The placental mammal ancestor and the post–K-Pg radiation of placentals. Science, 339, 662-667.

O'NEILL, R. J. W., O'NEILL, M. J. & GRAVES, J. A. M. 1998. Undermethylation associated with retroelement activation and chromosome remodelling in an interspecific mammalian hybrid. Nature, 393, 68-72.

OHSHIMA, K., HAMADA, M., TERAI, Y. & OKADA, N. 1996. The 3' ends of tRNA- derived short interspersed repetitive elements are derived from the 3' ends of long interspersed repetitive elements. Mol Cell Biol, 16, 3756-64.

OKADA, N. 1991. SINEs: Short interspersed repeated elements of the eukaryotic genome. Trends in Ecology and Evolution, 6, 358-361.

OKADA, N., SHEDLOCK, A. & NIKAIDO, M. 2004. Retroposon Mapping in Molecular Systematics. In: MILLER, W. & CAPY, P. (eds.) Mobile Genetic Elements. Humana Press.

OLIVER, J. C. & MONTEIRO, A. 2011. On the origins of sexual dimorphism in butterflies. Proceedings of the Royal Society of London B: Biological Sciences, 278, 1981-1988.

OLMSTEAD, R. G. & SWEERE, J. A. 1994. Combining data in phylogenetic systematics: an empirical approach using three molecular data sets in the Solanaceae. Systematic Biology, 43, 467-481.

Page 191 Bibliography Page 192

ORIANS, G. H. 1969. On the evolution of mating systems in birds and mammals. The American Naturalist, 103, 589-603.

OWENS, I. 2006. Ecological explanations for interspecific variability in coloration. Bird coloration, 2, 380-416.

PAGE, R. 1998. GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics (Oxford, England), 14, 819-820.

PAMILO, P. & NEI, M. 1988. Relationships between gene trees and species trees. Molecular biology and evolution, 5, 568-583.

PATEL, M. R., EMERMAN, M. & MALIK, H. S. 2011. Paleovirology—ghosts and gifts of viruses past. Current opinion in virology, 1, 304-309.

PATEL, S., KIMBALL, R. T. & BRAUN, E. L. 2013. Error in phylogenetic estimation for bushes in the tree of life. Journal of Phylogenetics and Evolutionary Biology, 1, 110.

PENNY, D. & PHILLIPS, M. J. 2004. The rise of birds and mammals: are microevolutionary processes sufficient for macroevolution. Trends in Ecology & Evolution, 19, 516-522.

PHILLIPS, M. J. 2015a. Four mammal fossil calibrations: balancing competing palaeontological and molecular considerations. Palaeontologia Electronica, 18, 1- 16.

PHILLIPS, M. J. 2015b. Geomolecular dating and the origin of placental mammals. Systematic biology, syv115.

PHILLIPS, M. J., BENNETT, T. H. & LEE, M. S. 2009. Molecules, morphology, and ecology indicate a recent, amphibious ancestry for echidnas. Proceedings of the National Academy of Sciences, 106, 17089-17094.

PHILLIPS, M. J., HAOUCHAR, D., PRATT, R. C., GIBB, G. C. & BUNCE, M. 2013. Inferring Kangaroo Phylogeny from Incongruent Nuclear and Mitochondrial Genes. Plos One, 8.

PHILLIPS, M. J. & PRATT, R. C. 2008. Family-level relationships among the Australasian marsupial “herbivores”(Diprotodontia: Koala, wombats, kangaroos and possums). Molecular phylogenetics and evolution, 46, 594-605.

PISANI, D., COTTON, J. A. & MCINERNEY, J. O. 2007. Supertrees disentangle the chimerical origin of eukaryotic genomes. Molecular Biology and Evolution, 24, 1752-1760.

PLATT II, R. N. & RAY, D. A. 2012. A non-LTR retroelement extinction in Spermophilus tridecemlineatus. Gene, 500, 47-53.

PLATT, R. N., ZHANG, Y., WITHERSPOON, D. J., XING, J., SUH, A., KEITH, M. S., JORDE, L. B., STEVENS, R. D. & RAY, D. A. 2015. Targeted capture of phylogenetically informative Ves SINE insertions in genus Myotis. Genome biology and evolution, 7, 1664-1675.

PLEDGE, N. S. 1992. The Curramulka local fauna: a new late Tertiary fossil assemblage from Yorke Peninsula, South Australia. Beagle: Records of the Museums and Art Galleries of the Northern Territory, The, 9, 115.

Page 192 Bibliography Page 193

POSADA, D. & CRANDALL, K. A. 1998. Modeltest: testing the model of DNA substitution. Bioinformatics, 14, 817-818.

POTTER, S., BRAGG, J. G., BLOM, M. P., DEAKIN, J. E., KIRKPATRICK, M., ELDRIDGE, M. D. & MORITZ, C. 2017. Chromosomal speciation in the genomics era: Disentangling phylogenetic evolution of rock-wallabies. Frontiers in genetics, 8, 10.

PRIDEAUX, G. J. 2004. Systematics and evolution of the sthenurine kangaroos, Univ of California Press.

PRIDEAUX, G. J., LONG, J. A., AYLIFFE, L. K., HELLSTROM, J. C., PILLANS, B., BOLES, W. E., HUTCHINSON, M. N., ROBERTS, R. G., CUPPER, M. L. & ARNOLD, L. J. 2007. An arid-adapted middle Pleistocene vertebrate fauna from south-central Australia. Nature, 445, 422.

PRIDEAUX, G. J. & TEDFORD, R. H. 2012. Tjukuru wellsi, gen. et sp. nov., a lagostrophine kangaroo (Diprotodontia, Macropodidae) from the Pliocene (Tirarian) of northern South Australia. Journal of Vertebrate Paleontology, 32, 717-721.

PRIDEAUX, G. J. & WARBURTON, N. 2009. Bohra nullarbora sp. nov., a second tree- kangaroo (Marsupialia: Macropodidae) from the Pleistocene of the Nullarbor Plain, Western Australia. Records of the Western Australian Museum, 25, 165-179.

PRIDEAUX, G. J. & WARBURTON, N. M. 2008. A new Pleistocene tree-kangaroo (Diprotodontia: Macropodidae) from the Nullarbor Plain of south-central Australia. Journal of Vertebrate Paleontology, 28, 463-478.

PRIDEAUX, G. J. & WARBURTON, N. M. 2010. An osteology-based appraisal of the phylogeny and evolution of kangaroos and wallabies (Macropodidae: Marsupialia). Zoological Journal of the Linnean Society, 159, 954-987.

PROCTER-GRAY, E. & GANSLOSSER, U. 1986. The individual behaviors of Lumholtz's tree-kangaroo: repertoire and taxonomic implications. Journal of Mammalogy, 67, 343-352.

RAGAN, M. A. 1992. Phylogenetic inference based on matrix representation of trees. Molecular phylogenetics and evolution, 1, 53-58.

RALLS, K. 1976. Mammals in which females are larger than males. The Quarterly Review of Biology, 51, 245-276.

RALLS, K. 1977. Sexual dimorphism in mammals: avian models and unanswered questions. The American Naturalist, 111, 917-938.

RAMBAUT, A. 2002. Se-Al v2. 0a11: sequence alignment editor. Oxford, UK: University of Oxford, http://tree. bio. ed. ac. uk/software/seal.

RAMBAUT, A. & BROMHAM, L. 1998. Estimating divergence dates from molecular sequences. Molecular Biology and Evolution, 15, 442-448.

RAMBAUT, A., ROBERTSON, D. L., PYBUS, O. G., PEETERS, M. & HOLMES, E. C. 2001. Human immunodeficiency virus: phylogeny and the origin of HIV-1. Nature, 410, 1047-1048.

Page 193 Bibliography Page 194

RANNALA, B. 2002. Identifiability of parameters in MCMC Bayesian inference of phylogeny. Syst Biol, 51, 754-60.

RAVEN, H. C. & GREGORY, W. K. 1946. Adaptive branching of the kangaroo family in relation to habitat, American Museum of Natural History.

RAY, D. A., XING, J., SALEM, A.-H. & BATZER, M. A. 2006. SINEs of a nearly perfect character. Systematic biology, 55, 928-935.

RENFREE, M. B., PAPENFUSS, A. T., DEAKIN, J. E., LINDSAY, J., HEIDER, T., BELOV, K., RENS, W., WATERS, P. D., PHARO, E. A. & SHAW, G. 2011. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of and development. Genome biology, 12, R81.

RENNE, P. R., DEINO, A. L., HILGEN, F. J., KUIPER, K. F., MARK, D. F., MITCHELL, W. S., MORGAN, L. E., MUNDIL, R. & SMIT, J. 2013. Time scales of critical events around the Cretaceous-Paleogene boundary. Science, 339, 684-687.

RENNE, P. R., SPRAIN, C. J., RICHARDS, M. A., SELF, S., VANDERKLUYSEN, L. & PANDE, K. 2015. State shift in Deccan volcanism at the Cretaceous-Paleogene boundary, possibly induced by impact. Science, 350, 76-78.

RENSCH, B. 1950. Die Abhängigkeit der relativen Sexualdifferenz von der Körpergrösse. Bonner Zoologische Beiträge, 1, 58-69.

RENSCH, B. & RENSCH, B. 1959. Evolution above the species level.

RETALLACK, G. J. 2001. Cenozoic expansion of grasslands and climatic cooling. The Journal of Geology, 109, 407-426.

RICHARDS, H., GRUETER, C. & MILNE, N. 2015. Strong arm tactics: sexual dimorphism in macropodid limb proportions. Journal of Zoology, 297, 123-131.

RIDE, W. PROTEMNODON PARMA (WATERHOUSE) AND THE CLASSIFICATION OF RELATED WALLABIES (PROTEMNODON, THYLOQALE, AND SETONIX). Proceedings of the Zoological Society of London, 1957. Wiley Online Library, 327-346.

RINEHART, T., GRAHN, R. & WICHMAN, H. 2005. SINE extinction preceded LINE extinction in sigmodontine rodents: implications for retrotranspositional dynamics and mechanisms. Cytogenetic and genome research, 110, 416-425.

ROBINSON, A. C. & YOUNG, M. C. 1983. The toolache wallaby (Macropus greyi Waterhouse), National Parks and Wildlife Service, Dept. of Environment and Planning.

ROBINSON, T. J., RUIZ-HERRERA, A. & AVISE, J. C. 2008. Hemiplasy and homoplasy in the karyotypic phylogenies of mammals. Proceedings of the National Academy of Sciences, 105, 14477-14481.

ROHLF, F. J. & WOOTEN, M. C. 1988. Evaluation of the Restricted Maximum-Likelihood Method for Estimating Phylogenetic Trees Using Simulated Allele-Frequency Data. Evolution, 42, 581-595.

Page 194 Bibliography Page 195

RONQUIST, F. & HUELSENBECK, J. P. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics, 19, 1572-1574.

ROSENBERG, N. A. 2002. The probability of topological concordance of gene trees and species trees. Theoretical population biology, 61, 225-247.

ROSENBERG, N. A. 2003. The shapes of neutral gene genealogies in two species: probabilities of monophyly, paraphyly, and polyphyly in a coalescent model. Evolution, 57, 1465-1477.

SAMBROOK, J. & RUSSELL, D., W 1989. Molecular cloning: a laboratory manual. Vol. 3, Cold spring harbor laboratory press.

SANDERSON, M. J. & DOYLE, J. J. 1992. Reconstruction of organismal and gene phylogenies from data on multigene families: concerted evolution, homoplasy, and confidence. Systematic Biology, 41, 4-17.

SANDERSON, M. J., PURVIS, A. & HENZE, C. 1998. Phylogenetic supertrees: assembling the trees of life. Trends in Ecology & Evolution, 13, 105-109.

SANDERSON, M. J. & SHAFFER, H. B. 2002. Troubleshooting molecular phylogenetic analyses. Annual review of ecology and Systematics, 33, 49-72.

SANMIGUEL, P., TIKHONOV, A., JIN, Y.-K., MOTCHOULSKAIA, N., ZAKHAROV, D., MELAKE-BERHAN, A., SPRINGER, P. S., EDWARDS, K. J., LEE, M. & AVRAMOVA, Z. 1996. Nested retrotransposons in the intergenic regions of the maize genome. Science, 274, 765-768.

SANSON, G. 1989. Morphological adaptations of teeth to diets and feeding in the Macropodoidea. Kangaroos, wallabies and rat-kangaroos, 1, 151-168.

SAVALLI, U. M. 1995. The evolution of bird coloration and plumage elaboration. Current ornithology. Springer.

SCALLY, A., DUTHEIL, J. Y., HILLIER, L. W., JORDAN, G. E., GOODHEAD, I., HERRERO, J., HOBOLTH, A., LAPPALAINEN, T., MAILUND, T. & MARQUES-BONET, T. 2012. Insights into hominid evolution from the gorilla genome sequence. Nature, 483, 169-175.

SCHLUTER, D. 2000. The ecology of adaptive radiation, Oxford University Press.

SCHMID, C. & MARAIA, R. 1992. Transcriptional regulation and transpositional selection of active SINE sequences. Curr Opin Genet Dev, 2, 874-82.

SCHMID, C. W. 1996. Alu: structure, origin, evolution, significance, and function of one- tenth of human DNA. Progress in nucleic acid research and molecular biology, 53, 283-319.

SCORNAVACCA, C. & GALTIER, N. 2017. Incomplete lineage sorting in mammalian phylogenomics. Systematic biology, 66, 112-120.

SETCHELL, J. M. & JEAN WICKINGS, E. 2005. Dominance, status signals and coloration in male mandrills (Mandrillus sphinx). Ethology, 111, 25-50.

Page 195 Bibliography Page 196

SHARMAN, G., CLOSE, R. & MAYNES, G. 1989. Chromosome evolution, phylogeny and speciation of rock wallabies (Petrogale, Macropodidae). Australian Journal of Zoology, 37, 351-363.

SHARMAN, G. & MAYNES, G. 1983. Rock-wallabies. Complete Book of Australian Mammals’.(Ed. R. Strahan.) pp, 207-212.

SHEDLOCK, A. M. & OKADA, N. 2000. SINE insertions: powerful tools for molecular systematics. Bioessays, 22, 148-160.

SHEDLOCK, A. M., TAKAHASHI, K. & OKADA, N. 2004. SINEs of speciation: tracking lineages with retroposons. Trends in Ecology & Evolution, 19, 545-553.

SHEN, M. R., BATZER, M. A. & DEININGER, P. L. 1991. Evolution of the master Alu gene(s). J Mol Evol, 33, 311-20.

SHIMAMURA, M., ABE, H., NIKAIDO, M., OHSHIMA, K. & OKADA, N. 1999. Genealogy of families of SINEs in cetaceans and artiodactyls: the presence of a huge superfamily of tRNA (Glu)-derived families of SINEs. Molecular biology and evolution, 16, 1046-1060.

SHIMAMURA, M., YASUE, H., OHSHIMA, K., ABE, H., KATO, H., KISHIRO, T., GOTO, M., MUNECHIKA, I. & OKADA, N. 1997. Molecular evidence from retroposons that whales form a clade within even-toed ungulates. Nature, 388, 666- 670.

SHOEMAKER, J. S., PAINTER, I. S. & WEIR, B. S. 1999. Bayesian statistics in genetics - a guide for the uninitiated. Trends in Genetics, 15, 354-358.

SIDDALL, M. E. 1997. Prior agreement: arbitration or arbitrary? Systematic Biology, 46, 765-769.

SIMPSON, G. G. 1944. Tempo and mode in evolution, Columbia University Press.

SIMPSON, G. G 1953. The Major features of Evolution. Columbia Univ. Press, New York.

SIMPSON, G. G. 1955. Major features of Evolution, Columbia University Press: New York.

SMIT, A., HUBLEY, R. & GREEN, P. 1996. http:// www. repeatmasker. org. RepeatMasker Open, 3, 1996-2004.

SMITH, G. R. 1992. Introgression in fishes: significance for paleontology, cladistics, and evolutionary rates. Systematic Biology, 41, 41-57.

SMITH, J. M. & HAIGH, J. 1974. The hitch-hiking effect of a favourable gene. Genetical research, 23, 23-35.

SMITH, M. J., HAYMAN, D. & HOPE, R. 1979. Observations on the chromosomes and reproductive systems of four macropodine interspecific hybrids (Marsupialia: Macropodidae). Australian Journal of Zoology, 27, 959-972.

SONG, S., LIU, L., EDWARDS, S. V. & WU, S. 2012. Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proceedings of the National Academy of Sciences, 109, 14942-14947.

Page 196 Bibliography Page 197

SOUSA, V. & HEY, J. 2013. Understanding the origin of species with genome-scale data: modelling gene flow. Nature Reviews Genetics, 14, 404-414.

SPRINGER, M. S. & KIRSCH, J. A. 1991. DNA hybridization, the compression effect, and the radiation of diprotodontian marsupials. Systematic Biology, 40, 131-151.

STAMATAKIS, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22, 2688-2690.

STEEL, M. 2016. Phylogeny: Discrete and random processes in evolution, SIAM.

STILLER, M., KNAPP, M., STENZEL, U., HOFREITER, M. & MEYER, M. 2009. Direct multiplex sequencing (DMPS)-a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA. Genome Research, 19, 1843-1848.

STOYE, J. P. 2012. Studies of endogenous retroviruses reveal a continuing evolutionary saga. Nature reviews Microbiology, 10, 395-406.

STRAHAN, R. 1995. , Smithsonian Inst Pr.

STRÖMBERG, C. A. 2006. Evolution of hypsodonty in equids: testing a hypothesis of adaptation. Paleobiology, 32, 236-258.

SUH, A., PAUS, M., KIEFMANN, M., CHURAKOV, G., FRANKE, F. A., BROSIUS, J., KRIEGS, J. O. & SCHMITZ, J. 2011. Mesozoic retroposons reveal parrots as the closest living relatives of passerine birds. Nature Communications, 2, 443.

SUH, A., SMEDS, L. & ELLEGREN, H. 2015. The dynamics of incomplete lineage sorting across the ancient adaptive radiation of neoavian birds. PLoS Biol, 13, e1002224.

SUZUKI, Y., GLAZKO, G. V. & NEI, M. 2002. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proceedings of the National Academy of Sciences, 99, 16138-16143.

SWERGOLD, G. D. 1990. Identification, characterization, and cell specificity of a human LINE-1 promoter. Molecular and cellular biology, 10, 6718-6729.

SZALAY, F. S. 2006. Evolutionary history of the marsupials and an analysis of osteological characters, Cambridge University Press.

TAKAHASHI, K., TERAI, Y., NISHIDA, M. & OKADA, N. 2001. Phylogenetic Relationships and Ancient Incomplete Lineage Sorting Among Cichlid Fishes in Lake Tanganyika as Revealed by Analysis of the Insertion of Retroposons. Molecular Biology and Evolution, 18, 2057-2066.

TAKAHATA, N. 1989. Gene genealogy in three related populations: consistency probability between gene and population trees. Genetics, 122, 957-966.

TAKEZAKI, N., RZHETSKY, A. & NEI, M. 1995. Phylogenetic test of the molecular clock and linearized trees. Molecular Biology and Evolution, 12, 823-833.

TAMURA, K., STECHER, G., PETERSON, D., FILIPSKI, A. & KUMAR, S. 2013. MEGA6: molecular evolutionary genetics analysis version 6.0. Molecular biology and evolution, mst197.

Page 197 Bibliography Page 198

TARLINTON, R., MEERS, J., HANGER, J. & YOUNG, P. 2005. Real-time reverse transcriptase PCR for the endogenous koala retrovirus reveals an association between plasma viral load and neoplastic disease in . Journal of general virology, 86, 783-787.

TARLINTON, R. E., MEERS, J. & YOUNG, P. R. 2006. Retroviral invasion of the koala genome. Nature, 442, 79-81.

TARVER, J. E., DOS REIS, M., MIRARAB, S., MORAN, R. J., PARKER, S., O’REILLY, J. E., KING, B. L., O’CONNELL, M. J., ASHER, R. J. & WARNOW, T. 2016. The interrelationships of placental mammals and the limits of phylogenetic inference. Genome biology and evolution, 8, 330-344.

TATE, G. H. H., RAVEN, H. C. & NEUHÄUSER, G. 1948. Studies on the anatomy and phylogeny of the Macropodidae (Marsupialia), American Museum of Natural History.

TEBBICH, S., STERELNY, K. & TESCHKE, I. 2010. The tale of the finch: adaptive radiation and behavioural flexibility. Philosophical Transactions of the Royal Society B: Biological Sciences, 365, 1099-1109.

TEDFORD, R., WELLS, R. & BARGHOORN, S. 1992. Tirari Formation and contained faunas, Pliocene of the Lake Eyre Basin, South Australia. Beagle: Records of the Museums and Art Galleries of the Northern Territory, The, 9, 173.

TELFER, W. R. A. C., J. H. 2008. Black Wallaroo, Macropus bernardus. In: STRAHAN, S. V. D. A. R. (ed.) The mammals of Australia. Third ed. Sydney, Australia: Reed New Holland.

THOMAS, O. 1888. Catalogue of the Marsupialia and Monotremata in the collection of the British Museum (Natural History), order of the Trustees.

TONINI, J., MOORE, A., STERN, D., SHCHEGLOVITOVA, M. & ORTÍ, G. 2015. Concatenation and species tree methods exhibit statistically indistinguishable accuracy under a range of simulated conditions. PLoS currents, 7.

TRAVOUILLON, K., LEGENDRE, S., ARCHER, M. & HAND, S. 2009. Palaeoecological analyses of Riversleigh's Oligo-Miocene sites: implications for Oligo-Miocene climate change in Australia. Palaeogeography, Palaeoclimatology, Palaeoecology, 276, 24-37.

TREANGEN, T. J. & SALZBERG, S. L. 2012. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nature reviews. Genetics, 13, 36.

TRIVERS, R. 1972. Parental investment and sexual selection, Biological Laboratories, Harvard University Cambridge, MA.

TUFFLEY, C. & STEEL, M. 1998. Modeling the covarion hypothesis of nucleotide substitution. Mathematical biosciences, 147, 63-91.

TYNDALE-BISCOE, H. 2005. Life of marsupials.

ULLU, E. & TSCHUDI, C. 1984. Alu sequences are processed 7SL RNA genes. Nature, 312, 171-172.

Page 198 Bibliography Page 199

UTESCHER, T., MOSBRUGGER, V. & ASHRAF, A. R. 2000. Terrestrial climate evolution in northwest Germany over the last 25 million years. Palaios, 15, 430-449.

VAN DAM, J. A. 2006. Geographic and temporal patterns in the late Neogene (12–3 Ma) aridification of Europe: the use of small mammals as paleoprecipitation proxies. Palaeogeography, Palaeoclimatology, Palaeoecology, 238, 190-218.

VAN DE LAGEMAAT, L. N., GAGNIER, L., MEDSTRAND, P. & MAGER, D. L. 2005. Genomic deletions and precise removal of transposable elements mediated by short identical DNA segments in primates. Genome Research, 15, 1243-1249.

VAN DYCK, S. & STRAHAN, R. 2008. The mammals of Australia, New Holland Pub Pty Limited.

VAN GELDER, R. G. 1977. Mammalian hybrids and generic limits. American Museum novitates; no. 2635.

VERNER, J. & WILLSON, M. F. 1966. The influence of habitats on mating systems of North American passerine birds. Ecology, 47, 143-147.

VERNES, K. 2016. Handbook of Mammals of the World, Vol. 5: Monotremes and Marsupials. Oxford University Press US

VOORDECKERS, K., BROWN, C. A., VANNESTE, K., VAN DER ZANDE, E., VOET, A., MAERE, S. & VERSTREPEN, K. J. 2012. Reconstruction of ancestral metabolic enzymes reveals molecular mechanisms underlying evolutionary innovation through gene duplication. PLoS Biol, 10, e1001446.

VRBA, E. S. 1995. Paleoclimate and evolution, with emphasis on human origins, Yale University Press.

WADDELL, P. J., KISHINO, H. & OTA, R. 2001. A phylogenetic foundation for comparative mammalian genomics. Genome Informatics, 12, 141-154.

WALLACE, A. R. 2007. Darwinism: an exposition of the theory of natural selection with some of its applications, Cosimo, Inc.

WANN, J. & BELL, D. 1997. Dietary preferences of the black-gloved wallaby (Macropus irma) and the (M. fuliginosus) in Whiteman Park, Perth, Western Australia. Journal of the Royal Society of Western Australia, 80, 55-62.

WARBURTON, N. M., BATEMAN, P. W. & FLEMING, P. A. 2013. Sexual selection on muscles of western grey kangaroos (Skippy was clearly a female). Biological Journal of the Linnean Society, 109, 923-931.

WEINER, A. M., DEININGER, P. L. & EFSTRATIADIS, A. 1986. Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. Annual review of biochemistry, 55, 631-661.

WEST, P. M. & PACKER, C. 2002. Sexual selection, temperature, and the lion's mane. Science, 297, 1339-1343.

WESTERMAN, M., BURK, A., AMRINE-MADSEN, H. M., PRIDEAUX, G. J., CASE, J. A. & SPRINGER, M. S. 2002. Molecular evidence for the last survivor of an ancient kangaroo lineage. Journal of Mammalian Evolution, 9, 209-223.

Page 199 Bibliography Page 200

WHITE, M. E. 1997. Listen, Our Land Is Crying: Australia's Environment: Problems And Solutions, Rosenberg Pub Pty Limited.

WHITE, W. T. J. & HOLLAND, B. R. 2011. Faster exact maximum parsimony search with XMP. Bioinformatics, 27, 1359-1367.

WICKER, T., SABOT, F., HUA-VAN, A., BENNETZEN, J. L., CAPY, P., CHALHOUB, B., FLAVELL, A., LEROY, P., MORGANTE, M. & PANAUD, O. 2007a. A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics, 8, 973-982.

WICKER, T., SABOT, F., HUA-VAN, A., BENNETZEN, J. L., CAPY, P., CHALHOUB, B., FLAVELL, A., LEROY, P., MORGANTE, M., PANAUD, O., PAUX, E., SANMIGUEL, P. & SCHULMAN, A. H. 2007b. A unified classification system for eukaryotic transposable elements. Nat Rev Genet, 8, 973-982.

WILLIS, K. & MCELWAIN, J. 2014. The evolution of plants, Oxford University Press.

WILSON, D. E. & REEDER, D. M. 2005. Mammal species of the world : a taxonomic and geographic reference, Baltimore, Md. ; [London], Johns Hopkins University Press.

WILSON, I. J., WEALE, M. E. & BALDING, D. J. 2003. Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities. Journal of the Royal Statistical Society Series a-Statistics in Society, 166, 155-188.

WINDSOR, D. & DAGG, A. 1971. The gaits of the Macropodinae (Marsupialia). Journal of Zoology, 163, 165-175.

WOINARSKI, J. 2016. Macropus bernardus. The IUCN Red List of Threatened Species 2016: e.T12620A21954187. http://dx.doi.org/10.2305/IUCN.UK.2016- 2.RLTS.T12620A21954187.en. [Online]. [Accessed].

WOODBURNE, M. O. 1967. The Alcoota Fauna, central Australia: an integrated palaeontological and geological study, Bureau of Mineral Resources, Geology and Geophysics.

WOODBURNE, M. O., MACFADDEN, B. J., CASE, J. A., SPRINGER, M. S., PLEDGE, N. S., POWER, J. D., WOODBURNE, J. M. & SPRINGER, K. B. 1993. Land mammal biostratigraphy and magnetostratigraphy of the Etadunna Formation (late Oligocene) of South Australia. Journal of Vertebrate Paleontology, 13, 483-515.

WOODHEAD, J., HAND, S. J., ARCHER, M., GRAHAM, I., SNIDERMAN, K., ARENA, D. A., BLACK, K. H., GODTHELP, H., CREASER, P. & PRICE, E. 2016. Developing a radiometrically-dated chronologic sequence for Neogene biotic change in Australia, from the Riversleigh World Heritage Area of Queensland. Gondwana Research, 29, 153-167.

WRIGHT, S. 1968. Evolution and the genetics of populations. Vol. 1. Genetic and biométrie foundations. Evolution and the genetics of populations. Vol. 1. Genetic and biométrie foundations.

XI, Z., LIU, L. & DAVIS, C. C. 2015. Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased. Molecular phylogenetics and evolution, 92, 63-71.

Page 200 Bibliography Page 201

YANG, J. & WARNOW, T. 2011. Fast and accurate methods for phylogenomic analyses. BMC bioinformatics, 12, S4.

YANG, Z. 2006. Computational molecular evolution, Oxford University Press.

YANG, Z. H. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Computer Applications in the Biosciences, 13, 555-556.

YODER, A. D. & YANG, Z. 2000. Estimation of primate speciation dates using local molecular clocks. Molecular Biology and Evolution, 17, 1081-1090.

ZACHOS, J., PAGANI, M., SLOAN, L., THOMAS, E. & BILLUPS, K. 2001. Trends, rhythms, and aberrations in global climate 65 Ma to present. Science, 292, 686-693.

ZACHOS, F. E. 2015. Wilson, DE; Mittermeier, RA (chief editors): Handbook of the Mammals of the World. Vol. 5. Monotremes and Marsupials, Lynx Edicions, Barcelona (2015). 800 pp., 44 colour plates, 717 colour photographs, 375 distribution maps, Hardback,€ 160, ISBN: 978-84-96553-99-6. Elsevier

ZEMANN, A., CHURAKOV, G., DONELLAN, S., GRÜTZNER, F., FANGQING, Z., BROSIUS, J. & SCHMITZ, J. 2013. Ancestry of the Australian termitivorous numbat. Molecular biology and evolution, mst032.

ZHUO, X. & FESCHOTTE, C. 2015. Cross-species transmission and differential fate of an endogenous retrovirus in three mammal lineages. PLoS Pathog, 11, e1005279.

ZUCKERKANDL, E. & PAULING, L. 1965. Molecules as documents of evolutionary history. Journal of theoretical biology, 8, 357-366.

Page 201 Bibliography Page 202

Page 202 Appendices Page 203

Appendices

Retrotransposon nexus matrix (Chapter 2):

#NEXUS

Begin data; Dimensions ntax=16 nchar=29; Format datatype=standard symbols="01" missing=? gap=-; Matrix [ 1 11 21 ] M_eugenii 1111111111 1111111111 111111111 M_agilis 111??0?1?1 1?11?1?110 10111111- M_Parma 1?1??0???1 ???1?1?1?1 10111???? M_rufog 1-1??-?1?0 1??1?1?110 10111?1?1 M_irma 111??0?1?0 1??1?0?110 11111?111 W_bicolor 111?111010 1110111111 011111010 M_rufus 0-0??0?-?0 -?1?????0? 0?0???0?? M_robustus 010??01110 11101010-0 000001011 M_giganteus 000??01110 01-0101000 101001010 M_fuliginousus 000??0?110 0?0??0???? 1?1??1??? Lagor_con ?????????? ?????????? ??0?????? Lagor_hir 000110000? -0001?10-0 ?-00?0?00 Onycho_ung ???11?000? 01??1?1??? 0????0?1? Thylogale_the ?0?11?-?0? ?0??1?1??? 0?0??0?1? Lagostr_fasc ?0?00?-?-? ?0??0?0??? ?????0??? Potorous_tri ?0?00?-?0? ?0??0?-??? ?????????

; end; begin paup; ctype dollo.dn: all; out Potorous_tri; constraints knownblind = (1,2,3,4,5,6,(7,8),(9,10),(11,12),13,14,15,16);

Hsearch addseq=rand nreps=20 constraints=knownblind enforce=yes; contree; end;

Page 203 Appendices Page 204

Table S1: Provenance of macropod DNA samples (Chapter 2). Wallabia bicolor (swampwallaby) Wallabia bicolor (swampwallaby) Wallabia bicolor (swampwallaby) Wallabia Macropusrufus(red kangaroo) (western knagaroo)greyMacropus fuliginosus (western knagaroo)greyMacropus fuliginosus opossum) (virginia virginiana Didelphis Monodelphisshort-taileddomestica(grey opossum) Monodelphisshort-taileddomestica(grey opossum) Monodelphisshort-taileddomestica(grey opossum) devil) (tasmanian Sarcophilusharrisii Vombatus(commonursinus ) barred(eastern gunii Perameles bandicoot) Pseudocheiruspossum)peregrinus (ringtail Trichosurus(brushtailpossum)vulpecula kangaroo)grey(eastern Macropusgiganteous Macropusrobustus (common wallaroo) Macropusrufus(red kangaroo) Macropusrufus(red kangaroo) Macropusrufus(red kangaroo) Macropusrufus(red kangaroo) wallaby) (agile Macropusagilis wallaby) (agile Macropusagilis Isoodonmacrourus (northern brown bandicoot) adult - Macropus(red-neckedjuvenilerufogriseus - wallaby) Macropus(red-neckedjuvenilerufogriseus - wallaby) adult bicolor - (swampwallaby) Wallabia adult bicolor - (swampwallaby) Wallabia bicolor (swampwallaby) Wallabia (koala) cinereus (red-neckedthetis Thylogale pademelon) juvenile - organism Frankfurt) Research Centre, and Climate at (Biodiversity BIK-F Samples Technology of University at Queensland Samples female male ? male male male male ? female ? sex liver (479004)liver blood blood blood(0.1) blood(1.0) wholeanimal liver thighmuscle liver thighmuscle muscle liver wholeanimal type sample roadkill hitthenbycar euthanased hitthenbycar euthanased roadkill roadkill roadkill ? ? death of cause Antwerp? Hannoverzoo Hannoverzoo zoo Basel Qualita' zoo Basel Griffin' Munichzoo FursdenCannonRoad, Hill Court, Bidya Mudgeeraba, QLD Court, Bidya Mudgeeraba, QLD SomersetMudgeeraba,Drive, QLD SomersetMudgeeraba,Drive, QLD MortonParkNSW,48.87", Rd, -34° +150°10' 43' ToogoolawahQLD Currumbin(Tomewin Valley Mountain QLD Road) provenance 2011 22/12/2013 22/12/2013 22/12/2013 22/12/2013 19/12/2013 19/12/2013 ? date collection 11/03/2011 18/07/2011 MattPhillips Australia Wildcare Australia Wildcare Australia Wildcare Australia Wildcare WendyForbes Zoo? Australia Australia Wildcare by collected SouthAm.freezer) -80box marsupial (BiK-F SouthAm.freezer) -80box marsupial (BiK-F SouthAm.freezer) -80box marsupial (BiK-F SouthAm.freezer) -80box marsupial (BiK-F freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian freezer)-80box marsupial (BiK-F Australian Thermifreezer-80 Sanyofreezer-80 Sanyofreezer-80 Sanyofreezer-80 Sanyofreezer-80 Sanyofreezer-80 Sanyofreezer-80 Sanyofreezer-80 location sample

Page 204 Appendices Page 205

Table S2: Source data for sequences obtained from Genbank for the supermatrix analysis (Chapter 3).

Source data for sequences obtained from Genbank

Gene Species Genbank Accession

Complete mt Dendrolagus lum/ben (ave) KJ868111

Complete mt Setonix brachyurus KJ868156

Complete mt Petrogale (ave) KJ868141

Complete mt Lagorchestes conspicillatus KJ868117

Complete mt Lagorchestes hirsutus NC008136

Complete mt Wallabia bicolor KJ868164

Complete mt Macropus eugenii KJ868119

Complete mt Macropus fuliginosus KJ868120

Complete mt Macropus robustus Y10524

Complete mt Macropus rufogriseus KJ868121

Complete mt Macropus giganteus LK995454

ApoB Dendrolagus lum/ben (ave) FJ603146

ApoB Lagorchestes conspicillatus FJ603148

ApoB Lagorchestes hirsutus FJ603142

ApoB Macropus agilis FJ603150

ApoB Macropus antilopinus FJ603151

ApoB Macropus eugenii FJ603152

ApoB Macropus fuliginosus FJ603143

ApoB Macropus giganteus FJ603153

ApoB Macropus rufus FJ603117

ApoB Macropus irma FJ603154

ApoB Macropus parma FJ603155

ApoB Macropus parryi FJ603156

ApoB Macropus robustus FJ603157

ApoB Macropus rufogriseus FJ603158

ApoB Petrogale (ave) FJ603162

ApoB Setonix brachyurus FJ603165

ApoB Thylogale stig/thet/bil (ave) FJ603166

Page 205 Appendices Page 206

ApoB Wallabia bicolor FJ603167

BRCA1 Dendrolagus lum/ben (ave) FJ603170

BRCA1 Lagorchestes conspicillatus FJ603174

BRCA1 Lagorchestes hirsutus FJ603175

BRCA1 Macropus agilis FJ603177

BRCA1 Macropus antilopinus FJ603178

BRCA1 Macropus eugenii FJ603179

BRCA1 Macropus fuliginosus FJ603180

BRCA1 Macropus giganteus FJ603181

BRCA1 Macropus rufus FJ603121

BRCA1 Macropus irma FJ603182

BRCA1 Macropus parma FJ603183

BRCA1 Macropus parryi FJ603184

BRCA1 Macropus robustus FJ603185

BRCA1 Macropus rufogriseus FJ603186

BRCA1 Petrogale (ave) FJ603190

BRCA1 Setonix brachyurus FJ603193

BRCA1 Thylogale stig/thet/bil (ave) FJ603194

BRCA1 Wallabia bicolor FJ603195

IRBP Dendrolagus lum/ben (ave) FJ603198

IRBP Lagorchestes conspicillatus FJ603201

IRBP Lagorchestes hirsutus FJ603202

IRBP Macropus agilis FJ603204

IRBP Macropus antilopinus FJ603205

IRBP Macropus eugenii FJ603206

IRBP Macropus fuliginosus FJ603207

IRBP Macropus giganteus AJ429135

IRBP Macropus rufus FJ603127

IRBP Macropus irma FJ603208

IRBP Macropus parma FJ603209

IRBP Macropus parryi FJ603210

IRBP Macropus robustus FJ603211

Page 206 Appendices Page 207

IRBP Macropus rufogriseus FJ603212

IRBP Petrogale (ave) FJ603216

IRBP Setonix brachyurus FJ603219

IRBP Thylogale stig/thet/bil (ave) FJ603220

IRBP Wallabia bicolor FJ603221

Rag1 Dendrolagus lum/ben (ave) FJ603222

Rag1 Lagorchestes conspicillatus FJ603228

Rag1 Lagorchestes hirsutus FJ603229

Rag1 Macropus agilis FJ603231

Rag1 Macropus antilopinus FJ603232

Rag1 Macropus eugenii FJ603233

Rag1 Macropus fuliginosus FJ603234

Rag1 Macropus giganteus FJ603235

Rag1 Macropus rufus FJ607154

Rag1 Macropus irma FJ603236

Rag1 Macropus parma FJ603237

Rag1 Macropus parryi FJ603238

Rag1 Macropus robustus FJ603238

Rag1 Macropus rufogriseus FJ603239

Rag1 Petrogale (ave) FJ603244

Rag1 Setonix brachyurus FJ603247

Rag1 Thylogale stig/thet/bil (ave) FJ603248

Rag1 Wallabia bicolor FJ603249

vWF Dendrolagus lum/ben (ave) FJ603252

vWF Lagorchestes conspicillatus FJ603256

vWF Lagorchestes hirsutus FJ603257

vWF Macropus agilis FJ603259

vWF Macropus antilopinus FJ603260

vWF Macropus eugenii FJ603261

vWF Macropus fuliginosus FJ603262

vWF Macropus giganteus AJ224670

vWF Macropus rufus FJ603138

Page 207 Appendices Page 208

vWF Macropus irma FJ603263

vWF Macropus parma FJ603264

vWF Macropus parryi FJ603265

vWF Macropus robustus FJ603266

vWF Macropus rufogriseus FJ603267

vWF Petrogale (ave) FJ603271

vWF Setonix brachyurus FJ603274

vWF Thylogale stig/thet/bil (ave) FJ603275

vWF Wallabia bicolor FJ603276

TRSP Macropus agilis EF368051

TRSP Macropus antilopinus EF368042

TRSP Macropus eugenii EF368049

TRSP Macropus giganteus EF368045

TRSP Macropus rufus EF368041

TRSP Macropus parma EF368047

TRSP Macropus robustus EF368044

TRSP Macropus rufogriseus EF368048

TRSP Petrogale (ave) EF368037

TRSP Thylogale stig/thet/bil (ave) EF368036

TRSP Wallabia bicolor EF368039

Page 208 Appendices Page 209

Table S3. Coding for ancestral habitat reconstruction (chapter 4.) based on the five nuclear genes

(Rag1, BRCA1, vWF, IRBP, ApoB) data matrix of Meredith et al. (2008) for 32 macropods, six outgroup diprotodontians and Dromiciops

Species Habitat coding Dromiciops_gliroides 0 Hypsyprymnodon_moschatus 0 Aepyprymnus_rufescens 1 Bettongia_gaimardi 1 Bettongia_penicillata 1 Potorous_longipes 0 Potorous_tridactylus 1 Dendrolagus_dorianus 0 Dendrolagus_goodfellowi 0 Dorcopsis_veterum 0 Dorcopsulus_vanheurni 0 Lagorchestes_conspicillata 1 Lagorchestes_hirsutus 2 Lagostrophus_fasciatus 1 Macropus_agilis 1 Macropus_antilopinus 2 Macropus_eugenii 1 Macropus_fulginosis 2 Macropus_giganteus 2 Macropus_irma 1 Macropus_parma 1 Macropus_parryi 1 Macropus_robustus 2 Macropus_rufogriseus 1 Macropus_rufus 2 Onychogalea_fraenata 2 Onychogalea_unguifera 2 Peradorcas_concinna 1,2 Petrogale_xanthropus 1,2 Setonix_brachyurus 1 Thylogale_stigmatica 0 Wallabia_bicolor 1 Cercartetus_nanus 1 Phalanger_orientalis 0 Petaurus_breviceps 0 Pseudocheirus_Pseudochirops 0 Phascolarctos_cinereus 1 Vombatus_ursinus 1,2

Page 209 Appendices Page 210

Table S4. Parsimony informative characters for each of the newly sequenced nuclear genes

(Chapter 5).

% informative parsimony characters Gene Total informative relative to abbreviation Gene Name characters characters total

Dentin Matrix Acidic DMP1 Phsophoprotein1 1362 210 15.42% TYR1 Tyrosinase 429 44 10.26%

BCHE butyrylcholinesterase 996 93 9.34%

ENAM enamelin 3900 406 10.41% A2AB adrenoceptor alpha 2B 849 49 5.77% ADORA3A adenosine A3 receptor 333 44 13.21%

Page 210 Appendices Page 211

Figure S1: Time-Calibrated Bayesian phylogeny (BEAST v1.8.1) of kangaroos based on the

‘supermatrix’ dataset (mitochondrial genomes and five nuclear genes). Confidence intervals for the time calibration are shown at each node.

Page 211 Appendices Page 212

Figure S2: Super Network generated in SplitsTree4 (Huson and Bryant, 2006) showing the conflict observed between Figures 18 – 23. The majority of the network conflict coincides with the ‘anomaly zone’ (the base of Macropus to the base of M. Notamacropus) shown in Figure 29.

Page 212 Appendices Page 213

Figure S3: Super Network generated in SplitsTree4 (Huson and Bryant, 2006) showing the lack of conflict observed between Figures 35 and 36.

Page 213