<<

Species Identification in Wildlife Crime Investigation using

By

Linzi M. A. Wilson-Wilde

BSc La Trobe University 1995

Grad Dip App Sc La Trobe University 1996

A thesis in fulfilment of the requirements for the degree of

Doctor of Philosophy (Applied Science)

Institute for Applied Ecology

University of Canberra

December 2010

Certificate of Authorship of Thesis Except where clearly acknowledged in footnotes, quotations and the bibliography, I certify that I am the sole author of the thesis submitted today entitled – Identification in Wildlife Crime Investigations using Diprotodontia.

I further certify that to the best of my knowledge, the thesis contains no material previously published or written by another person except where due reference is made in the text of the thesis.

The material in the thesis has not been the basis of an award or any other degree or diploma except where due reference is made in the text of the thesis.

This thesis complies with University requirements for a thesis as set out in http:/www.canberra.edu.au/secretariat/goldbook/forms/thesisrqmt.pdf.

Linzi Wilson-Wilde

Stephen Sarre December 2010

ii Copyright Under Section 35 of the Copyright Act of 1968, the author of this thesis is the owner of any copyright subsisting in the work, even though it is unpublished.

Under Section 31(I)(a)(i), copyright includes the exclusive right to ‘reproduce the work in a material form’. This, copyright is infringed by a person who, not being the owner of the copyright, reproduces or authorises the reproduction of the work, or of more that a reasonable part of the work, in a material form, unless the reproduction is a ‘fair dealing’ with the work ‘for the purpose of research or study’ as further defined in Sections 40 and 41 of the Act.

This thesis entitled ‘Species Identification in Wildlife Crime Investigations using Diprotodontia’ must therefore be copied or used only under normal conditions of scholarly fair dealing for the purposes of research, criticism or review, as outlined in the provisions of the Copyright Act 1968. In particular, no results or conclusions should be extracted from it, nor should it be copied or closely paraphrased in whole or on part without consent of the author. Proper written acknowledgment should be made for any assistance obtained from this thesis.

Copies of the thesis may be made by a library on behalf of another person provided the officer in charge of the library is satisfied that the copy is being made for the purposes of research or study.

Linzi Wilson-Wilde December 2010

iii Statement of Contributions As this thesis is written as a series of chapters prepared for publication in peer-reviewed journals, several people other than myself have contributed to the work; and they deserve acknowledgement.

• Janette Norman (Melbourne University, previously Museum), who provided guidance and supervision for all aspects of the PhD study and assisted in the preparation of manuscripts. • James Robertson (Australian Federal Police) who provided guidance and supervision for all aspects of the PhD study and assisted in the preparation of manuscripts. • Stephen Sarre (University of Canberra) who provided guidance and supervision for all aspects of the PhD study and assisted in the preparation of manuscripts. • Arthur Georges (University of Canberra) who provided guidance and supervision for all aspects of the PhD study and assisted in the preparation of manuscripts. • Rebecca Kogios (Victoria Police Forensic Services Department) who greatly assisted in the preparation of the revision (2010 version) of the manuscript presented in Chapter 2, particularly in regards to the legal rulings.

These people are included as authors in the following chapters as well as the associated publications, in order of their contribution to the work. However, despite the collaborative nature of this thesis, the work within is my own, and I received no assistance other than that which is stated above.

I as Primary Supervisor agree with the above stated proportions of work undertaken for each of the published (or prepared for submission) peer-reviewed manuscripts contributing to this thesis.

Prof. Stephen Sarre Date:

iv Acknowledgements It has been a long journey. When I began this project I was a Team Leader of the Biology Team for the Australian Federal Police, an active position, but since I had no children completing a PhD seemed feasible. Reality is not always so simple. During the course of this project I have had three children, changed career positions twice and moved interstate. It is safe to say there are numerous people to thank, without whom completing this project would have been impossible.

Firstly, Janette Norman should be thanked. Her willingness to accept a student part way through a project and continue to assist beyond the period of requirement, illustrates her dedication and commitment to her students and research. Janette’s high standards drove and inspired me in directions I had not perceived at the beginning. Without her guidance I simply could not have completed this project.

I would like to thank James Robertson who continually made the time to encourage me and support me in the face of numerous obstacles. James, you provided great advice and were always willing to listen. I would also like to thank the generosity of the Australian Federal Police who, through James, funded this project and provided study leave and support for conference attendance.

My University of Canberra Supervisors Stephen Sarre and Arthur Georges must also be thanked. Steve, your ‘pep talks’ were invaluable, you listened to my ranting and provided good, level headed advice. Arthur, your wonderful guidance in statistical analysis and encouragement to make the most out of my data was greatly appreciated. The Institute for Applied Ecology also provided a laboratory work space for the initial stages of the research and funding support for conference attendance. I would also like to thank my fellow students Jo Lee, Erika Alacs, Anna Macdonald (who also provided eugenii samples) and Niccy Aiken (who also provided Trichosurus vulpecular samples) for their support and encouragement.

I would like to thank Museum Victoria who provided a laboratory and work space where I conducted a majority of my research. Jo Sumner for her helpful advice, Stella Claudas for always ensuring I had the right paperwork, monitoring my budget and just encouraging me,

v Rory O’Brien for helping me sort through the collections to find the correct Diprotodontid specimens. Fellow students Paul Ferraro, Joshua Hale, Stephanie Chapple and Katie Smith for being wonderful to bounce ideas off, provide advice and friendship.

I would like to thank the New Zealand Policing Advisory Agency (ANZPAA) and the National Institute of Forensic Science (NIFS) team who took me on and were willing to support me in completing my PhD by generously providing me with study leave, supporting conference attendance and mostly for believing in me. Of particular mention is Alastair Ross who constantly provided encouragement and support – it was greatly appreciated. Also at ANZPAA I would like to thank Bruce Kerr for IT assistance and Nancy Bakker for assistance with my posters for conferences. At NIFS I would like to thank Ann Gidley for proof reading and formatting assistance. I would also like to thank everyone at ANZPAA who constantly gave me encouragement and seemed interested when I talked about my research.

Thanks also go to Runa Daniel and Lorna Hendry for editing assistance even though they knew nothing about species identification or phylogenetics.

I would like to thank the Australian National Wildlife Collection for the very generous provision of a majority of the samples. I would also like to thank the Victoria Museum, Peter Spencer at Murdoch University, Victoria Police Forensic Services Department, Australia Museum and Collingwood Children’s Farm for provision of samples.

Last, but not least, I would like to sincerely thank my family for their love and support. Jim, Zachary, Sienna and Eloise, I know it was not easy putting up with me over the years; mummy’s laptop and paperwork all over the kitchen bench, the stress and time away from you all. Jim you really came through for me. I love you all and thank you.

Linzi Wilson-Wilde

vi Abstract This research investigated the issues surrounding species identification in a forensic wildlife crime context using Diprotodontia as a model group. Wildlife crime covers a broad range of offences where there is a deliberate and purposeful illegal activity involving and plants for which purposeful gain is the principle motive. Worldwide it is thought to cost between US$10 and US$20 billion dollars annually. Native Australian such as sugar gliders and are thought to be targeted for their unique appeal and are currently sold overseas as pets. Numerous marsupials are also the subject of regulated harvesting and international trade, some of which are listed on the Convention on International Trade in Endangered Species of Fauna and Flora Appendices. The application of molecular DNA techniques and population genetics theory, in the context of a broader understanding of genetic variation within and among taxa, can provide the basis for determining the provenance of animals or their parts that have been seized as a result of wildlife crime investigations or regulation of legal trade.

Current human and non-human analysis systems and equipment in forensic laboratories were reviewed. This information was used to inform what changes in equipment and training are required to implement methods for species identification and the investigation of wildlife crime. Current expertise and methods used for species identification were assessed. It was found that the Ouchterlony antigen/antibody method was being used under the (now confirmed) mistaken belief that it was a definitive or confirmatory test, as opposed to an indicative or opinion based test, as it should correctly be used. Since the Ouchterlony test is limited, an alternative method for species identification is recommended.

Informative nuclear markers have been shown many times to be useful in phylogenetic studies by resolving relationships between taxa. A survey of a broad range of nuclear markers for their phylogenetic utility was conducted in Diprotodonts. One hundred and two sets of primers for nuclear markers obtained from the literature were evaluated for phylogenetic performance using a representative phylogeny of Diprotodonts. Eighteen nuclear markers were identified as having potential for further work. Of these, nine were optimised and analysed using parsimony, likelihood and Bayesian approaches. Four new nuclear markers were developed to assist future genetic studies.

vii A robust phylogeny must underpin any species identification test. A phylogeny containing a large number of taxa generated from nuclear and mitochondrial data was constructed. Mitochondrial markers COI and ND2 were combined with nuclear markers ApoB, IRBP and GAPD to amplify target sequence from 27 genera of Diprotodontia. Two suborders were resolved, and Phalangeriformes. Phalangeriformes was subsequently split into two clades. The first clade contained the and Burramyidae. The second clade contained , grouping with Phalangeroidae. Of the markers tested, ApoB and ND2 provided the greatest number of diagnostic characters. ND2, owing to its presence on the mitochondria and therefore its ease of amplification in difficult samples, is recommended for use as a phylogenetics species identification tool to complement the COI barcoding marker for Diprotodonts.

The use of the Cytochrome Oxidase I (COI) barcoding marker has been suggested as an alternative species identification test but would require a dedicated laboratory space and additional equipment and training. Whilst the technique is very similar to current methods and it would therefore require minimal training to conduct the analysis, the interpretation of the results would require significant training. The COI barcoding marker was assessed in detail using Diprotodontia as a model group and compared to two other forensically relevant mitochondrial DNA markers, Cytochrome b and ND2. The ND2 marker provided the most informative results for the Diprotodontia. However, the COI marker offers an international database of relevant sequences for comparison and as such might still provide the best solution for forensic laboratories. An additional recommendation arising from this work is that other supplementary markers, such as ND2, for Diprotodonts also be implemented as an adjunct to the COI barcoding marker.

Degraded and difficult samples are often encountered in wildlife crime investigations. Shortened amplicons and primer modification techniques for improving the amplification efficiency of these types of samples were investigated. The COI 5’ 150bp shortened amplicon segment identified by the Barcode of Life Initiative for species identification was targeted in a range of Diprotodont samples. Shortened amplicons were found to be effective in increasing amplification efficiency. However, consistent amplification was difficult when targeting within a gene across multiple species, owing to the lack of conserved sites for primer binding. The COI shortened amplicon was found to be insufficient for species identification within the diprotodonts, although this segment may be useful in other species.

viii

Primer modification techniques focused on the use of Locked Nucleic Acids (LNAs) for improving the amplification efficiency of degraded samples. LNAs were spiked into primers for the entire COI barcoding marker and the primers producing the COI 5’ shortened amplicons. The addition of LNAs to the primers improved amplification efficiency by up to an order of magnitude. The design of the primers however was found to be the critical factor in amplification success of primers, even with LNA additions. Use of M13 primer additions was not recommended as these appear to interfere with the function of the LNA spiked primers.

The results of this research highlight the difficulties associated with species identification and the broad level of underpinning information for species delimitation required to interpret the results. The final recommendation must therefore be that if DNA-based species identification is introduced into Government forensic laboratories, it be implemented in a ‘centres of specialisation’ approach where one laboratory becomes the expert service provider for all others. This specialised laboratory could act as a repository for forensically relevant samples, the analysis laboratory for samples of interest for crime investigations and provide a role for coordination of appropriate research and method development.

ix Table of Contents Certificate of Authorship of Thesis………………………….……..…………………... ii Copyright………………………………………………………………………………… iii Statement of Contributions……………………...……………………………………… iv Acknowledgments……………………………………………………………………….. v Abstract…………………………………………………………………………………... vii List of Figures……………………………………………………………………………. xiii List of Tables…………………………………………………………………...... xvii List of Co Authorship………………………………………………..………………….. xix Chapter 1 – Introduction……………………………………………………………….. 1 Wildlife crime………………………………………………………………………. 2 Regulation and enforcement……………………………………………………….. 3 Global hotspots for wildlife crime………………………………………………….. 4 Species identification……………………………………………………………….. 6 Species identification methods……………………………………………………... 7 Diprotodontia as a model group……………………………………...………..…… 8 Project aims and objectives…………………………………………………………. 9 Chapter 2 - DNA Profiling in Criminal Investigations……………………...... 14 Introduction…………………………………………………………………………. 15 History of DNA analysis……………………………………………………………. 24 DNA typing in Australia……………………………………………………………. 31 Alternatives to nuclear DNA analysis……………………………………………… 52 Future directions…………………………………………………………...... 61 Chapter 3 – Current Issues in Species Identification for Forensic Science and the Validity of Using the Cytochrome Oxidase I (COI) Gene………………………. 75 Abstract…………………………………………………………………………….. 76 Introduction………………………………………………………………………… 77 Materials and methods……………………………………………………………… 81 Results………………………………………………………………………………. 85 Discussion…………………………………………………………………………... 89

x Chapter 4 – Evaluating the Phylogenetic Utility of Nuclear Markers in Diprotodontia...……………………………………………………………………. 91 Abstract……………………………………………………………………………... 92 Introduction………………………………………………………………...... 92 Materials and methods……………………………………………………………… 94 Results………………………………………………………………………………. 106 Discussion…………………………………………………………………………... 122 Chapter 5 – Developing a Species Identification System in a Forensic Context for Diprotodontia…………………………………………………….………………... 126 Abstract……………………………………………………………………………... 127 Introduction…………………………………………………………………………. 127 Materials and methods……………………………………………………………… 132 Results……………………………………………………………………...... 136 Discussion…………………………………………………………………………... 145 Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis………………………………………………………………... 149 Abstract…………………………………………………………………………….. 150 Introduction………………………………………………………………………… 150 Materials and methods……………………………………………………………… 156 Results……………………………………………………………………...... 161 Discussion…………………………………………………………………………... 167 Chapter 7 – Amplification Success rates of Ancient and Degraded DNA Using Shortened Amplicons and Locked Nucleic Acids (LNAs)……………...... 170 Abstract……………………………………………………………………………... 171 Introduction………………………………………………………………………… 171 Materials and methods……………………………………………………………… 176 Results………………………………………………………………………………. 184 Discussion…………………………………………………………………………... 193

xi

Chapter 8 – Synopsis…………………………………………………………...... 196 Wildlife crime species identification……………………………………………….. 197 Evaluation of nuclear DNA markers for the identification of species and higher order………………………………………………………………………………… 197 Development of a database of COI sequences for forensic-based identification purposes…………………………………………………………………………….. 198 Evaluation of the utility of COI compared to other widely used mtDNA genes…… 199 Evaluation of options for the implementation of DNA-based species identification services in forensic laboratories for Diprotodontia………………………………… 199 Improving the analysis of degraded samples……………………………………….. 200 Future directions………………………………………………………...... 202 References………………………………………………………………………………... 205

xii List of Figures Figure 2.1 Illustration of the structure of a human cell highlighting the cellular 18 components.

Figure 2.2 Illustration of the structure of DNA showing the general structure 19 containing the phosphate-sugar backbone in relation to the bases.

Figure 2.3 Illustration of the structure of nucleotide relationships within the DNA 20 molecule.

Figure 2.4 Illustration of the different types of Short Tandem Repeats. 30

Figure 2.5 Image of the results of real-time PCR analysis for DNA quantitation. 37

Figure 2.6 Image of the comparison of alleles to a known control ladder in a 42 Genotyper profile analysis.

Figure 2.7 Image of the result obtained from a complete Genotyper profile analysis. 43

Figure 2.8 Image of a profile obtained using a European loci DNA analysis kit. 44

Figure 2.9 Image of a partial DNA profile. 48

Figure 2.10 Image of a mixed DNA profile – displaying major and minor components. 49

Figure 2.11 Image of a mixed DNA profile – too complex to determine the origin. 50

Figure 2.12 Images comparing the analysis of a crime scene sample using Profiler 51 Plus and the minifiler kit.

Figure 2.13 Image of the results of mitochondrial DNA analysis. 54

Figure 2.14 Image of the results of Y chromosome analysis. 55

Figure 2.15 Image of the results of mRNA analysis. 64

Figure 3.1 Amplification efficiency results using the newly developed COI primer 86 pair.

Figure 3.2 Neighbourhood joining tree of COI sequences obtained for forensically 87 relevant species.

Figure 3.3 Maximum parsimony tree of COI sequences obtained for forensically 88 relevant species.

xiii Figure 4.1 Reconstructed phylogeny of representative Diprotodont species to be 98 tested against nuclear markers.

Figure 4.2 Typical results obtained from the nuclear marker primer screen in 111 Diprotodonts.

Figure 4.3 Phylogenetic analysis for the ApoB marker in representative Diprotodont 113 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.4 Phylogenetic analysis for the EN2 marker in representative Diprotodont 114 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.5 Phylogenetic analysis for the GAPD marker in representative Diprotodont 115 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.6 Phylogenetic analysis for the IRBP marker in representative Diprotodont 116 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.7 Phylogenetic analysis for the ODC1 marker in representative Diprotodont 117 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.8 Phylogenetic analysis for the PENK1 marker in representative 118 Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.9 Phylogenetic analysis for the RAG1 marker in representative Diprotodont 119 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

Figure 4.10 Phylogenetic analysis for the vWF marker in representative Diprotodont 120 samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches.

xiv Figure 4.11 Illustration of the phylogenetic performance in Diprotodonts for each 121 nuclear marker using a parsimony approach against the test phylogeny.

Figure 5.1 Phylogenetic analysis of the ApoB nuclear marker in Diprotodonts. 138

Figure 5.2 Phylogenetic analysis of the IRBP nuclear marker in Diprotodonts. 139

Figure 5.3 Phylogenetic analysis of the GAPD nuclear marker in Diprotodonts. 140

Figure 5.4 Phylogenetic analysis of the COI mitochondrial marker in Diprotodonts. 141

Figure 5.5 Phylogenetic analysis of the ND2 mitochondrial marker in Diprotodonts. 142

Figure 5.6 Phylogram of phylogenetic analysis of 3332 bps of the concatenated data 143 sets for nuclear markers ApoB, IRBP and GAPD and mitochondrial markers COI and ND2 in Diprotodonts using maximum likelihood non- partitioned analysis performed in PHYML.

Figure 5.7 Phylogram of phylogenetic analysis of 3332 bps of the concatenated data 144 sets for nuclear markers ApoB, IRBP and GAPD and mitochondrial markers COI and ND2 in Diprotodonts using maximum likelihood partitioned analysis for each gene performed in RaxML.

Figure 6.1 Illustration for the rates of similarity for the various genes on the 155 mammalian mitochondrion.

Figure 6.2 Phylogenetic analysis of 699 bps of the COI marker in Diprotodont 163 species using maximum parsimony analysis.

Figure 6.3 Phylogenetic analysis of 871 bps of the ND2 marker in Diprotodont 164 species using maximum parsimony analysis.

Figure 6.4 Phylogenetic analysis of 407 bps of the cyt b marker in Diprotodont 165 species using maximum parsimony analysis.

Figure 6.5 Frequency histograms, as a percentage, of COI K2P sequence divergences 166 for each of the Diprotodont taxonomic hierarchy.

Figure 7.1 Comparison of the structure of Deoxyribonucleic acid and Locked 175 Nucleic Acid molecules.

Figure 7.2 Positioning of the primers to amplify the Cytochrome Oxidase I marker 179 on the mitochondrial DNA molecule.

xv Figure 7.3 Comparison of individual forward (1a) and reverse (1b) COI segment one 188 primers spiked with Locked Nucleic Acids amplified with a corresponding DNA primer.

Figure 7.4 Comparison of individual reverse COI barcoding primers spiked with 189 Locked Nucleic Acids amplified with a corresponding DNA primer.

Figure 7.5 Results of amplification of contemporary samples using LNA primers. 190

Figure 7.6 Results of amplification of serial dilution of peregrinus 191 comparing DNA and LNA primer pairs.

Figure 7.7 Phylogenetic analysis of segment 1 of the COI barcoding marker in the 192 Diprotodonts using Maximum Parsimony and Likelihood analysis.

xvi List of Tables Table 1.1 Approximate prices of exotic wildlife on the black market and Internet. 11

Table 1.2 CITES I and II listed Diprotodont species as found at the CITES website. 12

Table 1.3 Currently accepted classification structure to family level within the 13 Diprotodont order.

Table 2.1 Possible sources of DNA from biological material encountered at crime 21 scenes.

Table 2.2 Approximate success rates of DNA analysis for various types of biological 23 material as determined from crime scene samples.

Table 2.3 Details of the loci contained in the European marker set kits. 47

Table 3.1 Origin of samples used in this study to investigate the use of the COI 83 marker for use in forensic laboratories.

Table 4.1 List of Diprotodont species available for trialling nuclear markers for 97 species identification and their sample identification numbers.

Table 4.2 Primer Information for the 102 nuclear markers screened in the study of 99 phylogenetic performance in Diprotodonts.

Table 4.3 New primer sequences developed for the study of phylogenetic performance 105 of nuclear markers in Diprotodonts.

Table 4.4 Amplification results for each of the primers tested for phylogenetic 108 performance in Diprotodonts.

Table 4.5 Summary of results of nuclear marker analysis information. 112

Table 5.1 Recent Diprotodont Classifications of Wilson and Reader (2005) and 131 Meredith et al. (2009).

Table 5.2 Sample information regarding the samples analysed in the study of species 134 identification of Diprotodontia.

Table 6.1 Sample information regarding the Diprotodonts plus outgroup samples 158 analysed in the study of species identification using the COI marker.

Table 6.2 Primer information for COI, ND2 and cyt b primers used in the study of 160 Diprotodont species identification using the COI barcoding marker.

xvii Table 7.1 Archival museum specimens used in the study of LNA additions and 180 shortened amplicons.

Table 7.2 Contemporary museum specimens used in the study of LNA additions and 181 shortened amplicons.

Table 7.3 Standard DNA primers developed to produce shortened amplicons for three 182 overlapping segments of the 5’ end of the COI barcoding marker.

Table 7.4 LNA primers developed to produce shortened amplicons for the shortened 183 segment of the 5’ end of the COI gene and also the COI barcoding marker.

Table 7.5 Amplification success rate for each primer pair amplifying the COI marker 186 in three segments in the contemporary Diprotodont samples.

Table 7.6 Amplification success rate for each primer pair amplifying the COI marker 187 in three segments in the archival Diprotodont samples.

xviii List of Co Authorship

List of publications associated with this thesis:

Wilson-Wilde, L. (2005) DNA profiling in criminal investigations. In: Expert evidence. Editors: Freckelton I and Selby H, Lawbook Co. Australia.

Wilson-Wilde, L. (2010) Editorial: Combating wildlife crime. Forensic Science Medicine and Pathology 6:149-150.

Wilson-Wilde, L. (2010) Wildlife crime a global problem. Forensic Science Medicine and Pathology 6:221-222.

Wilson-Wilde L, Norman J, Robertson J, Sarre S, Georges A. (2010) Current issues in species identification for forensic science and the validity of using the cytochrome oxidase I (COI). Forensic Science Medicine and Pathology. 6:233-241.

Wilson-Wilde L and Kogios R. (2011) DNA profiling in criminal investigations. In: Expert Evidence. Editors: Freckelton I and Selby H, Lawbook Co. Australia.

xix

“I was much struck how entirely vague and arbitrary is the distinction between species and varieties.” Darwin, 1859

“The concept of a species is a concession to our linguistic habits and neurological mechanisms.” Haldane, 1956

“The species problem is the long-standing failure of biologists to agree on how we should identify species and how we should define the word ‘species.’” Hey, 2001

xx

Chapter 1 - Introduction

Photo: cinereus () species within the

order Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 1 – Introduction 1 Chapter 1 – Introduction

Sections of this chapter have been published as: Wilson-Wilde, L (2010) Editorial: Combating Wildlife Crime. Forensic Science Medicine and Pathology 6:149-150 and Wilson-Wilde, L (2010) Wildlife Crime a Global Problem. Forensic Science Medicine and Pathology 6:221-222.

This thesis details the results of research that was conducted to investigate the issues surrounding species identification in a wildlife crime context. Diprotodontia are used to investigate the development of species identification tools for forensic applications. The results are presented as a series of papers in chapters for publication. Details of literature cited from each chapter are amalgamated to form a single reference point at the end of this thesis to improve readability. This introductory chapter provides background information regarding wildlife crime offences in Australia and globally, regulation and enforcement of legal trade and potential illegal trade activity. The concept of species and species identification is discussed and current molecular techniques for species identification are highlighted. The use of Diprotodontia and a current classification system are presented as are the project objectives.

Wildlife Crime Wildlife crime is a deliberate and purposeful illegal activity involving animals and plants, alive or dead (including their parts, products and derivatives) for which purposeful gain is the principle motive (McDowell 1997). Wildlife crime involves a broad range of offences that can result in a loss of biodiversity where removal of specimens impacts on population viability (particularly if removal is biased by gender) or through habitat destruction during the collection of specimens. It undermines legal trade, has the potential to increase the transmission of diseases, and can expose the Australian environment to the invasion of pest species (Keller and Lodge 2007, Lips et al. 2006, McDowell 1997, Cook et al. 2002). Wildlife crime is thought to cost between US$10 and US$20 billion dollars worldwide annually (Brack 2004, Symondson 2002, Wyler and Sheikh 2008), which is equivalent to approximately 5% of the international drug trade (Lips et al. 2006, McDowell 1997, Cook et al. 2002). Despite this substantial trade, law enforcement resources allocated to combating wildlife crime as compared to drug crime are not commensurate with the relative levels of

Chapter 1 – Introduction 2 impact (Brack 2004, Wyler and Sheikh 2008). Broader attention to this problem is beginning to occur (Neme 2010), but the application of forensic techniques is predominantly in the research and educational arenas. A number of international networks have been established, including the INTERPOL Wildlife Crime Working Group

(http://www.interpol.int/publicHU /EnvironmentalCrime/Wildlife UH accessed 10 September 2009)

and TRAFFIC (Wildlife Trade Monitoring Network) (www.traffic.orgHU ).UH These networks have not been significantly funded and rely predominantly on public donations and member contributions.

Regulation and Enforcement At the international level there are currently 175 signatories to the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES 2010). CITES provides a system of control to inhibit the exploitation of animals and plants and prevent trade from threatening the extinction of endangered species. Fauna and flora of concern are listed in one of the three Appendices of the CITES document where international trade is prohibited (CITES Appendix I) (with exceptions for non-commercial purposes such as scientific research), regulated in circumstances where the trade does not endanger the survival of the species (CITES Appendix II) and regulated at the request of a particular country (CITES Appendix III) (for example Uruguay has listed the eleven banded armadillo). The ability to identify the species in question in any matter of compliance with CITES listing is paramount to the enforcement of CITES and the subsequent prosecution of offenders.

Australian compliance with the CITES agreement is mediated through the 1999 Commonwealth of Australia Environment Protection and Biodiversity Conservation Act (EPBC Act). The EPBC Act covers species protection including threatened species and communities, migratory species and marine species and also protected areas. Whilst the Act is very broad, Part 13 specifically covers the international movement of wildlife specimens and the CITES agreement (EPBC Act). The Act is enforced by the Department of Environment, Water, Heritage and the Arts, Boarder agencies (Customs and Quarantine), the Australian Federal Police and State and Territory law enforcement agencies (Anon 1998).

Evidence used in the investigation of any offence must also meet certain requirements for use in court proceedings. Evidence validity is often assessed according to Daubert v Merrell Dow Pharmaceuticals US 570 (1993), a USA case which set the rules for expert evidence

Chapter 1 – Introduction 3 admission into court proceedings. This is an American case, but the arguments presented have been used in Australia in numerous trials. More recently the US National Academy of Sciences (NAS) report was released (Anon 2009) which emphasised the need for all forensic evidence to have been produced in an accredited laboratory using validated methods by certified scientists. These recommendations have the potential to influence the presentation of forensic evidence in court in Australia.

Global hotspots for Wildlife Crime There exists a worldwide network of suppliers, distributors, middlemen and traffickers with contact persons in habitat countries and involvement of customs officers at end destinations that facilitates the wildlife crime trade (Lowther et al. 2002). Offenders are often associated with other trafficking such as in drugs, guns and humans, and with flow-on offences such as forgery. CITES documents, export/import permits and other documents are forged to reflect fictitious breeding by registered breeders (Lowther et al. 2002) and items may be laundered through another country to mask the country of origin. Offenders are often part of large groups, which are organised, use jargon and nicknames to conceal communications and take advantage of communication technologies (e.g. Skype) and the internet (IFAW 2005). For example, it is estimated that in Brazil, 40% of organised crime is associated with wildlife crime (Renctas 2001) and the figure is rising, owing in part to the global economic turndown and disparities between the penalties and financial gains when compared to other organised crime types (Leader-Williams and Milner-Gulland 1993, Li et al. 2000, Claridge et al. 2005, Alacs and Georges 2008).

It is difficult to ascertain specifically what drives demand in particular wildlife trade, but it is thought that fashion, species rarity, trends in alternative remedies and medicine and criminal elements each play a part (Alacs and Georges 2008). Furs, which were once popular, have been replaced with wool such as ‘shatoosh’ a product made from the endangered Tibetan antelope, which must be killed in order to harvest its wool. Skin products, such as crocodile and lizard skins and feathers, are also popular. Endangered species are more highly prized, cost more, and can therefore be in higher demand by collectors due to higher profits relative to the risks and penalties incurred (Courchamp et al. 2006). Species are also chosen for the end product such as shatoosh wool. Recently the trade in rhinoceros horn and ivory has increased markedly. Traditionally rhinoceros horn is used for Yemeni dagger handles, but there has been a significant increase in illegal poaching and trafficking of the horn on route to

Chapter 1 – Introduction 4 the Far East (Xianlin 2009). This is driven by the belief in the medicinal benefits of powdered horn in the treatment and prevention of cancer (Xianlin 2009). Simply placing a species on the CITES list, Appendix I (CITES I), can make a species more appealing. Table 1.1 lists approximate prices for some species being sold on the internet. A 1998 Senate report on the Commercial Utilisation of Australian Native Wildlife found that Australian animals most commonly targeted are birds, reptiles, insects, fish and large game (Anon 1998).

The trade in wildlife is also very popular in China where a large number of CITES I and II listed species are offered for sale on the Chinese-Language internet (in China, Hong Kong and Taiwan), with over 20,000 advertisements in China alone (Xianlin 2009). In an important move, the internet’s largest trading venue, eBay, prohibits the trade in live animals (except

some fish in the United States) and ivory (http://help.ebay.com.au/Help/Policies/BannedHU

/Restricted_items/Animals-and-wildlife UH accessed 20 June 2010).

Birds, followed by reptiles, are the most common illegally exported Australian animals (Anon 1998). Native birds are smuggled alive for their appeal as pets for collectors in Europe and America but, more commonly, fertilised eggs are smuggled to incubate overseas. Stuffed birds, feathers, skin and meat from various animals are also desirable. Reptiles are easy to transport and can survive long journeys in suitcases or concealed in clothing worn by the transporter (Alacs et al. 2010). This makes reptiles appealing for low-risk high-profit smuggling. Native Australian marsupials, such as sugar gliders and wallabies, are also targeted for their unique appeal (refer Table 1.1). Plants, such as orchids and various species of wood, are also heavily targeted.

Animals and plants of Australia are readily available in the wild and with some knowledge of the species and its habitat, trapping techniques and species husbandry specimens can be captured at little cost. Mounting a case for prosecution is problematic as, once caught, whether the specimen was wild caught or captive bred can be difficult to determine. In such cases, DNA analysis and application of population genetics theory, in the context of a broader understanding of genetic variation within and among taxa, can provide the basis for determining the provenance of captive animals. Genetic analyses can determine if a particular is an offspring of individuals nominated by a suspect, and, if not, from what population the animal was most likely derived.

Chapter 1 – Introduction 5

Often there is little evidence remaining of the illegally captured species. Sometimes all that remains are a few hairs or some minor cellular material. Genetic analysis therefore becomes very important to the determination of the species of origin of small amounts of biological material. This biological material may be severely degraded and the DNA may exist in relatively low copy numbers. It is therefore important that robust and reliable techniques exist that can withstand the rigours of a criminal court proceeding.

Central to the establishment of a viable international and national legal trade is the ability to monitor compliance and identify illegal activity. Identification to species level is especially difficult when animal products or parts are involved or when similar but different species are sold under the same common name (such as in food substitution). DNA profiling may prove critical in controlling trade in animal products and determining lines of inheritance for monitoring the trade of live captive-bred versus wildlife-caught animals. The scale of legal wildlife trade on a global level is massive. It is estimated over half a million shipments of wildlife containing just under 1.5 billion live animals have been imported into the USA since 2000 (Smith et al. 2009). Approximately 31% of the shipments were incorrectly labelled. They did not include the species level names included in the shipments, even though this is mandatory by law (Smith et al. 2009). These shipments were still permitted entry into the USA. This reflects the lack of a nationally coordinated compliance monitoring system, without which it is impossible to accurately determine the size of the illegal trade. Illegal shipments can be easily hidden within larger legal shipments and without a monitoring system and identification method the size of any concealed illegal activity cannot be determined.

Species Identification The concept of species has long been debated (Darwin 1859, Haldane 1956, Cracraft 2000, Hey 2001) yet the identification of species is important for a broad range of applications in conservation biology, epidemiology and evolutionary biology (Yang and Rannala 2010). Speciation is a continuing process, so there is rarely a clear and definitive distinction between closely related species. The determination of a species (examining a group of individuals and delineating the individuals into species groups based on their genetic similarity within a group and differences between the groups) is underpinned by the disciplines of evolutionary biology, (the classification of organisms into groups according to their morphological similarity), phylogeography (the evolutionary relatedness of organisms in

Chapter 1 – Introduction 6 space and time), and population genetics (distribution of genetic variation within and among populations) (Avise 2000). The latter two use molecular DNA techniques as the basic tool for defining taxonomic units. For a sample to be accurately placed into a species, the knowledge drawn from the above mentioned disciplines need to be applied at a level appropriate to the problem at hand. This is a key issue in wildlife forensics - how to define the biological (taxonomic) units of concern, whether they are species, populations or some other variant.

A species may not always be easily defined by genetic means, particularly if the baseline knowledge of genetic variation within and among species is not well characterised. The DNA analysis of two closely related populations may result in most of the specimens being grouped into one or the other of the populations, but there may be outliers in the ‘grey area’ between the two populations that cannot be readily assigned to one group or the other (Mallet 1995). Species delimitation is complicated owing to factors such as hybridisation (where two species interbreed), introgression (where genes from one species infiltrate into the gene pool of another as a result of interbreeding), homoplasy (where two species share traits which have not arisen from a common ancestor – such as in the case of convergent evolution) and cryptic species (where two species are morphologically difficult to diagnose).

Species Identification Methods Species identification tests more commonly used in broad based forensic laboratories involve hair comparison analysis (Brunner and Coman 1974, Robertson 1999) and the Ouchterlony test (Ouchterlony 1968, Ouchterlony and Nilsson 1986). The former requires both a database of known hairs and a high level of expertise, the latter of which can be problematic as there are a limited number of experts in this field. The Ouchterlony test is a gel immounodiffusion antibody/antigen reaction which, because of the loss of biological activity of the proteins and narrow parameters for optimal detection (Ouchterlony 1968, Fugate and Penn 1971, Saferstein 1982), is limited in its application. In addition, the need to raise specific antigens in laboratory animals means that the test is available for a very limited number of species (approximately 10) and control blood samples for each species for which a test is available also need to be sourced (Saferstein 1982). Although the Ouchterlony test is currently used as a confirmatory (definitive) test, it has been found to produce cross-reactions in non-target species (Wilson and Pringle 1956, Bird 2005), which could lead to confusing results, or worse, mis-identifications. It is therefore recommended by the author that in a forensic environment this test should be considered only as a presumptive test (indicative test).

Chapter 1 – Introduction 7

Use of molecular DNA techniques to identify species has long been postulated (Woese and Fox 1977) either as a stand-alone method or in conjunction to morphological methods (Hajibabaei et al. 2007). In laboratories outside Government agencies, the most commonly applied methods of analysis for species involve the use of mitochondrial DNA, particularly Cytochrome b (cyt b) and Cytochrome Oxidase I (COI) (Ogden et al. 2009). Cyt b has traditionally been used in forensic science for species identification (Parson et al. 2000, Verma and Singh 2002, Bellis et al. 2003, Hsieh et al. 2003). However, COI has more recently gained some favour, particularly in relation to the Barcode of Life Project which aims to catalogue all living species on earth. Data generated from this project are uploaded to the Barcode of Life Database (BOLD) which can then be searched and the results obtained can be used in the identification of an unknown sample.

Other molecular techniques include microsatelites, single nucleotide polymorphisms (SNPs), amplified fragment length polymorphisms (AFLPs) and species specific priming (SSPs). Microsatelites are more applicable in parentage analysis (individualisation), single species testing and population assignment. SNPs are particularly useful in mixed samples (Tobe et al. 2010) and can be analysed using pyrosequencing. For a summary of the advantages and disadvantages of the various techniques for species identification, refer to Alacs et al. (2010) or Linacre et al. (2009).

A species discrimination test has an application in criminal investigations through the identification of blood or hairs on suspects, illegal poaching, exportation of Australian native species and the importation of exotic species (McDowell 1997). Broader applications include confirmation of species and scat content analysis in ecological studies (Symondson 2002, Berry et al. 2007, Berry and Sarre 2007), food product analysis (Macedo-Silva et al. 2000, Martinex and Danielsdottir 2000, Wong et al. 2004), Chinese medicine content analysis (Peppin et al. 2008) and animal feed analysis for the identification of processed animal proteins (Fumière et al. 2009).

Diprotodontia as a model group In 2003, the AFP Forensic and Data Centres section invested in preliminary studies into the areas of species identification as it applies to wildlife crime. Three doctoral programs were

Chapter 1 – Introduction 8 funded, targeting birds (focussing on glossy black cockatoos), reptiles (focussing on turtles) and mammals (focussing on marsupials of the Diprotodont order). The aim of this approach was to look at the concept of species and species identification, determine what expertise and knowledge was required to provide a species identification service and determine whether the AFP Forensic Services was best placed to provide this service.

This study covers the third of these projects, focussing on diprotodont marsupials within the kingdom. Diprotodontid Marsupials were chosen as they are a broad species group containing very closely related and more distantly related taxa. This was important to fully explore species identification in a forensic context. Some Diprotodontid species are CITES I and II listed (refer Table 1.2), involved in the pet trade (refer Table 1.1), used in commercial meat supply and trade, targeted for legal hunting and used in the production of commercial skin and souvenir products (Anon 1998). Fundamental to the regulation of trade and harvesting of these species is an appropriate identification method.

The order Diprotodontia is the largest of the marsupialia orders. Using the current classification as developed by Meredith et al. 2009 (refer Table 1.3) there are currently 11 extant families within the order containing approximately 125 living species (Meredith et al. 2009).

The order is split into two suborders: Vombatiformes and . Vombatiformes consists of two families: and Vombatidae. Phalangerida is divided into two superfamilies: containing the Petauridae, , and Tarsipedidae, and Australoplagiaulacoida (coined by Meredith et al. 2009) containing the remaining five families in two major clades. Clade contains the and Burramyidae whilst clade Macropodiformes contains the , and .

Project Aims and Objectives My aims are to examine the concepts of species and species identification in a forensic context as it applies to the investigation of wildlife crime, using Diprotodontid marsupials as a model group. Knowledge of the evolutionary relationships of species within and between groups is critical in underpinning all identification systems. This can be particularly

Chapter 1 – Introduction 9 important in wildlife crime investigation where one species may be strictly regulated and another closely related species may not. If delimitation between the species is not fully resolved, identification of the regulated species may not be possible. This can be particularly problematic in fragmented remains. Degraded and altered samples are often encountered in forensic analysis for the prosecution of wildlife crime offences and improvements to the analysis of these types of samples would be beneficial. The investigation of species identification tools for use in forensic laboratories has not been conducted to date and it is unclear how these types of methods could be incorporated into a standard forensic context.

Therefore the specific project objects are as follows: 1 Develop a database of COI sequences for forensic-based species identification. 2 Evaluate the utility of the COI marker in Diprotodonts. 3 Investigate techniques for improving DNA-based species identification from degraded specimens and trace samples. 4 Evaluate nuclear DNA markers for identification of species and higher orders. 5 Evaluate options for the implementation of DNA-based species identification services in forensic laboratories.

This thesis is written as a series of papers. Four papers have been published to date. The thesis is written primarily from a perspective of species identification as it applies in forensic science, although application of some chapters is much broader. I begin the thesis by reviewing the issues surrounding wildlife crime (Chapter 1) and then review current molecular analysis techniques, including species identification, in prosecution-based forensic laboratories in Australia (Chapter 2). This information is then used to inform the development of subsequent chapters, looking at the application of COI barcoding techniques in forensic laboratories (Chapter 3), investigating nuclear markers for use in phylogenetics and species identification (Chapter 4), developing a robust phylogeny for Diprotodontia that can underpin a species identification method (Chapter 5), reviewing the barcoding marker for use in species identification for Diprotodonts (Chapter 6) and investigating alternatives techniques for analysing degraded samples often encountered in wildlife crime investigation (Chapter 7). Chapter 8 provides a synopsis, drawing together the common threads of the individual chapters and the discussions to provide an overview of the contribution this thesis makes to the discipline and suggesting future research.

Chapter 1 – Introduction 10 Table 1.1 Approximate prices of exotic wildlife on the black market and Internet. Species, price and references are provided.

Species Price Reference Parrots US$20,000-40,000 CITES “Wild Fauna and Flora” brochure document no. (eg Macaw) 41.835(e) Grey Parrot US$600-1,200 CITES “Wild Fauna and Flora” brochure document no. 41.835(e) Crocodile skin US$1,000 CITES “Wild Fauna and Flora” brochure document no. 41.835(e) Orchid (rare) US2,000-10,000 CITES “Wild Fauna and Flora” brochure document no. 41.835(e)

Wallaby US$2,000-4,000 http://www.drzoolittle.net/wallaby.htmlHU UH (accessed 20 June 2010)

Sugar glider US$125-1,000 http://www.gliderpets.com/HU UH (accessed 20 June 2010)

Samueli Red Tail A$200 http://www.birdauction.com.au/categories.php?categoryHU

Black Cockatoos =Australian-Birds&parent_id=1838 UH (accessed 20 June 2010)

Bear Bile per US$200,000 http://www.havocscope.com/black-market-HU

pound prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Orangutan US$45,000 http://www.havocscope.com/black-market-HU

prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Ivory US$1,800 http://www.havocscope.com/black-market-HU

prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Rhino Horn US$14,000 http://www.havocscope.com/black-market-HU

Dagger prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Tiger (Live) US$50,000 http://www.havocscope.com/black-market-HU

prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Iguanas US$10,600 http://www.havocscope.com/black-market-HU

prices/wildlife-animal-prices/ UH (accessed 20 June 2010)

Chapter 1 – Introduction 11 Table 1.2 CITES I and II listed Diprotodont species as found at the CITES website:

http://www.cites.org/eng/disc/speH cies.shtml accessed 20 June 2010 H .

CITES I Listed Species Common Name hirsutus Rufous hare-wallaby, Western hare-wallaby, Wurrup Lagorchestes fasciatus Banded hare-wallaby Onychogalea fraenata Bridled nailtailed wallaby Onychogalea lunata Crescent nailtailed wallaby Bettongia spp. , rat-, Lesueur’s Rat-, , Brush-tailed Rat-kangaroo, Woylie (the entire genus is now listed) Caloprymnus Campestris Desert Rat-kangaroo (possibly extinct) krefftii Northern Hairy-nosed

CITES II Listed Species Common Name Dendrolagus inustus Grizzled grey tree kangaroo Dendrolagus urinus Vogelkop tree-kangaroo, Black tree-kangaroo intercastellanus Eastern common Phalanger mimicus Cryptic cuscus Phalanger orientalis Gray cuscus kraemeri Admiralty cuscus Spilocuscus maculatus Spotted cuscus Spilocuscus papuensis Waigeo cuscus

Chapter 1 – Introduction 12 Table 1.3 Currently accepted classification structure to family level within the Diprotodont order. Classification developed by Meredith et al. 2009.

Order Diprotodontia Suborder Vombatiformes Family Phascolarctidae Family Vombatidae Suborder Phalangerida Superfamily Petauroidea Family Petauridae Family Pseudocheiridae Family Acrobatidae Family Tarsipedidae Superfamily Australoplagiaulacoida Clade Phalangeroidea Family Phalangeridae Family Burramyidae Clade Macropodiformes Family Potoroidae Family Hypsiprymnodontidae Family Macropodidae

Chapter 1 – Introduction 13

Chapter 2 - DNA Profiling in Criminal Investigations

Photo: Macropus eugenii () species within the order Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 2 – DNA Typing in Criminal Investigations 14 Chapter 2 - DNA Profiling in Criminal Investigations

This chapter was written as a review of DNA profiling in prosecution (police) based laboratories in Australia and the DNA analysis services they currently offer. The chapter was first published as: Wilson-Wilde, L. (2005) DNA profiling in criminal investigations. In: Expert evidence. Editors: Freckelton I and Selby H, Lawbook Co. Australia. The chapter has since been reviewed and updated by myself and a second author Rebecca Kogios as a revised version for the same book and published as: Wilson-Wilde L and Kogios R. (2011) DNA profiling in criminal investigations. In: Expert Evidence. Editors: Freckelton I and Selby H, Lawbook Co. Australia. The chapter is presented as published except for formatting changes.

Introduction

Deoxyribonucleic acid (DNA) evidence has been presented in Australian Courts of Law for more than two decades. Technological advances in the profiling of DNA during this time have been rapid and significant. Where previously large amounts of biological material from fluids such as blood were required (a stain approximately the size of a 20 cent piece), profiles are now routinely obtained from minute quantities of biological fluid and from samples where biological material other than fluid is present. The impact of these advances, not only on the justice system but in all areas of biological science, has brought DNA analysis into the public awareness and has dramatically increased the range of legal cases where DNA evidence is available. Whereas once DNA evidence was available only in cases involving the transfer of biological fluid (e.g. homicide, sexual assault), it is now routinely used in cases where no fluid is present (e.g. Armed Robbery where, for example, DNA could be recovered from the inside of a cap left behind at a crime scene). This chapter is an overview of DNA profiling in Australia, presented in a relatively easily digestible format; further scientific details can be obtained from the textbooks. Unless otherwise noted in this chapter, reference to DNA is to nuclear DNA.

The basis of DNA variation

Since Watson and Crick discovered the structure of DNA in 1953 and casually noted, “This structure has novel features which are of considerable biological interest” in their Nature article of the same year, much research has been devoted to the molecule and its role in

Chapter 2 – DNA Typing in Criminal Investigations 15 biological organisms. Much still remains unknown about the DNA molecule and its interaction in the cell; however, it is clear that it plays a fundamental part in the criminal justice system. The following is an introduction to DNA. For further, more indepth information, readers should refer to Butler (2010), Goodwin et al. (2007) or any good basic textbook on biology or genetics.

The human body is made up of different types of tissue, such as muscle tissue, bone, blood and skin. These different types of tissue are made up of a collection of cells, the building blocks of human, animal and plant bodies. The cell is made up of a nucleus containing the DNA, surrounded by a liquid material called the cytoplasm, which is in turn encased in the cell membrane (see Figure 2.1 which illustrates the basic structure of a cell). The cytoplasm contains various structures such as the mitochondria, which supplies the cell with energy to function. Tissue and cells deposited on a surface can be collected, the cells broken open and the DNA contained within analysed.

The DNA molecule is often referred to as the ‘blueprint for life’ as it carries all the genetic information of the organism. The molecule consists of two complementary strands which bind together and twist to form a double helix (Figure 2.2). The strands consist of one of four bases – adenine, thymine, guanine and cytosine – attached to a phosphate-sugar (deoxyribose) backbone. The phosphate makes the DNA acidic. The phosphate, sugar and base form a unit known as a nucleotide (Figure 2.3). The two strands bind together via hydrogen bonds. Adenine always bonds to thymine and guanine always bonds to cytosine. This unique and definite pairing provides a mechanism for the DNA to replicate; the pairing of each set of nucleotides is termed a base pair (Figure 2.3). There are approximately three billion nucleotide base pairs in the human genome. By knowing the sequence in the coding of one strand, the code sequence of the other can be determined. The current method of DNA replication which utilises this feature is discussed below.

The DNA double helix is wound around proteins called histones which are tightly packed into structures called chromosomes. Humans have 46 chromosomes: 22 pairs of autosomal chromosomes and two sex chromosomes. Each chromosome forms a storage mechanism for the deoxyribonucleic acid (DNA). An individual may generally possess either two X chromosomes (gender = female) or one X and one Y chromosome (gender = male). There are also some chromosomal abnormalities where an individual receives multiple X or Y chromosomes, or multiple autosomal chromosomes (e.g. trisomy 21 – three copies of

Chapter 2 – DNA Typing in Criminal Investigations 16 chromosome 21 leading to Down syndrome) or partial chromosomes attached to other chromosomes, although these various mutations are rare. Chromosomes are located in the nucleus of the cell (Figure 2.1). However, not all cells in the human body contain a nucleus and therefore not all cells contain DNA; red blood cells, for instance, do not have a nucleus. White blood cells do, and it is from these cells that a DNA profile can be derived when analysing blood.

Some sections of the DNA provide information to the cell (where the DNA is ‘translated’ by reading the genetic code in order to retrieve the information) to produce proteins enabling the body to grow and function. These sections are known as coding regions (in that they have a functional product). Other sections of the DNA that are not directly translated to produce products such as proteins are referred to as non-coding regions. While these areas have no obvious translatable purpose, it is still thought that they are necessary for the functioning of the DNA and their direct relevance in the intricate workings of the molecule may become clearer with time.

The sequence of the genome is different for all organisms and with the exception of identical twins, differs between individuals of the same type of organism. It is these differences in the sequence of the code that define who we are; these differences can be analysed to differentiate between us. Traditional forensic DNA analysis has focused on (but is not limited to) the non- coding areas of the DNA, as these regions of the DNA molecule exhibit greater variation between individuals than coding regions and generally offer no information regarding the individual such as predisposition to diseases or physical appearance. Hence limited personal information regarding an individual can be obtained from traditional forensic DNA analysis.

Sources of DNA

DNA may be obtained from a variety of sources, as illustrated in Table 2.1. Locard's Principle (Locard 1930) states that any contact between two surfaces may result in an exchange of material between the surfaces. This exchange is also true for DNA. For example, a person who comes into contact with a surface or object, such as the handle of a weapon, may leave some of her or his DNA on that surface. Locard's Principle is a fundamental underlying principle in many forensic science disciplines, including DNA analysis.

Chapter 2 – DNA Typing in Criminal Investigations 17

Mitochondria

Cell Membrane

Chromosomes/ DNA Cellular Bodies Nuclear Membrane

Figure 2.1 Illustration of the structure of a human cell highlighting the cellular components. The location of the DNA inside the nucleus is shown. The mitochondria also contain DNA.

Chapter 2 – DNA Typing in Criminal Investigations 18

Bases

Phosphate-sugar backbone

Figure 2.2 Illustration of the structure of DNA showing the general structure containing the phosphate-sugar backbone in relation to the bases.

Chapter 2 – DNA Typing in Criminal Investigations 19 5 ’ 3’ P

S B B S

Base Pair P

P

S B B S

P

P

S B B S Nucleotide

P 3 ’ 5’

Figure 2.3 Illustration of the structure of nucleotide relationships within the DNA molecule.

The diagram illustrates a nucleotide and a base pair containing two nucleotides.

Chapter 2 – DNA Typing in Criminal Investigations 20 Table 2.1 Possible sources of DNA from biological material encountered at crime scenes.

Biological material Source

Blood White blood cells.

Semen Spermatozoa and cells shed during ejaculation.

Vaginal secretions Cells shed from the vaginal lining. Secretions may be found on the outside of condoms or penile swabs.

Saliva Cheek (buccal) cells from the inside of the mouth. Saliva may be found on drinking vessels, cigarette butts, masks, stamps, envelopes and food.

Hair Cells found attached to the root of plucked hair or adhering to the shaft of the hair. There are many more cells attached to the root of the hair than the shaft.

Sweat/skin Skin cells which have sloughed off the body due to normal regeneration processes. Found on clothing, touched items and personal items. Also referred to as ‘trace DNA’ or ‘contact DNA’.

Bone Cells present in the marrow.

Tissue Cells present particularly in muscle tissue.

Teeth Cells present in the tooth pulp.

Urine/faeces/vomit Cells present in these fluids, if any, are thought to originate from the lining of the orifice from which they came. Often difficult to analyses as bacteria degrades the DNA.

Chapter 2 – DNA Typing in Criminal Investigations 21 However, even if biological material is present, it does not mean a DNA profile will be obtained. The two main factors that affect DNA analysis are the quantity and quality of the DNA present. The greater the number of cells present, the greater the chance of obtaining a DNA profile. However, if the DNA is degraded, in that it has been broken into smaller pieces, it is of low quality and a profile may not be able to be generated. Profiles can theoretically be generated from samples with a very small number of cells, as long as a sufficient number of cells containing DNA of a high quality are present. The DNA contained within these types of samples is referred to as Low Template DNA (LT DNA) but can also be referred to as ‘trace DNA’.

LT DNA is specifically obtained from an individual coming into contact with an object or surface (van Oorschot and Jones 1997). The amount of DNA deposited on the surface is dependent on the individual and whether that individual has the propensity to ‘shed’ her or his DNA. Individuals who have a greater propensity to shed their DNA are termed ‘shedders’ (Lowe et al. 2002). Friction between the individual and the surface can also aid in transferring a greater number of cells than would be transferred by just touching an object alone. Subsequently objects which are handled more often, such as mobile phones or briefcase handles can yield a DNA profile even if the individual is not a good shedder.

Factors that affect the DNA deposited on a surface or object include exposure to ultra violet light (sun), temperature, humidity, bacteria/fungi/mould and chemicals which can all break down the DNA, reducing the quality. The amount of DNA deposited, water/rain and interference with the sample (such as washing blood-stained clothing) can affect the quantity of cells present. The origin of the DNA may not be able to be determined in degraded or LT DNA samples where a profile cannot be fully resolved and therefore used for comparison. However, in high-quality DNA samples with just a few cells, a complete DNA profile can often be obtained. Table 2.2 contains the approximate success rates of DNA analysis for various types of biological material.

Chapter 2 – DNA Typing in Criminal Investigations 22 Table 2.2 Approximate success rates of DNA analysis for various types of biological material as determined from crime scene samples.

Biological material Approximate success rate

Blood 90%

Semen 90%

Saliva Bottle – drink 40%

Bottle – alcohol 25%

Cigarette butt 70%

Trace/Contact Clothing 30%

Objects – extended contact 15%

e.g. mobile phone

Objects – limited contact 10%

e.g. handles

Objects – single contact <10%

e.g. smudge marks

Hair 25%

Chapter 2 – DNA Typing in Criminal Investigations 23 History of DNA analysis

DNA analysis in Australia has advanced significantly over the past 20 years, resulting in previous forms of genetic testing, such as ABO blood grouping, enzyme typing (such as PGM), and serotyping of white blood cells (such as HLA-A, B, C, D) becoming redundant. Within DNA typing itself, progress has been such that first-generation DNA systems are now no longer in use in Australia and even early versions of PCR-based techniques have been superseded. The following provides a short history of this progression.

Up until the mid 1980s genetic markers commonly used in forensic analysis consisted of blood and protein polymorphisms. These variations were commonly products of the coding regions of the DNA, such as the ABO blood grouping system (which codes for proteins on the red blood cells). Variations in the DNA sequence resulted in changes on cell surfaces, or proteins within the body which could be differentiated analytically.

Analysis of genetic markers relating to coding DNA was of limited value in forensic science. First, coding region variations may represent fatal mutations. Consequently, the amount of viable variation is limited (Holland et al. 1993) and differential typings, when obtained, were of relatively low statistical significance. Further, both blood and protein markers degrade easily when outside the body (Denault et al. 1980) and are not found in all body fluids (Nakamura et al. 1987). The relatively low variation, sample differences and instability in the environment led to a move towards analysing the DNA directly (Cooper et al. 1985). The first type of DNA analysis used Restriction Fragment Length Polymorphisms (RFLPs), resulting from the presence or absence of a restriction site. Enzymes were used to cut the DNA at a specific site (called the restriction site); where the restriction site was present in the sequence of DNA, the DNA would be severed to produce smaller fragments. Where the sequence of DNA did not include the restriction site, the DNA would remain intact. Differentiation was achieved on the basis of fragment size. However, large amounts of DNA were required (approximately 10ng) to obtain a result and again results were of low statistical significance.

Repetitive DNA

In 1980, using the advances made by RFLP analysis, Wyman and White found regions of non-coding DNA which contained repeat sequences. These sequences were highly polymorphic, recognised by multiple fragment lengths and became known as Variable

Chapter 2 – DNA Typing in Criminal Investigations 24 Number of Tandem Repeats (VNTRs) (Nakamura et al. 1987). Consisting of core units repeated in tandem, the number of times the core unit is repeated determines the length of the allele. VNTRs follow strict inheritance patterns (Jeffreys et al. 1985a, Jeffreys, et al. 1985b) and show a high degree of variability between individuals, both of which are useful factors in forensic analysis as a means to differentiate one individual from another.

Non-coding regions of DNA mutate more readily than coding regions as there are generally no fatal consequences of mutations in non-coding regions. This lack of restriction against mutation allows the repetitive DNA to mutate, spread and diversify, increasing the number of polymorphisms throughout the genome (Ridley 1985, Cooper et al. 1985). This is further illustrated by the numerous areas of repeat regions scattered throughout the genome which total approximately 30% of the genome (Koblinsky 1992). RFLP-VNTRs were first analysed at a single locus and then at multi-loci (Jefferies et al. 1985a), the latter resulted in a barcode style image, which led to the potentially misleading phrase ‘DNA fingerprinting’. It is this DNA analysis method that was first used in a criminal investigation by Sir Alec Jefferies in 1985 to assist in solving the rape and murder of two young girls in the United Kingdom (Napper 2000). Importantly, in this first application of DNA analysis to solve a crime, the results, whilst linking the two crimes, excluded the suspect who had confessed to one of the offences. The same technology was later used to inculpate the true offender.

Single-locus VNTR typing became relatively widespread due to its great discriminating power. However, there was little uniformity in Australasia, with at least three different enzymes being used in different laboratories, resulting in incomplete databases specific to particular jurisdictions (see for example Balazs et al. 1992).

The polymerase chain reaction

In addition to lack of uniformity, RFLP-VNTR analysis had numerous practical limitations such as having to analyse crime samples at the same time as reference samples for comparison, the length of time required for a typing, the requirement for large amounts of relatively intact DNA and highly complex interpretations. The advent of the Polymerase Chain Reaction (PCR) solved many of these problems. PCR is a method of amplifying/copying the DNA (Saiki et al. 1985). The double-stranded DNA is first heated to a specific temperature to separate the strands (denature), then cooled to a second specific temperature to allow the short sections of manufactured DNA known as primers to bind

Chapter 2 – DNA Typing in Criminal Investigations 25 (anneal) to specific sequences on the DNA. For each section of DNA being analysed, two primers are required, one of each attaching to either side of the site of interest on the DNA molecule. At a third specific temperature an enzyme known as Taq polymerase facilitates the copying of the DNA by adding the components of the DNA to make each single strand a double strand (extend) using the G-C, A-T pairing code. It was the discovery of Taq polymerase that was the key factor in the development of PCR. Taq polymerase is a chemical found in a bacterium (Thermus aquaticus) that functions at high temperatures and essentially drives the replication process. At the end of the Taq polymerase extension process the copying process is complete and all DNA fragments are double stranded again.

By repeating the above heating and cooling process of denature, anneal and extend, the DNA can be replicated exponentially in vitro. In this way just a few copies of DNA can be made into millions or billions of exact copies. For example, if you started with just one copy of the genome, after the first cycle you would have two copies, then 4, 8, 16, 32 and after 28 cycles you would have around 130,000,000 copies. This represented an enormous advance in DNA analysis. PCR was originally applied to a coding region of the DNA, HLA-DQalpha (Amplitype User Guide 1990) but was soon applied to non-coding VNTR. The method has many benefits over the RFLP-VNTR system, including the ability to obtain results from degraded samples, and the need for less DNA (less than 5ng in comparison to more than 50ng for RFLP analysis (Urquhart et al. 1995). The process is also quicker and more cost effective (Edwards et al. 1992, Budowle et al. 1991).

PCR also allows DNA typing of old samples such as ancient bone (Gill et al. 1993); highly degraded samples subjected to harsh environmental conditions such as burning and chemical assault (Whitaker et al. 1995, Martin et al. 1996); and more unusual sources such as cigarette butts and stamps (Hochmeister et al. 1991).

Short tandem repeats

VNTRs have large repeated sequences of non-coding DNA producing large fragments of amplified DNA. It was found that, if the repeated sequences were shorter, the amplified fragment would also be shorter and therefore less sensitive to DNA degradation in the sample. This type of DNA code sequence for analysis was termed Short Tandem Repeats (STRs). STRs are repetitive DNA whose core sequence varies from one to six nucleotides in length within the population and which is repeated in tandem (Oldroyd et al. 1995). They are

Chapter 2 – DNA Typing in Criminal Investigations 26 analysed as length variations. Different people have different numbers of repeat core units and therefore will have a different length of DNA. STRs are hypervariable repeats that are prevalent in the genome and are easily amplified by PCR. As the repeat core sequences and total length of the amplified products are short, STRs are more reliable for analysing degraded samples (Gill and Evett 1995) and hence more useful in forensic casework. Reliable results can be obtained from less than 2ng of DNA and as low as approximately 100pg. Results have even been obtained from a single cell (Findlay et al. 1997).

The three main types of STRs (Urquhart et al. 1994) are illustrated in Figure 2.4 below. However, STRs can more generally be classed as either perfect, where the alleles differ in length only by the size of a repeat unit, or imperfect, where this is not the case (Urquhart et al. 1994). The nomenclature of STRs is determined by measuring the total length of the DNA fragment and dividing by the size of the repeat unit, e.g. four for tetranucleotide repeats (the most common type of STR used for human STR identification). Whilst the assumption for tetranucleotide repeats is that the allele consists only of four base pairs (bp) repeated in tandem, it does not always hold true for all alleles of all loci. For example, D21S11 is a complex STR that contains a 6bp insertion which is measured as 1.5 repeats and creates variant alleles, e.g. 32.2. However, the length of the allele is still calculated as 32 repeats and two additional base pairs even though it may be 31 repeats plus a 6bp insertion. This difference affects neither the end DNA profile result nor the subsequent comparison process.

STR sites are given a designation to represent the site on the DNA molecule where they are located. This generally takes the form of D (for DNA), the number of the chromosome the site is found on, S (designating single, if the sequence only occurs once in the genome) and lastly how many sites had been found when the site was described. In the case of D21S11, the designator indicates that the locus is a DNA site on chromosome 21, only occurs once in the genome and is the 11th site described on chromosome 21.

The most commonly analysed STRs in forensic science are the tetranucleotide repeats whose core repeat sequence consists of four base pairs. Tetranucleotides are considered highly suitable as they are easier to distinguish on a gel (Edwards et al. 1991) than the shorter di and trinucleotide repeats. Some pentanucleotide repeats have also been validated for forensic analysis use.

Chapter 2 – DNA Typing in Criminal Investigations 27 STRs have one further and major advantage over longer VNTRs in that PCR analysis of multiple STR loci in one reaction vessel can be achieved by the use of a range of primers that function under the same conditions, thereby dramatically reducing the total analysing time and cost (Oldroyd et al. 1995). This type of reaction is termed a ‘multiplex’ PCR reaction. Previously, if ten STR sites were required for analysis, then ten separate reactions had to be conducted. All ten sites can now be analysed at once using multiplex PCR. The greater the number of loci, the greater the discriminating power of the system and the more powerful the system is at differentiating between individuals. STR multiplex PCR systems have been the DNA analysis method of choice in Australian forensic laboratories since the late 1990s.

Validation of DNA analysis methods

In Australia, the legal test for admissibility of scientific evidence varies between jurisdictions. However, in general terms it is expected that any scientific method be well established and have withstood scientific review prior to its application in a criminal case.

The requirement for a thorough validation process is also outlined in laboratory accreditation systems. Australian laboratories undergo accreditation via the National Association of Testing Authorities (NATA). NATA accreditation ensures laboratories adhere to a very high standard of practice; all NATA-accredited laboratories are regularly inspected to ensure compliance. For further information regarding NATA accreditation refer to their website:

http://www.nata.asn.auH .H

Any new method or technology introduced into the laboratory must undergo a validation study prior to use in forensic casework according to NATA accreditation standards. The validation process must at least demonstrate that the method or technology operates as designed within the specific laboratory, but may include other studies depending on the method being introduced.

In terms of DNA testing kits and methodology, commercially available kits should have undergone an extensive validation process prior to release. Validation of STR DNA analysis methods includes studies of standard specimens, consistency, population studies, reproducibility, mixed specimen studies, environmental studies, matrix studies, non-probative evidence, non-human studies and minimum sample studies. These studies should be

Chapter 2 – DNA Typing in Criminal Investigations 28 completed by the manufacturer prior to release of the PCR kit.

Before implementation in a laboratory, the STR PCR method should also undergo further ‘in- house’ verification to ensure the method operates as designed within the specific laboratory. Once all validation/verification procedures have been completed, a correctly used method of analysis can be relied upon in courts of law.

Chapter 2 – DNA Typing in Criminal Investigations 29

'simple', where core sequences are identical in sequence and length i.e. there is only one core sequence repeated X number of times

'compound', where there are two or more different core repeat units that are repeated X number of times

'complex', where the repeat units have a variable length and may be separated via intermediatory sequences

Figure 2.4 Illustration of the different types of Short Tandem Repeats (often called STRs).

The three types of STRs are displayed: simple, compound and complex.

Chapter 2 – DNA Typing in Criminal Investigations 30 DNA typing in Australia

The process of DNA typing generally involves the following steps: extraction, quantitation, amplification, electrophoresis and interpretation. Robotic technology is now commonly utilised at the various stages of the DNA typing processes in most Australian jurisdictions. Whilst robotic technology has allowed for higher sample throughput and a reduction in the risk of contamination, it is important to note that this technology allows for automation of only the liquid handling parts of the process; the remaining processing requires manual interaction. The following provides a brief overview of each of these steps and outlines the various methods used in Australian forensic laboratories for achieving each of the processes. It should be noted that this is not an exhaustive description of all the methods available for DNA analysis.

DNA extraction

As mentioned, nuclear DNA is present in the nuclei of cells contained in biological material (such as blood, semen and saliva). Inside the cells other cellular material such as mitochondria (see Figure 2.1) may be found. The DNA extraction process liberates the DNA from the cell and removes all other cellular material and fluid, leaving the purified DNA in solution.

There are a number of methods for extracting DNA which can generally be separated into organic and non-organic methods. The type of biological material and the surface on which it is deposited will determine the best technique for extraction. Most laboratories however favour (but are not limited to) a particular method for routine analysis and will modify this depending on the biological material or item on which it is deposited. For example, extraction of DNA from spermatozoa cells will first require steps to separate spermatozoa cells from non-spermatozoa cells. Likewise a cigarette butt with saliva deposited may require special attention due to chemicals in the cigarette which inhibit the PCR process. Whilst some of the methods described below are no longer used frequently in Australian jurisdictions, their inclusion is designed to be of use to readers dealing with historical cases.

Phenol-chloroform method

The basic organic extraction procedure uses a detergent (sodium dodecyl sulphate) to rupture the cells’ membranes and a protease which denatures the proteins to elicit the DNA from the nucleus of the cell. The organic solvent phenol-chloroform then removes the proteins and

Chapter 2 – DNA Typing in Criminal Investigations 31 other cellular material, leaving the soluble DNA in the organic aqueous portion of the extraction solution. This technique is particularly beneficial in the removal of organic inhibitors and degraded samples found in (muscle) tissue from burnt remains, leaving good- quality, clean DNA. The procedure, however, uses hazardous chemicals (phenol-chloroform) and can be time-consuming.

Chelex method

The basic chelex method consists of boiling the sample in a 5% chelex solution. Cellular material, including proteins, are broken down during the boiling process and the DNA is released, denatured in solution (Walsh et al. 1991). Chelex is a commercially available product made up of chelating resin beads (chelate is derived from the Greek word ‘claw’ due to its ability to grasp (bind) metal ions). The beads have a high affinity for polyvalent metals ions (e.g. magnesium) that would otherwise degrade the DNA. The chelex method is simple and quick and is more cost-effective than organic extractions. There are also fewer steps required, which reduces the chance of contamination or transfer errors. However, the chelex resin beads must be completely removed from the sample for amplification as the beads inhibit the PCR process.

FTA paper analysis

FTA paper was developed by Leigh Burgoyne at Flinders University, South Australia. It is a cellulose-based paper which is impregnated with chemicals that lyse (break open) the cells on contact, bind the DNA to the paper and prevent the growth of mould and fungus which would otherwise degrade the DNA (Burgoyne 1996, Rogers and Burgoyne 1997). FTA paper is most useful in the collection of reference person samples and is used in most of the jurisdictions in Australia for the collection of reference samples (saliva or blood) in relation to forensic procedures legislation.

The FTA paper is particularly suited to Australian conditions as it provides a stable method for storing and transporting the DNA over long and often environmentally harsh distances. The paper also lends itself towards automation. Blood can be placed directly onto the card or saliva can be transferred via a circular foam swab. The paper is then washed to remove

Chapter 2 – DNA Typing in Criminal Investigations 32 inhibitors and placed directly into the PCR reaction tube. There is no requirement to quantitate the sample due to the punching of a predetermined amount of card. An alternative to this method is to perform a chelex extraction on the card and this has been shown to be successful (Lorente et al. 1998).

Magnetic bead extraction

Commonly used in many Australian jurisdictions at the time of writing, magnetic beads can be used to bind DNA, in a similar manner to FTA paper. An initial extraction process is used which is specifically tailored for the type of biological material to be extracted; this improves the amount of DNA to be yielded from the biological material by lysing the cells to release the DNA. The magnetic bead resin is then added to the extraction solution which then binds the DNA. Tubes containing the DNA in solution are located next to a magnet to allow washing of the samples to remove any inhibitors and unwanted cellular material. With the use of chemicals, the purified DNA can then be removed from the beads into solution ready for the PCR. The advantages of magnetic bead technology include that it can be used in automated systems, it provides for a specific quantity of DNA, thereby eliminating the need for quantitation, the system is highly efficient at binding small amounts of DNA and uses no harmful organic solvents.

DNA purification

After the initial extraction process has been completed, the DNA may be further purified and concentrated to increase the chance of obtaining a DNA profile. This is particularly useful in trace DNA samples where there may be few copies of the DNA in a large volume and where there may be an increased number of inhibitors not fully removed by the initial extraction process. The extraction solution containing the DNA is placed into a small column/tube containing a membrane. Buffer solution is added and the column is centrifuged. The membrane allows inhibitors to pass whilst retaining the DNA. In this way, the DNA can be cleaned and a set volume retained for further analysis.

Chapter 2 – DNA Typing in Criminal Investigations 33 Differential extraction

A modified version of the standard extraction process can be used for sexual assault cases where samples are thought to contain mixtures of seminal and other biological material. This process is designed to exploit the difference in structure between spermatozoa and other biological material in order to separate out the spermatozoa from the remainder of the biological material. The sample is subjected to incubation in buffer containing protinase K, followed by centrifugation; as a result of this process spermatozoa cells, if present in the sample, form a pellet at the base of the tube with the remainder of the biological material remaining in solution. The spermatozoa-enriched pellet and spermatozoa-depleted solution are subsequently separated and the remaining steps in the DNA process performed separately on both samples; the result of this analysis is that two separate DNA profile results are obtained, one from the spermatozoa-enriched sample and the other from the spermatozoa- depleted fraction of the sample.

DNA quantitation

As the PCR process operates optimally given a defined concentration of DNA, it is useful (although not essential) to quantitate the DNA after it has been extracted. Most current PCR systems suggest approximately 1-2ng of DNA to ensure a complete DNA profile. However, much smaller amounts of DNA are routinely analysed. If too little DNA is added, no profile or a partial profile may be obtained. If too much DNA is added, the sample will be over- amplified, resulting in a loss of resolution due to artifacts; at the extreme, no result will be achieved due to an increased level of inhibitors. The main function of the quantitation process is to provide the scientist with some approximate information regarding the concentration of the DNA in the sample in order to maximise the chance of obtaining a DNA profile. There are numerous methods of DNA quantitation – the first of the methods outlined below is the predominant method in use by forensic laboratories in Australia at the time of writing; the other two methods have been used over the last decade and may be of relevance to historical, or so called cold cases.

Chapter 2 – DNA Typing in Criminal Investigations 34 Real-time PCR quantitation

This method of DNA quantitation utilises the process of PCR and is called ‘real-time PCR quantitation’. Real-time PCR quantitation operates in the same manner as standard PCR, amplifying fragments of DNA using primers with fluorescent tags attached. The major difference is that the detection of the fluorescence is calculated in real-time whilst the amplification process is occurring, and without the need to resolve the fragments using gel electrophoresis. The application of real-time PCR for the quantitation of DNA is based on the amount of fluorescent being directly proportional to the amount of DNA being amplified (Richard et al. 2003). Standards of known concentration are amplified and a standard curve created. The concentration of the DNA is then calculated by comparison to the standard curve. The main benefit is that the procedure is capable of accurately detecting samples even with very low levels of DNA (Alonso et al. 2003). Unlike previously used methods, samples displaying a negative result (no fluorescence, after a particular number of cycles of heating and cooling) have no detectable DNA and would not proceed on to STR PCR analysis. This saves valuable time and resources.

Slot blot quantitation

A membrane detection method commonly used for quantitating the DNA prior to the introduction of RT-PCR was the Quantiblot Human DNA Quantitation Kit (Applied Biosystems). The procedure can be summarised as follows. The DNA binds to a membrane, a labelled probe is added which binds to a human/higher primate specific site on the DNA molecule (D17Z1), and an enzyme is then added which binds to the probe. This structure of bound DNA, probe and enzyme can be visualised by either colorimetric or chemiluminescent detection methods. The chemiluminescent detection method is captured on autoradiography film for visualisation while the colorimetric detection forms a blue precipitate on the membrane after the addition of a reagent (chromogen TMB). The intensity of the light emission captured on the radiographic film or the intensity of the blue-coloured precipitate is directly proportional to the amount of DNA present (Walsh et al. 1992).

Bands produced are compared against bands produced from DNA of known amounts. The method is prone to subjective assessment of the comparison and provides approximate concentration only, particularly when the DNA is in a high concentration. Additionally, a

Chapter 2 – DNA Typing in Criminal Investigations 35 negative result in this type of quantitation system does not mean that a DNA profile cannot be produced using PCR because the quantitation method cannot detect very low levels of DNA. However, the method does act as a guide to the scientist to approximate the concentration of the DNA obtained from the extraction process, allowing a subjective assessment of the amount of DNA to be added to the PCR reaction.

Gel quantitation

DNA can also be quantitated using gel electrophoresis (explained below), followed by various techniques to visualise the DNA so that the intensity of the DNA seen can be compared to known amounts in order to obtain an approximate concentration level of the DNA sample. One such method of visualisation is SYBR Green, a very sensitive dye, which, when mixed with DNA, fluoresces. The amount of fluorescence is directly proportional to the amount of DNA. The technique is simple and relatively inexpensive, but does not distinguish between human and non-human DNA. This method can detect very low concentrations of DNA and, unlike other methods, also provides information as to the extent of degradation of the DNA sample.

DNA amplification

Amplification of the DNA using the polymerase chain reaction (PCR) (Saiki et al. 1985) has revolutionised DNA analysis techniques. PCR is a method of copying the DNA by mimicking the biological processes in the cell using the structure of the double-stranded DNA as described previously. Using this copying process, results can theoretically be obtained from a single cell.

Current STR techniques ‘multiplex’ the PCR reaction. Multiplexing allows multiple primers to be placed into one reaction vessel, thereby copying numerous sites simultaneously. This requires less template DNA, an important advantage when analysing trace DNA samples. Primers can also be tagged with a fluorescent tag so that the segments of DNA can be detected using gel electrophoresis.

Chapter 2 – DNA Typing in Criminal Investigations 36

Figure 2.5 Image of the results of real-time PCR analysis for DNA quantitation. Replication cycle number of DNA is shown on the horizontal axis and amount of DNA detected (as a log) is shown on the vertical. The top image displays samples with good quality DNA, the curve illustrating an exponential increase in DNA. The bottom image displays samples with poor quality DNA, where no curve is produced as no significant increase in DNA is detected.

Chapter 2 – DNA Typing in Criminal Investigations 37 Gel electrophoresis

The PCR reaction produces multiple DNA fragments of varying sizes depending on the number of STR units in the allele. These fragments are then separated via a gel that sorts the fragments according to size. These fragments can then be compared against DNA fragments of known length to determine the allele length. This process is called gel electrophoresis and can be performed in a slab gel or capillary (a long small tube filled with gel) format. Instruments used to conduct this separation in forensic laboratories include the ABI Prism® 310, 377, 3100 and 3700 instruments (Applied Biosystems).

The DNA fragments are moved through a gel due to the negative charge attached to the phosphate groups on the DNA backbone. The DNA is repelled by a negative electrode (cathode) and is attracted to the positive electrode (anode). The smaller the DNA fragment, the more easily it moves through the gel. Longer fragments get caught up in the gel and move more slowly. Towards the end of the gel the DNA fragments pass a laser point. The laser excites the fluorescent tags attached to the primers on the amplified DNA. The tag emits a light which is registered on a charged coupled device (CCD) camera. The CCD camera converts the light emission to an electronic signal which is sent to a computer for analysis.

If multiple sites are amplified in the PCR (multiplex), the PCR products will contain the fragments of multiple loci which may have overlapping allele sizes. These can be separated using different coloured dyes or fluorescent tags attached to the primers incorporated into each of the amplified fragments. The tags emit light of different wavelengths. The total light emitted by the dyes can be split into the respective wavelengths by the Prism. The Prism is a device which uses the different wavelengths to split the total light being emitted by the different fluorescent tags. This allows the resolution of overlapping allele sizes which were amplified in one reaction.

Information relayed to the computer is collected by ABI Genescan® Collection software (Applied Biosystems) in the form of raw data. These data are then analysed by ABI Genescan® Analysis software (Applied Biosystems) using a time to distance travelled ratio and compared against known DNA fragment standards to allocate a size in base pairs for each fragment. Information is presented in a table and amplified fragments can be seen as peaks on a chart called an electropherogram. The overlapping loci appear as different colours defined by the different fluorescent tags (attached to the primers). The colours can be visualised

Chapter 2 – DNA Typing in Criminal Investigations 38 together or separated.

The fragments (now with known size in base pairs) can then be further compared against fragments of known size for each of the common alleles for each locus using ABI Prism™ Genotyper® or GeneMapper® ID-X software (Applied Biosystems) (Figure 2.6). The fragments of known length are called a ‘ladder’ and are supplied in the commercial kit (used for the amplification process). Each allele can be given its respective allelic designation by comparing to the known fragments. All allelic designations for each sample are collectively called the DNA profile (Figure 2.7). The benefits of this system of analysis are that it is an accurate and efficient method, a large number of samples can be analysed in a short amount of time and it allows resolution of alleles differing by just one base pair.

In 1999 the forensic laboratories of Australia (at the Biology Specialist Advisory Group (BSAG)) agreed to conform to ten specific sites for DNA analysis. This was the first step in the path to a national DNA database. The laboratories agreed initially to use the loci analysed in the Applied Biosystems AmpFlSTR Profiler Plus™ kit as the standard for prosecution- based analysis in casework. The agreement had several benefits as the buying power of the combined laboratories meant they could negotiate a reduced cost and profiles from each of the jurisdictions could be compared on a national DNA database.

The AmpFlSTR Profiler Plus™ PCR Amplification kit (Applied Biosystems cat # 4303326) is a multiplex PCR system comprising nine STR loci and the sex-indicating marker Amelogenin. The amplified products produced range in size from 107bps to 341bps. The STR loci are tetranucleotide repeats (repeats of 4bp in size), making alleles easy to distinguish from each other. However, some of the loci (vWA, FGA, D21S11 and D18S51) have insertions within the alleles and therefore the length of the alleles may differ by two base pairs (and occasionally one). Utilising the nine STR loci in the Profiler Plus kit provides a high combined level of discrimination between individuals.

At the time of writing, the Profiler Plus™ kit has been in use by Australian jurisdictions for approximately twelve years, allowing some 500,000 DNA ‘Profiler Plus’ profiles to be added to the National Criminal Investigation DNA Database (NCIDD) from across the country. However, the Profiler Plus™ kit is not without its limitations; at nine STR loci plus the sex- indicating marker, it has fewer markers than other kits that have been available for some years. Consequently, some Australian jurisdictions have introduced the use of kits such as

Chapter 2 – DNA Typing in Criminal Investigations 39 Applied Biosystems’ Identifiler (16 sites) or Promega Corporation’s PowerPlex™16 (16 sites) into their arsenal of available tests. New Zealand is currently using the Second Generation Multiplex Plus (SGM Plus) Kit (11 sites). As many of the loci contained in the various kits are the same, some overseas laboratories interchange the kits depending on which kit is the most cost-effective at the time of purchase. Concordance studies have been conducted which have found little difference between results obtained from the kits (Budowle and Spreecher 2001, Budowle et al 2001a, Budowle et al 2001b).

A significant limitation of all of these kits however, is that they all involve the use of only four different coloured dyes to resolve overlapping alleles. Technological advances in recent years have resulted in the development of five dye kits, allowing the inclusion of additional DNA markers, and for better separation between the markers. This in turn increases the discriminatory power of the DNA test.

In 2009 BSAG again met to discuss the possibility of abandoning the use of the Profile Plus™ kit in favour of a more highly discriminating system. As was the case in 1999, consistency of approach was considered highly desirable due to a number of factors including buying power and ability to compare profiles between Australian jurisdictions on a national DNA database. In addition, events of the last decade such as the 2004 Boxing Day tsunami and 2008 Philippines ferry disaster had highlighted the need to consider an internationally consistent approach so that international jurisdictional data exchange and matching might be scientifically possible (refer to the section on the application of DNA profiling).

The core set of DNA markers used by the US is referred to as the combined DNA index system (CODIS) marker set. CODIS is the FBI’s database, used to compare DNA profiles within and between US jurisdictions pursuant to the DNA Identification Act of 1994. The CODIS set consists of thirteen markers plus a sex-indicating site. All nine of the Profiler Plus™ loci are contained within the CODIS marker set.

In Europe, forensic laboratories are moving toward the use of a set of sixteen core (fifteen markers plus a sex-indicating site) markers, following an agreement by the European Network of Forensic Science Institutes (ENFSI) (see Figure 2.8 for an example of the results obtained using the European marker set). This agreement arose as a result of an agenda set by ENFSI and the European DNA Profiling Group (EDNAP) to expand the core set of loci in use in Europe at the time in line with an agreed new set of core markers (see: Expansion of the

Chapter 2 – DNA Typing in Criminal Investigations 40 European Standard Set of DNA Database Loci—the Current Situation, Peter M. Schneider, Profiles in DNA, 2009). The new European set of DNA markers has full overlap with the set of markers considered standard by Interpol (the ‘Interpol Standard Set of Loci’), in addition to a high degree of overlap with the US CODIS marker set (eight markers plus gender in common).

A number of new 5-dye DNA profiling kits have been developed in response to the ENFSI agreement, each allowing co-amplification and detection of the same core set of fifteen markers plus the sex-indicating site (Table 2.3). The NGM kit (Applied Biosystems) and PowerPlex ESX/ESI series of kits (Promega) are two such products. Both companies continue to develop these kits, with recent amendments involving the addition of the highly variable (discriminating), and therefore highly desirable locus SE33. In addition to the higher discriminating power, the kits are more sensitive and robust than their predecessors due to the inclusion of the miniSTRs, allowing for more DNA profile information to be gleaned from a smaller amount of starting material, and from material of lower quality of the type often encountered in forensic casework.

Whilst still the subject of discussion, it is anticipated that Australia will move unanimously to the introduction of the European core set of fifteen markers plus the sex-indicating site within the next couple of years. The highly desirable locus SE33 is likely to be viewed as an essential requirement for Australian jurisdictions. The availability of options in the market place will allow each Australian jurisdiction some flexibility in terms of which kit is chosen, without compromising the ability to share data for database purposes. This move would give Australian jurisdictions the advantage of a highly discriminating testing environment, access to the latest technology and the ability to share data internationally, whilst maintaining a degree of access to a competitive market. The transition from Profiler Plus™ to the new European set of markers would not be without challenge; as the two systems share only six markers plus the gender determining site. It is expected that the database matching of DNA profiles generated by the new kits with those generated via the use of Profiler Plus™ will show a degree of coincidental matching. Matches of this type may necessitate the reprocessing of the sample with the new kit in order to properly investigate the possibility of coincidental matching prior to reporting.

Chapter 2 – DNA Typing in Criminal Investigations 41

Figure 2.6 Image of the comparison of alleles to a known control ladder in a Genotyper profile analysis. The ladder of common known alleles is on the top row. On the bottom row the first peak illustrates this is a female sample and the second peak shows that the individual has an allele 13 at that locus.

Chapter 2 – DNA Typing in Criminal Investigations 42

Figure 2.7 Image of the result obtained from a complete Genotyper profile analysis. The image displays the results for the nine loci and the sex indicating marker. One or two peaks can be seen at each locus. Boxes under the peak display the allele determined on top and the area of the peak underneath. The area is indicative of the amount of DNA present. The bottom row displays the internal standard of fragments of a known length, which are used to measure and determine the alleles.

Chapter 2 – DNA Typing in Criminal Investigations 43

Figure 2.8 Image of a profile obtained using a European loci DNA analysis kit. The image displays the results for the sixteen loci and the sex indicating marker. The box underneath the peak refers to the allele designation. Image courtesy of Gavin Turbett Dr Gavin Turbett

Forensic Biology Laboratory, Pathwest Laboratory Medicine WA.

Chapter 2 – DNA Typing in Criminal Investigations 44 Additionally it has been reported that a linkage may exist between two loci of the new European set: D12S391 and vWA (Budowle et al. 2010). This means that the two loci may not be independently inherited and the results for one locus may indicate the results for another. This linkage is particularly important where kinship is being determined, such as in missing persons or paternity cases. In these instances it is recommended that the results for one of the loci is not included in the interpretation and profile comparison.

Interpretation

Once DNA profiles are obtained from samples, they may be compared to one another. If differences are found, they are excluded as originating from the same source. If no differences are detected, they are said to have potentially come from the same source. A statistical weighting of this potential can then be calculated. The statistical analysis of DNA profiles is outside the scope of this Chapter.

Interpretation of DNA profiles is complicated by the fact that complete single-source DNA profiles are not always obtained. Often partial profiles (where results are not obtained at all sites) or mixtures (where alleles from more than one individual are found at one or more sites) are obtained. This is particularly prevalent in trace samples where there are low levels of DNA. A partial profile (see Figure 2.9) can still be compared to a DNA profile from a reference person sample using the loci for which reportable alleles are found and a statistical weighting given on the partial match. Reportable alleles are those which meet all criteria required by the laboratory for reporting the result in court. However, if very few alleles are reportable, the profile is of little evidentiary value as the statistical weighting is low. In some instances, a determination of male or female DNA may be all that can be stated and the DNA profile is said to be insufficient for comparison purposes.

If a mixed DNA profile is obtained (see Figure 2.10), it may be possible to separate the profile into major and minor components based on the intensity of the peak heights, which are approximately proportional to the amount of DNA present. Alternatively, statistical analysis can be performed taking into account each of the contributors to the mixture. If, however, the profile has multiple contributors identified by multiple alleles at each locus, statistical analysis may not be able to be conducted as most of the population would be unable to be excluded as having contributed to the profile. In these cases the profile is determined to be too

Chapter 2 – DNA Typing in Criminal Investigations 45 complex to provide any identification information (see Figure 2.11) and has no evidentiary value.

Testing of sub-optimal DNA samples

Limitations associated with standard STR profiling of the type performed via the Profiler Plus™ kit include the need for DNA fragments of suitable integrity, in other words, DNA that is not highly fragmented or degraded. However, the biological material present on exhibits of the kind frequently examined by forensic laboratories is often highly compromised in nature by virtue of exposure to less than ideal environmental conditions. The samples retrieved from such exhibits for DNA testing tend to contain degraded or poor quality DNA; thereby compromising attempts to amplify using conventional STR processes. Furthermore as discussed previously, samples commonly encountered in forensic testing often contain inhibitors arising from substrate material; the presence of inhibitors further reduces the likelihood of successful amplification, making it almost impossible to produce DNA results of any evidential value.

One option available and used by several Australian jurisdictions in order to increase the chance of successful amplification for sub-optimal DNA samples involves analysis of MiniSTRs. MiniSTRs are essentially smaller versions of STRs, where the size of the amplified fragment is considerably smaller than that produced via conventional STR techniques (~70-285 bps as opposed to (~100-350 bps). At the time of writing there are a number of commercially available multiplexes kits containing miniSTRs including Applied Biosystems’ AmpFℓSTR® MiniFiler™ amplification kit. This multiplex kit simultaneously amplifies 8 autosomal miniSTRs (D13S317, D7S820, D2S1338, D21S11, D16S539, D18S51, CSF1PO and FGA) and Amelogenin (see Figure 2.12). Some of these loci overlap with those in standard STR kits; as the same allelic designation is expected to be obtained regardless of which kit is used. Data generated via the AmpFℓSTR® MiniFiler™ amplification kit is suitable for database comparisons. A number of Australian jurisdictions are using the AmpFℓSTR® MiniFiler™ amplification kit in conjunction with other STR kits in order to maximize the availability of DNA information obtained from compromised samples.

Chapter 2 – DNA Typing in Criminal Investigations 46 Table 2.3 Details of the loci contained in the European marker set kits. Locus Chromosome No of Mutation Rate (%) location alleles Amelogenin X:p22.1-22.3 1 - Y:p11.2 1 - D1S1656 1q42 23 - D2S1338 2q35 23 0.12 D2S441 2p14 12 - D3S1358 3p21 20 0.12 D8S1179 824 13 0.14 D10S1248 10q26.3 13 - D12S391 12 20 - D16S539 16q24.1 11 0.11 D18S51 18q21.33 43 0.22 D19S433 19q12 16 0.11 D21S11 21q21 70 0.19 D22S1045 22q22.3 17 - FGA 4q28 69 0.28 SE33* 6q14 35 0.64 TH01 11p15.5 20 0.01 vWA 12p13.31 28 0.17

Chapter 2 – DNA Typing in Criminal Investigations 47

Figure 2.9 Image of a partial DNA profile. The image shows peaks appearing on the left hand side but getting smaller on the right hand side. Full results for four of the loci plus the sex indicating loci have been obtained, partial results have been obtained for three loci and no results were obtained for two loci.

Chapter 2 – DNA Typing in Criminal Investigations 48

Figure 2.10 Image of a mixed DNA profile – displaying major and minor components.

Major peaks are seen as tall and minor peaks are seen as short, small peaks.

Chapter 2 – DNA Typing in Criminal Investigations 49

Figure 2.11 Image of a mixed DNA profile – too complex to determine the origin. There are numerous peaks at each site which cannot be discriminated due to varying unrelated heights.

Chapter 2 – DNA Typing in Criminal Investigations 50

Figure 2.12 Images comparing the analysis of a crime scene sample using Profiler Plus

(above) and the minifiler kit (below). The Profile Plus results show virtually no information other than the sample is from a male. The minifiler results show virtually a full DNA profile.

Chapter 2 – DNA Typing in Criminal Investigations 51 Alternatives to standard nuclear DNA analysis

Mitochondrial DNA analysis

Within the cell and outside of the nucleus lie the mitochondria (see Figure 2.2). These are subcellular units whose function it is to provide the cell with energy. Within the mitochondria is a small circular DNA molecule, 16,599 base pairs in size. This DNA is termed ‘mitochondrial DNA’ (mtDNA). The mtDNA is inherited maternally via the female ovum alone, with none of the mtDNA coming from the paternal line; maternal relationships (kinship) can thus be determined by analysing the mtDNA.

There are over 1,000 copies of the mtDNA in each cell. The abundance or high copy number of the mtDNA is useful in small or degraded samples such as hair shafts or skeletal remains. Its main application in forensic analysis is in analysing samples where nuclear DNA testing has or is unlikely to be successful (Isenberg 2002), although, depending on the status of the sample, results are not always obtained (Prieto et al. 2003). MtDNA, however, is time- consuming and costly, largely due to the many measures required to minimise contamination (as there are over 1,000 copies per cell). To minimise the occurrence of contamination, samples are analysed one at a time with thorough cleaning procedures between samples; however, this makes the analytical process lengthy. In a number of jurisdictions in the world, mtDNA is analysed twice and only concordant results are reported.

The mutation rates of mtDNA are five to ten times higher than nuclear DNA. There is a large non-coding region of the mtDNA called the D-loop, within which are two hypervariable regions called hypervariable region 1 (HVI) and hypervariable region 2 (HVII). In forensic analysis, these regions are most often analysed by determining the DNA sequence or code (Wilson et al. 2002) (Figure 2.13). As with nuclear DNA, the results produced from reference samples and crime scene samples are compared. In mtDNA analysis (unlike nuclear DNA analysis), if three or more differences are located, the samples are deemed not to have come from the same source. The reason for this is that mtDNA displays heteroplasmy, where more than one allele may be seen at a site in the sequence and in some cases an individual may display a different allele at a particular site depending on the biological material tested. This can lead to problems in interpretation when dealing with mixed samples (Eustachio 2002).

The statistical power of mtDNA is not as high as nuclear DNA. Many mtDNA profiles are

Chapter 2 – DNA Typing in Criminal Investigations 52 relatively common in certain populations and, due to the direct inheritance of mtDNA along the maternal line, the value of even rare typings is limited by the size of the available databases.

MtDNA analysis has been validated for use in forensic casework (Wilson et al. 1995, Holland and Parsons 1999, Isenberg 2002), and has been introduced into evidence into the Australian 1 criminal justice systemF F. In R v Rye, evidence of comparative statistical evaluation was excluded, but evidence of a count of previously accumulated different profiles was admitted into evidence. However, it is likely that the predominant use of mtDNA analysis will be in the identification of remains for coronial purposes (such as in missing persons or disaster victim identification cases) rather than as a routine analysis method for criminal casework.

Y chromosome analysis

Just as mtDNA can be used to analyse the maternal lineage, Y chromosome markers can be used to analyse paternal lineage and indeed, can be utilised for paternity testing. The Y chromosome is inherited from the paternal line only with no influence from the maternal line and therefore, like mtDNA, is not as variable as nuclear DNA. Recently, a number of STR markers have been developed and commercial kits made up of these markers are available (Butler et al. 2002, Redd et al. 2002, Sinha et al. 2003, Gusmao and Carracedo 2003, Sinha et al. 2003 and Kayser et al. 2004). The main forensic focus of Y chromosome analysis is to be able to discriminate male DNA in the presence of excessive female biological material. This is particularly useful in sexual assault cases or in cases where there are complex mixtures. Y chromosome STRs are analysed in the same way as nuclear STRs and numerous laboratories in Australia have validated and implemented either one of two commercially available kits: PowerPlex Y™ from Promega or Yfiler™ from Applied Biosystems (Figure 2.14). The major advantages of using commercial kits are the same as those for nuclear STR kits in that quality assurance of kit chemistry is maintained, reproducibility between laboratories when the same kits are used and availability of standard reference results in the kit ladders. The main issue with YSTR results is that the power of discrimination in the weighting of a match is not as high as those obtained from nuclear STR as the level of variation is lower. Y chromosome STR analysis has been introduced into evidence in the Australian criminal justice system.

1 R v Loeber [2004]HU NSWSC 293 UH (14 April 2004) (where the mtDNA profiling was performed by the FBI); R v Rye [2006] VSC 5 (6 April 2006).

Chapter 2 – DNA Typing in Criminal Investigations 53

Figure 2.13 Image of the results of mitochondrial DNA analysis. The top image displays the sequence of the DNA with each peak representing a nucleotide. The bottom image is of a comparison of a number of translated sequences with the differences highlighted. Image courtesy of Dr Dadna Hartman, Victorian Institute of Forensic Medicine.

Chapter 2 – DNA Typing in Criminal Investigations 54

Figure 2.14 Image of the results of Y chromosome analysis. The image displays the peaks representing the STR allele(s) found at that site. Image courtesy of Dr Gavin Turbett Forensic Biology Laboratory, Pathwest Laboratory Medicine WA.

Chapter 2 – DNA Typing in Criminal Investigations 55 Low Template DNA Analysis Samples containing low levels of DNA are routinely found in casework and are termed low template DNA (LT DNA). These samples can be analysed via the normal PCR processes as discussed earlier in this chapter. However, another alternative that has been used successfully is to increase the number of times the DNA undergoes the heating and cooling processes of denature, anneal and extend as described in the section on DNA amplification. As the DNA increases exponentially the number of times it undergoes each PCR cycle of heating and cooling, even one to two extra cycles, can have a significant impact on the amount of DNA eventually copied. This may provide the difference in obtaining a DNA profile. However, extending the number of cycles can lead to problems in increasing the chance of also co- amplifying contamination and significant changes to the laboratory work practices must be introduced in order to facilitate this type of non-standard protocol. LT DNA analysis is not currently offered by Australian laboratories due to the infrastructure requirements. NZ and the UK do offer a 34 cycle PCR analysis protocol and due to the strict laboratory conditions and methods this type of testing is expensive.

DNA evidence obtained as a result of LCN analysis has been used in court in a number of worldwide jurisdictions. Notable high profile cases include the murder of Peter Falconio in 2 the Australian outbackF F and the murder of the then Swedish foreign minister Anna Lindh. 3 Use of the technique in the UK was halted following the Omagh bombing verdict,F F where the prosecution’s LCN evidence linking the suspect Shaun Hoey to the bomb timers used in the attack was heavily criticized by the judge who expressed concern about the scientific validity of the LCN technique. The UK’s Home Office Forensic Science Regulator subsequently commissioned Professor Brian Caddy to undertake a review of the science of LCN; released in April 2008 the review concluded that the technique used by the UK’s Forensic Science 4 Service was “robust” and “fit for purpose”.F F Use of LCN technology in the UK has now resumed.

2 The Queen v Murdoch [2005] NTSC 76 (15 December 2005). 3 R v Hoey [2007] NICC 49 (20 December 2007). 4 Caddy B, Taylor GR, Linacre AMT (2008) A review of the science of low template DNA analysis. http://police.homeoffice.gov.uk/publications/operational-policing/Review_of_Low_Template_DNA_1.pdf

Chapter 2 – DNA Typing in Criminal Investigations 56 The application of DNA profiling

The application of DNA profiling has broadened to include not only the specific comparison of a suspect's DNA to DNA found at a crime scene, but also large-scale comparisons. DNA profiles from unsolved crimes are now placed onto DNA databases and compared to profiles from other unsolved crimes, suspects, convicted offenders and volunteers (Ayers et al. 2002). In order to conduct this type of matching, all jurisdictions in Australia have introduced ‘forensic procedure’ legislation which allows for DNA samples to be taken and for subsequent matching on a DNA database. CrimTrac, a Commonwealth Government initiative which was launched in June 2000, manages the National Criminal Investigation DNA Database (NCIDD). The NCIDD has numerous safeguards to protect the privacy concerns of the individuals whose DNA profiles are contained on the database. Legislation in various Australian jurisdictions also includes heavy penalties for the misuse of the information held on the DNA database. Importantly, each jurisdiction can only see links from the databases that are legally acceptable between both jurisdictions involved in the link.

DNA databases such as NCIDD have been used to provide valuable investigative information relating to a large number of unsolved crimes in many jurisdictions around the world. DNA database matching in the criminal context is also being performed on an international scale; in November 2006 Interpol announced that the 100th DNA hit had been recorded on its global DNA database.

In recent years the UK and USA have used DNA databases to conduct familial matching. Unlike conventional DNA matching whereby a hit is recorded as a result of an exact match between a person of interest and a crime stain profile, familial matching involves searching for a near match, the idea being that any person of interest identified through these means might be a close relative of the actual perpetrator. Yet to be performed in Australia, the use of databases in this manner is proving to be controversial with some arguing that it constitutes guilt by association.

DNA profiling has also been used effectively in DNA mass screens, where large numbers of volunteers are asked to provide a DNA sample for comparison to a specific unsolved crime. These samples are generally not placed on the database for matching against unsolved crime scene samples (most jurisdictions prohibit this from occurring). The suitability of an offence for a mass screen is determined by whether the profile obtained from the scene can be

Chapter 2 – DNA Typing in Criminal Investigations 57 specifically attributed to the offender (such as in the case of a sexual assault) and whether a discrete portion of the population can be identified (such as the country town in the mass DNA screen at Wee Waa, ). The mass DNA screen in Wee Waa was Australia's most highly publicised mass screen. The sexual assault occurred on 1 January 1999 and on Saturday 8 April 2000, a mass screen was commenced. A total of 495 DNA samples were collected and there were eight refusals. The screen resulted in the arrest of the offender who was later convicted of the offence.

DNA profiling has also been effective in Disaster Victim Identification (DVI) (Whitaker et al. 1995, Cash et al. 2003). These are incidents of multiple casualties (generally five or more) where the identification of the deceased is hampered, due to fire, dismemberment or other factors. DNA profiling was essential in the Bali bombing DVI incident in October 2002. Of the 202 persons deceased, 198 were positively identified. In addition, over 500 body parts were positively identified, most of which would not have been identified without the aid of DNA profiling. Other examples of the use of DNA profiling in DVI activities include the 2004 Boxing Day tsunami, the 2008 Philippines ferry disaster and, closer to home, the Black Saturday Victorian bushfires of 2009 where DNA analysis was instrumental in 114 cases for the purpose of victim identification.

DNA profiling has also been instrumental in the overturning of convictions through the work

of the Innocence Project. Initially established at the BenjaminHU N. Cardozo School of LawUH of

YeshivaHU University UH in New York City in 1992, there have been 252 post-conviction DNA 5 exonerations in United States history across 34 US states.F F Innocence Projects have subsequently been established in a number of Australian jurisdictions.

Assumptions / Limitations

It is important to note a number of key assumptions or limitations when considering the results of forensic DNA analysis and interpretation. The first of these is highlighted by a recent Victorian case in which a conviction was overturned due to the possibility of DNA 6 contamination at the point of sample collection.F F From a quality assurance point of view, it

5 http://www.innocenceproject.org/

6 See “Freed man Farah Jama angry over rape DNA bungle”, Milanda Rout, The Australian, 7th December 2009.

Chapter 2 – DNA Typing in Criminal Investigations 58 must be noted that the integrity of an exhibit depends largely upon the manner of its collection, packaging, storage and transport. The forensic laboratory is only able to account for the exhibit’s integrity for the duration of the laboratory’s holding of that exhibit.

Further assumptions arise in relation to the manner of sampling of an exhibit. Where an exhibit is sampled in relation to trace DNA, it is important to note that the scientist will target areas of the exhibit deemed most likely to provide a DNA result of value; other surfaces of the exhibit, if sampled, might give rise to different DNA results. Further, re-sampling of the same area of an exhibit might result in different results to those obtained first time; this is particularly likely where the exhibit housed very little DNA in the first instance, or where contributions of DNA from multiple sources are present. Some of the DNA collected in the first sampling event no longer remains on the exhibit; results will be dependent on the amount and nature of DNA present on the exhibit in the first place. It is also important to note that the examination strategy for a given exhibit or series of exhibits is dependent upon the contextual information provided to the laboratory. Additional or different information could have resulted in the selection of different sampling procedures, or indeed the examination of different exhibits.

A degree of variability also exists in relation to the DNA results obtained from a given sample if analysed several times. For example, one would expect a degree of variation in individual peak heights between analyses of the sample. It is important to note that differences of this nature would not affect the actual profiling results unless the sample consisted of low level contributions of DNA. Repeat analysis of low level DNA might fail to detect alleles in some samples; it is for this reason that laboratories utilize peak height threshold information to assist in the interpretation of DNA profiling results.

Lastly, in general, it is not possible to determine the exact time at which biological material was deposited onto an exhibit. In addition, it is often not possible to determine the mechanism that lead to the deposition of the biological material.

Legal challenges to DNA evidence

DNA evidence has been challenged on a number of grounds since its introduction to the Australian criminal justice system in 1989. Early challenges took the form of questions as to

Chapter 2 – DNA Typing in Criminal Investigations 59 7 the reliability of the technologyF F. Other challenges have been made regarding the manner in 8 which a laboratory interprets or modifies manufacturer guidelinesF F, or in relation to the failure by manufacturer to publish details of validation studies, including primer sequences, in 9 violation of accreditation requirements.F F Later, as DNA evidence became more prevalent in the criminal justice system, the challenge shifted to area of statistical analysis.

DNA evidence has been challenged both with and without success on the grounds that 10 prejudicial effect outweighed probative value.F F Heresay challenges have been successfully argued in Victoria such that when this issue is raised, all persons involved with processing 11 DNA samples are made available to the court.F F

Nowadays, in the experience of the authors, parties to legal proceedings generally accept DNA evidence without challenge to the technology used in generating the DNA profile, or to the appropriateness of the chosen statistical analysis, rather, doubt is cast over the manner of deposition of the DNA. DNA can be transferred legitimately as a result of events unrelated to a crime. Furthermore, biological material containing DNA is readily transferable between individuals and objects (direct transfer), and can be further transferred by the intermediary to another person or object (secondary transfer). A number of peer reviewed papers address this issue and provide a scientific framework in which assertions of these sorts can be assessed (for example, Goray 2010, Goray 2009).

7 Transcript of Proceedings, Police v Ricky Briggs (Children's Court (Criminal Jurisdiction), Magistrate O'Reilly, 19 November 2000); Transcript of Proceedings, R v Malcolm Andrew Prendergast and Kevin Roy Andrews (County Court, Judge Dee, 9 November 1998). 8 Hillier v R, ACT Court of Appeal, 2008; R v Farmer, NSW, 2008. 9 R v Gallagher [2002] NSWSC 462; R v Karger [2001] SASC 64 (29 March 2001). 10 R v Lucas [1992] 2 VR 109; R v Percerep [1993] 2 VR 109; R v Juric [2003] VSC 382 (13 August 2003). 11 R v Ryan [2002] VSCA 176 (1 November 2002)

Chapter 2 – DNA Typing in Criminal Investigations 60 Future directions

Whilst DNA analysis has progressed significantly over the last 20 years, research is ongoing with a view to increasing the efficiency and efficacy of DNA processing. Laboratories continue to look to improvements in the nuclear DNA process in order to be able to obtain results from degraded samples.

Laser microdissection systems

At the time of writing, a number of Australian laboratories are investigating the use of Laser Microdissection Systems in criminal casework; such systems allow efficient identification and precise isolation of desired cells within cell mixtures. The ability to separate out specific cell types from a background population of many cell types will increase the chances of obtaining single source DNA profiling results, thereby allowing analysts to interpret profiles with less difficulty and time. Of particular use in sexual assault cases where current methodology attempts to separate spermatozoa cells from complainant cells, often resulting in incomplete separation and mixed genotypes, laser microdissection enables spermatozoa cells to be selectively excised from the background mixture of biological material, thereby producing less ambiguity regarding the DNA profiling results. This technique is particularly promising for sexual assault samples that contain minute quantities of seminal material; rather than the dominant cellular population swamping out contributions from the minor contributor, selective excision and profiling of the spermatozoa cells should greatly improve the quality of genotypes obtained resulting in more straight forward interpretation of DNA results.

Laser microdissection also has the potential to be applied to samples other than those containing a mixture of semen and other biological material. Where a mixture of biological material consisting of either cells of the same type but from persons of different gender, or of cells of different types, the use of specific cell identification techniques such as histochemical stains and Fluorescence In-Situ Hybridisation (FISH) techniques should enable specific identification and excision of varying mixture components. As was the case in the sexual assault samples, this ability to separate out specific cell types will increase the chances of obtaining single source DNA profiling results, thereby allowing analysts to interpret profiles

Chapter 2 – DNA Typing in Criminal Investigations 61 with less difficulty and time.

Body fluid identification using mRNA analysis

As discussed previously, deoxyribonucleic acid (DNA) is present in the nucleus of most cells. In order to translate the DNA code to produce proteins, the cell uses messenger ribonucleic acid (mRNA), which passes through the nucleus wall, travels to the ribosome and the information contained used to produce proteins. As each cell has a specific function, the mRNA active in a particular cell may be different. The differences in the mRNA can be used to identify the body fluid type from which the DNA is obtained; such as blood, urine, semen and saliva. The mRNA is extracted in the same way as DNA. The extracted sample can be split with a portion going to DNA testing and the other portion going to mRNA testing. In this way the exact type of cell from which the DNA profile was obtained from can be determined. The mRNA aliquot must have the DNA removed prior to analysis to ensure the DNA does not interfere with the results. Once completed, the sample then undergoes a PCR process to target the variations in the mRNA code that would indicate the body fluid type. The New Zealand Government provider laboratory (Environmental Science Research) have developed a test called CellTyper 1 to identify saliva secretions, vaginal secretions, semen, menstrual blood and blood (Fleming and Harbison 2010a, Fleming and Harbison 2010b) (Figure 2.15). This test is expected to be released, after validation, at the end of 2010 for casework. Also in development is another kit CellTyper 2, to identify faeces, brain and skin epithelial cells.

Single nucleotide polymorphisms

Single nucleotide polymorphisms (SNPs) are the most common type of variation in the DNA and can occur at a rate of approximately 1 in every 1,000 base pairs (Cooper et al. 1985). SNPs are single base pair changes at a particular point on the DNA, such as an adenine to a thymine and are therefore biallelic markers (that is, they only have two alternatives, unlike STRs which have multiple alternatives). SNPs are commonly found in coding regions and therefore, when compared to other forms of genetic variation such as STRs (generally found in non-coding regions), have relatively low mutation rates. However, as SNPs do not

Chapter 2 – DNA Typing in Criminal Investigations 62 currently provide any information regarding the individual, unlike other types of variations (for example STRs), their location in the genome (coding verses non-coding) is irrelevant. SNPs also follow stable inheritance patterns.

Due to their biallelic nature, they have low discriminatory value, but when large numbers of SNPs are combined, the discriminatory value can equal that of STR multiple systems. It is thought that approximately 50 SNP loci are required to reach a comparable discriminatory value to that of current STR multiplex kits (Chakraborty et al. 1999). Approximately 1.6 million SNPs have been identified to date.

As the variation in SNPs is a single base pair, the target sequence for analysis is shorter than that for STRs. Therefore SNPs have the potential to increase the chance of a successful analysis of degraded DNA samples. The detection of SNPs is also relatively easy (Nicod and Largiader 2003) and can be combined with real-time PCR detection to produce fast and efficient analyses, without the need for gel electrophoresis. SNPs potentially offer a future routine DNA analysis system to replace or co-exist with STRs in forensic analysis.

Because of the dual advantages of being able to analyse even more degraded DNA, allied with fast (possibly real-time) analysis, there is considerable interest in SNPs. It seems inevitable that we will see systems emerge based on this technology. These may prove also to have the benefit of a much lower unit cost than the current technology. The timing of the emergence of such systems will be determined not by the science, but rather by commercial decisions; however, STR analysis will continue for some time as the value of SNPs is severely limited in mixtures arising from more than one individual. Hence there is a need to maintain STRs at some level.

Chapter 2 – DNA Typing in Criminal Investigations 63

Figure 2.15 Image of the results of mRNA analysis. The image displays positive results for the types of body secretions tested for using the CellTyper1 kit. In a typical crime scene sample only one or two of these peaks would be seen. Image courtesy of Rachel Fleming, Institute of Environmental Science Research, NZ.

Chapter 2 – DNA Typing in Criminal Investigations 64 Predictive DNA testing

Predictive DNA testing is a relatively new technique in which there is currently considerable research being conducted. The various methods under development use the information in the DNA code to predict the presence of certain physical features or ethnic background information of an individual. SNPs are usually targeted for testing in this type of research. Of possible physical descriptors (also called phenotype or phenotyping) eye colour, hair colour (particularly red hair) and height have been targeted. It is important to note that the information generated as a result of this type of testing applies to a group (or class character) and not to an individual. Therefore results produced indicate a percentage value that the individual would have a particular trait. Latest research indicates 7-8 genes have a strong association with eye (iris) colour, but this mainly relates to European populations as differences between the various shades of brown can be difficult to determine. An indicative result for hair colour can be obtained to separate hair colour into blond, brown, red and black to a predictive value of between 80-95%. Research looking at DNA code variations associated with height has identified 54 SNPs. However, this appears to account for only approximately 3.8% of the variation seen between individuals. Therefore the relevance of these results are still being assessed, as it appears that the markers are effected by environmental factors (Kayser and Schneider 2009). Emerging research is looking at skin pigmentation, the presence of freckles, baldness and facial dimensions (Decorte 2010).

Research has found that a predictive indication of the ethnic origin of an individual can also be achieved via SNP testing. Results to date indicate that information regarding a broad geographic region of origin can be determined such as European, North African, African- American etc. (Ossorio 2006). This type of testing has been used in court in American and significantly in the Madrid Bombing case where use of predictive testing revealed unknown 12 suspect remains to be of North African descent.F

The main issues with predictive DNA testing relate to the indicative nature of the results, for example consider the difficulties associated with predicting the ethnic origin of a perpetrator from a mixed ethnic background and also the ability of a perpetrator to change their appearance with ease; via coloured contact lenses, hair dye for example. There is currently no

12 Christopher Phillips, Lourdes Pretio, Manuel Fondevila, Antonio Salas, Antonio Gomez-Tato, Jose Alvarez- Dios, Antonio Alonso, Alejandro Blanco-Verea, Maria Brion, Marta Montesino, Angel Carracedo and Maria Victoria Lareu, ‘Ancestry Analysis in the 11-M Madrid Bomb Attack Investigation’ 4(8) PLoS One.

Chapter 2 – DNA Typing in Criminal Investigations 65 legislative framework for predictive DNA testing in Australia; current legislation is silent on the topic. It has been suggested that public consultation and wider debate would be required prior to the development of legislative reform expressly providing for predictive DNA 13 testing.F F At the time of writing, the only country to legislate for predictive DNA testing is the Netherlands, who amended the Dutch Code of Criminal Procedure to expressly provide 14 for testing of this nature in 2003.F F Whilst there is no legislative framework expressly providing for the use of predictive DNA testing in North America and the UK, various 15 forensic service providers in these jurisdictions do offer testing of this nature.F

Familial matching Familial searching involves the comparison of a crime scene profile to suspect and convicted offender profiles on a DNA database. The matching process looks for possible relatives to the crime scene profile and therefore generally uses a different search algorithm to direct match searching. Results of the familial searching produce a list of candidates, in decreasing order of likelihood, who may be related to the source of the crime scene sample of interest. As each successive generation shares its DNA with the next, ancestry or kinship can be traced through the generations. The closer the relative, the more similar the DNA code, i.e. mother and child will share 50% of their DNA. Siblings however may share 0-100% of their DNA. This is due to the variation in the random combination of alleles that can be inherited from the mother and father.

Familial searching can however occur inadvertently where the result of a direct match search provides an ‘almost’ match, with perhaps 1-2 mismatches. This level of match stringency would indicate the match has been made to a relative of the true source of the DNA profile. The main issues with familial matching are that the search process looks for an individual that is not associated with committing an offence (and not the actual perpetrator of the offence). Additionally, there are currently no accepted standards for familial searching or measurement of the weighting of a result. The method relies on the database being sufficiently large enough to contain possible relatives and depending on the stringency of the matching process,

13 Tom Sherman AO, Report of Independent Review of Part 1D of the Crimes Act 1914 – Forensic Procedures [41.111]. 14 Bert-Jaap Koops and Maurice Schellekens, ‘Forensic DNA Phenotyping: Regulatory Issues (2007) Tilberg University Legal Studies Working Paper No 005/2006, 11. 15 For examples, the UK’s Forensic Science Service provides an ethnic inference test and a red hair test.

Chapter 2 – DNA Typing in Criminal Investigations 66 a prohibitively large number of potential candidates may be generated. There are also some concerns that matches obtained as a result of this process may be unduly biased as they do not indicate an association with an offence, but may be used as such. Regardless, a number of international jurisdictions have implemented familial searching; whilst jurisdictional differences exist in terms of the underpinning legislative / policy framework, in general, the technique appears only to be used routinely in the most serious of crimes where conventional database techniques have failed to identify a person of interest. Use of familial matching has 16 been instrumental in identifying offenders in a number of casesF F and has been used successfully in courts overseas, both to convict an offender and exonerate a previously 17 convicted individual.F F Notably, familial matching is prohibited in Canada by the DNA Identification Act which provides for exact matching as a prerequisite for the reporting of a 18 putative offender’s identify.F

In Australia there is currently no legislative framework in any jurisdiction specifically relating to familial searching; the legislation is simply silent on the topic. Various reviews of jurisdictional legislation have indicated the need for legislative amendment prior to the introduction of familial searching and call for any change in this direction to be supported / 19 informed by public consultation and debate.F F However, an alternate view has been mooted in WA, where it has been suggested that the current legislative framework supports searching 20 of this kind.F F Potential exists for the development of a familial searching framework through delegated legislation in both NSW and the Northern Territory, where relevant instruments 21 give the executive arms of government the power to regulate expansion of DNA databases.F F As far as the authors are aware, at the time of writing, familial searching has not been judicially tested in Australia.

16 R v Reekers HC Auckland CRI 2008-029-658 [2010] NZHC 540 (20 April 2010); see also http://www.dnaforensics.com/FamilialSearches.aspx. 17 http://www.dnaforensics.com/FamilialSearches.aspx. 18 DNA Identification Act, S.C. 1998, c. 37, s. 6. 19 Victorian Parliament Law Reform Committee, Forensic Sampling and DNA Databases in Criminal Investigations, Report No 58 (2003 – 2004, 124; Australian Law Reform Commission, Essentially Yours: The Protection of Human Genetic Information in Australia, Report No 98 (2003) [39.60]); Tom Sherman AO, Report of Independent Review of Part 1D of the Crimes Act 1914 – Forensic Procedures [7.17]. 20 Criminal Investigation (Identifying people) Act 2002 (WA) Statutory Review. 21 s92(2)(j) Crimes (Forensic Procedures) Act 2000 (NSW) and s 147C Police Administration Act (NT).

Chapter 2 – DNA Typing in Criminal Investigations 67 Expert DNA Interpretation Systems A relatively recent development in the field of DNA is that of the expert interpretation system. Designed to streamline the process from data collection to assessment of the weight of evidence, these systems part automate many steps in the process previously performed manually, thereby providing a high throughput service and elevating the degree of objectivity associated with the process of DNA interpretation. At the time of writing there are a number of products on the market; one such product is Cybergenetic’s TrueAllele® DNA Mixture Interpretation software, which is currently in use in several US jurisdictions and was used to process samples as part of the Disaster Victim Identification efforts associated with the World Trade Centre disaster in New York. Collectively, the Australian jurisdictions are currently considering their position regarding potential adoption of expert DNA interpretation systems into routine casework.

Non-human DNA analysis

Another area of development is in non-human DNA analysis. This type of testing focuses on determining the species of a sample (e.g. when an unknown hair is found at a scene), parentage analysis (e.g. comparison of a regulated species alleged offspring to the alleged parents to determine if the animal was captive bred or illegally wild caught) and animal individualisation (e.g. to determine hairs found on a suspect are from a specific animal). The methods used are performed in a similar manner to human DNA analysis as described previously. DNA analysis tests to determine the species have generally favoured, but not limited to, sequence analysis of the mtDNA. This is due to a conserved region on the mtDNA that can be analysed in many different animal species. The sequence code is developed for the sample and then compared to known sequences so that a determination of the species can be made (Bellis et al. 2003). The most common sites currently used are Cytochrome b (cyt b) and Cytochrome oxidase I (COI) (Alacs et al. 2010, Ogden et al. 2009, Wilson-Wilde et al. 2010). The latter is the target sequence used by the Barcode of Life Initiative; whose aims are to sequence every living species and place it on the Barcode of Life Database (BOLD) (Stoeckle and Hebert 2008). BOLD can then be used by scientists to search sequences for identification purposes. There is some debate as to whether the COI marker is capable of this function (Moritz and Cicero 2004, Will and Rubinoff 2004).

Chapter 2 – DNA Typing in Criminal Investigations 68 Further research has looked at conducting DNA analysis to individualise animals of a specific species, such as dogs and cats, to determine from which dog a particular sample originated, in the same way we discriminate between humans (Fredholm and Wintero 1996). These types of tests can use STR analysis and operate in the same way as human STR analysis and are generally offered by private laboratories, universities or museums. STR animal DNA testing has been accepted into court (Menotti-Raymond et al. 1997, Clarke and Vandenberg 2010).

Other types of biological testing, such as single nucleotide polymorphisms, may also be used to determine animal individualisation within a species. These types of DNA tests are currently conducted in Australia by some universities and also by laboratories overseas; research to develop methods for use in prosecution-based laboratories is also occurring in Australia and can assist in resolving mixed samples (Tobe and Linacre 2010).

The ability to determine the animal of origin of a sample can be important in some criminal cases, such as where non-human hairs are found on a murder victim. Whilst hair analysis can be conducted to determine the animal, it would be beneficial to be able to compare via DNA analysis the DNA profile obtained from the hair to the DNA profile from a specific animal, say a dog belonging to the suspect. Non-human DNA analysis could also have implications in cases of illegal poaching and exportation of endangered species. One of the biggest issues is ensuring the tests used in court have undergone sufficient validation processes and results are backed by a robust database of known reference samples.

New Technologies

Technology is constantly improving. The future of DNA analysis lies, in part, in portable devices, in which lab-on-a-chip technology will bring the laboratory to the crime scene. Prototypes of these types of devices, where the chip is approximately the size of a palm, have already been displayed at conferences such at Pitcon 2003 and published data are also available (Heller 2002, Osafune and Baba 2004). A future scenario may be that a scientist locates a blood stain at a crime scene, analyses the sample at the scene in less than half an hour, uplinks the result to a remote DNA database, obtains a match after just a few seconds and provides the investigator with a possible identity of the perpetrator at the scene.

Chapter 2 – DNA Typing in Criminal Investigations 69 Matrix-assisted laser desorption-ionization time of flight mass spectrometry analysis (MALDI-TOF-MS)

Matrix-assisted laser desorption-ionization time of flight mass spectrometry (MALDI-TOF- MS) is a new technology available to analyse DNA. The technology can analyse various types of DNA analysis techniques including STRs. The system operates via a mass to charge ratio by separating the DNA ions in a vacuum environment. Essentially the technology operates by mixing the DNA with chemicals, spotting the solution onto a plate and air drying the mixture to form a crystallised structure, which prevents the DNA from breaking down in later phases. The plate is then placed into a vacuum environment and pulsed with a laser to ionize the DNA crystal matrix. The DNA ions enter a tube and travel to a detector where the mass of the ion is measured. STRs can be measured using the varying mass of the different lengths of repeat units in the allele variations. As the system measures mass it is extremely accurate and can detect base variations as the individual DNA bases have different masses. The method takes a few seconds per sample and therefore has the potential to have a much higher throughput than current gel electrophoresis separation techniques. Another variation of this technology is electrospray ionization time of flight mass spectrometry (ESI-TOF-MS). The technology is in the development stage and at time of writing is yet to be validated for use in casework.

Chapter 2 – DNA Typing in Criminal Investigations 70 Abbreviations

A adenine AB Applied Biosystems Bp base pair BSAG Biology Specialist Advisory Group C cytosine CCD charged coupled device DNA deoxyribonucleic acid DVI disaster victim identification G guanine kb kilobase (1000 base pairs) LCN low copy number mDNA mitochondrial DNA NATA National Association of Testing Authorities, Australia NCIDD National Criminal Intelligence DNA Database ng nanogram (one billionth of a gram) NIFS National Institute of Forensic Science PCR polymerase chain reaction pg picogram (one trillionth of a gram) RFLP restriction fragment length polymorphism SNP single nucleotide polymorphism STR short tandem repeat T thymine VNTR variable number tandem repeat

Chapter 2 – DNA Typing in Criminal Investigations 71 Glossary allele alternative at a site on the DNA molecule; one alternative is inherited from each parent. amino acid one of 20 or more different molecules which can combine together to make proteins. amplification a method of copying sections of the DNA exponentially. Can be targeted to specific sites on the DNA molecule. autosome chromosome not involved in sex determination. Humans have 22 pairs of autosomes. base pair a subunit of the double-stranded DNA structure, where two nucleotides are bonded together from each of the complementary DNA strands. There are over three million base pairs in the human genome. Also used as a measurement of the length of the DNA fragment. chelex commercially available resin bead used in the extraction of DNA from biological material. chromosome a discrete unit of the genetic material carrying genes and arranged into structures which can be visualised during cell division. CrimTrac Commonwealth Government agency responsible for the National Criminal Intelligence DNA Database (as well as other databases). degraded DNA DNA which has been broken into smaller pieces potentially hampering the chances of obtaining a DNA profile. Degradation can be caused by environmental factors (heat, humidity, ultraviolet light etc), bacteria, fungi, mould or chemicals. deletion a mutation where a portion of the DNA code has been removed from the sequence. denature separation of double-stranded DNA to produce two strands. DNA typing the process of DNA analysis to obtain a DNA profile. double helix the structure of the DNA. Two complementary strands of DNA bond together and twist to form a helix structure; this is called the double helix. electrophoresis a method of separating molecules based on size and charge.

Chapter 2 – DNA Typing in Criminal Investigations 72 Often used to separate DNA fragments of varying length. enzyme a protein which causes or increases a chemical reaction. gene a site on the DNA molecule. A sequence of the code inherited for which there is a functional product. genetic code the sequence of nucleotides on the DNA molecule. A group of three nucleotides provides the information for the production of a specific amino acid. genome the entire DNA molecule of an organism. genotype characterisation of alleles at a particular site. heterozygote where the two alleles possessed by an individual at a particular site are different. homozygote where the two alleles possessed by an individual at a particular site are identical. hybridisation the act of two complementary strands of DNA coming together to form a double-stranded molecule. Two complementary strands of DNA will hybridise together. hypervariable extremely variable. insertion a mutation where a sequence of code has been inserted into another sequence or where a sequence has been altered by the addition of an extra nucleotide or nucleotides. Low template DNA minute quantities of DNA in samples obtained from a surface or object. Can include saliva or skin cells on clothing, objects etc, extremely diluted blood or semen, see also trace DNA. locus site on the DNA molecule (plural: loci – pronunciation: loci). mitochondrial DNA DNA obtained from the mitochondria of the cell. mitochondrion a subcellular unit within the cell which provides the cell with energy (plural: mitochondria). molecule a chemical substance consisting of atoms bound together in a specific structure. multiplexing a method of amplifying multiple sites on the DNA molecule in one reaction vessel. nomenclature a system of naming. nuclear DNA DNA obtained from the nucleus of the cell. nucleotide one subunit of one strand of the DNA.

Chapter 2 – DNA Typing in Criminal Investigations 73 nucleus subcellular unit within animal and plant cells containing the chromosomes. polymerase chain reaction see amplification. polymorphism DNA variations within a species or group (descriptive: polymorphic). primer synthesised sequences of DNA that are complementary to a specific segment of the DNA on either side of the area of interest and are used in the PCR process. profile one or more genotypes used for DNA comparison. protein complex organic molecules essential for life. Made up of amino acids. sex chromosomes chromosomes involved in the determination of the sex of an individual. Females possess two X chromosomes and males possess one X and one Y chromosome. stochastic effect imbalance in the amplification of two alleles competing during PCR, where one allele is preferentially amplified over the other. supernatant liquid portion of a solution which can be separated after centrifugation or precipitation. trace DNA see low template DNA.

Chapter 2 – DNA Typing in Criminal Investigations 74

Chapter 3

This chapter has been removed due to copyright restrictions.

This chapter is available online as:

Wilson‐Wilde L, Norman J, Robertson J, Sarre S, Georges A. (2010) Current issues in species identification for forensic science and the validity of using the cytochrome oxidase I (COI). Forensic Science Medicine and Pathology. 6:233‐241.

Links to this chapter:

Print http://webpac.canberra.edu.au/record=b1610336~S4 Online subscribed content (UC community) Online general public http://www.springerlink.com/content/glt70p334544u638/ DOI 10.1007/s12024‐010‐9172‐y

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia

Photo: Thylogale billardierii (Red-bellied ) species within the order

Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 91 Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia.

Abstract Informative nuclear markers have been shown many times to be useful in phylogenetic studies by resolving relationships between taxa. The order Diprotodontia often displays unclear separation of taxa at the family level and the phylogenetic structure at species level, particularly in the macropods, is also largely unresolved. To date, a survey of a broad range of nuclear markers for their phylogenetic utility has not been conducted in Diprotodonts.

One hundred and two sets of primers for nuclear markers obtained from the literature were evaluated for phylogenetic performance using a representative phylogeny of Diprotodonts. Eighteen nuclear markers were identified as having potential for further work and of those, nine were optimised and analysed using parsimony, likelihood and Bayesian approaches. These analyses indicate that these markers will be useful in future studies.

Introduction Accurate species identification is best underpinned by robust phylogenetic information of the species of concern and its close relatives. Traditionally, species identification and phylogenetic reconstruction is performed using morphological (traditional taxonomy) or mitochondrial DNA data. Whilst each source of data has its strengths, they can produce conflicting results (Hillis 1987, Shaffer et al. 1991). Morphological data integrates genomic differences but is subject to problems of moderation by the environment in which genes are expressed (phenotypic plasticity) and a correlated evolution of functionally linked traits (Georges et al. 1998). Mitochondrial DNA markers (mtDNA) are from a single maternally inherited unit and as such may mislead through independent lineage sorting and undetected instances of hybridisation and introgression (Brower 2006). These factors may affect the ability of the mitochondrial marker to correctly attribute the sample to the correct species.

Rapid advances in DNA technology has led to the characterisation of an astonishing number of potential nuclear DNA markers with the promise of new independent datasets to complement traditional approaches based on morphological and mtDNA data. The use of multiple marker sets is beneficial as limited gene sampling can lead to errors in phylogenetic

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 92 reconstruction (Page and Charleston 1997). Research has suggested that the more independent markers used, the more accurate the resulting phylogeny may be (Rokas and Caroll 2005). Coupling nuclear and mitochondrial markers also provides a more robust approach to establishing phylogenies and in turn provides a more robust method for the identification of species (Brower et al. 1996, Maddison 1997, Page 2000, Sites et al. 1996). However, prior to identifying appropriate nuclear markers, a thorough understanding of the phylogenetic utility of the markers should be known. Nuclear markers evolve more slowly than mitochondrial DNA and are less susceptible to base-composition bias but the amplification of the sequence information can be hampered by the presence of introns, duplicate copies of the sequence and a lack of preserved primer binding sites (Sanderson and Shaffer 2002). Potential markers must also be tested for independence and any incongruence between gene and species evolution (Page and Charleston 1997).

Identifying the most appropriate nuclear marker for use can be problematic; a scan of the literature produces hundreds of potential candidates. Lyons et al. (1997) developed 318 primers for conserved comparative anchor tagged sequences (CATS) and optimised these for amplification in domestic cats. Lyons et al. (1997) demonstrated that 79 of these CATS primers could also be used to amplify introns in numerous other eutherian mammals, including the Macropus eugenii (Tammar Wallaby). However, it is unknown whether the primers that were successful in the wallaby would also be amplifiable in other marsupial species. A broad study investigating the phylogenetic utility of a wide range of nuclear markers, taking into account the ease of amplification and sequencing, to determine the variation rates of the markers and their suitability for discrimination in Diprotodontia has not been conducted to date. In this study I review the 79 CATS in a broad range of Diprotodont species and a further 26 nuclear markers, chosen from a literature search for their phylogenetic utility, identifying nuclear markers that may be phylogenetically informative in Diprotodonts.

I report on potential nuclear markers obtained from the literature and investigate the phylogenetic utility of a selected subset of these in the order Diprotodontia as a mechanism to inform further studies in a broad range of areas including ecological studies, phylogenetics and species identification.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 93 Material and Methods Samples were obtained from vouchered and non-vouchered specimens, usually tissue (heart, liver), from various collections (Table 4.1). Taxon sampling was aimed at the creation of a test phylogeny to evaluate the performance of the nuclear markers. The test phylogeny and identification of known clade information (taken from the most recent phylogenetic study of Meredith et al. 2009) can be found in Figure 4.1. All tissue samples were extracted either using standard phenol chloroform procedures or Qiagen DNeasy® Blood and Tissue Kit (Qiagen part # 69506), using the animal tissue bench protocol. Negative controls were used in all extractions.

A total of 102 primer pairs (Table 4.2) were applied to a representative species from each of eight Diprotodont families to select markers for further optimisation and application across a broader range of species. The species selected for initial investigation were Phascolarctos cinereus (Phascolarctidae), Vombatus ursinus (Vombatidae), Trichosurus vulpecular (Pseudocheiridae), parvus (Burramyidae), Pseudocheirus peregrinus (Phalangeridae), breviceps (Petauridae), Bettongia tropica (Potoroidae) and Macropus eugenii (Macropodidae). Taxonomic designations follow Meredith et al. 2009.

PCRs were performed in a 25uL reaction using a master mix made up of 12.5uL Go Taq Green® Master Mix 2X (Promega cat # M712C), 0.75uL primer (Sigma Aldrich, refer Table 4.2 for primer sequences) and 0.3uL 25mM magnesium chloride (Qiagen) per sample. 12.5uL of master mix was added to a 96 well plate (cat # T323-96N), after which 12.5uL of genomic DNA was added to the appropriate wells. Samples were amplified under the following reaction conditions: initial incubation of 94oC 3 mins; followed by 40 cycles of 94oC 30 secs, 50-57oC 30 secs, 72oC 60 secs; and a final extension step of 72oC 7 mins. A PCR negative control was used in all reactions.

PCR products were examined on a 1.2% agarose gel with Hyperladder II (Bioline) and ethidium bromide staining. Results were categorised as either single band (S), multiple bands (M) (if clear detailing the number of bands – e.g. 2B equates to 2 bands observed) and no result (N). Primer sets producing the best results were identified and analysed further, including optimisation of amplification conditions (annealing temperature and reaction component concentrations). Where clear bands were detected in the presence of secondary

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 94 band(s), gel isolation was conducted and isolated bands re-amplified, using the same reaction conditions.

The following markers were identified for further optimisation and analysis and application to a broader range of Diprotodonts (Table 4.1): ApoB, EN2, GAPD, MYH2, ODC1, PENK1, IRBP, vWF and RAG1. This was based on the ease of amplification and amplification efficiency across the eight families tested and the presence of clear bands.

In some cases, new primers were designed using Genbank nuclear sequences for Marsupialia for reference. Sequences were aligned using MEGA version 4 (Tamura et al. 2007) to identify conserved regions. The conserved sequences were imported to Primer 3 (Rozen and Skaletsky 2000) and primers were identified using standard criteria. Candidate primer sets were compared to the aligned sequences, modified manually and forward and reverse primers developed (Table 4.3). All Primers used included M13 tails, forward: GTA AAA CGA CGG CCA GT, reverse: CAG GAA ACA GCT ATG AC.

Samples (Table 4.1) were amplified using the newly designed primers (Table 4.3) and the following primer pairs: ApoB, IRBP (2), Rag1 and vWF (Table 4.2). PCR products were examined on a 1.2% agarose gel with Hyperladder II (Bioline) and ethidium bromide staining prior to sequencing. All sequencing was performed in both directions by the Australian Genome Research Facility. Sequences were edited using SEQUENCHER version 4.1.4 (GeneCodes Corporation 1991-2002). Consensus sequences were aligned using the CLUSTAL module in MEGA version 4 (Tamura et al. 2007) and confirmed by eye. Overall composite mean distances were determined using Kimura-2-Parameters (K2P). Sequences were imported into Geneious (Biomatters Ltd, www.geneious.com) and maximum parsimony analysis (MP) was performed using PAUP* 4.0b10 (Swofford 2002) using default settings, but with gaps coded as missing. MP analysis identifies the tree topology with the smallest number of nucleotide changes as the final tree. The analysis utilised the heuristic search option (beginning with a starting tree and rearranging it to find a better tree), TBR (tree bisection-reconnection) branch swapping (a method by which it dissects the tree to rearrange it) and employed bootstrap analysis incorporating 10,000 replications (where data is resampled and results compared as an indicator of the confidence that can be placed on the results at that node) (Swofford 2002).

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 95 JModelTest (Posada 2008) was used to estimate the approximate model of sequence evolution, values for the gamma shape (the rate of amino acid substitutions varies according to the gamma distribution, which can vary between sites) and the number of invariant sites, with the final model selected using corrected Akaike Information Criterion (AIC) (a goodness of fit test for a sequence evolution model to the observed data). These figures were then used in the maximum likelihood analysis (ML). The ML analysis was performed using PHYML (Guindon and Gascuel 2003) and 1000 bootstrap replications. ML analysis identifies a tree that gives the highest likelihood of producing the observed sequence data and uses a specific substitution model (sequence evolution model as determined by AIC).

Bayesian analysis using Markov Chain Monte Carlo (MCMC) analysis of nucleotide sequence data was performed using MrBayes version 3.1.2 (Huelsenbeck and Ronquist 2001). Bayesian analysis is similar to maximum likelihood analysis in that it uses a likelihood function and a specific sequence evolution model. However, it also uses prior information, in this case all trees were rated a having equal prior probabilities. The MCMC analysis is a sampling and hypothesis testing method to converge on the posterior distribution of trees. Searches here incorporated 4 chains with trees sampling every 100 generations. Bayesian analysis produces a posterior probability in a distribution of trees given the prior distribution, the evolution model and the data. The final tree is deemed the tree most probable of the distribution of trees. Support values are the posterior probabilities for the given clades and were converted to percentages figures for ease of reading with MP and ML methods. All phylograms were manually rooted where required and condensed to only those nodes with supporting values above 50%.

The phylogenetic performance of markers was measured as the accuracy of recovering the test phylogeny (Figure 4.1). This method uses a parsimony framework by comparing the sum of the bootstrap percentage for each identified clade (as numbered in Figure 4.1) against the maximum possible sum of bootstrap percentages (Wild and Maddison 2008).

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 96 Table 4.1 List of Diprotodont species available for trialling nuclear markers for species identification, and their sample identification numbers.

Genus Species Common Name Sample No. Phascolarctos cinereus Koala M24360 Vombatus ursinus W5166 Trichosurus vulpecula Common Brush-tailed Possum M28611 Trichosurus caninus Mountain Brush-Tailed Possum M16324 Spilocuscus maculatus Spotted Cuscus GT149 Phalanger orientalis Grey Cuscus/ M19104 Burramys parvus Mountain M29305 nanus C31224 Petauroides volans M16313 Pseudocheirus peregrinus M16323 Gymnobelideus leadbeateri Leadbeaters Possum W5317 Petaurus breviceps M16819 Petaurus norfolcensis M16378 Tarsipes rostratus M55442 Acrobates pygmaeus M16432 Bettongia tropica Northern M28956 Aepyprymnus rufescens Rufous Bettong 48-01-596 Petrogale assimilis Allied Rock Wallaby M36007 Petrogale xanthopus Yellow-Footed Rock Wallaby M36009 Petrogale penicillata Brush Tailed Rock Wallaby W5316 Macropus rufus M36017 Macropus giganteus MKMG1 Macropus eugenii Tammar Wallaby 49857 Dendrolagus goodfellowi Goodfellow's Tree Kangaroo 53706 Isoodon macrourus Northern Brown Bandicoot M16380

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 97

Figure 4.1 Reconstructed phylogeny of representative Diprotodont species to be tested against nuclear markers. Phylogeny reconstruction is taken from Meredith et al. (2009). Known clades to be tested are numbered. Numbers 1 to 5 represent deeper clades and 6 to 20 represent shallow clades within the order.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 98 Table 4.2 Primer Information for the 102 nuclear markers screened in the study of phylogenetic performance in Diprotodonts. Primer information includes the primer name, marker acronym and the reference for the sequences for the forward and reverse primers.

Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym ADH2 ADH2for CAGGTGATAAAGTCATCCCG Lyons et al. (1997) ADH2rev TGTACTGGGAGAAGGTGCTG ADRA2A ADRA2Afor Lyons et al. (1997) GCACCTGTGCGCCATC ADRA2Arev CTTCTCGATGGAGATGAGCG ADRB1 ADRB1for CCTCTTCATCATGTCCCTGG Lyons et al. (1997) ADRB1rev TGACACACAGGGTCTCGATG ADRBK2 ADRBK2for ACCTTGGGTCTCATTATGGG Lyons et al. (1997) ADRBK2rev CTCACCTTCATAGCTTTCAAACC ALDH2 ALDH2for TCCCGCTCCTGATGCAAG Lyons et al. (1997) ALDH2rev CCCCGAGACTTGGCCCG ALDOB ALDOBfor AACACTGAAGAGAACCGCC Lyons et al. (1997) ALDOBrev GCCACTTCCCAAAGTCAAC ANT1 ANT1for TCTTCAAGTCTGATGGCCTG Lyons et al. (1997) ANT1rev TCACACTCTGGGCAATCATC APC APCfor CTCTTTGGAGCAGGGTTTG Lyons et al. (1997) APCrev ATGAAGGCCTCCAGGTAGG AP5 AP5120fwd AATGCCCCATTCCACACAGC DeBry (2001) AP5545rev GATCATGGTTTCCAGCCAGCAC ApoB (1) APOBF60 GGAGAAGCCAAYCTGGATTTCYT Amrine-Madsen et al. APOBR820 GTYGTCCCATCTAACTTATACTG (2003) ApoB (2) APOBF90 CAATTCCTGAAATGACTCTGCC Amrine-Madsen et al. APOBR881 GACARSGCTGTGGCYARCTTCARNCCT (2003) APOE APOEfor GCCATGCTGGGCCAG Lyons et al. (1997) APOErev TGAGGCGGGCCTGG APRT APRTfor CCCAGGCGTGGTATTCAG Lyons et al. (1997) APRTrev TGCGATGTAGTCGATGCG APT1 APT1for TCTGGACCCTCCTACCTCTG Lyons et al. (1997) APT1rev TTGTCTGTGTACTCCTTCCCTTC A2AB A2ABfor ASCCCTACTCNGTGCAGGCNACNG Stanhope et al. A2ABrev CTGTTGCAGTAGCCDATCCARAARAARA (1992) AYTG

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 99 Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym AR ARfor GAGACTCAGAGGAAGCAAGGAAAG Lyons et al. (1997) ARrev AACTGGCGCCGGGAG ARAF1 ARAF1for GGTACCGGGACTCAGGC Lyons et al. (1997) ARAF1rev AAGCTGTAGGGGTTCGGG C5 C5for ATGGGAAATTCAAGGCATTG Lyons et al. (1997) C5rev GGAAGCATTGTGAATGTCACC CA1 CA1for GAGCTTCACTTAGTTCACTGGAATT Lyons et al. (1997) CA1rev GTCAAAATTTGTGAATGGGG CA2 CA2for GTGTGACCTGGATTGTGCTC Lyons et al. (1997) CA2rev GTTCTTCAGTGGCTGAGCTG CAMK4 CAMK4for GAAAATGGGATTGTCCATCG Lyons et al. (1997) CAMK4rev GGGGTTCCACATACTGTCTTC CD40 CD40for CTTATCACCCAGATGATTGGG Lyons et al. (1997) CD40rev AGTTCAGTAAGGATAAGGACCCC CGA CGAfor GGAAAGTGTCAACTTTCAGGATG Lyons et al. (1997) CGArev TCCCACCTGCATCTAAGTCC CHAT CHATfor CCATTGTGCAGCAGTTTGG Lyons et al. (1997) CHATrev TGGAGTTGACAGGCAGGG COL3A1 COL3A1for CAAAAGGTGATGCTGGCC Lyons et al. (1997) COL3A1rev CAGGGAATCCTCGATGTCC CP CPfor CTAGGTCCTGTCATTTGGGC Lyons et al. (1997) CPrev TCTTTGGGGACAGTCCATTC CYP1A1 CYP1A1for GTCCCCAAAGGCCTGAAG Lyons et al. (1997) CYP1A1rev CATATGGCACAGATGACATTGG CYP1A2 CYP1A2for TCTCATGGAGCCTCATGTACC Lyons et al. (1997) CYP1A2rev GGGGGATGGTGAAGGG CYP2A CYP2Afor TTCGACTGGGTCTTCAAAGG Lyons et al. (1997) CYP2Arev ACATCTCATAGAGCTGCCCC CYP2D CYP2Dfor ATGGTGACCACCTCGACC Lyons et al. (1997) CYP2Drev CAGCACCGATGACAGGTTG DHFR DHFRfor ATCGTCGCCGTGTCCC Lyons et al. (1997) DHFRrev TGAGGTTGTGGTCATTCTTTGG DMP1(1) DEN12F GATGAAGACGACAGTGGAGATGACACCT Jansa et al. (2005) T DEN2R ATCTTGGCAATCATTGTCATC DMP1(2) DEN14F GAGAACAGCAAAAGCAAATCCCAAG Jansa et al. (2005) DEN13R ATCRCTGTCCTGAGGTCTAGACTC

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 100 Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym EGF EGFfor CCTCAGATGGGAAAATGTGTC Lyons et al. (1997) EGFrev ACACACCACCATGGAGGC EN2 EN2for CCCGAAAACCAAAGAAGAAG Lyons et al. (1997) EN2rev GTTCTGGAACCAAATCTTGATC EPO EPOfor CTCTGGGCCTCCCAGTC Lyons et al. (1997) EPOrev TGTCTGGGACAGTGATATTCTCA EVI2A EVI2Afor GGAAGTCTGTGAGGAAAACAAC Lyons et al. (1997) EVI2Arev CGCTTGCCTACTTGTTTTG F9 F9for GGCCAAAGAGGTATAATTCAGG Lyons et al. (1997) F9rev CAACATACTGCTTCCAAAATTCAG FGB FGBmamm-U CACAACGGCATGTTCTTCAGCAC Matocq et al. (2007) FGBmamm-L ACCCCAGTAGTATCTGCCGTTTGGATT FIB FIBfor AGAHAAYTGCTGCATCTTAGATG Gatesy (1997) FIBrev TTCRTATTTCATAATTTCTTC FIB7 FIB-B17L TCCCCAGTAGTATCTGCCATTAGGGTT Prychitko and Moore FIB-B17U GGAGAAAACAGGACAATGACAATTCAC (1997) FLT1 FLT1for CCCCTCAAAGCCATACTGAC Lyons et al. (1997) FLT1rev GGCTCATGAATTTGAAAGCG FOS FOSfor AGACTACGAGGCGTCATCC Lyons et al. (1997) FOSrev GTGGGGATAAAGTTGGCAC GLOBw DRST GGAGAAACAGATCATTTTAGC Wheeler et al. (2001) IR1 AGGTTCTTCACTGCCTCACC GABRA1 GABRA1for TCCAAGACACCACCACAGAC Lyons et al. (1997) GABRA1rev TTTTCCAGTACAGCATGAGGTC GAPD GAPDfor CCAAGGTCATCCATGACAAC Lyons et al. (1997) GAPDrev GCTTCACCACCTTCTTGATG GHR(1) GHR5for GGCRTTCATGAYAACTACAAACCTGACY Galewski et al. TC (2006) GHR6rev GAGGAGAGGAACCTTCTTTTTWTCAGGC GHR(2) GHR3for GACTTTATGCYCARGTRAG Galewski et al. GHR4rev CTYACYTGRGCATAAAAGTC (2006) GNA12 GNA12for AGTCAGACAACGAGAACCGC Lyons et al. (1997) GNA12rev GGTCCTTCTTGTTGAGGAAGAG GSN GSNfor CACCCCGAATTCCTAAAGG Lyons et al. (1997) GSNrev CCAGCCATTAGTGGAGGTC GSTA2 GSTA2for CGGGGCAGAATGGAGTC Lyons et al. (1997) GSTA2rev TTGAGAATGGCTCTGGTCTG

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 101 Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym GUSB GUSBfor GATTCGAACAGTGGCTGTCAC Lyons et al. (1997) GUSBrev GAGTAGGGATAGTGGCTGGTACG HBA1 HBA1for ATTCTTACCCTGGACCCAGC Lyons et al. (1997) HBA1rev GGTCACCAGCAGGCATTTAC HLR1 HLR1for CATGGAAAAGCTACTATACTGATTGC Lyons et al. (1997) HLR1rev GACAAATTTCACATCTTCCACATC HOXD HOXDfor AAAAAGCGCTGTCCCTACAC Lyons et al. (1997) HOXDrev TCATCCTACGGTTCTGGAAC HXB HXBfor ACTGGCCTTGCTCCTGG Lyons et al. (1997) HXBrev TCAGGTTCCCGATGGAGTAC IGF1R IGF1Rfor TGCTGTTTGAACTGATGCG Lyons et al. (1997) IGF1Rrev TCATGTGCGCGTAAGGC IGF2 IGF2for CGTGGCATCGTGGAAGAG Lyons et al. (1997) IGF2rev TGGAAGAACTTGCCCACG IL5 IL5RP1 TGCACTGAAGAAATCTTTCAGGG Hawken et al. (1999) IL5RP2 ACACCAAGAAACTCTTGCAGGTA IL8 IL8for GGCTGTGGCTCTCTTGGC Lyons et al. (1997) IL8rev CCCACTTTTCCTTGGGGTC INHBA INHBAfor GCCCTTGCTTTGGCTGAG Lyons et al. (1997) INHBArev ACCACTGACAGGTCACTGCC IRBP (1) IRBP CTAGTGATCTCTTATGAGCCC Krajewski et al. IRBP AGGTGCTCCGTGTCCTGAA (2004) IRBP (2) 119AF ATGGCCAAGGTCCTCTTGGATAACTACT Jansa and Voss 2000 GCTT 1313DIR CATCATCAAACCGCAGATAGCCCA IRBP (3) 746E1F CTCCACTGCCCTCCCATGTCT Jansa and Voss 2000 878FR AGCCTACATCCTCAAGAAGATGCG JUND JUNDfor GAGGAGCAGGAGGGCTTC Lyons et al. (1997) JUNDrev TATGAGCTCCCAGTCCCG KRAS1P KRAS1Pfor TGCCTTGACGATACAGCTAATTC Lyons et al. (1997) KRAS1Prev TCCTCTTGACCTGCTGTGTC LAMA1 LAMA1for GAGGGGTTTGCATCAACTG Lyons et al. (1997) LAMA1rev CCACAGGGTCACAGTTACAGG LCAT U749 GTGACAACCAGGGCATCC Robinson-Rechavi et L1210 TGTGTTATTGCTGAAGACCAT al. (2000) M6P/IGF2R M6P/IGF2R311F CTGTGCAGTTACACATGGGAAGC Killian et al. (2001) M6P/IGF2R1078R GGCATACTCAGTGATCCACTC

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 102 Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym MDH2 MDH254fwd GACCAGCTGGCCACACTCAC DeBry (2001) MDH2510rev GCGTCCACGAGGGAGAAGAC MOS MOSfor TGTTCCTGTGGCCATCAAG Lyons et al. (1997) MOSrev GGCCGTTAACSSCATCTAGGG MUC1 MUC1for CAACCAGAACACAGACCAGG Lyons et al. (1997) MUC1rev GTGAAGGCACAGTTCAGTCAG MYH2 MYH2for GAACACCAGCCTCATCAACC Lyons et al. (1997) MYH2rev TGGTGTCCTGCTCCTTCTTC ODC1 ODC1for TGGGGTGCCTCCAGAGAG Lyons et al. (1997) ODC1rev TGCTTTGGAATCATCAGTGGC Omegaglob G314 GGAATCATGGCAAGAAGGTG Wheeler et al. (2001) G424 CCGGAGGTGTTCAGTGGTATTTTC OTC OTCfor GAGGATCCTGTTAAACAATGCAG Lyons et al. (1997) OTCrev CATCCACTTCTTCTGGCTTTCT P4HB P4HBfor CCTGGAGGAGGAGATGACC Lyons et al. (1997) P4HBrev TGCTCCTGGCTCATCAGG PDGFA PDGFAfor GGGACGCGATGAGGACC Lyons et al. (1997) PDGFArev ACGGAGGAGAACAAAGACCG PENK PENKfor TGCACACTGGAATGTGAAGG Lyons et al. (1997) PENKrev AGCTCTTTTGCTTCGTCTTCC PGK-1 PGK-1-P1345 GTCTATGTCAATGATGCTTTTGG Colgan (1999) PGK-1-P1772 TGGCTTGGCCAGTCTTTGGCATTTTCA POLA POLAfor CCATGAGCAAACGCAAGG Lyons et al. (1997) POLArev GCTCACGTTCTTCTCAAAGTTTG PP Ppfor ACAAGGCGTTCACCAAGG Lyons et al. (1997) Pprev AACGAACCTCTTGCACATTTG PRKCA PRKCAfor GGAGGGGACCTCATGTACC Lyons et al. (1997) PRKCArev TCTCCCAGTCGATCCTCC PRM1 PRM1for ACCTGCTCACAGGTTGGCTG Queralt et al. (1995) PRM1rev CATTTATTGACAGGTGGCA PRNP PRNPfor TGGTGGCTGGGGACAG Lyons et al. (1997) PRNPrev GGCACTTCCCAGCATGTAG PYGL PYGLfor GTGAAAGACCTGAGCCAGC Lyons et al. (1997) PYGLrev TGTCCTTGGCACGAAGAAC RAG1 (1) RAG1for CAYTGYGAYATHGGNSSYGC Baker et al. (2004) RAG1rev TTRTGNGCRTTCATRAAYTTYTG

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 103 Marker Primer Name Primer Sequence - 5' to 3' Mammal reference Acronym RAG1 (2) F2204 GCTTCTGGCTCWGTCTACATYTGTAC Amrine-Madsen et al. R2749 AAACGCTGTGARTTGAACT (2003) RASA RASAfor ATGGACCAGAATACGAGGAGG Lyons et al. (1997) RASArev TGTAACCCACATCCATCCATC RHO RHOfor AGGGGAGGTCACTTTATAAGGG Lyons et al. (1997) RHOrev CCAGCATGGAGAACTGCC SI SIfor ACTTGGGGAATGTTCACAAGAG Lyons et al. (1997) SIrev GGTAAGTTAGAGCAGGAGTTGGC SST SSTfor CAGCACTCCGTCAGTTTCTGC Lyons et al. (1997) SSTrev AGCAGCTCTGCCAAGAAGTAC TCP1 TCP1for CTCTTGGTCCAGTTGGCTTG Lyons et al. (1997) TCP1rev AAGGGCTCCAGAGTGAACAG TGFB2 TGFB2for AGCGACGAGGAGTACTACGC Lyons et al. (1997) TGFB2rev TCACCAAATTGGAAGCATTC TGFB3 TGFB3for CTTGAGCAAGCTCAGGCTC Lyons et al. (1997) TGFB3rev CTGAGGACACATTGAATCGG TF TFfor CAGCTCCAAGGCAAGAAGTC Lyons et al. (1997) TFrev CCCCAGATCAAGTCCTCCTTG THY1 THY1for CCCATCCAGCATGAGTTCAG Lyons et al. (1997) THY1rev GAACCAGCAGGCTTATGCC TTR TTRfor TATTTGTGTCTGAGGCTGGC Lyons et al. (1997) TTRrev GACCATCAGAGGACACTTGG vWF vWFMF119 GACTTGGCYTTYCTSYTGGATGG Amrine-Madsen et al. vWFMR1140 TTGATCTCATCSGTRGCRGGATTGC (2003) WF Wffor TGTCAACCTCACCTGTGAAGCCTG Huchon et al. (1999) Wfrev TGCAGGACCAGGTCAGGAGCCTCTC ZFX ZFXfor TGGCCCAGATGGACATCC Lyons et al. (1997) ZFXrev AATGCTTTCCGGACTCATCG

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 104 Table 4.3 New primer sequences developed for the study of phylogenetic performance of nuclear markers in Diprotodonts. Primers designed using aligned available sequences from Genbank in order to increase the amplification products of PENK1, GAPD, EN2 and ODC1.

Marker Acronym Primer Name Primer Sequence PENK1 PENK1LWWF TGCACACTAGAATGTGAAGG PENK1LWWRA GAACTTCTTTGGAATAACTTTCTC PENK1LWWRB GAACTTCTTTCGAGTAATTTTCGC GAPD GAPDLWWFA TATGATAATTCCCTTAAGATTGTC GAPDLWWFB TACGACAACTCACTCAAGATCGTC GAPDLWWR TGTAACCATACTCATTGTCATACC EN2 EN2LWWF GTTTACTGCACCCGATATTC EN2LWWR TGAACAGCTAGCGTATTCTTG ODC1 ODC1LWWF TTAAATTCCATCTCTTAATTTTCC ODC1LWWR TAACATCAATATTCAGTTCCTTTG

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 105 Results Amplification efficiency was variable between the 102 primer pairs (Table 4.4). A total of eight primers produced single bands in all eight species tested (Figure 4.2C), eight primers produced single bands one to seven species tested, 18 primer sets produced no visible bands in any of the eight species tested (Figure 4.2A) and 66 produced multiple bands in one or more species tested (Figure 4.2B). Multiple band patterns tended to be variable fragment lengths as illustrated in Figure 4.2B. This is also true for some primer pairs where single bands were found in more than one species. Primers that produced an amplified product in Macropus eugenii in the study of Lyons et al. (1997) did not always produce an amplified product in any of the species tested in this study. Also, PCR success in Macropus eugenii was not often reproduced in other Diprotodont species.

Of the 102 primers initially tested, nine were deemed to be suitable for further optimisation work (ApoB, IRBP, GAPD, EN2, ODC1, MYH2, vWF, RAG1 and PENK1) under the criteria that they were easy to amplify under standard conditions, produced clear bands and produced results across a broad range of species. Seven markers were also found to be useful in a small range of species or in an individual species and would be useful for further optimisation outside of this study.

Initial sequence data sets of the eight diprotodonts tested were obtained for the nine markers listed above. Four new primer sets for PENK1, ODC1, MYH2 and GAPD were developed to increase the amplified fragment length and amount of sequence information obtained to provide additional data for a phylogenetic study in Diprotodonts (refer Table 4.3). After a comparison of initial sequence data only the second segment of the IRBP marker was amplified to avoid an apparently ambiguous section of sequence contained in segment 1 (also found in Meredith et al. 2009). Whilst MYH2 was initially identified for further optimisation work it was abandoned due to difficulties in isolating and amplifying a single fragment.

The final eight markers (ApoB, EN2, GAPD, ODC1, PENK1, IRBP, vWF and RAG1) amplified well in the representative phylogeny group (Figure 4.1), although the amplification efficiency of the primer pair for the marker PENK1 was generally limited to the possums. The product size range varied from 200 base pairs to 1285 base pairs. A summary of the average K2P distance representing the approximate level of variation across the Diprotodonts for the sequences obtained after optimisation and analysis ranged between 2.5% and 22.3%

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 106 (Table 4.5). Values for the best fit model of sequence evolution and gamma shape distribution as determined by jModelTest can be found in Table 4.5. The sequences generated from the OCD1 and Rag1 primer sets were found to have invariable sites (Table 4.5). Genbank accession numbers can be found in phylograms where data was obtained from Genbank. Phylograms from phylogenetic analyses using MP, ML and Bayesian analysis for each of the individual markers in the representative Diprotodont phylogeny can be found in Figures 4.3-4.10.

The phylogenetic performance of the individual markers as calculated in a parsimony framework when compared to the test phylogeny (Figure 4.1) found that all markers displayed greater resolution at the shallower nodes than in the deeper nodes (Figure 4.11). ApoB and IRBP displayed the greatest resolution at the deeper nodes and ApoB, vWF and PENK1 displayed the greatest resolution at the shallow nodes (Figure 4.11). ApoB illustrated the best overall performance against the test phylogeny.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 107 Table 4.4 Amplification results for each of the primers tested for phylogenetic performance in Diprotodonts. Results are presented as single band (S), multiple bands (M) (if clear detailing the number of bands – e.g. 2B) and no result (N). It is important to note that for some primer sets, whilst the presence of a single band is noted, the band may not be the same size in all individuals. ? denotes where a sample was not tested and * denotes where a result could not be obtained.

Marker Phascolarctus Vombatus Trichosurus Burramys Pseudocheirus Petaurus Bettongia Macropus Acronym cinereus ursinus vulpecula parvus peregrinus breviceps tropica eugenii ADH2 M N M M M (2 B) M N N ADRA2A S S S S S S S S ADRB1 S S S S N S S S ADRBK2 N N N N N N N N ALDH2 M N M (2B) S M S N N ALDOB S N S M M S N N ANT1 S N S S S S N N APC M N M M M M N N AP5 S N M(2B) N M(2B) N S N ApoB (1) N M M ? M M N ? ApoB (2) S S S ? S S S ? APOE N N N N N N N N APRT M S S S M M N N APT1 M N M M M M N N A2AB M S * M M M S S

AR M M M M M M M N ARAF1 M N M M M M M N C5 N N M M M M N N CA1 M N M M M M * N CA2 M S M M M S S N CAMK4 S N M M M M N N CD40 N N N N N N N N CGA N N N N N N N N CHAT S N M M M M (2B) N N COL3A1 S N M (2B) M M (2B) M (2B) N N CP M M M M M M N M CYP1A1 M N M M M M N M CYP1A2 M N M M M M N M CYP2A M N M M M M N M

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 108 Marker Phascolarctus Vombatus Trichosurus Burramys Pseudocheirus Petaurus Bettongia Macropus Acronym cinereus ursinus vulpecula parvus peregrinus breviceps tropica eugenii CYP2D M N M M M M N M DHFR N N N N N N N N DMP S S S S S S S S DMP1 N N S ? N N N ? EGF M M M M M M M M EN2 S S S M(2B) S S S S EPO M N M M M M N N EVI2A N N N N N N N N F9 M (3B) N M M M M N S FGB S S N ? N S N ? FIB S N N N N N N N FIB7 N N N N N N N N FLT1 N N S S S S N S FOS N S M M M M N M GLOBw M M M (2B) M (2B) M (2B) M (2B) S S GABRA1 N N N N N N N N GAPD S S S S S S S S

GHR (1) N N N N N N N N GHR (2) N N N N N N N N GNA12 N N N N N N N N GSN N N N M N M N N GSTA2 S N M M S M(2b) S S GUSB M S M M M M S S HBA1 M M M M M M N S HLR1 S N N S N S N S HOXD N N S N S N N N HXB M N M M M M N S IGF1R S N M(2B) S S S N N IGF2 M N M M M M N S IL5 S N S M N S N S IL8 M N M M M M N S INHBA M N M M M M N M IRBP (1) S S M M M M N M

IRBP (2) N N S ? N N N ?

JUND M N M M M M N M KRAS1P N N N N N N N N

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 109 Marker Phascolarctus Vombatus Trichosurus Burramys Pseudocheirus Petaurus Bettongia Macropus Acronym cinereus ursinus vulpecula parvus peregrinus breviceps tropica eugenii LAMA1 S S M M S M S S LCAT N N N S N S S S

M6P/IGF N N N M M M N S 2R MDH2 S N M M M M N M MOS S N M M S M N S MUC1 M(2B) S M(2B) M(2B) M(2B) S N M MYH2 M(2B) S M(2B) M(2B) M(2B) M(2B) S M(2B) ODC1 M S M(2B) M(2B) M(2B) M(2B) S S Omega S M M ? S S S ? OTC N N N N N N N N P4HB S N M M N S N N PDGFA M M M M M M N M PENK S N S S S S S S PGK-1 S N S S S S S S POLA N N N N N N N N PP S N M M M M N M PRKCA M N M S N M N M PRM1 N N N N N N N N PRNP S N M M S M N M PYGL S N M M M M N S RAG1 (1) N N N N N N N N RAG1 (2) S N S ? S S N ? RASA M N M M M M N S RHO M N M M M M N M SI S S M M M M N M SST S N M M M M N S TCP1 S M M M M M S M TGFB2 N N N N N N N N TGFB3 S N M M M M N M TF N N N N N N N N THY1 S N M M M M N M TTR S N M M M S S M WF M S M N M M S M

ZFX N N N N N N N N

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 110

(A) (B)

(C) (D)

Figure 4.2 Typical results obtained from the nuclear marker primer screen in Diprotodonts. (A) GNA12 primers, no results, (B) CYP1A1 primers, multiple bands and different length fragments, (C) ApoB primers, single bands in all samples tested, (D) MYH2 primers, clear double bands. Products were run with Bioline Hyperladder II (control fragments ranging from 50 to 2000 base pairs).

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 111 Table 4.5 Summary of results of nuclear marker analysis information. Listed is the product length in basepairs, the composite mean distance measured using K2P distances, results of sequence evolution models, gamma shape distribution and percent invariant as determined by jModelTest analysis, as found in the Diprotodont test phylogeny.

Marker Product Composite Sequence Gamma Percent Size Mean evolution shape invariant Distance model distribution ApoB 805 22.3% TIM3+G 0.7040 0 EN2 200 3.2% TPM3 equal 0 GAPD 400 2.5% K80+G 0.1960 0 IRBP 1285 9.0% HKY+G 0.3280 0 ODC1 200 3.2% HKY+I equal 0.4790 PENK1 450 4.8% TIM3+G 0.3070 0 RAG1 543 5.0% TIM1+I+G 0.8200 0.5200 vWF 1038 7.7% TPM3+G 0.4970 0

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 112

Figure 4.3 Phylogenetic analysis for the ApoB marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 113

Figure 4.4 Phylogenetic analysis for the EN2 marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented, NA indicates that the node was not resolved by the relevant phylogenetic analysis method.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 114

Figure 4.5 Phylogenetic analysis for the GAPD marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 115

Figure 4.6 Phylogenetic analysis for the IRBP marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 116

Figure 4.7 Phylogenetic analysis for the ODC1 marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented, NA indicates that the node was not resolved by the relevant phylogenetic analysis method.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 117

Figure 4.8 Phylogenetic analysis for the PENK1 marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented, NA indicates that the node was not resolved by the relevant phylogenetic analysis method.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 118

Figure 4.9 Phylogenetic analysis for the RAG1 marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 119

Figure 4.10 Phylogenetic analysis for the vWF marker in representative Diprotodont samples and an outgroup using Maximum Parsimony (above), Maximum Likelihood (left below) and Baysian (right below) approaches. Support values for each of the clades tested are presented.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 120

100 90 80 70 60 50 40 30 % Recovered 20 10 0 A POB EN2 GA PD IRBP ODC1 PENK1 RA G1 V WF Deep nodes Shallow nodes Total nodes

Figure 4.11 Illustration of the phylogenetic performance in Diprotodonts for each nuclear marker using a parsimony approach against the test phylogeny. Results are scored in a parsimony framework by comparing the sum of the bootstrap percentage for each identified clade against the maximum possible sum of bootstrap percentages. The percentage of total possible bootstrap values recovered for the test phylogeny for each marker is shown.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 121 Discussion The primer screen provided clear differences in the level of amplification efficiency of the 102 primer pairs. Analysis of the nine optimised primer pairs provided good sequence data across the test phylogeny samples with which to test the phylogenetic utility of the markers.

This study illustrates the importance of testing a broad range of primers and not relying solely on the results of other research to determine the choice of a marker. Primers in one laboratory may amplify samples, but in another laboratory with different equipment and resources, amplification may be more difficult than expected, producing variable results. However, whilst some markers analysed here did not amplify well across the Diprotodont order, they could have an application in a subset or single species where less optimisation may be required. Results here provide accurate information of phylogenetic performance in Diprotodontia regarding eight markers, including four new markers to inform future phylogenetic studies.

Evidence of non-repeatable results were found when comparing results to those reported in Lyons et al. (2005). Whilst Lyons illustrated that numerous markers (79) had amplified in Macropus eugenii, the same results were not found here. There are numerous explanations for the discrepancy between results previously reported and those obtained here. If processing large numbers of samples and primers, shorter gel runs may be used, causing problems in the gel resolution where an apparently single band is actually multiple bands of a similar size. The discrepancy could also be due to contamination, sample degradation or variations at the primer binding site between sub-populations. Screening a large number of markers provided valuable information about the amplification efficiency of these nuclear markers in Diprotodonts.

Whilst the primers identified for further analysis did not amplify in all species tested, it is conceivable that with a small amount of further optimisation work (such as in the case of EN2) or the use of various primer combinations (such as with the primers for the DMP1 marker) that results could be obtained in all of the Diprotodont species. These markers could also be useful in studying a single species or group of closely related species. However, sufficient data were collected in this study to characterise the markers in this order. Further work would not have added additional information to aid the conclusions of this study.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 122 The decision to redesign four of the primer sets obtained from the study of Lyons et al. (1997) was aimed at improving their use in the Diprotodonts by increasing amplification success rates and, where possible, extending the amount of sequence recovered particularly in the PENK1 marker. This was problematic for some markers, as there was little relevant existing sequence data to assist in targeting appropriate primer binding sites. With next generation sequencing and larger bioinformatics platforms, new data will be generated that could aid in the future development of these primers. The new primers resulted in varying success, probably due to primer mismatches found when amplifying across a broad range of species. The PENK1 primers amplified well but in a limited range of species, mainly focussed on the possums. This is consistent with the possible presence of a mutation at one of the primer binding sites, in the macropods. The kangaroo genome project will assist in the future to provide the sequence data to identify any change at the primer binding site in the macropods and facilitate the redesign of the primers to amplify a broader range of Diprotodonts.

ApoB displayed the highest composite mean of all markers characterised, indicating it was the most variable of the markers tested in the Diprotodonts. The two suborders were not resolved, although Vombatidae did group together with strong support. ApoB appeared to perform well at the family level with good resolution, indicating this marker may be useful for family or species level research. The exception to this was that Bettongia tropica did not group with Aepyprymnus rufescens as expected, but both species did group within the macropod clade. Of note is that Burramyidae grouped with the macropods but Phalangeridae did not, indicating that this relationship to the macropods may be at a deeper level beyond the ability of the marker to resolve. An alternate reason may be that Phalangeridae is more closely related to the possums than the macropods. A further study with additional samples is required to resolve this. This is investigated in Chapter 5.

The new EN2 primers improved amplification success rates and, with further optimisation, sequences could be obtained from additional species. EN2 displayed some exceptions to the test phylogeny with Vombatus ursinus grouping with Petaurus species and Phascolarctos cinereus grouping with Cercartetus nanus. The macropods grouped strongly together but formed a polytomy. The Trichosurus genus did not position as expected. EN2 appeared more useful at a species level discrimination.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 123 The positioning of the Trichosurus genus was also found in the analysis of sequence results for the GAPD marker, however all other species grouped as expected for GAPD. GAPD was found to be a more conserved marker and when available sequences aligned, from a very broad range of mammals, displayed considerable similarity at the two new primer binding areas targeted. As a result all Diprotodonts amplified well and yielded good sequence data, however the variation between the sequences was the lowest of the markers at 2.5%. This marker may therefore only be useful for discrimination at a higher order level.

IRBP resolved most families within the order but was not as informative at the deeper nodes as ApoB. Petaurus breviceps did not group as expected, instead sitting with the macropods. The two other Petauridae species did group together. Phalangeridae resolved well. The results indicated that IRBP would be useful for species and family level discrimination.

The new ODC1 primers were successful and sequences were obtained from all species tested. ODC1 displayed low resolution at the deeper nodes and the Diprotodonts presented as largely a polytomy. This could be due to the short length of sequence obtained. This may also be the reason that the groupings were not as expected against the test phylogeny. The primers for this marker may need to be redesigned to amplify a larger region in order to obtain more informative sequence data. This could be achieved once further sequence data is obtained from the Diprotodonts using next generation sequencing.

The new PENK1 primers amplified well in the possums but not in the macropods and would also benefit from further redesign work to adjust the primer binding site to a more conserved area. The marker would however be useful if investigating possum species or clades. Grouping was as expected compared to the test phylogeny.

RAG1 displayed less resolution than ApoB and IRBP. Whilst Phalangeridae and the macropods were resolved, not all grouping was in line with the test phylogeny. Petauroides volans and Pseudocheirus peregrinus were not resolved as a family as expected. Results indicate that RAG1 might be more informative at the species level. vWF provided more information in regards to resolving the test phylogeny, even though the two suborders were not resolved (no marker was able to resolve the deeper nodes). Good resolution at the family level was obtained. In addition the superfamily Petauroidae was

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 124 resolved as were the Macropodiformes. Phalangeroidae was also resolved but did not then group with Macropodiformes, indicating vWF as a useful marker for resolving relationships within the Diprotodont order.

Whilst the phylograms developed by the varying analytical methods produced generally the same topologies it is not unexpected that not all nodes were resolved by all methods. As the MP and ML methods are both discrete methods, albeit with slightly different processes for calculation, it is expected that these methods performed similarly and produced highly concordant topologies, resolving the same nodes with similar bootstrap values. The Bayesian approach uses additional information and a different analytical technique. It is important to note that the absence of resolution of a particular node may not be significant as the posterior probability is only true given the evolution model used, the prior information and the data presented. However, given the general support for the topologies found, some confidence can be asserted in the relative phylogenetic performance of these markers in this test group.

Studies such as this one can inform the progression of other research by providing information on the level of usefulness and phylogenetic performance of various markers and the identification of primer sets that could be developed further. Importantly, it can also identify primer sets that do not amplify well and would require considerable development work. Results here provide invaluable information regarding the amplification efficiency of 102 primer sets and the detailed phylogenetic performance of eight markers in Diprotodonts, including four new markers to add to the tool set for researchers. It is hoped that this research will inform future studies in the various species of Diprotodontia.

Chapter 4 – Evaluating the phylogenetic utility of nuclear markers in Diprotodontia 125

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia

Photo: Macropus robustus (Eastern ) species within the order

Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 126 Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia Abstract Species identification for forensic purposes must be underpinned by robust phylogenetic information. To date, a phylogeny containing a large number of taxa generated from nuclear and mitochondrial data has not been constructed. Here the mitochondrial (COI and ND2) and nuclear markers (ApoB, IRBP and GAPD) are combined to create a more robust phylogeny to underpin a species identification method for the marsupial order Diprotodontia. Mitochondrial markers were combined with nuclear markers to amplify 27 genera of Diprotodontia. Data were analysed using parsimony and likelihood methods. These combined data sets resolved two suborders: Vombatiformes and Phalangeriformes. Phalangeriformes was subsequently split into two clades. The first clade contained the Macropodiformes and Burramyidae. The second clade contained Petauridae, grouping with Phalangeroidea. Phylogenies to date have generally resolved either three suborders grouping all possums in one suborder, or two suborders dividing the possums into two clades, with one clade grouping Burramyidae and Phalangeroidea with Macropodiformes and the other containing the remaining possums. Of the markers tested, ND2 provided the greatest level of diagnostic accuracy and could be used as a species identification tool for Diprotodonts.

Introduction Numerous marsupials are critically endangered due to habitat destruction, introduced species and illegal poaching (Anon 1998). Australia has the largest number of marsupials, accounting for approximately 200 of approximately 270 species extant in the world (Graves and Westerman 2002). A key issue in the investigating and prosecution of wildlife crime is defining the biological (taxonomic) units of concern, whether they are species or populations or some other variant. A species may not always be easily defined by genetic means, particularly if the baseline knowledge of genetic variation within and among species is not well characterised. The DNA analysis of two closely-related populations may result in most of the specimens being grouped into one of the two populations, however there will always be outliers in the

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 127 ‘grey area’ between the two populations that cannot be readily assigned to one group or the other (Mallet 1995).

Species identification processes are underpinned by robust phylogenetic information regarding the relevant species. Traditionally, species identification and phylogenetic reconstruction is performed using morphological (traditional taxonomy) or mitochondrial DNA data. Whilst each source of data has its strengths, they can produce conflicting results (Hillis 1987, Shaffer et al. 1991). Morphological data integrates genomic differences but is subject to problems of moderation by the environment in which genes are expressed (phenotypic plasticity) and correlated evolution of functionally linked traits (Georges et al. 1998).

There continues to be much debate regarding the classification and phylogeny of marsupials and many species are known only from the fossil record. Marsupials are generally divided into four orders: Dasyuroidea, Perameloidea, Diprotodontia and Notoryctidae (Aplin and Archer 1987). Diprotodontia is by far the largest of these orders. Eleven extant families containing approximately 125 living species are recognised (Meredith et al. 2009) (refer Table 7.1). Diprotodonts derive their name from the two procumbent lower medial incisors, which are a synapomorphy of the order (Kirsch 1977, Aplin and Archer 1987). The order, whilst mostly herbivorous, also contains nectarivores (Tarsipes rostratus), folivores (Phascolarctos cinereus) and insect-omnivores (Aepyprymnus rufescens) (Aplin and Archer 1987).

A considerable amount of genetic research has been conducted to date on the species within the order Diprotodontia. Mitochondrial DNA research has been focussed on the NADH dehydrogenase subunit 2 gene (ND2) as detailed in Osborne and Christidis (2002b), however this work was mainly centred on Petauroidea with only a few macropod species. A small amount of work has been conducted on sequencing cytochrome b (cyt b) and the control regions of some species (Aitken 1997, Fumgalli et al. 1997, Eldridge et al. 2001b). Currently, 17 species have complete mitochondrial sequences represented in Genbank (Phillips and Pratt 2008). Complete mitochondrial DNA studies of Nilsson et al. (2004) and Munemasa et al. (2006) found conflicting results. The Nilsson et al. (2004) study included six Diprotodont species which resulted in unclear resolution of the phylogenetic relationship of the

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 128 species with the genus Potorous grouping with Macropus and this pairing then grouping with Vombatus. This group was then sister to a pairing of Trichosurus and Pseudocheirus. Munemasa et al. (2006) found two suborder resolution with Phalangeridae grouping with the macropods. The differences were probably due to limited representation of taxa. For example no taxa from Burramyidae were contained in the Munemasa et al. (2006) study.

Mitochondrial DNA markers are from a single maternally inherited unit and as such may mislead through independent lineage sorting and undetected instances of hybridisation and introgression (Brower 2006). These factors may affect the ability of the mitochondrial marker to correctly attribute the sample to the correct species. Therefore combining mitochondrial data with nuclear data can lead to a more robust phylogeny. Earlier nuclear based research concentrated on microsatellites, however many of the microsatellites produced were dinucleotide repeats, focussing on the population genetics of a single species and therefore not informative enough for broad phylogenetic reconstructions (Houlden et al. 1996a, Luikart et al. 1997, Taylor et al. 1997, Taylor and Cooper 1998a, Taylor and Cooper 1998b, Beheregara et al. 2000, Millis 2000, Zenger and Cooper 2001, Kraaijeveld-Smit et al. 2002, Taylor et al. 2002, Hansen et al. 2003).

Kirsch et al. (1997), using DNA hybridisation studies, identified three suborders (Vombatiformes, Phalangeriformes and Macropodiformes) within the Diprotodontia. Vombatiformes combined Vombatidae and Phascolarctidae, Phalangeriformes grouped the possums and Macropodiformes grouped the Macropodidae and Potoridae together and added Hypsiprymnodontidae. This classification is consistent with Wilson and Reader (2005) who also found the three suborders grouping possums to the isolation of the macropods.

Amrine Madsen et al. (2003), Meredith et al. (2008) and Meredith et al. (2009) examined the phylogenetics of the Diprotodonts using five nuclear genes: ApoB, BRCA1, IRBP, Rag1 and vWF. These studies found a basal split into two suborders between Vombatiformes and Phalangerida, which is inconsistent with the previous major phylogenetic analysis of Diprotodonts of Wilson and Reader (2005) who found three monophyletic suborders (Vombatiformes, Phalangeriformes and

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 129 Macropodiformes) (Table 7.1). Of note Meredith et al. (2009) (who examined all genera) found that Phalanderoidea grouped with Macropodiformes and this group was a sister group to Petauroidea.

Phillips et al. (2006) contend that phylogenetic studies should combine mitochondrial and nuclear DNA data to obtain a true reflection of the relationships between the species. However, any analysis should be conducted using separately modelled process partitions. Phillips and Pratt (2008) found support for two suborders of Vombatiformes and Phalangerida and found that Macropodidae grouped with Burramyidae and Phalangeridae. Beck (2008) also combined mitochondrial and nuclear data to resolve two suborders Vombatiformes and Phalangerida. Possums were found to be paraphyletic with Burramyidae and Phalangeridae grouping with Macropodiformes. Cardillo et al. (2004) found contrary results using combined data to form supertrees, with a paraphyly of all possums to the exclusion of macropods.

In this study mitochondrial and nuclear marker information are combined to develop a robust phylogeny that is able to underpin a species identification method (Phillips et al. 2006). This study provides the largest investigation of Diprotodontia species using combined mitochondrial and nuclear data. By developing species-specific markers for the Diprotodontia, some comment can be made as to the relatedness of the species and a robust phylogeny to underpin a method of identification for the different species can be created.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 130 Table 5.1 Recent Diprotodont Classifications of Wilson and Reader (2005) and Meredith et al. (2009).

Wilson and Reader (2005) Meredith et al. (2009) Order Diprotodontia Order Diprotodontia Suborder Vombatiformes Suborder Vombatiformes Family Phascolarctidae Family Phascolarctidae Family Vombatidae Family Vombatidae Suborder Phalangeriformes Suborder Phalangerida Superfamily Phalangeroidea Superfamily Petauroidea Family Phalangeridae Family Petauridae Family Burramyidae Family Pseudocheiridae Superfamily Petauroidea Family Acrobatidae Family Petauridae Family Tarsipedidae Family Pseudocheiridae Superfamily Australoplagiaulacoida Family Acrobatidae Clade Phalangeroidea Family Tarsipedidae Family Phalangeridae Suborder Macropodiformes Family Burramyidae Family Potoroidae Clade Macropodiformes Family Hypsiprymnodontidae Family Potoroidae Family Macropodidae Family Hypsiprymnodontidae Family Macropodidae

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 131 Materials and Methods Samples were obtained from vouchered and non-vouchered specimens, usually tissue (heart or liver), from various collections (Table 5.1). In all, 55 samples were obtained representing ten families, 27 genera and 55 species of Diprotodontia, plus an additional species used as an out group. Taxonomic designations follow Meredith et al. (2009). All tissue samples were extracted either using standard phenol chloroform procedures or Qiagen DNeasy® Blood and Tissue Kit (Qiagen part # 69506), using the animal tissue bench protocol. Negative controls were used in all extractions.

Mitochondrial marker primers used to amplify COI and ND2 were as described in Chapter 6. Nuclear marker primers for ApoB and IRBP are as reported in Amrine- Madsen et al. (2008) and also used in Chapter 4. GAPD primers are as described in Chapter 4.

PCRs were performed in a 25uL reaction using a master mix made up of 12.5uL Go Taq Green® Master Mix 2X (Promega cat # M712C), 0.75uL primer (Sigma Aldrich, refer Table 2 for primer sequences) and 0.3uL 25mM magnesium chloride (Qiagen) per sample. 12.5uL of master mix was added to a 96 well plate (cat # T323-96N), after which 12.5uL of genomic DNA was added to the appropriate wells. Negative controls were used in all PCRs. Samples were amplified under the following reaction conditions: initial incubation of 94oC 3 mins, followed by 40 cycles of 94oC 30 secs, 50-57oC 30 secs, 72oC 60 secs; and a final extension step of 72oC 7 mins.

PCR products were visualised to ensure sufficient amplified DNA was obtained before sequencing on a 1.2% agarose gel with Hyperladder II (Bioline) and ethidium bromide staining. All sequencing was performed in both directions by the Australian Genome Research Facility using M13 primers. Run data were edited using SEQUENCHER version 4.1.4 (GeneCodes Corporation 1991-2002) software package. Consensus sequences were aligned using the CLUSTAL module in MEGA version 4 (Tamura et al. 2007) with manual adjustments. Maximum parsimony analysis was performed using PAUP 4.0b10 (Swofford 2002) where gaps were coded as missing data. Parsimony analysis utilised the heuristic search option and TBR branch swapping and employed bootstrap analysis incorporating 1,000 replications. The partition homogeneity test using 100 replicates was performed using PAUP*

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 132 4.0b10 to test the appropriateness of combining the five marker sets into one concatenated data set (Farris et al.1994, Swofford 2002)

The maximum likelihood analysis of the concatenated data was performed using PHYML (Guindon and Gascuel 2003) on the Montpellier bioinformatics platform (http://www.atgc-montpellier.fr/phyml/). Partitioned maximum likelihood analysis of concatenated data was performed using RaxML version 7.2.3 (Stamatakis et al. 2008).

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 133 Table 5.2 Sample information regarding the samples analysed in the study of species identification of Diprotodontia. Information includes the collection source of the sample and the relevant reference numbers. Codes indicating sample origins are: VM – Victoria Museum, ANWC - Australian National Wildlife Collection, WA M – Western Australia Museum, AM – Australia Museum, WA PS – Murdoch University.

Family Genus Species Source Museum Reference Phascolarctcidae Phascolarctos cinereus ANWC M24360 Vombatidae Vombatus ursinus VM W5166 Phalangeridae Trichosurus vulpecula ANWC M28611 Trichosurus caninus ANWC M16324 Trichosurus johnstoni AM M23000 Trichosurus arnhemensis AM AL8 Wyulda squamicaudata AM M21955 Phalanger lullulae AM EBU9564 Phalanger gymnotis AM U13 Phalanger sericeus AM M12869 Phalanger vestitus AM EBU26417 Phalanger orientalis AM M19104 Spilocuscus maculatus AM GT149 Spilocuscus rufoniger AM M13520 Burramyidae Burramys parvus ANWC M29305 Cercartetus lepidus VM W5157 Cercartetus nanus VM C31224 Cercartetus caudatus AM EBU24767 Cercartetus concinnus VM M55432 Pseudocheiridae Pseudocheirus peregrinus ANWC M16323 Pseudocheirus occidentalis WA M 48849 Petauroides volans ANWC M16313 Petauridae Gymnobelideus leadbeateri VM W5317 Petaurus breviceps ANWC M16819 Petaurus norfolcensis ANWC M16378 Petaurus australis VM W5247 trivirgata ANWC M24957 Tarsipedidae Tarsipes rostratus VM M55442 Acrobatidae Acrobates pygmaeus ANWC M16432

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 134 Family Genus Species Source Museum Reference Potoroidae Potorous longipes VM W5305 Bettongia penicillata WA M 47907 Bettongia tropica ANWC M28956 Bettongia lesueur WA M 53724 Aepyprymnus rufescens WA PS 48-01-596 Macropodidae Lagostrophus fasciatus WA M 46395 Lagorchestes hirsutus WA M 41403 Onychogalea unguifera WA M 53713 Onychogalea fraenata WA PS 55-01-031 Petrogale penicillata VM W5316 Petrogale rothchildi WA M 51607 Petrogale persephone WA PS 51-01-637 Petrogale xanthopus ANWC M36009 Petrogale lateralis WA M 53741 Macropus agilis ANWC M16443 Macropus fuliginosus WA PS 15-05-071 Macropus irma WA M 53722 Macropus eugenii WA M 49857 Macropus rufogriseus AM M35562 Macropus bicolor VM W5304 Setonix brachyurus WA M 48455 Dendrolagus lumholtzi ANWC M16996 Dendrolagus bennettianus WA PS 64-91-334 Dendrolagus matchiei WA M 53707 Dendrolagus goodfellowi WA M 53706 Outgroup Isoodon obesulus ANWC M29952 Isoodon macrourus ANWC M16380

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 135 Results After removing ambiguous data, the following sequences were obtained for each marker: ApoB 59 taxon, 724 basepairs; IRBP 56 taxon, 560 basepairs; GAPD 52 taxon, 478 basepairs; COI 61 taxon, 699 basepairs; ND2 63 taxon, 871 base pairs. The concatenated data set contained 64 taxon and 3,332 basepairs.

Maximum parsimony analysis of individual markers displayed similar topologies. All trees were condensed to display only those nodes with more than 50% bootstrap support. All markers, except for COI, were able to resolve Macropodiformes, however, IRBP grouped Petaurus breviceps within this clade and separate to its sister taxa of Petaurus norfolcensis. All markers were also able to resolve Phalangeridae with strong bootstrap support. ApoB performed well as a diagnostic marker and generally grouped all species as expected (Figure 5.1). ApoB displayed robust support for the two suborders. Contrary to expectations Phalangerida was split into three clades, grouping Petauroidea as sister to Phalangeridae and then as sister to the Macropods and Burramyidae (which grouped together).

IRBP displayed more variation than expected, with Burramys not grouping with Cercartetus. Both IRBP and GAPD showed lower resolution at the deeper nodes (Figures 5.2 and 5.3). GAPD displayed strong bootstrap values only for the groupings of Pseudocheirus occidentalis and Pseudocheirus peregrinus and Phalangeridae. The phylogenetic performance of COI was not as clear as the other markers. The macropods were split into two clades and the two suborders were not resolved. Also, Macropus rufus did not group with the other macropods (Figure 5.4). Whilst ND2 did not resolve at the deeper nodes, as it did not resolving the two suborders, it did group at the shallow nodes as might be expected for such a rapidly evolving gene (Figure 5.5).

Partition homogeneity analysis produced a probability of 0.01, which was less than the statistically significant figure of 0.05 and therefore indicates that the datasets could be combined. The similar topologies of the individual marker trees also support this result. The non-partitioned maximum likelihood analysis of the concatenated data combining the nuclear and mitochondrial markers (Figure 5.6) supports a two

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 136 suborder structure as per Meredith et al. (2009). Possums grouped together, with the exclusion of Burramyidae, which grouped with the macropods.

Partitioned maximum likelihood also provides strong bootstrap support for a two suborder structure as per Meredith et al. (2009); Vombatiformes and a second clade (Figure 5.7). However, a second clade grouped the Macropodiformes with Burramyidae but to the exclusion of Phalangeroidea, which grouped with Petauridae. This result is contrary to the phylogenetic reconstruction of Meredith et al. (2009). However these relationships found only moderate support. Bootstraps values were high, with most over 90.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 137 Dendrolagus bennettianus Petrogale persephone Dendrolagus matschiei Dendrolagus goodfellowi

90 92 Petrogale lateralis Petrogale rothchildi 65 Petrogale penicillata 90 Petrogale assimilis Dendrolagus lumholtzi Petrogale xanthopus Lagorchestes conspicillatus 51 Lagostrophus hirsutus Thylogale stigmatica 58 80 Aepyprymnus rufescens Macropus agilis Macropus wallabia bicolor Macropus irma Macropus eugenii

81 Macropus fuliginosus

84 Macropus robustus Macropus rufus Macropus rufogriseus 67 Macropus giganteus Onychogalea fraenata 55 65 Onychogalea unguifera Lagostrophus fasciatus 99 Bettongia lesueur 99 Bettongia tropica 62 Potorous longipes Burramys parvus

73 Cercartetus nanus 80 Cercartetus concinnus 85 Cercartetus lepidus 53 Cercartetus caudatus

58 Trichosurus johnstoni 58 Trichosurus vulpecula 96 Trichosurus caninus 75 Trichosurus arnhemensis 81 Wyulda squamicaudata

92 Spilocuscus rufoniger 99 Spilocuscus maculatus Phalanger gymnotis 86 Phalanger lullulae Phalanger sericeus 76 80 Phalanger vestitus 97 Phalanger orientalis Acrobates pygmaeus

99 Pseudocheirus occidentalis 91 Pseudocheirus peregrinus 90 Petauroides volans Petaurus breviceps 88 99 Petaurus norfolcensis

99 Gymnobelideus leadbeateri

72 Dactylopsila palpator 98 Dactylopsila trivirgata Vombatus ursinus 99 Phascolarctos cinereus Isoodon macrourus 99 Isoodon obesulus

Figure 5.1 Phylogenetic analysis of 724 bps of the ApoB nuclear marker in Diprotodonts. Figures represent node support for maximum parsimony analysis based on 1,000 replications.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 138 81 Petrogale lateralis

87 Petrogale penicillata Petrogale persephone 76 Petrogale xanthopus Aepyprymnus rufescens 69 Thylogale stigmatica Dendrolagus matschiei Dendrolagus goodfellowi Dendrolagus bennettianus 71 50 Dendrolagus lumholtzi Macropus robustus 86 Macropus rufus Macropus fuliginosus Macropus giganteus Macropus eugenii Macropus irma 73 Macropus rufogriseus Macropus agilis 98 Petaurus breviceps Macropus wallabia bicolor Lagorchestes conspicillatus 99 Lagostrophus hirsutus Onychogalea unguifera Lagostrophus fasciatus Bettongia lesueur 99 Bettongia tropica Potorous longipes 95 Potorous tridactylus Phalanger lullulae 56 Phalanger orientalis 55 Phalanger vestitus 85 Phalanger gymnotis Spilocuscus maculatus 98 99 Spilocuscus rufoniger Wyulda squamicaudata Trichosurus johnstoni 87 Trichosurus arnhemensis 98 Trichosurus caninus Trichosurus vulpecula

70 Cercartetus concinnus 69 Cercartetus nanus 99 Cercartetus lepidus Cercartetus caudatus Burramys parvus

83 Petauroides volans Pseudocheirus peregrinus Tarsipes rostratus Acrobates pygmaeus

89 Dactylopsila palpator Dactylopsila trivirgata 99 Gymnobelideus leadbeateri 50 Petaurus norfolcensis Phascolarctos cinereus 81 Vombatus ursinus Isoodon macrourus 99 Isoodon obesulus Figure 5.2 Phylogenetic analysis of 560 bps of the IRBP nuclear marker in Diprotodonts. Figures represent node support for maximum parsimony analysis based on 1,000 replications.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 139 Aepyprymnus rufescens Macropus wallabia bicolor Dendrolagus matschiei Lagostrophus fasciatus Lagostrophus hirsutus Dendrolagus bennettianus Petrogale penicillata Dendrolagus lumholtzi Macropus rufogriseus Petrogale xanthopus Macropus eugenii Petrogale rothchildi

52 Macropus agilis Onychogalea unguifera Petrogale persephone Potorous longipes Macropus giganteus Lagorchestes conspicillatus Thylogale stigmatica Bettongia lesueur Bettongia tropica 56 Macropus fuliginosus Dendrolagus goodfellowi Potorous tridactylus Macropus robustus Macropus rufus Acrobates pygmaeus Cercartetus concinnus Cercartetus nanus Burramys parvus Cercartetus lepidus Dactylopsila trivirgata Petauroides volans Gymnobelideus leadberteri Dactylopsila palpator Pseudocheirus occidentalis 86 Pseudocheirus peregrinus Petaurus norfolcensis Petaurus australis 59 Petaurus breviceps Trichosurus arnhemensis Spilocuscus rufoniger Trichosurus vulpecula Wyulda squamicaudata Spilocuscus maculatus 99 Trichosurus caninus Phalanger orientalis Phalanger gymnotis Phalanger vestitus Tarsipes rostratus Isoodon macrourus 99 Isoodon obesulus Figure 5.3 Phylogenetic analysis of 478 bps of the GAPD nuclear marker in Diprotodonts. Figures represent node support for maximum parsimony analysis based on 1,000 replications.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 140 100 Macropus agilis 54 Thylogale stigmatica Macropus eugenii Macropus rufogriseus Setonix brachyurus Macropus fuliginosus Macropus wallabia bicolor Lagostrophus hirsutus Macropus irma Onychogalea unguifera Onychogalea fraenata 85 95 Aepyprymnus rufescens

99 Dendrolagus bennettianus Dendrolagus lumholtzi Dendrolagus matschiei 100 Dendrolagus goodfellowi Petrogale xanthopus Petrogale rothchildi

94 Petrogale penicillata 58 Petrogale lateralis Petrogale persephone Tarsipes rostratus Distoechurus pennatus Acrobates pygmaeus Lagostrophus fasciatus Potorous longipes Bettongia lesueur Bettongia penicillata 68 100 Macropus giganteus 100 Bettongia tropica Vombatus ursinus

99 Spilocuscus rufoniger Spilocuscus maculatus

95 Trichosurus arnhemensis 92 Trichosurus vulpecula 100 99 Trichosurus johnstoni 96 Trichosurus caninus 50 Wyulda squamicaudata Phalanger sericeus Phalanger gymnotis Phalanger lullulae Phalanger vestitus 95 Phalanger orientalis

99 Petaurus breviceps Petaurus norfolcensis Petaurus australis Macropus rufus 100 Dactylopsila trivirgata

99 Pseudocheirus peregrinus Pseudocheirus occidentalis Gymnobelideus leadbeateri Petauroides volans Potorous tridactylus 53 Lagorchestes conspicillatus Phascolarctos cinereus Burramys parvus

58 Cercartetus lepidus Cercartetus caudatus Cercartetus nanus Cercartetus concinnus Isoodon macrourus Isoodon obesulus Figure 5.4 Phylogenetic analysis of 699 bps of the COI mitochondrial marker in Diprotodonts. Figures represent node support for maximum parsimony analysis based on 1,000 replications.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 141 96 Petrogale penicillata 82 Petrogale assimilis 88 Petrogale lateralis Petrogale rothchildi Petrogale persephone Petrogale xanthopus Dendrolagus lumholtzi

62 Dendrolagus goodfellowi 100 Dendrolagus matschiei Aepyprymnus rufescens Onychogalea fraenata 100 Thylogale stigmatica Setonix brachyurus 74 Lagostrophus hirsutus Lagorchestes conspicillatus Macropus eugenii Macropus rufogriseus Macropus giganteus 99 Macropus fuliginosus 62 Macropus irma Macropus agilis Macropus rufus 100 Macropus robustus 96 Macropus wallabia bicolor Onychogalea unguifera Lagostrophus fasciatus Potorous longipes Potorous tridactylus 89 Bettongia lesueur

87 Bettongia penicillata 100 Bettongia tropica Cercartetus concinnus 85 Cercartetus lepidus 75 Cercartetus nanus Cercartetus caudatus Burramys parvus Acrobates pygmaeus Tarsipes rostratus

60 Trichosurus vulpecula 61 Trichosurus johnstoni 99 Trichosurus arnhemensis 95 Trichosurus caninus Wyulda squamicaudata Spilocuscus maculatus 74 99 Spilocuscus rufoniger

75 Phalanger gymnotis Phalanger lullulae 72 Phalanger sericeus 63 Phalanger vestitus 84 Phalanger orientalis Gymnobelideus leadbeateri

78 Dactylopsila palpator Dactylopsila trivirgata Petauroides volans

92 Pseudocheirus peregrinus 100 Pseudocheirus occidentalis Petaurus australis

93 Petaurus breviceps 100 Petaurus norfolcensis Vombatus ursinus 69 Phascolarctos cinereus Isoodon obesulus 100 Isoodon macrourus Figure 5.5 Phylogenetic analysis of 871 bps of the ND2 mitochondrial marker in Diprotodonts. Figures represent node support for maximum parsimony analysis based on 1,000 replications.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 142

Figure 5.6 Phylogram of phylogenetic analysis of 3332 bps of the concatenated data sets for nuclear markers ApoB, IRBP and GAPD and mitochondrial markers COI and ND2 in Diprotodonts using maximum likelihood non-partitioned analysis performed in PHYML.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 143

Figure 5.7 Phylogram of phylogenetic analysis of 3332 bps of the concatenated data sets for nuclear markers ApoB, IRBP and GAPD and mitochondrial markers COI and ND2 in Diprotodonts using maximum likelihood partitioned analysis for each gene performed in RaxML. ML bootstrap values are shown.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 144 Discussion The concatenated data found strong support for the two suborders of Vombatiformes and Phalangerida as per Meredith et al. (2009). The results in this study do not support the three suborder phylogeny of Wilson and Reader (2005). Strong support for possum paraphyly was obtained, with three clades resolved. The split of Burramyidae and Phalangeridae, with Burramyidae grouping with Macropodiformes and Phalangeridae grouping with Petauroidea, is significant and contrary to the current published phylogeny of Meredith et al. (2009). Meredith et al. (2009) found that both Burramyidae and Phalangeridae grouped with the Macropodiformes to form the superfamily Australoplagiaulacoida.

The results here illustrated robust support for Vombatiformes which is consistent with previous studies (Kirsh et al. 1997, Springer et al. 1997, Amrine-Madsen et al. 2003, Munemasa et al. 2006, Phillips and Pratt 2008, Meredith et al. 2009). Results also illustrated robust support for Phalangerida which is consistent with the studies of Kirsh et al. (1997), Munemasa et al. (2006), Amrine-Madsen et al. (2003) Phillips and Pratt (2008) and Meredith et al. (2009), but not consistent with Wilson and Reader (2005). Results are therefore strongly supportive of a two suborder phylogeny of Vombatiformes and Phalangerida.

Interestingly, the results support three distinct possum clades. The largest clade containing the superfamily Petauroidea (Petauridae, Pseudocheiridae, Acrobatidae and Tarsipedidae) and this clade is sister to a second clade Phalangeridae and a final clade Burramyidae, which grouped with the macropods. The grouping of Burramyidae to the macropods was found in the partitioned and non-partitioned concatenated data sets and also in the single marker results using the ApoB marker (although it was clearer in the partitioned analysis). The partitioned analysis provided moderate bootstrap support values for this relationship. Meredith et al. (2009) also found Burramyidae grouping with the macropods but not to the exclusion of Phalangeridae, using nuclear data sets alone.

This difference in phylogenetic reconstructions between this study and Meredith et al. (2009) may be due to the BRCA marker which was used in the analysis of Meredith et al. (2009), but not used here. The BRCA marker is over 2,000 base pairs and

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 145 therefore contributed a significant portion of the data set for the Meredith et al. (2009) study. The other four nuclear markers used by Meredith et al. (2009) study have been investigated in either this chapter or Chapter 4 of this research. Individual Maximum Parsimony topologies did not concur with those of the Meredith et al. (2009) final tree reconstruction. The difference must therefore be due to the addition of the BRCA marker information. Of possible significance is that Meredith et al. (2009) found significant results for the partition homogeneity test indicating that the individual data sets could not be combined. Non-significant results were obtained using the bootstrap compatibility test but no discussion of a comparison of tree topologies was mentioned. It is therefore possible that the results obtained by Meredith et al. (2009) are representative of the gene evolution and not the species. This illustrates the benefit of combining nuclear and mitochondrial data.

Combined mitochondrial and nuclear data studies (Asher et al. 2004, Phillips and Pratt 2008) found strong support for a basal split between Vombatiformes and Phalangerida. However, Phillips and Pratt (2008) found strong support for a clade containing Macropodoidea and Phalangeridae and moderate support for Burramyidae as a sister group to this clade. These studies included only limited representation of Diprotodont species and therefore may not be representative of the true relationships within the order.

Phillips et al. (2006) and Beck (2008) suggest that mitochondrial protein coding sequences are subject to composition bias in that the third codon position may mislead tree reconstruction using maximum likelihood analysis. Beck (2008) suggests that this may even affect partitioned studies, but that Bayesian analysis may be less affected by the composition bias. Phillips et al. (2006) however contend that partitioned analysis should be sufficient. The differences found in this study, compared with previous studies in the placement of Phalangeridae, may be due to the composition bias of the two mitochondrial protein coding sequences affecting correct tree reconstruction. However, the topology of the concatenated data in the grouping of Burramyidae with the macropods to the exclusion of Phalangeridae is consistent with the topology of the individual phylogram obtained from maximum parsimony, maximum likelihood and bayesian approaches of the ApoB nuclear marker and is therefore supportive of this being the correct topology.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 146

Below is the suggested phylogeny for Diprotodontia emerging from this study. Order Diprotodontia Suborder Vombatiformes Family Phascolarctidae Family Vombatidae Suborder Phalangeriformes Infraorder Phalangeromorphia Superfamily Petauroidea Family Petauridae Family Pseudocheiridae Family Acrobatidae Family Tarsipedidae Superfamily Phalangeroidea Family Phalangeridae Infraorder Macropodomorphia Superfamily Burramyoidea Family Burramyidae Superfamily Macropodoidea Family Potoroidae Family Hypsiprymnodontidae Family Macropodidae

Partitioned maximum likelihood provided clearer resolution and is supportive of the contention by Phillips et al. (2006) that combined mitochondrial and nuclear analysis should be a separately modelled process. The five nuclear markers generally produced similar topologies with some minor variations. Of the markers, ApoB performed well producing a robust phylogeny with strong support values. ND2 also performed well with all groupings as expected. Therefore, ApoB and ND2 in particular would be useful for species identification within this order and given the degraded nature encountered within wildlife forensics ND2 may be the best primary marker to be used in the first instance.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 147 This study illustrates the benefits of combining nuclear and mitochondrial data to produce a more robust phylogeny to underpin a species identification test. Of the markers tested, ND2 provided the greatest level of diagnostic accuracy combined with amplification ease for degraded samples and could be used as a species identification tool for Diprotodonts. Presented here is a robust phylogeny developed for the first time using an extensive data set of nuclear and mitochondrial information in Diprotodontia.

Chapter 5 – Developing a species identification system in a forensic context for Diprotodontia 148

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis

Photo: Vombatus ursinus (Common wombat) species within the order Diprotodontia. Photo

by Linzi Wilson-Wilde.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 149 Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis

Abstract The Barcode of Life Initiative (BOLI) aims to sequence a short diagnostic section of DNA (approximately 648 base pairs) from the mitochondrial Cytochrome Oxidase I gene for all eukaryote species. There are many applications for this methodology and associated databases including species identification for wildlife forensic analyses. Here, we use the Australian marsupial Diprotodont group as a model to test the ability of the COI marker to identify species in a forensic context. We also compare the phylogenetic performance of the marker against other relevant mitochondrial markers, Cytochrome b and NADH dehydrogenase subunit 2. COI and ND2 primers were designed to amplify marsupials and successfully amplified in all species tested. ND2 performed slightly better in Diprotodonts than COI, which performed significantly better than cyt b. However, given the extensive COI sequence data on a broad range of species we suggest that COI be used as a preliminary species identification marker. We also provide a comprehensive set of data that can be used to inform the development of a Diprotodontid species identification test.

Introduction The use of molecular DNA techniques to identify species has long been postulated (Woese and Fox 1977) either as a stand-alone approach or in conjunction with morphological methods (Hajibabaei et al. 2007). There are many applications of species identification tests, the most significant being for use in criminal investigations and population genetics. Wildlife crime costs approximately 20 billion dollars annually worldwide and is the third largest illegal trade in the world behind drugs and human trafficking and ahead of the arms industry (INTERPOL 2007). Species identification is one of the most important pieces of information required in wildlife crime investigations. A genetic species identification test could be used to convict offenders of various offences including the importation/exportation of endangered species, importation of invasive species (leading to disease transmission) and illegal poaching.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 150 DNA-based species identification of a limited range of targeted species, such as pets, farm animals or other well characterised species is relatively straightforward because a known specific DNA marker can be used. However, a variety of DNA markers may be required to cover a broad range of species. If the sample is morphologically unidentifiable, such as a tissue sample, a broader test may be required, backed up by a reference database of known DNA results. Currently, no one single such test can amplify all species.

The Cytochrome Oxidase I (COI) mtDNA gene has been suggested as a possible solution to this problem by providing a means to ‘barcode’ every living species on Earth (Hebert 2003a). To this end, the Barcode of Life Initiative (BOLI) was initiated to catalogue and database a section of DNA sequence for every living species (Stoeckle 2003). The database developed as part of that project could then be challenged with an unknown sequence and compared to all existing sequences held on the database, to enable identification of the species origin of the questioned sequence. In the development of the barcode approach, mitochondrial DNA was favoured over nuclear DNA because of its higher copy number and slower rate of degradation, which is important for partial or degraded specimens that may include pelts, eggs, hair and tissue fragments. It requires only small amounts of sample material and therefore has little impact on the morphology of crime scene samples. Additionally, mtDNA displays a relative lack of introns, limited recombination, is inherited directly (haploid) and contains useful levels of genetic variation at a number of taxonomic levels.

The 5’ end of the COI gene was identified as the sequence of choice for the barcode analysis because it has a broad range of primer sites that are applicable across a wide range of taxa (Folmer et al. 1994) within a relatively short sequence (approximately 648 base pairs) and evolves more slowly than the Cytochrome b (cyt b) gene, which is the most viable alternative (Lynch and Jarrell 1993). However, the sequence variation of COI and cyt b have not been measured in Diprotodonts and so their use in species identification of Diprotodonts is unknown. Studies have shown that the COI gene can be used to accurately place individuals not only into the correct higher taxonomic categories (phyla and order) but also into the correct species (Hebert et al. 2003a; Hebert et al. 2003b).

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 151 The benefit of targeting one gene is that a single technique can be applied to all species in a high throughput system that can provide a rapid diagnosis and can then allow for further testing if required. Importantly, for the first time, laboratories across the world could use the same molecular system for analysis, leading to increased collaborations and information exchange. The Barcode of Life Database (BOLD), which was established to act as a repository for COI sequences, is now available and provided free to all researchers (Ratnasingham and Hebert 2007). The BOLI aims to sequence 500,000 species, equating to 5 million samples, by 2014 (Stoeckle and Hebert 2008).

Much of the criticism regarding the use of the COI gene as the barcode centres on its ability to identify species. Sperling 2003 contends that studies of the COI gene may be able to discriminate between some species but would perform poorly for closely related species because similar haplotypes tend not to be placed on Genbank. This could lead to a bias on Genbank towards more distantly related species.

Numerous researchers feel that no single gene can provide the information required to discriminate among all species (Mallet and Willmott 2003; Moritz and Cicero 2004; Johnson and Cicero 2004; Will and Rubinoff 2004; Erpenbeck et al. 2005). Also, since the COI gene is in the mitochondria, the retention of the ancestral population would be favoured. Difficulties may also arise through limited diversity between closely related species as selection operates on the mitochondria as a whole, analysis may be hampered by the presence of pseudogenes in the nucleus, and genes may introgress following hybridisation (Herrnstadt et al. 1999; Wirth et al. 1999; Williams and Knowlton 2001; Thalman et al. 2004; Kress et al. 2005; Erpenbeck et al. 2006).

Pseudogenes are sequences of mitochondrial DNA copied into the nucleus and can co-amplify with the mitochondrial DNA versions. The mitochondrial DNA mutates more rapidly than nuclear DNA and overtime differences between the mitochondrial and nuclear versions arise (Lopez et al. 1994). This can cause problems when trying to measure variation using the mitochondrial DNA as the nuclear version may be preferentially amplified using a particular primer sequence or exist in higher copy numbers in blood samples (Sorenson and Quinn 1998). The resulting variation found between samples may not be reflective of the true

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 152 variation and studies have shown that the presence of pseudogenes in phylogenetic data sets can lead to an overestimation of the number of species (Song et al. 2008, Thalmann et al. 2004). There are a number of ways pseudogenes can be distinguished from the true mitochondrial sequence. Here we used those recommended by Sorenson and Quinn (1998). Sequences can be compared to complete mitochondrial sequences to determine differences, the presence of ambiguous sequences, stop codons, insertions or deletions indicate possible nuclear sequences and blood samples (containing excess quantities of nuclear DNA) should be avoided. Nuclear copies are not always complete copies of the sequences and appear shorter in sequence length. Therefore, targeting larger sequences, where possible, is preferential. Owing to differences in the accumulation of mutations, species may group in an unexpected arrangement in phylogenetic reconstructions (Sorenson and Quinn 1998). Differential extraction processes, altering primers and increasing annealing temperatures are methods of isolating the mitochondrial versions of targeted sequences.

Some studies have found little sequence divergence in the mitochondria. Johnson and Cicero (2004) found an average of only 1.9% sequence divergence (a range of 0.0%-8.2%) amongst North American birds compared with 7-8% in the study of Hebert et al. (2003a). Of the birds studied, 74% would not have been recognised as separate species. However, as Lambert et al. (2005) pointed out, only eight out of the 39 species comparisons used COI data, rendering these conclusions invalid. Having a higher level of sequence divergence between species, particularly closely related ones, is important for species identification as it ensures there are enough diagnostic points to allow for clear identifications. Studies have illustrated the sequence variation of genes on the mitochondria in mammals. Sequence similarity has been found to generally be above 65% (refer Figure 6.1) (Linacre and Tobe 2009). The level of similarity at COI and cyt b was found to be approximately 75-80%. However, NADH dehydrogenase subunit 2 (ND2) was found to be 65-70%, indicating that ND2 might be a more variable marker in Diprotodonts. A comparison of COI and cyt b performance in reconstructing phylogenies in mammals found that cyt b was more useful for species resolution and was recommended above COI for this purpose (Tobe et al. 2010).

Additionally, it has been shown that the ‘universal’ Folmer primers (Folmer et al. 1994) are not as universal as anticipated and do not amplify a broad range of species successfully. This

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 153 is illustrated in a number of studies that have needed to design specific primers in order to maximise amplification success rates (Ward et al. 2005; Lorenz et al. 2005; Ivanova et al. 2007). However, there has been a move to a more general acceptance of the use of COI barcoding in assisting taxonomy and the understanding that further development may be beneficial, including the inclusion of other markers to fully-resolved species boundaries (Smith et al. 2007; Sevilla et al. 2007; Kress and Erikson 2007; Fre’zal and Leblois 2008).

Here, we use the Australian marsupial order Diprotodontia as a model system to test the ability of the COI barcoding analysis to identify and distinguish species when compared with other relevant mitochondrial DNA markers cyt b and ND2. Diprotodonts include the kangaroos, possums and gliders and offer numerous species both closely related and more distantly related. Diprotodonts have also been targets of wildlife crime, purportedly being illegally exported overseas as part of the pet trade or for collections, as well as illegally hunted for sport, meat or skins (Anon 1998). The sugar glider is a popular ‘pocket pet’ in the United States and Europe (Ruth’s Sugar Glider Page 2009 www.sugarglider.com /archives/ruth/) and illegal hunting of macropods, possums and has been the subject of the Australian Senate Select Committee on the Commercial Utilisation of Native Australian Wildlife (Anon 1998).

There are clear and important applications of a species identification tests for this order. The Order has been characterised in a number of studies using morphological (Archer 1984; Aplin and Archer 1987; Marshall et al. 1990; Luckett 1994; Szalay 1994; Springer et al. 1997; Horovitz and Sanchez-Villagra 2003) and molecular DNA techniques (Kirsh et al 1997; Springer et al. 1998; Wilson and Reader 2005; Meredith et al. 2009). There is debate about some aspects of the phylogeny of the order. However, the results of the phylogeny developed in Chapter 5, using combined nuclear and mitochondrial data, can be used as a reference phylogeny to compare the ability of the COI, ND2 and cyt b markers to discriminate a broad range of Diprotodont species.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 154

Figure 6.1 Illustration for the rates of similarity for the various genes on the mammalian mitochondrion. The lower the similarity, the higher the variation at that site in mammals. ND2, ND6 and 8PTA have the highest levels of variation in mammals. This image is reproduced with the kind permission of Prof Adrian Linacre, Flinders University from Linacre (2009) (note: figures were correct at time of publication).

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 155 Materials and Methods Samples were obtained from vouchered and non-vouchered specimens, usually tissue (heart, liver), from various collections (Table 6.1). In all, 55 samples were obtained representing ten families, 27 genera and 55 species of Diprotodontia, plus an additional two species of Isoodon used as out groups. Samples were chosen to measure interspecies variation, intraspecies variation was outside the scope of this study. Taxonomic designations followed Meredith et al. (2009). All tissue samples were extracted either using standard phenol chloroform procedures or Qiagen DNeasy® Blood and Tissue Kit (Qiagen part # 69506), using the animal tissue bench protocol. Negative controls were used in all extractions.

Primers for COI and ND2 were designed by aligning all available complete mitochondrial sequences for Marsupialia held on Genbank. Seventeen sequences were aligned using MEGA version 4 (Tamura et al. 2007) to identify conserved regions. The Macropus robustus sequence was imported into Primer 3 (Rozen and. Skaletsky 2000) and primers were identified using standard criteria to amplify approximately 699 base pairs of the COI gene and 871 base pairs of the ND2 gene. Various primer sets were compared back to the aligned sequences, manual modifications were made and two forward and two reverse primers developed (Table 6.2). Primers designed included M13 tails as follows; forward: GTA AAA CGA CGG CCA GT, reverse: CAG GAA ACA GCT ATG AC.

PCRs were performed in a 25uL reaction using a master mix made up of 12.5uL Go Taq Green® Master Mix 2X (Promega cat # M712C), 0.75uL primer (Sigma Aldrich, refer Table 6.2 for primer sequences) and 0.3uL 25mM magnesium chloride (Qiagen) per sample. 12.5uL of master mix was added to a 96 well plate (cat # T323-96N), after which 12.5uL of genomic DNA was added to the appropriate wells. Negative controls were used in all PCRs. Samples were amplified under the following reaction conditions: initial incubation of 94oC 3 mins; followed by 40 cycles of 94oC 30 secs, 50-57oC 30 secs, 72oC 60 secs; and a final extension step of 72oC 7 mins.

PCR products were visualised before sequencing on a 1.2% agarose gel with Hyperladder II (Bioline) and ethidium bromide staining. All sequencing was performed in both directions by the Australian Genome Research Facility using the M13 primers. Run data were then edited

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 156 using ProSeq version 2.9 (Filatov 2001) and Cromas version 1.45 (McCarthy 1997) or SEQUENCHER version 4.1.4 (GeneCodes Corporation 1991-2002) software packages. Consensus sequences were aligned using the CLUSTAL module in MEGA version 4 (Tamura et al. 2007) with manual adjustments. Cyt b sequences were obtained from Genbank and aligned in MEGA. K2P distances were determined in MEGA to calculate the overall mean composite rates for COI, ND2 and cyt b.

Maximum parsimony analysis was performed using PAUP* 4.0b10 (Swofford 2002) incorporating a heuristic search, seed type random and 10,000 bootstrap replications. Resulting trees were concatenated to only those nodes with a minimum of 50% bootstrap support values.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 157 Table 6.1 Sample information regarding the Diprotodonts plus outgroup samples analysed in the study of species identification using the COI marker. The collection source of the sample and the relevant reference numbers are provided. ANWC – Australian National Wildlife Collection, VM – Victoria Museum, AM – Australia Museum, WAM – Western Australia Museum, WAPS – Murdoch University, Western Australia

Family Genus Species Source Museum Sample Reference No Phascolarctcidae Phascolarctos cinereus ANWC M24360 PKPC2 Vombatidae Vombatus ursinus VM W5166 VWVU3 Phalangeridae Trichosurus vulpecula ANWC M28611 PBTV1 Trichosurus caninus ANWC M16324 PBTC1 Trichosurus johnstoni AM M23000 PBTJ1 Trichosurus arnhemensis AM AL8 PBTA1 Wyulda squamicaudata AM M21955 PBWS1 Phalanger lullulae AM EBU9564 PCPL1 Phalanger gymnotis AM U13 PCPG1 Phalanger sericeus AM M12869 PCPS1 Phalanger vestitus AM EBU26417 PCPV1 Phalanger orientalis AM M19104 PCPN1 Spilocuscus maculatus AM GT149 PCSM1 Spilocuscus rufoniger AM M13520 PCSR1 Burramyidae Burramys parvus ANWC M29305 BPBP2 Cercartetus lepidus VM W5157 BPCL1 Cercartetus nanus VM C31224 BPCN1 Cercartetus caudatus AM EBU24767 BPCC1 Cercartetus concinnus VM M55432 BPCW1 Pseudocheiridae Pseudocheirus peregrinus ANWC M16323 PRPP1 Pseudocheirus occidentalis WA M 48849 PRPD1 Petauroides volans ANWC M16313 PRPV1 Petauridae Gymnobelideus leadbeateri VM W5317 PGGL1 Petaurus breviceps ANWC M16819 PGPB2 Petaurus norfolcensis ANWC M16378 PGPN1 Petaurus australis VM W5247 PGPA1 Dactylopsila trivirgata ANWC M24957 PGDT1 Tarsipedidae Tarsipes rostratus VM M55442 TPTR1 Acrobatidae Acrobates pygmaeus ANWC M16432 AGAP1

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 158 Family Genus Species Source Museum Sample Reference No Potoroidae Potorous longipes VM W5305 PPPL1 Bettongia penicillata WA M 47907 PPBP1 Bettongia tropica ANWC M28956 PPBT1 Bettongia lesueur WA M 53724 PPBL1 Aepyprymnus rufescens WA PS 48-01-596 PPAR1 Macropodidae Lagostrophus fasciatus WA M 46395 MWLF1 Lagorchestes hirsutus WA M 41403 MWLH1 Onychogalea unguifera WA M 53713 MWOU1 Onychogalea fraenata WA PS 55-01-031 MWOF1 Petrogale penicillata VM W5316 MWPP1 Petrogale rothchildi WA M 51607 MWPR1 Petrogale persephone WA PS 51-01-637 MWPS1 Petrogale xanthopus ANWC M36009 MWPX1 Petrogale lateralis WA M 53741 MWPL1 Macropus agilis ANWC M16443 MWMA1 Macropus fuliginosus WA PS 15-05-071 MKMF1 Macropus irma WA M 53722 MWMI1 Macropus eugenii WA M 49857 WMME1 Macropus rufogriseus AM M35562 MWMN1 Setonix brachyurus WA M 48455 MWSB1 Wallabia bicolor VM W5304 MWWB1 Dendrolagus lumholtzi ANWC M16996 MKDL1 Dendrolagus bennettianus WA PS 64-91-334 MKDB1 Dendrolagus matchiei WA M 53707 MKDM1 Dendrolagus goodfellowi WA M 53706 MKDG1 Outgroup Isoodon obesulus ANWC M29952 M29952 Isoodon macrourus ANWC M16380 M16380

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 159 Table 6.2 Primer information for COI, ND2 and cyt b primers used in the study of Diprotodont species identification using the COI barcoding marker. Note sequences are excluding M13 tails.

Primer Name Primer Sequence 5’ to 3’ Reference COILWW26F caa tgc tta cct cag cca ttt tac This study COILWW904R gtg gct gat gta aag tat gct cg This study COILWW18 acc tca gcc att tta cct atg ttc This study COILWW674R gtt gat ata gga ttg ggt ctc cac This study HC02198 ggt caa caa atc ata aag ata ttg g Folmer et al. 1994 LC1490 taa act tca ggg tga cca aaa aat ca Folmer et al. 1994 ND2LWW31For gaa aat gtt ggt tta cat cct tcc This study ND2LWW1047Rev gat gca aga taa agg ctt gta att c This study ND2LWW1067Rev gtt cga ttt gcg ttc gga tga tgc This study H15298 ccc ctc aga atg ata ttt gtc ctc a Kocher et al. 1889 L14990 cca tcc aac atc tca gca tga tga aa Kocher et al. 1889

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 160 Results Samples were successfully amplified using varying primer combinations (Table 6.2). A total of 59 unambiguous sequences were obtained, containing no sequence gaps. All sequences varied between species, with the most informative positions being at the third codon position. K2P distances for COI reflected the variability within the Order, with the minimum K2P distance of 0.015 being between the sister taxa Trichosurus vulpecula and Trichosurus arnhemensis and the maximum distance of 0.349 being between the distantly related Lagorchestes conspicillatus and Trichosurus johnstoni. Lagorchestes conspicillatus displayed the greatest amount of divergence of all species within the order. The overall composite mean K2P distances were 0.221 for COI, 0.206 for ND2 and 0.177 for cyt b.

Maximum parsimony analysis of COI data (Figure 6.2) found some grouping as expected along accepted taxonomic lines at the shallower nodes, with macropods generally grouping together (but in two clades). Phalangeridae grouped together with Burramyidae, however a small clade of Phalangerida grouped unexpectedly with Macropus rufus. There was limited resolution at the shallow nodes and a number of other species were unresolved.

Maximum parsimony analsyis of ND2 (Figure 6.3) found better resolution but was still unable to resolve the two suborders of Meredith et al. (2009) that split the Diprotodonts into macropodiformes grouping with Phalangeridae and Burramyidae and the remaining possums of Phalangerida grouping with Vombatiformes. Vombatiformes grouped but not to the isolation of Phalangerida. The macropods were well resolved with generally high bootstrap values, as were Phalangeridae and Burramyidae. The cyt b marker did not resolve as well as COI or ND2, with only limited resolution of some macropods, Vombatiformes (but again not to the exclusion of Phalangerida) and the genus Trichosurus (Figure 6.4).

Frequency histograms of COI K2P distances (displayed as a percentage for the variation found within the Diprotodontid genus, family and order) illustrate the variation found within genera, family and order (Figures 6A-C). As expected, most of the variation was found within the order, followed by family and then genera. The variation found within the genera was widely dispersed, ranging from 0.015 to 0.200 (Figure 6.5A). The range of variation decreased at the deeper nodes, however the variation levels were higher at the deeper nodes

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 161 (Figures 5B and 5C). As expected, most of the variation observed using the COI marker was at the species level (as the range of variation was higher).

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 162 100 Macropus agilis 54 Thylogale stigmatica Macropus eugenii Macropus rufogriseus Setonix brachyurus Macropus fuliginosus Macropus wallabia bicolor Lagostrophus hirsutus Macropus irma Onychogalea unguifera Onychogalea fraenata 85 95 Aepyprymnus rufescens

99 Dendrolagus bennettianus Dendrolagus lumholtzi Dendrolagus matschiei 100 Dendrolagus goodfellowi Petrogale xanthopus Petrogale rothchildi

94 Petrogale penicillata 58 Petrogale lateralis Petrogale persephone Tarsipes rostratus Distoechurus pennatus Acrobates pygmaeus Lagostrophus fasciatus Potorous longipes Bettongia lesueur Bettongia penicillata 68 100 Macropus giganteus 100 Bettongia tropica Vombatus ursinus

99 Spilocuscus rufoniger Spilocuscus maculatus

95 Trichosurus arnhemensis 92 Trichosurus vulpecula 100 99 Trichosurus johnstoni 96 Trichosurus caninus 50 Wyulda squamicaudata Phalanger sericeus Phalanger gymnotis Phalanger lullulae Phalanger vestitus 95 Phalanger orientalis

99 Petaurus breviceps Petaurus norfolcensis Petaurus australis Macropus rufus 100 Dactylopsila trivirgata

99 Pseudocheirus peregrinus Pseudocheirus occidentalis Gymnobelideus leadbeateri Petauroides volans Potorous tridactylus 53 Lagorchestes conspicillatus Phascolarctos cinereus Burramys parvus

58 Cercartetus lepidus Cercartetus caudatus Cercartetus nanus Cercartetus concinnus Isoodon macrourus Isoodon obesulus Figure 6.2 Phylogenetic analysis of 699 bps of the COI marker in Diprotodont species using maximum parsimony analysis. All Bootstrap values above 50% are illustrated.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 163 96 Petrogale penicillata 82 Petrogale assimilis 88 Petrogale lateralis Petrogale rothchildi Petrogale persephone Petrogale xanthopus Dendrolagus lumholtzi

62 Dendrolagus goodfellowi 100 Dendrolagus matschiei Aepyprymnus rufescens Onychogalea fraenata 100 Thylogale stigmatica Setonix brachyurus 74 Lagostrophus hirsutus Lagorchestes conspicillatus Macropus eugenii Macropus rufogriseus Macropus giganteus 99 Macropus fuliginosus 62 Macropus irma Macropus agilis Macropus rufus 100 Macropus robustus 96 Macropus wallabia bicolor Onychogalea unguifera Lagostrophus fasciatus Potorous longipes Potorous tridactylus 89 Bettongia lesueur

87 Bettongia penicillata 100 Bettongia tropica Cercartetus concinnus 85 Cercartetus lepidus 75 Cercartetus nanus Cercartetus caudatus Burramys parvus Acrobates pygmaeus Tarsipes rostratus

60 Trichosurus vulpecula 61 Trichosurus johnstoni 99 Trichosurus arnhemensis 95 Trichosurus caninus Wyulda squamicaudata Spilocuscus maculatus 74 99 Spilocuscus rufoniger

75 Phalanger gymnotis Phalanger lullulae 72 Phalanger sericeus 63 Phalanger vestitus 84 Phalanger orientalis Gymnobelideus leadbeateri

78 Dactylopsila palpator Dactylopsila trivirgata Petauroides volans

92 Pseudocheirus peregrinus 100 Pseudocheirus occidentalis Petaurus australis

93 Petaurus breviceps 100 Petaurus norfolcensis Vombatus ursinus 69 Phascolarctos cinereus Isoodon obesulus 100 Isoodon macrourus Figure 6.3 Phylogenetic analysis of 871 bps of the ND2 marker in Diprotodont species using maximum parsimony analysis. All Bootstrap values above 50% are illustrated.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 164 77 Dendrolagus spadix 59 Dendrolagus goodfellowi Dendrolagus matschiei Lagorchestes hirsutus Dendrolagus dorianus stellarum Petrogale xanthopus Dendrolagus inustus Petrogale mareeba Dendrolagus lumholtzi Macropus rufus Macropus giganteus Marcopus robustus Macropus agilis Macropus parma Macropus irma Setonix brachyurus Onychogalea fraenata Lagostrophus fasciatus Macropus bicolor Macropus antilopinus Potorous gilbertii Potorous longipes Potorous platyops Potorous tridactylus Isoodon macrourus Aepyprymnus rufescens Bettongia lesueur Bettongia gaimardi 72 97 Bettongia tropica 76 Bettongia penicillata moschatus Petaurus breviceps Dactylopsila trivirgata cupreus Phalanger interpositus Phascolarctos cinereus Vombatus ursinus Trichosurus caninus 99 Trichosurus vulpecula Pseudocheirus peregrinus Thylogale stigmatica Tarsipes rostratus Dactylopsila pennatus Figure 6.4 Phylogenetic analysis of 407 bps of the cyt b marker in Diprotodont species using maximum parsimony analysis. All Bootstrap values above 50% are illustrated.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 165 Within Genus 25

20

15

10

5

0

40 80 20 60 00 40 80 .0 .0 .1 .1 .180 .2 .2 .260 .2 0 0 1-0 1-0 1-0 1-0 1-0 1-0 1-0 02 06 10 14 .161- 18 22 .241- 26 0.000-0.0200. 0.041-0.0600. 0.081-0.1000. 0.121-0.1400. 0 0. 0.201-0.2200. 0 0. (A)

Within Family 35 30 25 20 15 10 5 0

40 80 20 60 00 40 80 .0 .0 .1 .1 .180 .2 .2 .260 .2 0 0 1-0 1-0 1-0 1-0 1-0 1-0 1-0 02 06 10 14 .161- 18 22 .241- 26 0.000-0.0200. 0.041-0.0600. 0.081-0.1000. 0.121-0.1400. 0 0. 0.201-0.2200. 0 0. (B)

Within Order 40 35 30 25 20 15 10 5 0

40 80 20 60 00 40 80 .0 .0 .1 .1 .180 .2 .2 .260 .2 0 0 1-0 1-0 1-0 1-0 1-0 1-0 1-0 02 06 10 14 .161- 18 22 .241- 26 0.000-0.0200. 0.041-0.0600. 0.081-0.1000. 0.121-0.1400. 0 0. 0.201-0.2200. 0 0. (C) Figure 6.5 Frequency histograms, as a percentage, of COI K2P sequence divergences for each of the Diprotodont taxonomic hierarchy. Figure 5A displays the variation observed between species within a genus, 5B displays the variation observed between genus within a family and 5C displays the variation observed between families within the Diprotodont order.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 166 Discussion The COI barcoding sequence is relatively simple to amplify and analyse, once appropriate primers are obtained. As no single set of primers will universally amplify all species well, more specific primers are required. However, primers can be designed to amplify a broad selection of species. Combinations of the three sets of primers were required to amplify all Diprotodont species, although most specimens were amplified using just one set. The primers also amplified various other mammal species tested, including pet, farm and zoo animals (Wilson-Wilde et al. 2010, Chapter 3). The addition of M13 tails allowed for ease of sequencing by an external provider, regardless of which primer pair was used, and would be suitable for high-throughput species identification testing. The addition of M13 tails did not appear to hamper or affect amplification success rates.

The appearance of possible pseudogenes in some samples has the potential to complicate the ability to distinguish species correctly. If pseudogenes are amplified the results may not be reflective of the true relationship between species and may lead to the incorrect determination of the species of a sample. However, in this study the pseudogenes were easily identified, as they were generally less than 200 bps, were present in a lower concentration than the mtDNA product and were removed when annealing temperatures were increased. Their presence was mainly as interference in the resolution of the 5’ end, requiring repeated amplification.

The presence of these pseudogenes was not entirely unexpected as they have been observed before when analysing mitochondrial DNA (Williams and Knowlton 2001). Additionally, any samples that did not group as expected were BLAST searched (www.ncbi.nlm.gov/Blast), where all results returned matches to the species or closely related species. Where available, sequences obtained from samples which had previously contained pseudogenes were compared to sequences from the same species obtained from Genbank. These comparisons showed limited to no variation. All sequences were checked for ambiguities in sequence, stop codons and insertions or deletions. No blood samples were used in this study. As samples generally grouped as expected and there were no anomalous results (apart from Macropus rufus), there is no evidence to support the contention that any of the resulting sequences originate from pseudogenes.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 167 All specimens had a unique COI sequence and were unambiguous in the determination of the species. Species generally grouped into the correct clade and this would indicate that if the questioned species was not represented in the reference database, then potentially some information could be derived regarding the order, family and possibly genus. However, clear resolution of the phylogenetic relationships of the species was limited, particularly at the deeper nodes. It is important to note that the COI sequence was chosen as it was deemed the most suitable for species differentiation or identification, not for phylogenetic analysis, and this is illustrated in the low resolution at the deeper node branches. It is therefore not unexpected that the resolved tree does not conform entirely to anticipated topologies, based on previous phylogenetic analysis using more suitable markers.

This is particularly illustrated when results are compared to ND2 which displayed much greater resolution and stronger bootstrap support values. Of the markers analysed, ND2 appeared to be the most informative marker and would be useful for species identification in Diprotodonts. Species within the same family grouped together, many with high bootstrap values, as shown in Figure 6.3. Where data was not concatenated to bootstrap values above 50%, the tree topology was consistent with that found in Chapter 5. No genus was split into more than one clade, indicating that if an unknown sample was analysed using ND2, the results would group the unknown sample within the correct clade. This was not the case with COI which split the macropods into multiple clades and did not group Vombatidae and Phascolarctidae, which would be expected given the evolutionary distance between these two families and the others within the Diprotodonts.

The cyt b marker proved relatively uninformative and would not be recommended for use as a species identification tool for Diprotodonts. This is contrary to the results of Tobe et al. 2010, however, that study looked at a broad range of species and only a limited number of Diprotodonts. Investigating Diprotodonts in more detail with a greater range of species illustrates that cyt b displays very little resolution, only resolving three genus with above 50% bootstrap support.

As all sequences were unambiguous and grouping was generally as expected, the COI barcoding sequence and the ND2 marker would be highly useful for species identification in a

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 168 forensic environment. The level of sequence variability was high, with numerous informative codon positions in both markers. The biggest issue in this type of research is obtaining samples. As BOLI methods recommend vouchered specimens, researchers must either collect samples from the wild and submit them to a museum, or use already archived museum specimens. As some species are endangered or limited in numbers or geographic distribution, it would be preferable to use museum specimens as much as possible. BOLD is highly developed and easy to use; with further sample input covering a larger range of species, particularly mammals and other forensically important species, the database could be an important tool for species identification.

Prior to implantation in a forensic setting further work is required. Additional databases of forensically relevant species from known specimens (e.g. vouchered museum samples) need to be developed. The marker and method also need to undergo a validation/verification process within the forensic laboratory as outlined in Chapter 2, akin to that as described for human DNA testing. This includes testing against known case work samples, blind trials and sensitivity studies, so that the operating environment including error rates for the method are known. This can be a lengthy process as illustrated in the recent introduction of cannabis identification into the forensic testing environment (Howard et al. 2008, Howard et al. 2009). Work on this project included characterisation of the method, sensitivity studies, species specificity and detection conditions (Howard et al. 2008), in addition to the development of a reference database, including variation, diversity and accuracy information (Howard et al. 2009).

The use of the COI barcoding marker for forensic species identification is possible, but may be more useful as one marker in an arsenal of tests. In this study the marker was able to distinguish between distantly related and more closely related species of the Diprotodont order. Information generated here can be used in investigations of wildlife crime on marsupial species. With further work in cataloguing eukaryotic life currently being undertaken by various barcoding groups, the system and associated database will have real and significant applications in forensic species identification and as such could be more useful than ND2, although further work needs to be carried out to improve access to more universal primers.

Chapter 6 – Diprotodont Species Identification Using Cytochrome Oxidase I Barcoding Analysis 169

Chapter 7 – Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs).

Photo: Trichosurus vulpecula (Common brush-tailed possum) species within the order

Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 170 Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs).

Abstract Ancient or degraded DNA samples are often difficult to amplify due to the lack of sufficient copies of template DNA within the sample. Successful amplification of these samples relies on primers persistently targeting specific segments of the DNA code. Here, primers were designed to amplify shorter amplicons of overlapping segments of the mitochondrial Cytochrome Oxidase I barcoding marker in Diprotodonts. Primers were also spiked with Locked Nucleic Acids (LNAs) which are conformationally restricted nucleic acid analogues. Comparisons between amplification rates using LNA spiked primers and DNA primers found that LNA primers increase amplification success rates by at least an order of magnitude. The success of the LNA spiked primers was shown to be dependent on a number of factors such as the type, number and position of the LNAs added and the adjacent unmodified nucleotide. However, the critical factor in the amplification efficiency of the LNA spiked primer was the design of the original DNA primer.

Introduction Degraded or low level DNA samples are commonly found in various studies involving the analysis of biological material and can be extremely difficult to amplify because of damage to the DNA. Studies such as ancient DNA analysis of archival museum specimens (Roy et al. 1994, Glenn et al. 1999, Leonard 2008), wildlife and ecological studies (Berry et al. 2007), evolutionary studies (Macdonald et al. 2010) and wildlife crime studies often involve the analysis of trace DNA. Specimens may include hairs (Robertson 1999, Deedrick and Koch 2004, Peabody et al. 1983), deliberately altered samples such as smashed eggs, pelts, and processed samples (Martinex and Danielsdottir 2000) and partial samples such as tissue and carcass remnants, bones, gall bladders and bile (Peppin et al. 2008).

There are two main ways to approach the analysis of these types of samples. The first is to improve the extraction process by improving the removal of substances that would otherwise break down the DNA and inhibitors of the amplification process

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 171 (Schmerer et al. 1999). The second approach is to improve the amplification process. Damage to DNA causes standard oligonucleotide primers to experience difficulty in persistently targeting the specific segments of genomic DNA because of sequence changes (Höss et al. 1996). It is therefore necessary to look at ways of improving the amplification efficiency in degraded or low level DNA.

Methods to increase the success rate of the amplification process include the use of specifically designed primers to produce shortened amplicons (Hajibabaei et al. 2006) and modifications to the primers themselves. Degraded DNA is usually fragmented into shorter pieces with the level of fragmentation increasing with the level of degradation. Shortening the amplicon length can provide more opportunities for the primers to bind to either end of a target segment of the DNA because shorter fragments of template DNA will be present in greater numbers in degraded samples. This approach has been used in human forensic analysis for genotyping (Butler et al. 2003, Chung et al. 2004, Parsons et al. 2007) and also in ancient DNA analysis (Alonso et al. 2003) including barcoding using the cytochrome oxidase I (COI) marker (Hajibabaei et al. 2006, Meusnier et al. 2008, Wandeler et al. 2007). In particular, barcoding techniques target ancient DNA or historical specimens that have often undergone damaging preservation techniques such as storage in formalin and tanning of skins or hides. The barcoding approach involves targeting a small section of approximately 150 base pairs at the 5’ end of the mitochondrial COI gene. Currently, there are no published studies to determine whether this segment of the COI gene is sufficient for species identification purposes in Diprotodontia. The design of overlapping shortened amplicons to amplify the entire barcoding segment could advance this method further by enabling amplification of genetic information that would be comparable to that obtained from good quality tissue; thereby providing sufficient information for species identification.

Primer modifications that can be made to improve amplification success rates include inosine and 7-deazaguanosine (Motz et al. 2000), phosphorothioate linkages (Iyer et al. 1990), propyne analogs (Wagner et al. 1993) and locked nucleic acids (Hertoghs et al. 2003). The use of Locked Nucleic Acids (LNAs) is one technique that has shown great promise and was found to be particularly suitable for improving DNA amplification efficiency (Koshkin et al. 1998). LNAs are conformationally restricted

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 172 nucleic acid analogues where the sugar is ‘locked’ into the northern type structure (Nielsen et al. 1999, Obika et al. 1998). This ‘locked’ molecular structure is achieved by a methylene bridge between the 2’-O atom and the 4’-C atom of any of the common nucleotides (refer Figure 7.1). The LNA containing DNA sequence binds using standard base pairing rules and is greatly benefited by the N-type rigid molecular structure. LNAs provide enhanced hybridisation and thermostable properties and a slower disassociation rate (Christensen et al. 2001) and have been shown to have applications in gene expression (Tolstrup et al. 2003), microarray probes (Kauppinen et al. 2006), single nucleotide polymorphism analysis (Simeonov and Nikiforov 2002), real-time PCR (Mouritzen et al. 2004) and genotyping (Ballantyne et al. 2008).

The melting temperatures of LNA primers show increased values for each LNA addition; between 3 and 5oC (Koshkin et al. 1998). Patterson et al. (2000) also found that the LNA additions influenced the conformational structure of adjacent non-LNA nucleotide also into the northern form. Further, McTigue et al. (2004) found that if adjacent bases were purine, then the influence over the conformational structure was even stronger.

Latorra et al. (2003) found that LNA primers showed higher maximal annealing temperatures and improved PCR product yield with shorter primers compared to DNA primers. Latorra also found that primer design was both complex and important to PCR success rates and that 1-3 LNA additions was optimal, although further specific rules for primers design were not determined. Levin et al. (2006) expanded upon the work of Latorra et al. (2003) by demonstrating that incorporation of LNAs at the 5’ end of the primer provided the best results, but that primers containing LNA additions (either at the 3’ end or evenly spaced) never performed worse than their DNA primer counterparts.

In this study, the amplification efficiency of shorter amplicons with and without the incorporation of LNA additions is investigated for the amplification and sequencing of low level and degraded DNA samples by analysing historical and contemporary samples of Diprotodontia. Target sequences focussed on the COI barcoding gene for its potential applications in forensic species identification and investigate the use of

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 173 the 5’ 150 base pair fragment of the COI marker to investigate its usefulness in species identification in Diprotodontia.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 174

Deoxyribonucleic acid Locked nucleic acid

Figure 7.1 Comparison of the structure of Deoxyribonucleic acid and Locked Nucleic Acid molecules. The locked nucleic acid illustrates the methylene linkage between the 2’-O atom and the 4’-C atom.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 175 Materials and Methods Sample Information DNA was extracted from 14 archival museum specimens from a variety of Diprotodont (marsupial) species (refer Table 7.1). Samples consisted of skins, tanned hides and alcohol fixed specimens, in various states of preservation and some over 100 years old. Samples were extracted according to a modified DNeasy Blood and Tissue Kit (QIAGEN) method as outlined in Christidis et al. (2008). DNA was also extracted from 23 contemporary museum specimens as outlined in Table 7.2 using a standard phenol chloroform extraction (Sambrook et al. 1989). Additionally, a serial dilution of the contemporary Pseudocheirus peregrinus genomic DNA at the following concentrations: 1/100, 1/1000, 1/2,500, 1/4,500, 1/10,000, 1/100,000, 1/1,000,000 was generated to test the amplification efficiency of the various primer sets. Negative controls were used in all extractions.

Primer Design DNA oligonucleotide primers used to amplify the entire COI barcoding marker in one fragment were as described in Wilson-Wilde et al. (2010). To test the amplification efficiency of the smaller amplicons, DNA oligonucleotide primers were designed to amplify the COI barcoding marker in three overlapping segments (refer Figure 7.2 for a diagrammatic representation of the overlapping segments and associated primers). Seventeen diprotodont sequences were aligned using MEGA version 4 (Tamura et al. 2007) and conserved regions in the barcoding sequence of the COI gene identified. These design sites were then identified in the mitochondrial sequence from the Pseudocheirus peregrinus in Primer 3 (Rozen and Skaletsky 2000) and primers designed using standard criteria. Various potential primer sets were identified using this process and these were compared with the aligned sequences and subsequently modified manually to form three sets of primers (Table 7.3). Primers were purchased from Sigma Aldrich.

The first segment of the COI primer set (Figure 7.2) was spiked with LNA’s according to the rules as outlined in Latorra et al. (2003) and Levin et al. (2006) using NetPrimer software (Premier Biosoft International at http://www.premierbiosoft.com/ netprimer/). Purines were targeted for LNA conversion, particularly when located next to another purine base. LNA additions targeted the 5’ end of the oligonucleotide

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 176 and evenly spaced through the primer (Latorra et al. 2003 and Levin et al. 2006). Various combinations of 1 to 3 LNA additions were spiked producing multiple forward and reverse primers and all primers were checked for potential secondary structures using the software OligoAnalyser from Integrated DNA Technologies (http://www.idtdna.com/analyzer/Applications/OligoAnalyzer/). Additionally, LNAs were spiked into the reverse primer amplifying the full barcoding segment of the COI gene for comparison using the same process as for the COI segment one forward and reverse primers. See Tables 7.4 and 7.5 for the various forward and reverse LNA primers developed. All LNA primers were purchased from Integrated DNA Technologies, USA (http://www.idtdna.com/Home/Home.aspx). All primers designed included M13 tails as follows; forward: GTA AAA CGA CGG CCA GT, reverse: CAG GAA ACA GCT ATG AC.

Amplification and Sequencing PCRs were performed in a 25uL reaction using a master mix made up of 12.5uL Go Taq Green® Master Mix 2X (Promega cat # M712C), 0.75uL primer (Sigma Aldrich, refer Table 7.2 for primer sequences) and 0.3uL 25mM magnesium chloride (Qiagen) per sample. 12.5uL of master mix was added to a 96 well plate (cat # T323-96N), after which 12.5uL of genomic DNA was added to the appropriate wells. Negative controls were used in all PCRs. Samples were amplified under the following reaction conditions: initial incubation of 94oC 3 mins; followed by 40 cycles of 94oC 30 secs, 50-57oC 30 secs, 72oC 60 secs; and a final extension step of 72oC 7 mins.

All DNA only primers were tested on the contemporary Diprotodontid samples. The three sets of COI mini primers were then used to amplify the historical Diprotodontid samples, using two rounds of PCR under the same conditions, but using the amplified product as the source (genomic) DNA for the second round PCR.

Each LNA primer was paired for amplification with a DNA primer to look at the effect of the LNA additions amplifying a serial dilution of genomic Pseudocheirus peregrinus DNA and determine the optimal LNA primer pair. Once suitable primer pairs were identified they were used to amplify the contemporary DNA samples. Amplification success rates were also tested by comparing the amplification success

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 177 rates of the COI mini DNA and LNA primers and the COI standard DNA and LNA primers on the serial dilution of pseudocheirus peregrinus.

PCR products were visualised before sequencing on a 1.2% agarose gel with Hyperladder II (Bioline) and ethidium bromide staining. The COI segment one samples were sequenced in both directions by the Australian Genome Research Facility using M13 primers. Run data was then edited using SEQUENCHER version 4.1.4 (GeneCodes Corporation 1991-2002) software package and consensus sequences exported as text files. Consensus sequences were aligned using the CLUSTAL module in MEGA version 4 (Tamura et al. 2007) with manual adjustments. Maximum parsimony and maximum likelihood analyses were performed using PAUP* 4.0b10 (Swofford 20020) and PHYML (Guindon and Gascuel 2003) with bootstrap analysis incorporating 10,000 and 1,000 replications respectively.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 178

Figure 7.2 Positioning of the primers to amplify the Cytochrome Oxidase I marker on the mitochondrial DNA molecule. Primer locations are illustrated for the barcoding marker above and the three overlapping shortened segments below.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 179 Table 7.1 Archival museum specimens used in the study of LNA additions and shortened amplicons. Vouchered specimens listed below were approximately 50 to 100 years old and in various states of preservation including alcohol stored, dried skins and skins which had undergone a tanning process. Specimens catalogued with no collection date information are listed below as unknown (u/k) collection date.

Genus Species Common Name Sample Specimen Collection No. type date Lasiorhinus latifrons Southern Hairy-Nose C30495 Mount u/k Wombat Pseudochirops archeri C3935 Skin 1911 Petropseudes dahli Rock Ringtail Possum C3945 Skin 1912 herbertensis Herbert River Ringtail C3949 Skin 1911 Possum Bettongia penicillata Brush tailed Bettong / C2452 Mount u/k Woylie Petrogale brachyotis Short-eared Rock-wallaby DTC184 Skin 1935 Thylogale billardierii / C6431 Skin u/k Rufus-bellied Pademelon, Red-bellied Pademelon Macropus dorsalis Black -stripped Wallaby/ C1373 Skin u/k scrub Wallaby Macropus parma / White C8156 Skin u/k throated Wallaby (tanned) Pseudocheirus peregrinus Common Ringtail Possum C27247 Mount u/k Pseudocheirus convolutor Ringtail possum C3939 Skin u/k Phalanger orientalis Grey Cucus Gaudalcanal C26556 Mount u/k Is Phalanger orientalis Grey Cucus New Georgia C24792 Alcohol 1879 sps Wallaby (PNG/) C18599 Skin 1973

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 180 Table 7.2 Contemporary museum specimens used in the study of LNA additions and shortened amplicons. Vouchered specimens listed below were tissue samples stored at -80oC and less than 10 years old.

Genus Species Common Name Sample No. Phascolarctos cinereus Koala M24360 Vombatus ursinus Common Wombat W5166 Trichosurus vulpecula Common Brush-tailed Possum M28611 Trichosurus caninus Mountain Brush-Tailed Possum M16324 Spilocuscus maculatus Spotted Cuscus GT149 Phalanger orientalis Grey Cuscus / Northern common M19104 cuscus Cercartetus nanus Eastern Pygmy Possum M29305 Petauroides volans Greater Glider C31224 Pseudocheirus peregrinus Common Ringtail Possum M16313 Gymnobelideus leadbeateri Leadbeaters Possum M16323 Petaurus breviceps Sugar Glider W5317 Petaurus norfolcensis Squirrel Glider M16819 Tarsipes rostratus Honey Possum M16378 Acrobates pygmaeus Feathertail Glider M55442 Bettongia tropica M16432 Aepyprymnus rufescens Rufous Bettong M28956 Petrogale assimilis Allied Rock Wallaby 48-01-596 Petrogale xanthopus Yellow-Footed Rock Wallaby M36007 Petrogale penicillata Brush Tailed Rock Wallaby M36009 Macropus rufus Red Kangaroo W5316 Macropus giganteus Eastern Grey Kangaroo M36017 Dendrolagus goodfellowi Goodfellow's Tree Kangaroo MKMG1 Isoodon macrourus Northern Brown Bandicoot 49857

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 181 Table 7.3 Standard DNA primers developed to produce shortened amplicons for three overlapping segments of the 5’ end of the COI barcoding marker. Primer name, sequence, product length and melting temperature are provided.

Primer Name Primer Sequence 5’ to 3’ Length Tm (oC) (bases) Segment 1 COIminilww1aF TCTGCCACTCTACCTATGTTC 229bps 51.9 COIminilww1aR GCCTGACTATGAAGAAGATTAT 50.5 Segment 2 COIminilww2F AAGTAATCGTAACCGCACA 235bps 51.4 COIminilww2R GTAGGCTAAACAGTCCATCC 51.3 Segment 3 COIminilww3F TTCCTCCTTCTCCTCGCATC 226bps 59.1 COIminilww3aR ACAGCGGTGTTTGGTATTG 53.9

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 182 Table 7.4 LNA primers developed to produce shortened amplicons for the shortened segment of the 5’ end of the COI gene and also the COI barcoding marker. Primer name, sequence, primer length and melting temperature are provided. Bold and underlined base indicates an LNA addition.

Forward COI Segment One Sequence 5’ to 3’ Length Tm (oC) Primer Name (bases) COIm1bLFa TCTGCCACTCTACTATGTTC 21 55.9 COIm1bLFb TCTGCACTCTACCTATGTTC 21 56.4 COIm1bLFc TCTGCCACTCTACCTATGTTC 21 58.9 COIm1bLFd CTCTGCCACTCTACCTATGTTC 22 58.0 COIm1bLFe TCTGCCACTCTACCTATGTTC 21 61.1 COIm1bLFf CTCTGCCACTCTACCTATGTTC 22 60.9 Reverse COI Segment One Sequence 5’ to 3’ Length Tm (oC) Primer Name (bases) COIm1bLRa GCCTGACTATGAAGAAGATTAT 22 54.8 COIm1bLRb GCCTGACTATGAAGAAGATTAT 22 54.0 COIm1bLRc GCCTGACTATGAAGAAGATTAT 22 54.4 COIm1bLRd GCCTGACTATGAAGAAGATTAT 22 56.9 COIm1bLRe GCCTGACTATGAAGAAGATTAT 22 56.6 Reverse COI Barcoding Segment Sequence 5’ to Length Tm (oC) Primer Name 3’ (bases) COIFLRa ACTTCAGGGTGTCCAAAAAATCA 23 57.7 COIFLRb ACTTCAGGGTGTCCAAAAAATCA 23 60.0 COIFLRc ACTTCAGGGTGTCCAAAAAATCA 23 57.6 COIFLRd ACTTCAGGGTGTCCAAAAAATCA 23 59.7 COIFLRe CTTCAGGGTGTCCAAAAAATCA 22 57.6

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 183 Results The COI segment one amplified well with the primers positioned in the most conserved section of the COI gene and displayed the greatest amount of amplification efficiency of the three segments, amplifying 11 out of the 23 contemporary samples. Segments two and three were not as successful, amplifying 9 and 8 respectively (refer Table 7.6). Amplification product bands were also stronger and more defined from segment one. As expected, the primers amplified Pseudocheirus peregrinus (from which they were designed) well and relatively well in those species of Petauroidea. However amplification efficiency was reduced in the macropods.

The results indicate that COI segment one forward primers COIm1bLFd and COIm1bLFf and reverse primers COIm1bLRc and COIm1bLRd and COI barcoding reverse primers COIFLRc and COIFLRd provided the best results (Figures 7.3 and 7.4). Two to three LNA additions provided better results than one alone and spiking LNAs into the middle and 5’ end targeting purines, where possible, generally provided the greatest amount of amplification efficiency. Primer pairs require very similar Tm values. The following LNA primer pairs were determined to be optimal for amplification: COI segment one primers COIm1bLFb and COIm1bLRd, and COI barcoding primers COIm1bLFf and COIFLRd.

The LNA primers for the entire barcoding marker (Figure 7.5b) provided greater amplification efficiency than the shortened LNA amplicon primers (Figure 7.5a), however the larger amplicons also displayed secondary products in some samples, particularly the macropods. Generally, both sets of LNA primers performed better in the possums than the macropods. Results indicate that the addition of LNA primers produce amplifications in diluted samples of up to an order of magnitude greater that DNA primers alone (Figure 7.6). The COI segment one DNA and LNA primers amplified up to a dilution of 1 in 2,500 and 1 in 4,500 respectively (Figure 7.6a), whilst the barcoding marker DNA and LNA primers generally amplified up to a dilution of 1 in 10,000 and 1 in 100,000 respectively (Figure 7.6b).

Phylogenetic analysis using maximum parsimony and maximum likelihood analysis of the COI segment one from contemporary and ancient samples indicate that very little phylogenetic information can be obtained for Diprotodonts (refer Figure 7.7).

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 184 The lack of resolution in the tree indicates a hard polytomy, which is contrary to existing phylogenies (Meredith et al. 2009) and suggests that the COI segment one is not a phylogenetically informative marker. Seventy-six variable sites were found out of the total 186 base pairs. Sequence changes can be diagnostic and used for identification of known species held on a database. However, if a species is not held on the database, grouping to other related species via phylogenetic analysis would not be resolved.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 185 Table 7.5 Amplification success rate for each primer pair amplifying the COI marker in three segments in the contemporary Diprotodont samples. Amplification is expressed as one of the following: + successful amplification, +(M) multiple products, – no amplification.

Genus Species COI segment 1 COI segment 2 COI segment 3

Phascolarctos cinereus + + - Vombatus ursinus + - +(M) Trichosurus vulpecula +(M) - +(M) Trichosurus caninus + +(M) +(M) Spilocuscus maculatus - +(M) + Phalanger orientalis + +(M) + Cercartetus nanus + + + Petauroides volans +(M) + - Pseudocheirus peregrinus + + + Gymnobelideus leadbeateri + + - Petaurus breviceps + - - Petaurus norfolcensis _ - - Tarsipes rostratus +(M) + + Acrobates pygmaeus + +(M) - Bettongia tropica + - - Aepyprymnus rufescens - - + Petrogale assimilis - + - Petrogale xanthopus +(M) - + Petrogale penicillata +(M) - + Macropus rufus - + +(M) Macropus giganteus - + +(M) Dendrolagus goodfellowi + - - Isoodon macrourus - - -

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 186 Table 7.6 Amplification success rate for each primer pair amplifying the COI marker in three segments in the archival Diprotodont samples. Amplification is expressed as one of the following: + successful amplification or – no amplification.

Genus Species COI segment 1 COI segment 2 COI segment 3

Lasiorhinus latifrons + + - Pseudochirops archeri + + - Petropseudes dahli + + + Pseudochirulus herbertensis - + + Bettongia penicillata + + + Petrogale brachyotis + + + Thylogale billardierii + - + Macropus dorsalis + + - Macropus parma + + - Pseudocheirus peregrinus - + + Pseudocheirus convolutor + - + Phalanger Orientalis (1) + - + Phalanger Orientalis (2) + - - Dorcopsis sps + + -

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 187 Lane123456789

COIm1bLFa

COIm1bLFb

COIm1bLFc

COIm1bLFd

COIm1bLFe

COIm1bLFf

(a)

Lane1234 5 6 789

COIm1bLRa

COIm1bLRb

COIm1bLRc

COIm1bLRd

COIm1bLRe

(b) Figure 7.3 Comparison of individual forward (1a) and reverse (1b) COI segment one primers spiked with Locked Nucleic Acids amplified with a corresponding DNA primer. Using a serial dilution of Pseudocheirus peregrinus at the following concentrations: 1/100, 1/1000, 1/2,500, 1/4,500, 1/10,000, 1/100,000, 1/1,000,000 (lanes 1-8 and 2-9 respectively).

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 188 Lane123456789 COIFLRa

COIFLRb

COIFLRc

COIFLRd

COIFLRe

Figure 7.4 Comparison of individual reverse COI barcoding primers spiked with Locked Nucleic Acids amplified with a corresponding DNA primer. Using a serial dilution of Pseudocheirus peregrinus at the following concentrations: 1/100, 1/1000, 1/2,500, 1/4,500, 1/10,000, 1/100,000, 1/1,000,000 (lanes 2-9 respectively).

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 189 COI Segment 1 COI barcoding marker Lane 1 23 45 6 789 10 11 12 13 14 15 Lane 1 23 45 6 789 10 11 12 13 14 15

Lane 123 45 678910 11 12 13 14 15 Lane 123 45 678910 11 12 13 14 15

(a) (b) Figure 7.5 Results of amplification of contemporary samples using LNA primers. Figure 7.5a illustrates amplification results of segment 1 of the COI barcoding marker and Figure 7.5b illustrates amplification results of the entire COI barcoding marker.

Upper Genus Species Lower Genus Species Lane Lane No. No. 1 Ladder 1 Ladder 2 Phascolarctos cinereus 2 Acrobates pygmaeus 3 Vombatus ursinus 3 Bettongia tropica 4 Trichosurus vulpecula 4 Aepyprymnus rufescens 5 Trichosurus caninus 5 Petrogale assimilis 6 Spilocuscus maculatus 6 Petrogale xanthopus 7 Phalanger orientalis 7 Petrogale penicillata 8 Cercartetus nanus 8 Macropus rufus 9 Petauroides volans 9 Macropus giganteus 10 Pseudocheirus peregrinus 10 Dendrolagus goodfellowi 11 Gymnobelideus leadbeateri 11 Isoodon macrourus 12 Petaurus breviceps 12 Nil 13 Petaurus norfolcensis 13 Nil 14 Tarsipes rostratus 14 Nil 15 Ladder 15 Ladder

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 190 Lane123456789

DNA Primers ment 1 g

COI Se LNA Primers

(a)

Lane123456789

DNA Primers marker marker g

LNA Primers COI barcodin

(b) Figure 7.6 Results of amplification of serial dilution of Pseudocheirus peregrinus comparing DNA and LNA primer pairs. Figure 7.6a illustrates amplification results of segment 1 of the COI barcoding marker and Figure 7.6b illustrates amplification results of the entire COI barcoding marker. Using a serial dilution of Pseudocheirus peregrinus at the following concentrations: 1/100, 1/1000, 1/2,500, 1/4,500, 1/10,000, 1/100,000, 1/1,000,000 (lanes 2-8 respectively). Lane 1 contains ladder.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 191 Phalanger orientalis Bettongia tropica Macropus dorsalis Macropus fuliginosus Macropus agilus Aepyprymnus rufescens Petrogale penicillata Petrogale xanthopus Dendrolagus goodfellowi Tarsipes rostratus

69 Petaurus breviceps

Petaurus norfolcensis 56 Gymnobelideus leadbeateri Vombatus ursinus Petauroides volans Acrobates pygmaeus Spilocuscus maculatus 60 99 Trichosurus caninus 63 98 Trichosurus vulpecula Phascolarctos cinereus Cercartetus nanus Pseudocheirus peregrinus Isoodon macrourus

Figure 7.7 Phylogenetic analysis of segment 1 of the COI barcoding marker in the Diprotodonts tested analysed in PAUP* using Maximum Parsimony and Maximum Likelihood analysis. Figures represent bootstrapping values for MP above the line and ML below the line from 10,000 replications and 1,000 replications respectively.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 192 Discussion LNA primers demonstrated high specificity and performed optimally in highly conserved regions. Results suggested that 2-3 LNA additions provided the greatest increased amplification output. The LNA primers used to amplify the COI marker as the entire barcoding marker provided the greatest amount of amplification efficiency and were shown to be robust primers in Diprotodonts. This is likely to be due to the reverse primer being located in a highly conserved area of the gene in Diprotodonts. COI segment one of the shortened amplicon primers generally amplified generally well, with the primers which amplified segments two and three providing the poorest results.

These results support the contention that the original design of the primers is crucial to amplification success rates of the subsequent LNA containing primers and that optimising the DNA primers prior to spiking with LNAs will lead to an increased performance of the end LNA primer product. This was illustrated in the amplification of the serial dilution in the comparison of DNA and LNA versions of the same primer sequence. These results illustrate that LNA spiked primers provide up to an order of magnitude increase in the amount of amplified product. This is a significant improvement in amplification success and could be crucial when amplifying degraded samples. However, by designing primers in conserved regions, a higher level of amplification efficiency can be reached. The use of alternative amplicons may not be able to overcome issues in primer target regions. The highest level of primer optimisation can be reached by designing primers in conserved regions and then spiking with LNAs.

Whilst sequences were obtained for some of the historical samples for segment one of the COI gene, the sequence was often of insufficient length and quality to be included in the further analysis. Amplification success rates in the historical specimens could be due to the poor preservation state in which some of the samples were found. The primer positioning and sequence could also have contributed to the lack of usable sequence. The use of shorter amplicons, of approximately 30-50 base pairs, may also overcome some of the lack of amplification issues.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 193 A major problem in amplifying short segments within a gene when targeting a broad range of species is that base changes, used to enable the identification of the differences between species, can hamper the ability of the primer pairs to bind to the target site. Conserved sections are always preferred but can be difficult and sometimes impossible to locate. The forward primer for segment one was placed in the conserved section prior to the commencement of the gene sequence and the reverse primer was placed in a small 5 base pair (position 187 to 208 from the 5’ end of the gene) conserved section (in the 17 species aligned from entire mitochondrial sequences obtained from Genbank at the beginning of this study). However, further COI analysis work reported earlier (Chapter 6) illustrated that within the COI marker in the Diprotodonts there is a lack of conserved primer sites, with most base changes occurring at the third codon position throughout the COI barcoding marker sequence, including positions 187 to 208 from the 5’ end of the gene. Results were reflective of this existing variation with fewer than expected sequences obtained.

Due to the reasons outlined above, amplification of more than one segment for identification purposes may therefore not be practical and produce limited data return for the resource input. Practically, this protocol would be more useful when looking at a known or single species, where specific primers can be found in conserved regions to maximise the ability of the primers to bind repeatedly to the target sequence.

Whilst the Consortium of Life barcoding group have indicated that segment one of the barcoding marker could provide enough information for the discrimination of most species (Hajibabaei et al. 2006, Meusnier et al. 2008, Wandeler et al. 2007), in the Diprotodonts this does not seem to be the case. Maximum parsimony analysis of the segment produced a tree of very limited informative value. The lack of resolution in the tree indicates a hard polytomy, which is contrary to existing phylogenies (Meredith et al. 2009) and therefore not supportive of the COI segment one being an informative marker. These are results are also consistent with those found using the entire barcoding segment in Diprotodonts in Chapter 6. A comparison of the COI, ND2 and Cyt b markers found that ND2 was significantly more useful for species discrimination and phylogenetic reconstruction. Whilst the shortened COI segment may provide some diagnostic features they were not sufficient for reliable species

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 194 identification in Diprotodonts. However this may not necessary be the case in other species.

Grouping of the species into clades was not entirely as expected in reference to the phylogeny presented in Chapter 5. Apart from Trichosurus grouping together, Spilocuscus grouping as a sister group to Trichosurus and Petaurus grouping together, all other taxa essentially grouped together as a polytomy. It therefore appears that the use of this single shortened amplicon would not provide enough sequence data to adequately identify diprotodonts and additional markers would be needed to confirm identity. Any use of the short sequence to group an unknown sample into the correct taxa could not be relied upon as the grouping within the tree was not consistent with current expectations.

These amplification techniques, whilst successful in increasing amplification efficiencies, are more difficult to implement when amplifying a broad range of species and would be more effective if amplifying a single species or group of closely related species. Shortened amplicons would be more easily identified and targeted where clear sections of sequence homogeneity can be identified. These targeted, specific primers can then be used as a base to spike LNAs into, further improving amplification success rates.

The design of the primers is crucial to amplification success rates and whilst the LNA additions do improve PCR success rates there are no alternative amplicons that can improve on the traditional conserved primers. Optimising the DNA primers prior to spiking with LNAs will lead to an increased performance of the end LNA primer product. When analysing ancient or highly degraded DNA, the use of amplicons of less than 100 base pairs is suggested. By combing shortened amplicons, targeting conserved regions and utilising the addition of LNAs, significant improvement in amplification efficiency can be achieved.

Chapter 7 - Amplification success rates of Ancient and Degraded DNA using Shortened Amplicons and Locked Nucleic Acids (LNAs). 195

Chapter 8 – Synopsis

Photo: Thylogale thetis (Red-necked pademelon) species within the order

Diprotodontia. Photo by Linzi Wilson-Wilde.

Chapter 8 – Synopsis 196 Chapter 8 – Synopsis

Wildlife crime species identification Reports suggest there has been an increase in illegal activities associated with wildlife crime, particularly involving organised crime syndicates (Alacs et al. 2009, Renctas 2001). Apprehension of offenders is reliant on good investigation techniques; however, subsequent prosecution of wildlife crime offenders relies largely on the ability of the prosecutor to prove an offence against a restricted or regulated species. The ability to identify an animal or animal part to species level is vital to the prosecution of the offence. If the animal or part is morphologically identifiable traditional taxonomic (morphological) methods may be used. If this is not the case, molecular DNA identification methods will be required.

The delimitation of species is underpinned by three key disciplines of evolutionary biology: taxonomy, phylogeography and population genetics (Yang and Rannala). Developing the body of knowledge required for the creation of a species identification method and provision of expert opinion evidence in courts of law needs considerable expertise, funding in order to conduct the research, and is time consuming (sometimes years). The identification of appropriate markers to determine species boundaries and develop an understanding of the phylogenetic relationships of closely and distantly related species may be different to those later chosen for a species identification test once the species boundaries and relationships are understood. Building a database of known DNA sequences for comparison to the sequence results from an unknown sample is vital to the confidence in the opinion that can be obtained from any molecular DNA testing results.

Evaluation of nuclear DNA markers for the identification of species and higher orders Any species identification test requires underpinning information to support the opinion of the identification of the species. This includes known samples of the species, including closely related species, and also an understanding of the phylogenetic relationship of various relevant species. It has been proposed that an understanding of the phylogenetic relationships of a range of species can be improved by combining mitochondrial and nuclear data. However, the identification of suitable

Chapter 8 – Synopsis 197 markers can be time consuming and may require considerable optimisation of markers prior to analysis in the species.

In this study the feasibility of screening a large number of markers was investigated, optimising only those that amplified relatively well. One hundred and two markers were screened using eight species from eight families within the Diprotodont order. Primers were rated for their ease of amplification. Eighteen markers were identified as having potential for further work outside the focus of this study (i.e. more suitable for a specific species study). Of the 18 markers, nine markers were identified for further optimisation work. The amplification performance of each of the markers was assessed in the Diprotodonts, including four markers never before tested in the Diprotodonts (EN2, ODC1, GAPD and PENK1). The results illustrated that ApoB provided the greatest amount of resolution at the deeper nodes and that ApoB, PENK and vWF performed well at the shallow nodes. The information generated in regards to the individual markers and their phylogenetic utility can be used to inform future studies in Diprotodonts and potentially other marsupial species.

Development of a database of COI sequences for forensic-based identification purposes The Barcode of Life Project aims to catalogue DNA sequences for the Cytochrome Oxidase I gene (COI) for every living species on the Barcode of Life Database (BOLD). The database asserts strict quality control requirements and offers wildlife crime forensic scientists an ability to run a preliminary screen over unknown samples. However, the power of any reference database relies on the information contained within the database. Currently there are no COI sequences for Diprotodonts held on BOLD, however, there are references to sequence information for 32 species held on Genbank (which has no controls for data quality input). Therefore, there is a need to develop reference sequences for Diprotodonts that can be uploaded into BOLD to extend the use of the database to include Diprotodonts.

This study has resulted in the development of primers to amplify the COI gene in marsupials and other mammals. The study has also produced COI sequences for a broad range of species within the order and most genera. This data can be used to extend BOLD and inform future research into the COI marker and barcoding analysis,

Chapter 8 – Synopsis 198 and provides the basis for a reference database of Diprotodont COI sequences obtained from vouchered museum specimens.

Evaluation of the utility of COI compared to other widely used mtDNA genes Whether or not the COI marker can be used for species identification has long been debated from a barcoding perspective. From a forensic species identification perspective, the cytochrome b (cyt b) gene has often been used as an appropriate marker and recommended for use over COI (Tobe et al. 2010). Mitochondrial markers are often targeted for forensic species identification as they can be more easily amplified in degraded samples than nuclear markers. Given the high prevalence of specimens and unknown samples received in the investigation of wildlife crimes that have been exposed to harsh conditions, rendering the DNA degraded, the use of mitochondrial DNA markers is appealing. In Diprotodontia, another mitochondrial DNA gene that has been used is NADH dehydrogenase subunit 2 (ND2). The utility of COI was compared to cyt b and ND2. It was found that ND2 provided better phylogenetic resolution of the species within the Diprotodont order. However, given the large amount of reference sequences available for COI, the COI may still be a more appropriate screening tool when the identification of the sample is unknown. If the sample is known to be a marsupial, then the use of ND2 may be preferred. It is recommended that COI not be used in isolation, but rather as one marker in an arsenal of tools available for a forensic scientist to use for species identification.

Evaluation of options for the implementation of DNA-based species identification services in forensic laboratories for Diprotodontia Species identification can be relatively straightforward for species with a clear gap between related species such as Phascolarctos cinereus (koala) and Vombatus ursinus (wombat), however it can be much more problematic when the species boundary is more blurred, such as between Petaurus gracilis () and Petaurus breviceps (Sugar glider). This is due to problems such as introgression, cryptic species and hybridisation. An understanding of evolutionary biology is vital for providing opinion evidence by forensic scientists on species identification in wildlife crime. Additionally, dedicated equipment and appropriate training would be required for government laboratories to provide services in this field. A review of the current

Chapter 8 – Synopsis 199 testing methods and processes (contained in Chapter 2) indicates this would be a difficult task for these laboratories. One approach could be to implement a virtual institute with a nodal system of laboratories with particular specialisation and expertise providing the service for others. The specialisation could be broad ranging but focussed to particular taxonomic clades such as mammalia, aves, etc. If a government laboratory was to validate and implement species identification for wildlife crime, it is recommended that only one jurisdictional laboratory in Australia implement this methodology.

What has been developed as a result of this study is a robust phylogeny of the order Diprotodontia to underpin a species identification test. Additionally, information regarding the phylogenetic performance of numerous markers in Diprotodonts and potential improvements for the amplification of degraded samples were developed. Provided are recommendations to forensic laboratories to implement COI as a screening tool for unknown samples and also have a arsenal of other tools for more specific species identification if required. ND2 is recommended for use with Diprotodontia.

In order for any laboratory to specialise in the area of species identification for wildlife crime investigations, further research and technique development is required to ensure sufficient tools are available to scientists in order to identify relevant species. The process of validation is extensive and time-consuming, as discussed in Chapters 2 and 6. Additionally, the ability to develop further tools and databases of reference samples is also required, as control samples and sequence data the species of interest may not currently be available.

Improving the analysis of degraded samples Degraded samples pose a particular problem in DNA analysis including species identification. Any techniques that can improve DNA-based identification results from degraded specimens and trace samples would be of great benefit not only to forensic scientists but also to any researcher dealing with these types of samples, such as in ecology, conservation and phylogenetic studies. The primers cannot consistently bind to the DNA to facilitate the amplification process in inhibited and degraded samples. This study investigated techniques that might be useful in improving the

Chapter 8 – Synopsis 200 ability of the primers to consistently bind to the DNA. The use of primers to amplify shortened amplicons and primer modifications were investigated. This research focussed on the short 5’ segment of COI barcoding gene as recommended by the BOLI and additionally the use of Locked Nucleic Acids (LNAs) primer modifications using the species from the order Diprotodontia as model organisms.

Results indicated that the shortened segment of the COI gene was not as useful as other markers for species identification and it is not recommended for general use. Improved results were obtained from most samples using the standard barcoding segment. However, when extremely degraded (ancient) samples were analysed, no sequence data could be obtained for the barcoding segment as the genomic fragment lengths were too degraded to allow the primers to consistently bind to the target region. Results were obtained using the shortened primers and these may be more useful when samples are particularly degraded. However, the primers were not able to bind across a broad range of species, so specifically designed primers to amplify shortened amplicons may be required to improve amplification success rates for a particular species.

The addition of LNAs to the primers improved the amplification efficiency of the primers by up to an order of magnitude and could be a cost-effective method of improving results from difficult samples. The addition of LNAs did however reduce the ability of the primers to amplify a wide range of species across the Diprotodont order. This is likely due to the LNA additions increasing the specificity of the primers. This would be useful if targeting a particular species to the exclusion of other species. It was also found that LNA primers did not amplify when M13 tails were attached. This is possibly due to interactions between the M13 primers used for sequencing and the LNA section of the primers causing tertiary structures. It is therefore recommended that M13 tails are not used when LNA modifications are incorporated into primers. This will reduce the application of high throughput analysis processes when using LNA modified primers. However, when analysing degraded samples or when requiring high specificity, this may be an appropriate compromise.

Chapter 8 – Synopsis 201 Future Directions Whilst the COI barcoding marker provides part of the solution for forensic laboratories seeking to implement species identification techniques, considerable work needs to be undertaken to develop a database of known control sequences of forensically relevant samples. The work undertaken and information generated in this thesis can be used to inform and populate such a database, however further species would need to be analysed, particularly aves and reptilia, two orders commonly targeted in wildlife crime activities.

In addition to the development of a database of known samples, extensive work must be undertaken to meet accreditation validation and verification requirements. This process is aimed at understanding the limitations of the method so that accurate representation of the strength of the evidence can be presented in court. Studies include the determination of false positives, error rates, sensitivity studies and results of which must be published in peer reviewed journals. Cannabis identification methods were developed using DNA analysis and considerable work was undertaken after the development studies were completed to provide the validation data to be in a position to implement the method into a forensic laboratory (Howard et al. 2008, Howard et al 2009).

Prior to the implementation of the testing and methods for species identification in Diprotodonts, as outlined in this study, further work should be conducted to thoroughly validate the methods. In addition, for each laboratory that implements species identification, verification studies must be conducted to ensure expected results are obtained in line with the previously obtained validation results. This ensures consistent results are obtained for the courts regardless of which scientist or laboratory conducts the analysis.

New molecular techniques may also offer assistance as an adjunct to the COI marker such as SNP analysis, particularly for mixed samples. Additionally, with the introduction of next generation sequencers and bioinformatics systems, whole of genome sequencing could be investigated for its application to forensic species identification.

Chapter 8 – Synopsis 202 There are few capacity building activities currently occurring globally. The Association of Southeast Asian Nations Wildlife Enforcement Network (ASEAN- WEN) initiative has the greatest potential to affect Australia. However, Australia is not currently involved in the ASEAN-WEN project; TRAFFIC has assisted in this initiative. Additionally, the world’s only laboratory dedicated to the forensic analysis of wildlife crime (the US National Fish and Wildlife Forensic Laboratory) recently announced the creation of a Society for Wildlife Forensic Sciences and an associated tri-annual conference dedicated to wildlife forensic sciences (the first meeting was held in Ashland Oregon, 19-23 April 2010).

Wildlife crime has the potential to impact to a greater extent on the Australian environment than other countries, due to the unique nature of the region’s fauna and flora. From a forensic viewpoint, the Australia New Zealand Policing Advisory Agency - National Institute of Forensic Science (ANZPAA NIFS) (www.anzpaa.org.au) facilitates the Australian Wildlife Forensics Network. At a recent INTERPOL meeting of the Wildlife Crime Working Group (Manaus, Brazil, September 2009), numerous international representatives provided examples of seizures of Australian animals, illustrating that Australia is being targeted for its unique fauna by organised criminal networks. The INTERPOL Wildlife Crime Working Group is actively involved in conducting targeted operations on organized crime. This involves sharing intelligence, capability building initiatives including training and method development, and general information.

It is clear that wildlife crime in Australia is increasing (Alacs and Georges 2008) and that the extent of the problem is still largely understood as illegal shipments can be hidden within the regulated legal trade (Smith et al. 2009). Partnerships between multiple agencies can be effective in identifying contraband and perpetrators and deterring other offenders. Effective action requires collaboration between investigators, forensic specialists and governments of source, transit and recipient countries. It requires the commitment of government and law enforcement agencies, international collaboration and information exchange and, importantly, the subsequent prosecution of offenders. One option suggested is for the prosecution to focus on the theft of a public resource, which is motivated by profit and presents a potential risk to human health in the spread of diseases (such as avian influenza (H5N1) and severe

Chapter 8 – Synopsis 203 acute respiratory syndrome (SARS) (Wyler and Sheikh 2008)). This effort needs to be made in conjunction with wildlife and habitat protection, promotion of eco trade (which can also target former poachers), public outreach campaigns, sustainable agriculture and funding for further research. Eco-tourism communities can earn 2-3 times the money from tourism than they can from poaching, and tourism also ensures they have a long term funding source.

Molecular approaches to wildlife crime require further development within Australia, with a coordinated nodal based service provision structure underpinned by funded research and development activities. This would require dedicated government funding and focussed attention to this important issue.

Chapter 8 – Synopsis 204 References

Alacs E and Georges A. (2008) Wildlife across our boarders: a review of the legal trade in Australia. Australian Journal of Forensic Sciences 40:147-160.

Alacs E, Georges A, Fitzsimmons NN and Robertson J. (2010) DNA detective: a review of molecular approaches to wildlife forensics. Forensic Science, Medicine and Pathology 6:180- 194 doi: 10.1007/s12024-009-9131-7.

Aitken N. (1997) The mitochondrial control region and cytochrome b of the (Trichosurus vulpecular): potential for population markers. School of Biological Sciences Auckland, University of Auckland.

Alonso A, Martin P, Albarrá C, García P, Primorac D, García O, Fernández de Simón L, García-Hirshfeld J, Sancho M and Fernández-Piqueras J. (2003) Specific Quantification of Human Genomes from Low Copy Number DNA Samples in Forensic and Ancient DNA Studies. Croatian Medical Journal 44 (3):273.

Amplitype User Guide (Perkin-Elmer Corporation, 1990).

Amrine-Madsen H, Scally M, Westerman M, Stanhope MJ, Krajewski CW, Springer MS. (2003) Nuclear gene sequences provide evidence for the monophyly of australidelphian marsupials. Molecular Phylogenetics and Evolution 28:186–196

Anon. (1998) Australian Senate Select Committee on the Commercial Utilization of Native Australian Wildlife. Commercial Utilisation of Australian Native Wildlife. Australian Department of the Senate, Canberra www.aph.gov.au/Senate/committee/rrat_ctte/completed_ inquiries/1996-99/wild/report/contents.htm.

Anon. (2009) Strengthening forensic science: a path forward. Report for the National Academy of Sciences. The National Academies Press, Washington, D.C. www.nap.edu

References 205 Aplin KP and Archer M. (1987) Recent advances in marsupial systematics with a new syncretic classification. In: Archer, M. (Ed.), Possums and Opossums: Studies in Evolution. Surrey Beatty & Sons, Chipping Norton, Nsw, pp. Xv–Lxxii.

Archer M. (1984) The Australian marsupial radiation. In: Archer M, Clayton G (Eds) Vertebrate zoogeography and evolution in Australasia. Hesperian Press, pp477-516.

Asher RJ, Horovitz I and Sánchez-Villagra MR. (2004) First combined cladistic analysis of marsupial phylogenetic relationships. Molecular Phylogenetics and Evolution 33:240-250.

Avise JC. (2000) Phylogeography: the history and formation of species. Massachusetts: Havard University Press.

Ayres KL, Chaseling J and Balding DJ. (2002) Implications for DNA Identification Arising from an Analysis of Australian Forensic Databases. Forensic Science International 129:90.

Baker ML, Wares JP, Harrison GA and Miller RD. (2004) Relationship among the families and orders of marsupials and the major mammalian lineages based on recombination activating gene-1. Journal of Mammal Evolution 11:1–16.

Balazs I, Neuweiler J, Gunn P, Kidd KK, Kuhl J and Mingjun L. (1992) Human Population Genetic Studies Using Hypervariable Loci. Analysis of Assamese, Australian, Cambodian, Caucasian, Chinese and Melanesian Populations. Genetics 131:191.

Ballantyne KN, van Oorschot RAH, Mitchell RJ. (2008) Locked Nucleic Acids in PCR Primers Increase Sensitivity and Performance. Genomics 91:301-305.

Beck RMD. (2008) A Dated Phylogeny of Marsupials Using a Molecular Supermatrix and Multiple Fossil Constraints. Journal of Mammalogy 89:175-189.

Beheregaray LB, Sunnucks P, Alpers DL, Banks SC & Taylor AC (2000) A set of microsatellite loci for the hairy-nosed (Lasiorhinus krefftii and L. latifrons). Conservation Genetics 1: 89-92

References 206 Bellis C, Ashton KJ, Freney L, Blair B and Griffiths LR. (2003) A molecular genetic approach for forensic animal species identification. Forensic Science International 134:99- 108.

Bellis C, Ashton KJ, Freney L, Blair B and Griffiths LR. (2003) A Molecular Approach for Forensic Animal Species Identification. Forensic Science International 134:99.

Berry O, Sarre SD, Farrington L, Aitken N. (2007) Faecal DNA detection of invasive species: the case of feral foxes in Tasmania. Wildlife Research 34:1-7.

Berry O and Sarre SD. (2007) Gel-free species identification using melt-curve analysis. Molecular Ecology Notes 7(1):1-4.

Bird GWG. (2005) Paradoxical findings in Ouchterlony tests. Cellular Molecular Life Science 17:408.

Blaxter ML. (2004) The promise of molecular taxonomy. Philosophical Transactions of the Royal Society B 359:669-679.

Brack D. (2004) The Growth and control of international environmental crime. Environmental Health Perspectives 112:80-81.

Brower AVZ. (1996) Gene trees, species trees, and systematics: a cladistics perspective. Annual Review of Ecological Systems 27:423-450.

Brower AVZ. (2006) Problems with DNA barcodes for species delimitation: 'ten species' of Astraptes fulgerator reassessed (Lepidoptera: Hesperiidae). Systematics and Biodiversity 4:127–132.

Brunner H and Coman BJ. (1974) The identification of mammalian hairs. Melbourne: Inkata Press.

References 207 Budowle B, Garofano P, Hellman A, Ketchum M, Kanthaswamy S, Parson W, van Haeringen W, Fain S and Broad T. (2009) Recommendations for animal DNA forensic and identity testing. International Journal of Legal Medicine 119:295-302.

Budowle B, Chakraborty R, Giusti AM, Eisenberg AJ and Allen RC. (1991) Analysis of the VNTR Locus D1S80 by the PCR Followed by High-Resolution PAGE. American Journal of Human Genetics 48:137.

Budowle B, Masibay A, Anderson SJ, Barna C, Biega L, Brenneke S, Brown BL, Cramer J, DeGroot GA, Douglas D, Duceman B, Eastman A, Giles R, Hamill J, Haase DJ, Janssen DW, Kupferschmid TD, Lawton T, Lemire C, Llewellyn B, Moretti T, Neves J, Palaski C, Schueler S, Sgueglie J, Sprecher C, Tomsey C and Yet D. (2001a) STR Concordance Study. Forensic Science International 124:47.

Budowle B, Shea B, Niezgoda S and Chakraborty R. (2001b) CODIS STR Loci Data from 41 Sample Populations. Journal of Forensic Science 46:453.

Budowle B and Spreecher C. (2001) Concordance Study on Population Database Samples Using the PowerPlex™ 16 Kit and AmpFlSTR® Profiler Plus™ Kit and AmpFlSTR® Cofiler™ Kit. Journal of Forensic Science 3:637.

Budowle B, Ge J, Chakraborty R, Eisenberg AJ, Green R, Mulero J, Lagace R and Hennessy L. (2010) Population genetic analyses of the NGM STR loci. International Journal of Legal Medicine DOI 10.1007/s00414-010-01516-7.

Burgoyne LA. (1996) Solid Medium and Method for DNA Storage. US Patent 5:496-562.

Butler JM. (2010) Fundamentals of Forensic DNA Typing. Elsevier Academic Press, San Diego 520.

Butler JM, Schoske R, Vallone PM, Kline MC, Redd AJ and Hammer MF. (2002) A Novel Multiplex for Simultaneous Amplification of 20 Y chromosome STR Markers. Forensic Science International 129:10.

References 208 Butler JM, Shen Y and McCord BR. (2003) The development of reduced sized STR amplicons as tools for analysis of degraded DNA. Journal of Forensic Science 48:1054-1064.

Cardillo M, Bininda-Emonds ORP, Boakes E and Purvis A. (2004) A species-level phylogenetic supertree of marsupials. Journal of Zoology (London) 264:11-31.

Cash HD, Hoyle JW and Sutton AJ. (2003) Development under Extreme Conditions: Forensic Bioinformatics in the Wake of the World Trade Center Disaster. Pacific Symposium on Biocomputing 638-653.

Chakraborty R, Stivers DN, Su B, Zhong Y and Budowle B. (1999) The Utility of Short Tandem Repeat Loci Beyond Human Identification: Implications for Development of New DNA Typing Systems. Electrophoresis 20(8):1682.

Christensen U, Jacobsen N, Rajwanshi VK, Wengel J and Koch T. (2001) Stopped-flow kinetics of locked nucleic acid (LNA)-oligonucleotide duplex formation: studies of LNA/DNA and DNA/DNA interactions. Biochemical Journal 354:481-484.

Christidis L, Horton P and Norman JA (2008) Subspeciation in the Short-tailed Grasswren (Amytornis merrotsyi, Maluridae). Emu 108(4):275-282.

Chung DT, Drabek J, Opel KL, Butler JM, McCord BR, (2004) A study on the effects of degradation and template concentration on the amplification efficiency of the STR Miniplex primer sets. Journal of Forensic Science 49:733–740.

CITES (2010) Convention on International Trade in Endangered Species of Wild Fauna and Flora. http://www.cites.org/eng/disc/what.shtml accessed 20 June 2010.

Clark M and Vandenberg N. (2010) Dog attack: the application of canine DNA profiling in forensic casework. Forensic Science, Medicine and Pathology 6:151-157.

Claridge G, Chea-Leth V and Chhoan IV. The effectiveness of law enforcement against forest and wildlife crime. A study of enforcement disincentives and other relevant factors in South- Western Cambodia. Report prepared for East-West Management Institute, Conservation

References 209 International and USAid. http://pdf.usaid.gov/pdf_docs/pnadf439.pdf

Colgan DJ. (1999) Phylogenetic studies of marsupials based on phosphoglycerate kinase DNA sequences. Molecular Phylogenetics and Evolution 11:13-26.

Cook D, Roberts M and Lowther J. (2002) The International Wildlife Trade and Organised Crime: A Review of the Evidence and the Role of the UK. United Kingdom: World Wildlife Fund.

Cooper DN, Smith BA, Cooke HJ, Niemann S and Schmidtke J. (1985) An Estimate of Unique DNA Sequence Heterozygosity in the Human Genome 69:201.

Courchamp F, Angulo E, Rivalan P, Hall RJ, Signoret L, Bull L and Meinard Y. (2006) Rarity value and species extinction: the anthropogenic allele effect. Plos Biology 4:2405- 2410.

Cracraft J. (2000) Species concepts in theoretical and applied biology: a systematic debate with consequences. In: Species concepts and phylogenetic theory. Wheeler QD and Meier R (eds) Columbia University Press, New York.

Darwin C. (1859) On the origin of species by means of natural selection. Murray London.

Dawnay N, Ogden R, McEwing R and Carvalho RS. (2007) Validation of the barcoding gene COI for use in forensic genetic species identification. Forensic Science International 173(1):1- 6.

DeBry RW and Seshadri S. (2001) Nuclear intron sequences for phylogenetics of closely related mammals: an example using the phylogeny of Mus. Journal of Mammalogy 82(2): 280-288.

Decorte R. (2010) Genetic identification in the 21st century—Current status and future developments. Forensic Science International 201(1):160-164.

References 210 Deedrick DW and Koch SL. (2004) Microscopy of hair Part II: A practical guide and manual for animal hairs Forensic Science Communications 6(3) accessed 27 July 2010 at http://www2.fbi.gov/hq/lab/fsc/backissu/july2004/research/2004_03_research02.htm

Denault GC, Takimoto HH, Kwan QY and Pallos A. (1980) Detectability of Selected Genetic Markers in Dried Blood on Aging. Journal of Forensic Sciences 25(3):479.

D’Eustachio P. (2002) High Levels of Mitochondrial DNA Heteroplasmy in Human Hairs. Forensic Science International 130:63.

DeYoung RW and Honeycutt RL. (2005) The molecular toolbox: genetic techniques in wildlife ecology and management. Journal of Wildlife Management 69:1362-1384.

Drummond AJ, Ashton B, Cheung M, Heled J, Kearse M, Moir R, Stones-Havas S, Thierer T and Wilson A. (2010) Geneious v5.0, Available from http://www.geneious.com

Ebach MC and Holdrege C. (2005) DNA barcoding is no substitute for taxonomy. Nature 434:697.

Edwards A, Cititello A, Hammond HA and Caskey CT. (1991) DNA Typing and Genetic Mapping with Trimeric and Tetrameric Tandem Repeats. American Journal of Human Genetics 49:746.

Edwards A, Hammond HA, Lin J, Caskey CT and Chakraborty R. (1992) Genetic Variation at Five Trimeric and Tetrameric Tandem Repeat Loci in Four Human Population Groups. Genomics 12:241.

Eldridge MDB, King JM, Loupis AK, Spencer PBS, Taylor AC, Pope LC and Hall GP. (1999) Unprecedented low levels of genetic variation and inbreeding depression in an island population of the black-footed rock-wallaby Petrogale lateralis. Conservation Biology, 13: 531-541.

References 211 Eldridge MD, Browning TL and Close RL. (2001) Provence of a New Zealand brush-tailed rock-wallaby (Petrogale penicillata) population determined by mitochondrial DNA sequence analysis. Molecular Ecology 10: 2561-2567.

Erpenbeck D, Hooper JNA and Worheide G. (2006) CO1 phylogenies in diploblasts and the ‘Barcoding of Life’—are we sequencing a suboptimal partition? Molecular Ecology Notes 6:550–553.

Farris JS, Kallersjo M, Kluge AG and Bult C. (1994) Testing significance of incongruence. Cladistics 10:315-319.

Felsenstein J. (1983) Parsimony in systematics: biological and statistical issues. Annual Review of Ecological Systematics 14:313-333.

Filatov DA. (2001) Processor of sequences manual. University of Birmingham, England. Available at http://www.biosciences.bham.ac.uk/labs/filatov/proseq.htm1

Findlay I, Taylor A, Quirke P, Frazier R and Urquhart A. (1997) DNA Fingerprinting from Single Cells. Nature 389:555.

Fleming R and Harbison SA. (2010a) The development of a mRNA multiplex RT-PCR assay for the definitive identification of body fluids. Forensic Science International: Genetics 4:244- 256.

Fleming R and Harbison SA. (2010b) The use of bacteria for the identification of vaginal secretions. Forensic Science International: Genetics doi:10.1016/j.fsigen.2009.11.008.

Folmer O, Black M, Hoeh W, Lutz R and Vrijenhoek R. (1994) DNA Primers for the Amplification of Mitochondrial Cytochrome c Oxidase 1 from Diverse Metazoan Invertebrates. Molecular Marine Biology and Biotechnology 3:294-299.

Fredholm M and Wintero AK. (1996) Efficient Resolution of Parentage in Dogs by Amplification of Microsatellites. Animal Genetics 27:19.

References 212 Fre´zal L and Leblois R. (2008) Four years of DNA barcoding: Current advances and prospects. Infection Genetics and Evolution, doi:10.1016/j.meegid.2008.05.005.

Fugate HG and Penn SR. (1971) Immunodiffusion technique for the identification of animal species. Journal - Association of Official Analytical Chemists 54:1152-1156.

Fumgalli L, Pope LC, Taberlet P and Moritz C. (1997) Versatile primers for the amplification of the mitochondrial DNA control region in marsupials. Molecular Ecology 6:1199-1201.

Fumière O, Veys P, Boix A, von Holst C, Baeten V and Berben G. (2009) Methods of detection, species identification and quantification of processed animal proteins in feedingstuffs. Base 13:59-70.

Galewski T, Tilak M, Sanchez S, Chevret P, Paradis E and Douzery EJP. (2006) The evolutionary radiation of Arvicolinae rodents (voles and lemmings): relative contribution of nuclear and mitochondrial DNA phylogenies. BMC Evolutionary Biology 6:80. doi:10.1186/1471-2148-6-80.

Gatesy J. (1997) More DNA Support for a Cetacea/Hippopotamidae Clade: The Blood- Clotting Protein Gene y-Fibrinogen. Molecular Biology and Evolution 14(5):537-543.

Georges A, Birrell J, Saint K, McCord WP and Donnellan S. (1998). A phylogeny for side- necked turtles (Chelonia: Pleurodira) based on mitochondrial and nuclear gene sequence variation. Biological Journal of the Linnean Society, London 67:213-246.

Gill P and Evett I. (1995) Population Genetics of Short Tandem Repeat (STR) Loci. Genetica 96:69-87.

Gill P, Ivanov P, Kimpton CP, Piercy R, Benson NJ, Tully G, Evett I, Hagelberg E and Sullivan KM. (1993). Nature Genetics 6:130.

Glenn TC, Stephen W and Braun MJ. (1999) Effects of a population bottleneck on whooping crane mitochondrial DNA variation. Conservation Biology 13:1097-1107.

References 213 Goodwin W, Linacre A and Hadi S. (2007) An Introduction to Forensic Genetics. John Wiley and Sons, Chichester.

Goray M, Mitchell RJ and van Oorschot RAH. (2010) Investigation of secondary DNA transfer of skin cells under controlled test conditions. International Journal of Legal Medicine 12(3):117-120.

Goray M, Eken E, Mitchell RJ and van Oorschot RAH. (2009) Secondary DNA transfer of biological substances under varying test conditions. Forensic Science International: Genetics 4(2):62-67.

Graves JAM and Westerman M. (2002) Marsupial genetics and genomics. Trends in Genetics 18:517-521.

Guindon S and Gascuel O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology 52(5):696-704.

Gusmao L and Carracedo A. (2003) Y Chromosome-Specific STRs. Profiles in DNA 6:3.

Hajibabaei M, Singer AC and Hickey DA. (2007) Benchmarking DNA barcodes: an assessment using available primate sequences. Genome 49:851-854.

Hajibabaei M, Smith MA, Janzen DH, Rodriguez JJ, Whitfield JB, Hebert PDN: A minimalist barcode can identify a specimen whose DNA is degraded. Molecular Ecology Notes 2006, 6:959-964.

Haldane JBS. (1956) Can a species concept be justified? In The species concept in palaeontology. Sylvester-Bradley PC (ed) Systematics Association, London.

Hanner RH, Schindel DE, Ward RD and Hebert PDN. (2005) FISH-BOL workshop report, August 26, 2005. For the workshop held at the University of Guelph, June 5-8, 2005 Ontario Canada http://www.fishbol.org/news.php

References 214 Hansen BD, French J, Handasyde KA, Kendal TA and Taylor AC. (2003) A set of microsatellite primers for the , Dactylosile trivirgata (Petauridae: Marsupialia). Molecular Ecology Notes 3(2):212-214.

Harrison RG. (1991) Molecular changes at speciation. Annual Review of Ecology and Systematics 22:281-308.

Hawken RJ, Maccarone P, Toder R, Marshall Graves JA and Maddox JF. (1999) Isolation and characterization of marsupial IL5 genes. Immunogenetics 49:942-948.

Hebert PDN, Cywinska A and Ball SL. (2003a) Biological identifications through DNA barcodes. Proceedings Royal Society of London B (Suppl.) 270:313-321.

Hebert PDN, Ratnasingham S and deWaard JR. (2003b) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings Royal Society of London B (Suppl.) 270:S96-S99.

Hebert PD, Stoeckle MY, Zemlak TS and Francis CM. (2004) Identification of Birds through DNA Barcodes.; PloS Biology 2(10):e312. doi:10.1371/journal.pbio.0020312.

Heller MJ. (2002) DNA Microarray Technology: Devices, Systems and Applications. Annual Review Biomedicine England 4:129.

Herrnstadt C, Clevenger W and Ghosh SS. (1999) A novel mitochondrial DNA-like sequence in the human nuclear genome. Genomics 60:67-77.

Hertoghs KML, Ellis JH, and Catchpole IR. (2003) Use of locked nucleic acid oligonucleotides to add functionality to plasmid DNA. Nucleic Acids Research 31(20):5817- 5830.

Hey J. (2001) The mind of the species problem. Trends in Ecology and Evolution 16:326-329.

Hillis, DM. (1987) Molecular Versus Morphological Approaches to Systematics. Annual Review of Ecology and Systematics 18:23-42.

References 215

Hochmeister MN, Budowle B, Jung J, Borer UV, Comey CT and Dirnhofer R. (1991) PCR- Based Typing Old DNA Extracted From Cigarette Butts. International Journal of Legal Medicine 104:229.

Holden J. (1998) By Hook or by Crook–A reference manual on illegal wildlife trade and prosecutions in the UK. The Royal Society for the Protection of Birds, Bedfordshire DOI: 10.1046/j.1365-3008.1999.00061.x.

Holland MM and Parsons TJ. (1999) Mitochondrial DNA Sequence Analysis – Validation and Use for Forensic Casework. Forensic Science Review 11:21.

Holland MM, Roy R, Fraser MD and Liu RH. (1993) Application of Serological and DNA Methods for the Identification of Urine Specimen Donors. Forensic Science Review 5:2.

Horovitz I and Sánchez-Villagra MR. (2003) A morphological analysis of marsupial mammal higher-level phylogenetic relationships. Cladistics 19:181-212.

Höss M, Jaruga P, Zastawny TH, Dizdaroglu M and Paabo S (1996) DNA Damage and DNA Sequence Retrieval from Ancient Tissues. Nucleic Acids Research 24(7):1304-1307.

Houlden BA, England PR and Sherwin WB. (1996a) Paternity exclusion in koala (Phascolarctos cinereus) using hypervariable microsatellites. Journal of Heredity 87:149-152.

Houlden BA, England PR, Taylor AC, Greville WD and Sherwin WB. (1996b) Low enetic variability of the koala Phascolarctos cinereus in south-eastern Australia following a severe population bottleneck. Molecular Ecology 5:269-281.

Howard C, Gilmore S, Robertson J and Peakall R. (2008) Developmental validation of a cannabis sativa STR multiplex system for forensic analysis. Journal of Forensic Science 53(5):1061-1067.

Howard C, Gilmore S, Robertson J and Peakall R. (2009) A cannabis sativa STR genotype database for Australian seizures: forensic applications and limitations. Journal of Forensic

References 216 Science 54(3):556-563.

Hsieh HM, Haung LH, Tsai LC, Kuo YC, Meng HH, Linacre A and Lee JC. (2003) Species identification of rhinoceros horns using the cytochrome b gene. Forensic Science International 2003:136:1-11.

Huchon D, Catzeflis FM and Douzery EJ. (1999) Molecular evolution of the nuclear von Willebrand factor gene in mammals and the phylogeny of rodents. Molecular Biology and Evolution 16:577-589.

Huelsenbeck J and Ronquist F. (2001) MrBayes: Bayesian inference on phylogenetic trees. Bioinformatics 17:754-755.

IFAW (International fund for animal welfare) (2005) Caught in the web: wildlife trade on the internet. http://www.ifaw.org/ifaw/dfiles/file_562.pdf.

Interpol (2007) Aug 23 Available from: www.interpol.int/Public/EnvironmentalCrime/ Wildlife /Default.asp.

Isenberg AR. (August 2002) Forensic Mitochondrial DNA Analysis A Different Crime- Solving Tool. FBI Law Enforcement Bulletin 16.

Ivanova NV, Zemlak TS and Hanner RH. (2007) Universal primer cocktails for fish DNA barcoding. Molecular Ecology Notes 7:544–548.

Iyer RP, Egan W, Regan JB, Beaucage SL. (1990) 3H-1,2-Benzodithiole-3-one 1,1-dioxide as an improved sulfurizing reagent in the solid-phase synthesis of oligodeoxyribonucleoside phosphorothioates. Journal of the American Chemical Society 112(3):1253–1254.

Jansa SA, Forman J and Voss RS. (2005) Different patterns of selection on the nuclear genes IRBP and DMP-1 affect the efficiency but not the outcome of phylogeny estimation for didelphid marsupials. Molecular Phylogenetics and Evolution doi:10.1016/j.ympev.2005 .06.007

References 217 Jansa SA, Forsman J and Voss RS. (2006) Different patterns of selection on the nuclear genes IRBP and DMP-1 affect the efficiency but not the outcome of phylogeny estimation for didelphid marsupials. Molecular Phylogenetics and Evolution 38:363-380.

Jansa, SA and Voss RS. (2000) Phylogenetic studies on didelphid marsupials I: Introduction and preliminary results from IRBP gene sequences. Journal of Mammalian Evolution. 7(1): 43-77.

Jeffreys AJ, Wilson V and Thein SL. (1985a) Hypervariable 'Minisatellite' Regions in Human DNA. Nature 314:67.

Jeffreys AJ, Wilson V and Thein SL. (1985b) Individual-Specific 'Fingerprints' of Human DNA. Nature 316:76.

Johnson CN, Delean S and Balmford A. (2002) Phylogeny and the selectivity of Extinction in Australian Marsupials. Animal Conservation 5:136-142.

Johnson NK and Cicero C. (2004) New mitochondrial DNA data affirm the importance of Pleistocene speciation in North American birds. Evolution 58:1122–1130.

Kauppinen S, Vester B and Wengel J. (2006) Locked Nucleic Acid: High-Affinity Targeting of Complementary RNA for RNomics. Handbook of Experimental Pharmacology 173:405- 422 doi: 10.1007/3-540-27262-3_21.

Kayser M, Kittler R, Erler A, Hedman M, Lee AC, Mohyudden A, Mehdi SQ, Rosser Z, Stoneking M, Jobling MA, Sajantila A and Tyler-Smith C. (2004) A Comprehensive Survey of Human Y-Chromosomal Microsatellites. American Journal of Human Genetics 74(6):1183.

Kayser M and Schneider PM. (2009) DNA-based prediction of human externally visible characteristics in forensics: Motivations, scientific challenges, and ethical considerations Forensic Science International Genetics 3(3):154-161.

References 218 Keller RP and Lodge DM. (2007) Species invasions from commerce in live aquatic organisms problems and possible solutions. BioScience 57:428-436.

Killian JK, Buckley TR, Stewart N, Munday BL and Jirtle RL. (2001) Marsupials and eutherians reunited: genetic evidence for the Theria hypothesis of mammalian evolution. Mammalian. Genome 12:513-517.

Kimura M. (1980) A simple method for estimating evolutionary rate of base substitutions through comparison studies of nucleotide sequences. Journal of Molecular Evolution 16:111- 120.

Kirsch JAW. (1977) The comparative serology of Marsupialia, and a classification of marsupials. Australian Journal of Zoology Supplementary Series 52:1-152.

Kirsch JA, Lapointe FJ and Springer MS. (1997) DNA-hybridisation studies of marsupials and their implications for metatherian classification. Australian Journal of Zoology 45:211- 280.

Koblinsky L. (1992) Recovery and Stability of DNA in Samples of Forensic Science Significance. Forensic Science Review 4:67.

Koshin AA, Nielsen P, Meldgaard M, Rajwanshi VK, Singh SK and Wengal J. (1998) LNA (locked nucleaic acid): an RNA mimic forming exceedingly stable LNA:LNA duplexes. Journal of the American Chemistry Society 120:13252-13253.

Kraaijeveld-Smit FJL, Lindenmayer DB and Taylor AC. (2002) Dispersal patterns and population structure in a small marsupial, Antechinus agilis from two forests analysed using microsatellite markers. Australian Journal of Zoology 50:325-338.

Krajewski C, Moyer GR, Sipiorski T, Fain MG and Westerman M. (2004) Molecular systematics of the enigmatic 'phascolosoricine' marsupials of . Australian Journal of Zoology 52:389-415.

References 219 Kress WJ, Wurdack KJ and Zimmer EA. (2005) Use of DNA Barcodes to Identify Flowering Plants. Proceedings of the National Academy of Sciences of the United States of America 102:8369-8374.

Kress WJ and Erickson DL. (2007) A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnHpsbA spacer region. Plos One 2 (6), e508 doi:10.1371/journal.pone.0000508.

Lambert DM, Baker A and Huynen L. (2005) Is a large-scale DNA-based inventory of ancient life possible? Journal of Heredity 96(3):279–84.

Latorra D, Arar K and Hurley JM. (2003) Design considerations and effects of LNA in PCR primers. Molecular and Cellular Probes 17:253-259.

Leader-Williams N and Milner-Gulland EJ. (1993) Policies for the enforcement of wildlife laws: the balance between detection and penalties in Luangwa Valley, Zambia. Conservation Biology 7:611-617.

Leonard JA. (2008) Ancient DNA applications for wildlife conservation. Molecular Ecology 17:4186–4196.

Levin JD, Fiala D, Samala MF, Kahn JD and Peterson RJ. (2006) Position-dependent effects of locked nucleic acid (LNA) on DNA sequencing and PCR primers. Nucleic Acids Research 34:e142.

Li YM, Gao ZX, Li XH, Wang S and Niemela, J. (2000) Illegal wildlife trade in the Himalayan region of China. Biodiversity and Conservation 9:901-918.

Linacre A. (2009) Forensic Science in Wildlife Investigations. CRC Press, Taylor and Francis Group, Florida USA.

Linacre A and Tobe SS. (2009) Species identification using DNA loci. In: Forensic science in wildlife investigations. Linacre A (ed) CRC Press, Florida.

References 220 Lips KR, Brem F, Brenes R, Reeve JD, Alford RA, Voyles J, Carey C, Livo L, Pessier AP and Collins JP. (2006) Emerging infectious disease and the loss of biodiversity in a Neotropical amphibian community. Proceedings of the National Academy of Sciences of the United States of America. 103:3165–70.

Locard E. (1930) L'Enquète Criminelle et les Méthodes Scientifiques. Flammarion 139.

Lopez JV, Yuhki N, Masuda R, Modi W and O’Brien SJ. (1994) Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat. Journal of Molecular Evolution 39:174-190.

Lorente JA, Lorente M, Lorente MJ, Alvarez JC, Entrala C, Lopez-Munoz J and Villaneuva E. (1998) Newborn Genetic Identification: Expanding the Fields of Forensic Haemogenetics. Progress in Forensic Genetics 7:114.

Lorenz JG, Jackson WE and Beck JC. (2005) The problems and promise of DNA barcodes for species diagnosis of primate biomaterials. Philosophical Transactions of the Royal Society B 360:1869–1877.

Lowe A, Murray C, Whitaker J, Tully G and Gill P. (2002) The Propensity of Individuals to Deposit DNA and Secondary Transfer of Low Level DNA from Individuals to Inert Surfaces. Forensic Science International 12:25.

Lowther J, Cook D and Roberts M. (2002) Crime and punishment in the wildlife trade. A WWF/TRAFFIC Report. www.wwf.org.uk

Luckett WP. (1994) Suprafamilial relationships within Marsupialia: resolution and discordance from multidisciplinary data. Journal Mammalian Evolution 2:255–288.

Luikart G, Painter J, Crozier RH, Westerman M and Sherwin WB. (1997) Characterisation of microsatellite loci in the endangered long-footed Potorus longipes. Molecular Ecology 6:497-498.

References 221 Lynch M and Jarrell PE. (1993) A method for calibrating molecular clocks and its application to animal mitochondrial DNA. Genetics 135:1197–2208.

Lyons LA, Laughlin TF, Copeland NG, Jenkins NA, Womack JE and O'Brien SJ. (1997) Comparative anchor tagged sequences (CATS) for integrative mapping of mammalian genomes. Nature Genetics 15:47 - 56 doi:10.1038/ng0197-47.

Macdonald AJ, Sarre SD and FitzSimmons NN. (2010) Sex chromosome microsatellites: new tools for macropodoid population ecology. In: Macropods: The Biology of Kangaroos, Wallabies and Rat-Kangaroos. (eds) Coulson G and Eldridge M Csiro Publishing, Canberra.

Macedo-Silva A, Macedo-Silva SC, Barbosa MGA, Alkmin AJ, Vaz M, Shimokomaki M and Tenuta-Filho A. (2000) Hamburger meat identification by dot-ELISA. Meat Science 56:189- 192.

Maddison WP (1997) Gene trees in species trees. Systematics and Biology 46:523-536.

Madsen O, Willemsen D, Ursing BM, Arnason U and de Jong WW. (2002) Molecular Evolution of the Mammalian Alpha 2B Adrenergic Receptor. Molecular Biology and Evolution 19(12):2150 - 2160.

Mallet J. (1995) A species definition for the modern synthesis. Tree 10:294-299.

Mallet J and Willmott K. (2003) Taxonomy: Renaissance or Tower of Babel? Trends in Ecology and Evolution 18:57-59.

Marshall LG, Case JA and Woodburne MO. (1990) Phylogenetic relationships of the families of marsupials. In: Genoways, H.H. (Ed.), Current Mammalogy. Plenum Press, New York 433– 505.

Martin P, Alonso A, Albarran C and Sancho M. (1996) Identification of the Skeletal Remains of Two 12 Year old Bodies by Nuclear DNA Polymorphisms Analysis. Advances in Forensic Haemogenetics 6:301.

References 222

Martinex I and Danielsdottir AK. (2000) Identification of marine mammal species in food products. Journal of. Science in Food Agriculture 80(4):527-533.

Matocq MD, Shurtliff QR and Feldman CR. (2007) Phylogenetics of the woodrat genus Neotoma (Rodentia: Muridae): implications for the evolution of phenotypic variation in male external genitalia. Molecular Phylogenetics and Evolution 42:637-652.

McCarthy C. (1997) Chromas v1.4 computer package. Griffith University, Australia available at http://trishul.sci.gu.edu.au/~conor/chromas.html ml.

McDowell D. (1997) Wildlife Crime Policy and the Law. Australian Government Publishing Service Canberra.

McTigue P, Peterson RJ, and Kahn JD. (2004) Sequence-Dependent Thermodynamic Parameters for Locked Nucleic Acid (LNA)-DNA Duplex Formation. Biochemistry 43:5388- 5405.

Menotti-Raymond M, David VA, Stephens JC, Lyons LA and O'Brien SJ. (1997) Genetic individualization of domestic cats using feline STR loci for forensic applications Journal of Forensic Science 42(6):1039-51.

Meredith RW, Westerman M, Case JA and Springer MS. (2008) A phylogeny and timescale for marsupial evolution based on sequences for five nuclear genes. Journal of Mammalian Evolution 15:1-26.

Meredith RW, Westerman M and Springer MS. (2009) A phylogeny of Diprotodontia (Marsupialia) based on sequences for five nuclear genes. Molecular Phylogenetics and Evolution 51(3):554-71.

Meusnier I, Singer GAC, Landry JF, Hickey DA, Hebert PDN and Hajibabaei M. (2008) A universal DNA mini-barcode for biodiversity analysis. BMC Genomics 9:214.

References 223 Millis, AL (2000) “Isolation and characterisation of microsatellite loci in marsupial gliders (Petaurus norfolcensis, P. braviceps and P. gracilis)” Moelcular ecology Volume 9, issue 10 p1681-1683

Moritz C and Cicero C. (2004) DNA Barcoding: Promise and Pitfalls. PLoS Biology 2:1529- 1531 doi:10.1371/journal.pbio.0020354.

Motz M, Pääbo S and Kilger C. (2000) Improved cycle sequencing of GC-rich templates by a combination of nucleotide analogs. BioTechniques 29:268-270.

Mouritzen P, Nielsen PS, Jacobsen N, Noerholmn M, Lomholt C, Pfunderheller HM, Ramsing NB, Kauppinen S and Tolstrup N. (2004) The ProbeLibary – expression profiling 99% of all human genes using only 90 duel-labelled real-time PCR probes. BioTechniques 37:492-495.

Munemasa M, Nikaido M, Nishihara H, Donnellan S and Austin CC, et al. (2008) Newly discovered young CORE-SINEs in marsupial genomes. Gene 407:176–185.

Munemasa M, Nikaido M, Nishihara H, Donnellan S and Austin CC, Okada N and Hasegawa M. (2006) Phylogenetic analysis of Diprotodontian marsupials based on complete mitochondrial genomes. Genes and Genetic Systems 81:181-191.

Nakamura Y, Leppart M, O'Connell P, Wolff R, Holm T, Culver M, Martin C, Fujimoto E, Hoff M, Kumlin E and White R. (1987) Variable Number Tandem Repeat (VNTR) Markers for Human Gene Mapping. Science 235:1616.

Napper R. (2000) A National DNA Database the United Kingdom Experience. Australian Journal of Forensic Sciences 32:65.

Neme L. (2010) INTERPOL’S wildlife crime working group meeting. Forensic Science, Medicine and Pathology 6:223-224.

References 224 Nielsen CB, Singh SK, Wengel J and Jacobsen JP. (1999) The solution structure of a locked nucleic acid (LNA) hybridized to DNA. Journal of Biomolecular Structure and Dynamics 17:175-191.

Nicod JC and Largiader CR. (2003) SNPs by AFLP(SBA): A Rapid SNP Isolation Strategy for Non-Model Organisms. Nucleic Acids Research 31(5):19.

Nilsson M, Arnason U, Spencer PBS and Janke A. (2004) Marsupial relationships and a timeline for marsupial radiation in South Gondwana Gene 340:189–196.

Obika S, Nanbu D, Hari Y, Andoh J, Morio K, Doi T and Imanishi T. (1998) Stability and structural features of the duplexes containing nucleoside analogues with a fixed N-type conformation, 2'-O,4'-C-methylene ribonucleosides. Tetrahedron Letters 39:5401-5404.

Ogden R, Dawnay N and McEwing R. (2009) Wildlife DNA forensics – bridging the gap between conservation genetics and law enforcement. Endangered Species Research doi:10.3354/esr00144.

Oldroyd NJ, Urquhart AJ, Kimpton CP, Millican ES, Watson SK, Downes T and Gill PD. (1995) A Highly Discriminating Octoplex Short Tandem Repeat Polymerase Chain Reaction System Suitable for Human Individual Identification. Electrophoresis 16:334.

Olsen SJ. (1985) Origins of the domestic dog: the fossil record. Tucson: The University of Arizona Press.

Osafune T and Baba Y. (2004) Microchip Technology for Genome Analysis: Applications to DNA Identification and Scientific Criminal Investigations. Japanese Journal of Science and Technology for Identification 9(2):95.

Osborne ML, Christidis L and Norman JA. (2002) Molecular Phylogenetics of the Diprotodontia (Kangaroos, wombats, koala, possums and allies). Molecular Phylogentics and Evolution 25:219-228.

References 225 Ossorio PN. (2006) About Face: Forensic Genetic Testing for Race and Visible Traits. The Journal of Law, Medicine and Ethics 34(2):277-292.

Ouchterlony O. (1968) Handbook of Immunodiffusion and Immunoelectrophoresis. Ann Arbor: Ann Arbor Science Publishers Inc.

Ouchterlony O and Nilsson LA. (1986) Immunodiffusion and immunoelectrophoresis. In: Weir DM, Herzerberg LA, Blackwell C, Herzerberg LA, editors. Handbook of Experimental Immunology Oxford: Blackwell 1(4):32.1–32.50.

Page RDM. (2000) Extracting species trees from complex gene trees: reconciled trees and vertebrate phylogeny. Molecular Phylogenetics and Evolution 14:89-106.

Page RDM and Charleston MA. (1997) From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution 7:231-240.

Parson W, Pegoraro K, Niederstätter H, Föger M and Steinlechner M. (2000) Species identification by means of the cytochrome b gene. International Journal of Legal Medicine 114:23-28.

Parsons TJ, Huel R, Davoren J, Katzmarzyk C, Milos A, SelmanovicA, Smajlovic L, Coble MD and Rizvic A. (2007) Application of novel ‘‘mini-amplicon’’ STR multiplexes to high volume casework on degraded skeletal remains. Forensic Science International: Genetics 1:175–179.

Peabody AJ, Oxborough RJ, Cage PE and Evett IW. (1983) The Discrimination of Cat and Dog Hairs. Journal of the Forensic Science Society 23(2):121-129.

Peppin L, McEwing R, Carvalho GR and Ogden R. (2008) A DNA based approach for the forensic identification of Asiatic black bear (Ursus thibetanus) in traditional Asian medicine. Journal of Forensic Sciences 53:1358-1362.

Peterson M, Nielsen CB, Nielsen KE, Jensen GA, Bondensgaard K, Singh SJ, Rajwanshi VK, Koshkin AA, Dahl BM, Wengel J and Jacobsen JP. (2000) The conformations of locked

References 226 nucleaic acids (LNA). Journal of Molecular Recognition 13:44-53.

Phillips MJ, McLenachan PA, Down C, Gibb GC and Penny D. (2006) Combined mitochondrial and nuclear DNA sequences resolve the interrelations of the major Australasian marsupial radiations. Systematic Biology 55:122–137.

Phillips MJ and Pratt RC. (2008) Family-level relationships among the Australasian marsupial '' (Diprotodontia: koala, wombats, kangaroos and possums). Molecular Phylogenetics and Evolution 46:594–605.

Phillips C, Pretio L, Fondevila M, Salas A, Gomez-Tato A, Alvarez-Dios J, Alonso A, Blanco-Verea A, Brion M, Montesino M, Carracedo A and Lareu MV, (2009) Ancestry Analysis in the 11-M Madrid Bomb Attack Investigation 4(8) PLoS One e6583. doi:10.1371/journal.pone.0006583.

Pilar N and Ossorio JD. (2006) About Face: Forensic Genetic Testing for Race and Visible Traits. The Journal of Law, Medicine & Ethics 34(2):277-292.

Prieto L, Montesino M, Salas A, Alonso A, Albarran C, Álvarez S, Cresillo M, DiLonardo AM, Doutremepuich C, Fernádez-Fernádez I, Vega AG, Gusmão L, López CM, Modesti NM, Palacio AM, Paredes M, Pena SDJ, Pérez-Lezaun A, Pestano JJ, Puente J, Sala A, Vide M, Whittle MR, Yunis JJ and Gómez J. (2003) The 2000-2001 GEP-ISFG Collaborative Exercise on mtDNA Assessing the Cause of Unsuccessful mtDNA PCR Amplification of Hair Shaft Samples. Forensic Science International 134:46.

Prosada D. (2008) jModelTest: Phylogenetic Model Averaging. Moelcular Biology and Evolution 25(7):1253-1256.

Prychitko TM and Moore WS. (1997) The utility of DNA sequences of an intron from the b- fibrinogen gene in phylogenetic analysis of woodpeckers (Aves: Picidae). Molecular Phylogenetics and Evolution 8:193–204.

Quantiblot® Human DNA Quantitation Kit, Applied Biosystems, Product insert, Part No N808-0014 (Roche Molecular Systems, Inc, Branchburg, New Jersey, USA, 1996).

References 227

Queralt R, Adroer R, Oliva R, Winkfein RJ, Retief JD and Dixon GH. (1995) Evolution of protamine P1 genes in mammals. Journal of Molecular Evolution 40:601-607.

Ratnasingham S and Hebert PDN. (2007) BOLD: the Barcode of Life Data system. Molecular Ecology Notes 7:355–364.

Redd AJ, Agellon AB, Kearney VA, Contreras VA, Karafet T, Par H, Knijff P, Butler JM and Hammer MF. (2002) Forensic Value of 14 Novel STRs on the Human Y Chromosome. Forensic Science International 130:97.

Renctas. (2001) National network to fight the trafficking of wild animals. 1st national report on wild fauna traffic. Brazil; www.rencta.org.br.

Richard ML, Frappier RH and Newman JC. (2003) Developmental Validation of a Real-Time Quantitative PCR Assay for Automated Quantitation of Human DNA. Journal of Forensic Science 43(5):1041.

Richard ML, Frappier RH and Newman JC. (2003) Developmental validation of a real-time quantitative PCR assay for automated quantification of human DNA. Journal of Forensci Science 48(5):1041-1046.

Ridley M. (1985) Selfish DNA Comes of Age. New Scientist 1456:34.

Robertson J. (editor). (1999) Forensic examination of hair 2nd Edition. London: Taylor & Francis.

Robinson-Rechavi M, Ponger L and Mouchiroud D. (2000) Nuclear Gene LCAT Supports Rodent Monophyly. Molecular Biology and Evolution 17(9):1410-1412.

Rokas A and Carroll SB. (2005) More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy. Molecular Biology and Evolution 22:1337-1344.

References 228

Rogers C and Burgoyne L. (1997) Bacterial Typing: Storing and Processing of Stabilized Reference Bacteria for PCR without Preparing DNA-An Example of an Automated Procedure. Analytical Biochemistry 247:223.

Roy MS, Girman DJ, Taylor AC and Wayne RK. (1994) The use of museum specimens to reconstruct the genetic variability and relationships of extinct populations. Experientia. 50:551-557.

Rozen S and Skaletsky H. (2000) Primer3 on WWW for general users for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ 365-386.

Saferstein R. (1982) Identification and grouping of Bloodstains. In: Saferstein R. Forensic Science Handbook. New Jersey: Prentice Hall, Inc. 267-296.

Saiki RK, Scharf S, Faloona F, Mullis KB, Horn GT, Erlich HA and Arnheim N. (1985) Anzymatic Amplification o Beta-Globin Genomic Sequences and Restriction Analysis for Diagnosis of Sickle Cell Anemia. Science 230.1350.

Saitou N and Nei M. (1987) The Neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology Evolution 4:406-425.

Sambrook J, Fritsch EF and Maniatis T. (1989) Molecular cloning: a laboratory manual, second edition. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.

Sanderson MJ and Shaffer HB. (2002) Troubleshooting molecular phylogenetic analysis. Annual Review of Ecological Systems 33:49-72.

Schmerer WM, Hummel S and Herrmann B. (1999) Optimized DNA extraction to improve reproducibility of short tandem repeat genotyping with highly degraded DNA as target. Electrophoresis 20:1712-1716.

References 229 Sevilla RG, Diez A and Noren M. (2007) Primers and polymerase chain reaction conditions for DNA barcoding teleost fish based on the mitochondrial cytochrome b and nuclear rhodopsin genes. Molecular Ecology Notes 7:730–734.

Shaffer HB, Clark JM and Kraus F. (1991) When molecules and morphology clash: A phylogenetic analysis of the North American ambystomatid (Caudata:Ambystomatidae) salamanders. Systematic Zoology 40(3):284-303.

Simeonov A and Nikiforov TT. (2002) Single nucleotide polymorphism genotyping using short, fluorescently labelled locked nucleic acid (LNA) probes and fluorescence polarization detection. Nucleic Acids Research 30(17):91.

Sinha SK, Budowle B, Arcot SS, Richey SL, Chakraborty R, Jones MD, Wojtkiewicz PW, Schoenbauer DA, Gross AM, Sinha SK and Shewale JG. (2003) Development and Validation of a Multiplexed Y-Chromosome STR Genotyping System, Y-PLEX™6, for Forensic Casework. Journal of Forensic Science 48(1):93.

Sites JW, Davis SK, Guerra T, Iverson JB and Snell HL. (1996) Character congruence and phylogenetic signal in molecular and morphological data sets: a case study in the living iguanas (Squamata, Iguanidae) Molecular Biology and Evolution 13:1087-1105.

Smith KF, Behrens M, Schoegel LM, Marano N, Burgiel S and Daszek P. (2009) Reducing the risks of the wildlife trade. Science 324:594-595.

Smith MA, Wood DM and Janzen DH. (2007) DNA barcodes affirm that 16 species of apparently generalist tropical parasitoid flies (Diptera, Tachinidae) are not all generalists. Proceedings of the National Academy of Sciences of the United States of America 104(12):4967–4972.

Song H, Buhay JE, Whiting MF and Crandall KA. (2008) Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proceedings of the National Academy of Sciences of the United States of America 105(36):13486-13491.

References 230 Sorenson MD and Quinn TW. (1998) Numts: a challenge for avian systematics and population biology. Auk 115:214–221.

Sperling F. (2003) DNA Barcoding: Deus ex Machina. Newsletter of the Biological Survey of Canada (Terrestrial Arthropods) 22(1) Opinion Page.

Springer MS, Kirsch JA and Case JA. (1997) The chronicle of marsupial evolution. In: Givinish T, Sytsma K (Eds) Molecular evolution and adaptive radiations. Camridge University Press, New York 129-161.

Springer MS, Westerman M, Kavanagh JR, Burk A and Woodburne MO, Kao DJ and Krajewski C. (1998) The origin of the Australasian marsupial fauna and the phylogenetic affinities of the enigmatic Monito de Monte and marsupial mole. Proceedings of the Royal Society of London B 265:2381–2386.

Stamatakis A, Hoover P and Rougemont J. (2008) A Rapid Bootstrap Algorithm for the RAxML Web-Servers. Systematic Biology, 75(5):758-771, 2008.

Stanhope MJ, Czelusniak J, Si JS, Nickerson J and Goodman M. (1992). A molecular perspective on mammalian evolution from the gene encoding interphotoreceptor retinoid binding protein, with convincing evidence for bat monophyly. Molecular. Phylogenetics and Evolution 1:148-160.

Stoeckle M. (2003) Taxonomy, DNA, and the Bar Code of Life. BoScience 53:2-3.

Stoeckle MY and Hebert PD. (2008) Barcode of life. Scientific American 299(4):82-6.

Swofford DL. (2002) PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinaur Associates, Sunderland MA.

Symondson WOC. (2002) Molecular identification of prey in predator diets. Molecular Ecology 11:627-641.

References 231 Szalay FS. (1994) Evolutionary history of the marsupials and an analysis of osteological characters. Cambridge: Cambridge University Press.

Tamura K, Dudley J and Nei M. (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24:1596-1599. Publication PDF at http://www.kumarlab.net/publications

Taylor AC and Cooper DW. (1998a) Microsatellite markers for the phalangerid marsupial, the (Trichosurus vulpecular). Molecular Ecology 7:1780-1782.

Taylor AC and Cooper DW. (1998b) A set of tammar wallaby Macropus eugenii microsatellites tested for genetic linkage. Molecular Ecology, 7:925-931.

Taylor AC, Horsup A, Johnson CN, Sunnucks P and Sherwin WB. (1997) Relatedness structure detected by microsatellite analysis and attempted pedigree reconstruction in an endangered marsupial, the northern hairy-nosed wombat, Lasiorhinus krefftii. Molecular Ecology 6:9-19.

Taylor AC, Kraaijeveld K and Lindenmayer DB. (2002) Microsatellites for the greater glider, Petauroides volans. Molecular Ecology Notes 2(1):57-59.

Thalmann O, Hebler J and Poinar HN. (2004) Unreliable mtDNA data due to nuclear insertions: a cautionary tale from analysis of humans and other great apes. Molecular Ecology 13:321–335.

Tobe SS, Kitchener AC, Linacre AMT, (2010) Reconstructing Mammalian Phylogenies: A Detailed Comparison of the Cytochrome b and Cytochrome Oxidase Subunit I Mitochondrial Genes. PLoS ONE 5(11): e14156. doi:10.1371/journal.pone.0014156.

Tobe SS and Linacre A. (2010) DNA typing in wildlife crime: recent developments in species identification. Forensic Science, Medicine and Pathology 6:195-206.

References 232 Tolstrup N, Nielsen PS, Kolberg JG, Frankel AM, Vissing H, and Kauppinen S. (2003) OligoDesign: optimal design for LNA (locked nucleic acid) oligonucleotide capture probes for gene expression profiling. Nucleic Acids Research 31(13):3758-3762.

Urquhart A, Kimpton CP, Downes TJ and Gill P. (1994) Variation in Short Tandem Repeat Sequences – A Survey of Twelve Microsatellite Loci Used as Forensic Identification Markers. International Journal of Legal Medicine 107:13.

Urquhart A, Oldroyd NJ, Kimpton CP and Gill P. (1995) Highly Discriminating Heptaplex Short Tandem Repeat PCR System for Forensic Identification. BioTechniques 18:116.

Van Oorschot RAH and Jones MJ. (1997) DNA Fingerprinting from Fingerprints. Nature 387:767.

Verma SK, Singh L. (2002) Novel universal primers establish identify of an enormous number of animal species for forensic application. Molecular Ecology Notes 3:28-31.

Wagner RW, Matteucci MD, Lewis JG, Gutierrez AJ, Moulds C and Froehler BC. (1993) Antisense gene inhibition by oligonucleotides containing C-5 propyne pyrimidines. Science 260: 1510-1513.

Walsh PS, Metzger DA and Higuchi R. (1991) Chelex 100 as a Medium for Simple Extraction of DNA for PCR-Based Typing from Forensic Material. BioTechniques 10(4):506.

Wanderler P, Hoeck PE and Keller LF. (2007) Back to the future: museum specimens in population genetics. Trends in Ecological Evolution 22(12):634-642.

Ward RD, Zermlak TS, Innes BH, Last PR and Hebert PDN. (2005) DNA barcoding Australia’s fish species. . Philosophical Transactions of the Royal Society B 360:1847-1857.

Watson JD and Crick FH. (1953) A Structure for Deoxyribose Nucleic Acid. Nature 171:737.

Wheeler D, Hope R, Cooper SJB, Dolman G, Webb GC, Bottema CDK, Gooley AA, Goodman M and Holland RAB. (2001) An orphaned mammalian b-globin gene of ancient

References 233 evolutionary origin. Proceedings of the National Academy of Sciences of the United States of America, 98 (3):1101-1106.

Whitaker JP, Clayton TM, Urquhart AJ, Downes TJ, Kimpton CP and Gill P. (1995) Short Tandem Repeat Typing of Bodies From a Mass Disaster: High Success Rate and Characteristic Amplification Patterns in Highly Degraded Samples. BioTechniques 18(3):1.

Wild AL and Maddison DR. (2008) Evaluating nuclear protein-coding genes for phylogenetic utility in beetles. Molecular Phylogenetics and Evolution 48:877-891.

Will KW and Rubinoff D. (2004) Myth of the Molecule: DNA barcodes for species cannot replace morphology or identification and classification. Cladistics 20:47-55.

Willams ST and Knowlton N. (2001) Mitochondrial pseudogenes are pervasive and often insidious in the snapping shrimp genus Alpheus. Molecular Biology and Evolution 18:1484– 1493.

Wilson MW and Pringle BH. (1956) Cross-reactions in the Ouchterlony plate: analysis of native and halogenated bovine serum albumins. Journal of Immunology 77:324-331.

Wilson DE and Reeder DM. (eds.). (2005) Mammal species of the world: a taxonomic and geographic reference. Third edition. The Johns Hopkins University Press, Baltimore.

Wilson MR, Allard MW, Monson K, Miller KWP and Budowle B. (2002) Recommendations for Consistent Treatment of Length Variants in the Human Mitochondrial DNA Control Region. Forensic Science International 129:35.

Wilson MR, DiZinno JA, Polanskey D, Replogle J and Budowle B. (1995) Validation of Mitochondrial DNA Sequencing for Forensic Casework. International Journal of Legal Medicine 108:68.

Wilson-Wilde L, Norman J, Robertson J, Sarre S and Georges A. (2010) Current issues in species identification for forensic science and the validity of using the cytochrome oxidase I (COI). Forensic Science Medicine and Pathology. 6:233-241.

References 234

Wirth T, Le Guellec R and Veuille M. (1999) Directional substitution and evolution of nucleotide content in the cytochrome oxidase II gene in earwigs (Dermapteran Insects). Molecular Biology and Evolution 16 (12):1645–1653.

Woese CR and Fox GE. (1977) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proceedings of the National Academy of Sciences of the United States of America 74 (11):5088–5090.

Wong KL, Wang J, But PPH and Shaw PC. (2004) Application of cytochrome b DNA sequences for the authentication of endangered snake species. Forensic Science International 139:49-55.

Wyler LS and Sheikh PA. (2008) CRS report for congress – International illegal trade in wildlife: threats and U.S. policy.

Wyman AR and White R. (1980) A Highly Polymorphic Locus in Human DNA. Proceedings of the National Academy of Science 77(11):6754.

Xianlin M. (2009) Law Enforcement activities regarding Falcons, Tibetan Antelope and trade on the internet. Presentation at the Interpol Wildlife Crime Working Group Meeting, Manaus, Brazil.

Yang Z and Rannala B. (2010) Bayesian species delimitation using multilocus sequence data Proceedings of the National Academy of Sciences of the United States of America 107(20):9264-9269.

Zenger KR and Cooper DW (2001) “A set of highly polymorphic microsatellite markers developed for the eastern grey kangaroo (Macropus giganteus).” Molecular Ecology 1:98- 100.

References 235

“Consider the postage stamp: its usefulness consists in the ability to stick to one thing till it gets there.” Josh Billings (1818-1885)

“Perseverance is the hard work you do after you get tired of doing the hard work you already did.” Newt Gingrich (1943-present)

“Nothing in the world can take the place of persistence. Talent will not; nothing in the world is more common than unsuccessful men with talent. Genius will not; unrewarded genius is a proverb. Education will not; the world is full of educated derelicts. Persistence and determination alone are omnipotent.” Calvin Coolidge (1872-1933)

References 236

One final image found on the internet of a person shopping in Walmart, in the USA, carrying a wallaby joey around in their bag. A unique, but inappropriate pet. Image courtesy of www.peopleofwalmart.com.

References 237