Utah State University DigitalCommons@USU

All Graduate Theses and Dissertations Graduate Studies

12-2020

Novel Polymorphisms of ZRSR2 and GPM6B Homologs and Their Use in Sex Identification of Bovine and Porcine Species

Evan K. Peterson Utah State University

Follow this and additional works at: https://digitalcommons.usu.edu/etd

Part of the Animal Sciences Commons, and the Molecular Genetics Commons

Recommended Citation Peterson, Evan K., "Novel Polymorphisms of ZRSR2 and GPM6B Gene Homologs and Their Use in Sex Identification of Bovine and Porcine Species" (2020). All Graduate Theses and Dissertations. 7915. https://digitalcommons.usu.edu/etd/7915

This Thesis is brought to you for free and open access by the Graduate Studies at DigitalCommons@USU. It has been accepted for inclusion in All Graduate Theses and Dissertations by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. NOVEL POLYMORPHISMS OF ZRSR2 AND GPM6B GENE HOMOLOGS AND

THEIR USE IN SEX IDENTIFICATION OF BOVINE AND PORCINE SPECIES

by

Evan K. Peterson

A thesis submitted in partial fulfillment of the requirements for the degree

of

MASTER OF SCIENCE

in

Animal, Dairy, and Veterinary Science (Animal Molecular Genetics)

Approved:

______Lee F. Rickords, Ph.D. Aaron Thomas, Ph.D. Major Professor Committee Member

______S. Clay Isom, Ph.D. Janis L. Boettinger, Ph.D. Committee Member Acting Vice Provost of Graduate Studies

UTAH STATE UNIVERSITY Logan, Utah

2020 ii

Copyright © Evan K. Peterson

All Rights Reserved

iii

ABSTRACT

Novel Polymorphisms of ZRSR2 and GPM6B Gene Homologs and their use in

Sex Identification of Bovine and Porcine Species

by

Evan K. Peterson, Master of Science

Utah State University, 2020

Major Professor: Dr. Lee F. Rickords Department: Animal, Dairy, and Veterinary Science

Bovine and porcine DNA-based sex identification is of importance to livestock producers. Currently, the most efficient sex identification methods use PCR and qPCR to determine genotypic sex based on the presence or absence of Y- specific in a DNA sample. Zinc finger CCCH-type, RNA binding motif and serine/arginine rich 2 (ZRSR2) in Bos taurus (European domestic cattle) and glycoprotein M6B

(GPM6B) in Sus scrofa (domestic pigs) are X-chromosome specific genes with Y- chromosome specific homologs. These Y-chromosome homologs contain deletion regions that were targeted in order to create amplicons of differing length for use in sex identification.

The Bos taurus ZRSR2 gene and its Y-specific homolog (ZRSR2Y) were aligned and two adjacent deletions on ZRSR2Y (totaling 33 bp) were identified. PCR primers were designed to flank these deletions and create amplicons 149 bp and 116 bp long from iv

the X- and Y- respectively. These primers were then tested in PCR and

qPCR reactions with DNA from Bos taurus, Bos indicus (African domestic cattle), Bison

bison (bison), Capra hircus (domestic goat), Ovis aries (domestic sheep), and Bos grunniens (domestic yak). Results from both PCR and qPCR indicate that the primer regions exist in all six bovine species tested as do Y-specific deletion regions of an equivalent length. Additionally, melting curve analysis during qPCR indicates that the two fragments have unique melting temperatures which can be used for rapid sex identification of a DNA sample.

Sus scrofa GPM6B and its Y-specific pseudogene were aligned and several adjacent deletions were identified (totaling 43 bp). PCR primers were designed to flank the deletion regions and create X- and Y-specific amplicons 244 bp and 201 bp long respectively. Testing confirmed the presence of the Y-specific deletion regions as well as the presence of a 16 bp intron deletion variant (rs1111696537) in a subset of the population tested. The ability to identify sex using this primer pair in porcine samples was also confirmed.

(103 Pages) v

PUBLIC ABSTRACT

Novel Polymorphisms of ZRSR2 and GPM6B Gene Homologs and their use in

Sex Identification of Bovine and Porcine Species

Evan K. Peterson

Accurate and cost-effective PCR based sex identification is important in animal production because it gives producers the ability to determine the sex of embryos prior to transfer, saving time and money. The most efficient PCR sex identification assays work by using a single primer pair to amplify a specific target region located on the Y- chromosome and a second, separate target region on the X-chromosome.

This thesis reports the design of two novel assays. The first assay was designed to target the Zinc finger CCCH-type, RNA binding motif and serine/arginine rich 2 (ZRSR2) gene found on the X-chromosome and its Y-chromosome homolog, ZRSR2Y, in

European domestic cattle (Bos taurus). It creates X- and Y-specific amplicons of 149 bp and 116 bp long respectively and is effective for sex identification of Bos taurus, African domestic cattle (Bos indicus), bison (Bison bison), domestic goat (Capra hircus), domestic sheep (Ovis aries), and domestic yak (Bos grunniens).

The second assay was designed to target the glycoprotein M6B (GPM6B) gene found on the X-chromosome in domestic pigs (Sus scrofa) and a homologous pseudogene on the Y-chromosome. It creates X- and Y-specific amplicons of 244 bp and 201 bp long, respectively. The presence of a 16 bp intron deletion variant (rs1111696537) on the X- specific amplicon in a subset of the population was confirmed. Despite the variant, the primer pair is effective for sex identification of domestic pigs. vi

ACKNOWLEDGMENTS

The results presented in this thesis are a result of a collaborative effort. This research was made possible through a grant from the Utah Agricultural Research Station

(project #1330) and the kind efforts of many people. Those I mention here are not the only ones who helped me, but they are the primary ones. Each is responsible, in their own way, for this thesis and all have my deepest and most profound gratitude.

First and foremost, I’d like to thank Dr. Lee Rickords. I met him outside the university as an undergraduate student, when I was unable to find a biology lab that would even let me wash their glassware. He offered me a spot in his lab or his help getting into or any other lab I was interested in. That chance meeting and conversation has changed my life! In the seven years since then, Dr. Rickords has been my P.I., instructor, mentor, major professor, editor, advocate, sounding board, greatest frustration, best chance at self-improvement, and my friend. He has always given me the freedom to choose my own path and has also stepped in (more than once) to stop me from wandering too far afield. He encouraged me to pursue graduate school and has pushed me to keep working when I didn’t think I could finish.

My committee members Drs. Clay Isom and Aaron Thomas, also deserve a special thanks. They have provided crucial guidance at critical moments in the processes of research and data analysis. Additionally, both have spent hours proofreading and helping me make edits during my writing process.

My undying gratitude goes out to Miriam Laker. She started this project and laid

all the foundations for this work. She has been my second brain during all of this and my vii

first contact for every project related question. When I have felt totally inadequate, she

has always reminded me that: this is lab work and that means that no matter how careful

and skilled you are, sometimes things are just not going to work the way they should (or

the way they have the last 40 times).

My family deserves more praise than will fit in these pages. Everything good at

the core of who I am I owe to my parents. Together they have been my best and greatest instructors. My siblings and their spouses and children have supported me physically, spiritually, mentally, and emotionally. Whether I needed someone to tell me I could do

hard things, someone to smile and nod while I babbled about science (allowing me to

remind myself that I love it), someone to distract me from worries and remind me that

there is more to life than grad school, or even just a healthy meal and a free place to do laundry; they have always been there right when I needed them most!

I’d like to thank Marcy Labrum, I owe many of my good lab related habits to her teaching, and her patience. I sincerely appreciate Dr. Abby Benninghoff for teaching me much of what I know about professional presenting. I’m grateful for Sherri Cox, Laurel

Hancey, Charles Delaney, Karma Wood, Cara Allen, and the many other members of the

ADVS staff past and present, I don’t know what most of them did/do, but I know my life would have been a lot harder without them. I’d like to thank my fellow graduate students, those who are finished gave me hope that it was possible and those who are not done yet reminded me that we are all in this together!

Finally, a very special thanks to my beautiful new bride. Since we began dating, she has believed in me far more than I believed in myself and has seen in me potential I viii didn’t think possible. She inspires me and makes me want to be better every day. She wasn’t with me at the beginning, but I wouldn’t have made it to the end without her!

Evan K. Peterson ix

CONTENTS

Page

ABSTRACT ...... iii

PUBLIC ABSTRACT ...... v

ACKNOWLEDGMENTS ...... viii

LIST OF TABLES ...... xiii

LIST OF FIGURES ...... xiv

LIST OF ABBREVIATIONS AND DEFINITIONS ...... xv

CHAPTER

1. LITERATURE REVIEW ...... 1

Introduction ...... 1

Sex Determination ...... 1

Mammalian Sex Identification ...... 7

Current PCR/qPCR Gene Targets ...... 26

Conclusion ...... 32

2. SEX IDENTIFICATION OF BOVID SPECIES USING ZRSR2 & ZRSR2Y

GENE HOMOLOGS ...... 34

Abstract ...... 34

Introduction ...... 35

Materials and Methods ...... 37

Results ...... 41

Discussion ...... 50 x

3. SEX IDENTIFICATION OF PIGS USING GPM6B & ITS Y-CHROMOSOME

HOMOLOGS ...... 52

Abstract ...... 52

Introduction ...... 53

Materials and Methods ...... 54

Results ...... 57

Discussion ...... 58

REFERENCES ...... 61 xi

LIST OF TABLES

Table Page

1 Bovid Tissues Used ...... 38

2 ZRSR2/ZRSR2Y Targeting Primer Pair Results Summary ...... 44

3 Summary of X-Amplicon Sizes Observed ...... 57 xii

LIST OF FIGURES

Figure Page

1 Portion of ZRSR2-ZRSR2Y alignment in Bos taurus ...... 39

2 NCBI BLAST alignment for the amplified region of ZRSR2 ...... 42

3 NCBI BLAST alignment for the amplified region of ZRSR2Y ...... 43

4 Gels showing representative PCR results for bovid species tested...... 45

5 Graph of melting curve of excised X- and Y-specific fragments ...... 46

6 Graphs of melting curves for a subset of Bos taurus samples ...... 47

7 Graphs of melting curves for a subset of Bos indicus samples ...... 48

8 Graphs of melting curves of Bos taurus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens samples ...... 49

9 Portion of Sus scrofa GPM6B and its Y-chromosome pseudogene aligned ...... 55

10 A representative gel showing the amplicons produced by the primer pair targeting GPM6B and its Y-specific homologue ...... 58

11 Sequencing data from long and short X-amplicon generated from GPM6B primer pair compared to the Sus scrofa reference sequence (AC_000187.1 bases 11387210 to 11387329) and to the rs1111696537 deletion ...... 59

xiii

LIST OF ABBREVIATIONS AND DEFINITIONS

AMELX: Amelogenin gene. X-linked gene important in enamel production in mammals.

AMELY: Amelogenin-like gene. Y-linked gene that plays a minor role in enamel

production in mammals.

Amplicon: A piece of DNA produced by PCR or qPCR replication.

APRT: Adenine phosphoribosyl transferase. Enzyme produced by the autosomal gene

APRT.

ASW: Avian sex-specific W-linked. W-linked gene in avian species that is, in part,

responsible for inducing female development of embryos.

Bison bison: American bison, bison, or American buffalo.

BLAST: Basic local alignment search tool. A sequence alignment algorithm and program

for rapid comparison of a query sequence against a sequence database or library.

Bos grunniens: Domestic yak.

Bos indicus: African domestic cattle.

Bos taurus: European domestic cattle.

Capra hircus: Domestic goat.

CDC: Centers for Disease Control and Prevention.

CT value: Cycle threshold. The qPCR cycle at which fluorescence rises above the

threshold differentiating it from the background signal.

Cytogenetic sex identification: Genotypic sex identification method in which a karyotype

is created to determine if the individual has two X-chromosomes (female) or an

X- and a Y-chromosome (male). xiv

DDX3X: DEAD-box helicase 3 X-linked. X-linked member of the DEAD-box gene

family.

DDX3Y: DEAD-box helicase 3 Y-linked. Y-linked member of the DEAD-box gene

family.

Direct sex identification: A category of sex identification methods that rely on direct

examination of some aspect of an individual in order to identify its sex.

DMRT1: Doublesex and mab-3 related transcription factor 1. Z-linked gene involved in

the male sex determination in chickens.

Sex identification by DNA probe of Y-specific targets: Genotypic sex identification

method in which the binding of fluorescent Y-specific probes indicates the

presence of a Y-chromosome.

Eutherian: Mammal of the major group Eutheria, which contains only placental

mammals.

FET1: Female Expressed Transcript 1. W-linked gene in avian species that is, in part,

responsible for inducing female development of embryos.

G6PD: Glucose 6-phosphate dehydrogenase. X-linked enzyme produced by the gene

G6PD.

Gender: The subjective perception of an individual’s own sex and sexual orientation.

Genital ridge: The portion of a developing embryo that is the precursor to the gonads

(also known as the gonadal ridge).

Genotypic sex: The sex of an individual based on the sex chromosomes they possess (XX

is female, XY is male, for example). xv

GPM6B: Glycoprotein M6B. An X-linked gene that codes for a membrane glycoprotein

of the proteolipid family.

Haplodiploidy: A mechanism of sex determination based on whether the individual is

haploid or diploid, with males being haploid and females being diploid (also

known as arrhenotoky).

HMG Box: High Mobility Group Box. A protein domain with DNA binding activity.

HPRT: Hypoxanthine phosphoribosyl transferase. X-linked enzyme produced by the gene

HPRT.

H-Y antigen detection sex identification: Genotypic sex identification method in which

binding of H-Y antibody indicates the presence of a Y-chromosome in the cell.

H-Y antigen: A male specific antigen.

IETS: The International Embryo Transfer Society

Indirect sex identification: A category of sex identification methods in which a secondary

indicator (often the binding of a fluorescently labeled antibody or probe) is used

to indicate the sex of an individual.

Intersex: A general term used to describe an individual whose genotypic sex and

phenotypic sex do not fully match (hermaphrodite, true hermaphrodite,

pseudohermaphrodite, male pseudohermaphrodite, sex regressed, congenital

eunuch, etc. are other terms for intersex individuals)

IVF: In vitro fertilization.

IVP: In vitro-produced. Refers to embryos created through in vitro fertilization. xvi

MCA: Melting curve analysis. A qPCR analysis method that involves the slow raising of

the reaction temperature while fluorescence is being measured. Used to determine

the melting temperature and generate melting curves.

Oligo: Oligonucleotide. A very short strand of DNA.

Osteological sex identification: Phenotypic sex identification method that compares the

measurements of bones that are highly variable in size between sexes against a

population average in order to determine the sex of a given individual in that

population.

Ovis aries: Domestic sheep.

PAR: Pseudoautosomal region. Homologous regions on the X- and Y-chromosomes

capable of crossing over between the sex chromosomes. They also allow for the

sex chromosomes to properly segregate during Meiosis in males.

PCR: Polymerase chain reaction.

PCR fragment: See amplicon.

PCR primer: Oligonucleotide that serves (as part of a pair) to determine which section(s)

of the template DNA are replicated during PCR.

PCR/qPCR sex identification: Genotypic sex identification which uses PCR to create an

amplicon from a portion of the Y-chromosome (and often a different size

amplicon from the X-chromosome). Presence of the Y-specific amplicon indicates

the presence of a Y-chromosome (i.e. a genotypic male). Absence of the Y-

amplicon indicates a lack of Y-chromosome (i.e. genotypic female). xvii

Phenotypic sex: The sex of an individual based on the internal and external genitalia they

possess, as well as expression of secondary sex characteristics.

PAGE: Polyacrylamide gel electrophoresis. Type of electrophoresis that uses a

polyacrylamide gel, generally offers a higher resolution than agarose gels.

Primary sex characteristics: The internal and external genitalia of an individual.

qPCR: Quantitative polymerase chain reaction (also known as real-time polymerase chain

reaction). PCR with the addition of any of several types of fluorescent chemistries

and measurements of fluorescence during the reaction, providing real-time and

potentially quantitative data.

RFLP: Restriction fragment length polymorphism. Method of determining the presence

of polymorphisms based on the length of fragments produced by a restriction

enzyme digestion. rs1111696537: A 16 long intron deletion variant on the X-chromosome in Sus

scrofa.

SE47/SE48: A primer pair developed for PCR sex identification in cattle.

Secondary Sex Characteristics: Characteristics of sex that appear in individuals as the

near sexual maturity.

Sex Determination: The process by which sex is selected and develops in individuals of a

given species.

Sex Identification: Any method by which ascertain the sex of an individual.

SF1: Steroidogenic factor 1. An orphan nuclear receptor that acts in conjunction with SRY

to upregulate SOX9 expression in the process of sex determination. xviii

Sox9: SRY-related HMG box 9. A gene activated and upregulated by SRY gene activity.

Important in maintaining the process of male sex development in sex

determination.

SRY: Sex-determining region Y. Y-linked gene responsible for initiating the cascade that

results in male sex development during embryonic development of therian

mammals.

Sus scrofa: Domestic pigs.

TDF: Testes determining factor. Prior to the discovery of the SRY gene. This was the

name given to the theoretical gene suspected to be responsible for sex

determination in therian mammals.

Therian: Mammal of the major group Theria. Includes all marsupials and placental

mammals.

Ultrasound sex identification: Phenotypic sex identification method that uses ultrasound

technology to view the genital structures of a developing fetus in utero.

Visual sex identification: Phenotypic sex identification method in which sex of an

individual is ascertained by visual examination of primary sex characteristics.

X-linked enzymatic sex identification: Genotypic sex identification method in which the

relative activity of an X-linked enzyme prior to X-inactivation is used to

determine if an embryo’s cells contain one or two X-chromosomes.

ZFX: Zinc finger protein X-linked. Gene on the X-chromosome that codes for a DNA

binding protein. A target for some PCR sex identification assays. xix

ZFY: Zinc finger protein Y-linked. Gene on the Y-chromosome that appears to play a role

in normal male sex development and spermatogenesis. A target for some PCR sex

identification assays.

ZRSR2: Zinc finger CCCH-type, RNA binding motif and serine/arginine rich 2. An X-

linked gene that encodes for U2 small nuclear ribonucleoprotein auxiliary 35 kDa

subunit-related protein 2. A target of the primer pair described in chapter two for

the purpose of sex identification in bovid species.

ZRSR2Y: Zinc finger CCCH-type, RNA binding motif and serine/arginine rich 2 Y-

linked. A Y-linked, bovid specific, gene whose functions are not clearly

understood. A target of the primer pair described in chapter two for the purpose of

sex identification in Bovid species. CHAPTER 1:

LITERATURE REVIEW

Introduction

Sex is an important defining characteristic of sexually reproducing animals. This review of literature will discuss the most common of the various methods by which sex is determined. As the focus of the research is on eutherian species (placental mammals), special attention will be given to the way in which sex is determined in them (X/Y sex determination). This will include a history of the discovery of the so-called testes determining factor (TDF), sex-determining region Y (SRY), on the Y-chromosome and the way in which this gene induces male embryonic development. Discussion will also include the reasons why sex identification of eutherians is important and the various direct and indirect methods by which sex identification has and is accomplished. PCR and qPCR methods will be discussed in detail as they are the methods used in the research presented in the following chapters. The review will conclude with a discussion of the details of several of the specific genes that are currently targets for PCR/qPCR sex identification.

Sex Determination

Sex and Sex Determination

Within the animal kingdom, the vast majority of species undergo sexual reproduction during which haploid cells of individuals of different sexes merge to create a new diploid organism1. The sex of an individual is generally determined during 2

embryonic development, though the exact mechanism by which sex is determined varies

significantly. Methods of sex determination include chromosomal sex determination,

temperature-dependent sex determination2, location or contact-dependent sex determination3, and sequential hermaphroditism4. The most studied of these methods is

chromosomal sex determination. Within chromosomal sex determination, different

species accomplish sex determination through many mechanisms including

haplodiploidy, ZW sex chromosomes, XX/XO determination, and XY sex chromosomes

(including both X/A ratio, Y chromosome determination methods).

Haplodiploidy (also known as arrhenotoky) is the mechanism of sex

determination in all insects of the orders Hymenopterida and Thysanoptera.

Haplodiploidy determines sex based on whether the individual is haploid or diploid, with

males being born from unfertilized (haploid) eggs, and females being born from fertilized

(diploid) eggs5,6. As a result, males cannot have direct male progenitors or descendants.

Additionally, female offspring with the same male and female parents will have a

relatedness coefficient of 0.75 with one another and 0.50 with their mother (on average

they share 75% and 50% of their genomes respectively), in contrast with the standard

0.50 between all siblings and between parents and offspring as seen in species that

reproduce using other means. This altered ratio of relatedness is used to explain the

eusocial behavior of species that exhibit haplodiploidy in sex determination, as the

offspring will have more of their own genes expressed if they help their parent produce

more female offspring than if they were to produce their own offspring, in an

evolutionary strategy known as kin selection7. 3

ZW sex determination is a category of methods by which sex is determined in

birds, as well as some fish, insects, and reptiles. In this system, there are a pair of dissimilar sex chromosomes called the Z and W chromosomes. The Z chromosome is larger than the W chromosome and contains more genes allowing for a visible difference between the sex chromosomes. There is a large amount of variability in the mechanisms by which sex is determined in this system, so much so that the only truly defining characteristic of the system is that heterogametic (or monogametic in some species as described below) individuals (ZW or Z0 respectively) are female, while homogametic individuals (ZZ) are male8,9.

In some species, the presence of the W-chromosome specific genes are needed for

the embryo to develop into a female. For example, in birds the genes FET1 and ASW

(both found on the W-chromosome) are responsible for directing development toward the

female sex10–12. In contrast, in chickens, the double dosage of a Z-chromosome gene

(DMRT1) as a result of being homogametic also induces development of ZZ embryos into males13. Within invertebrates, some species have an altered mechanism in which

they do not directly rely on the W chromosome for determination, as is the case for some

members of the order Lepidoptera that have an altered form of ZW determination in

which ZO and ZZW also generate females9. The inconsistency in method is likely due to

the ZW system evolving more than once, as has been shown with the divergent origins of reptile and bird Z and W chromosomes14.

The last major chromosomal sex determination mechanism is the XX/XY system.

The XY system is similar to the ZW system in the amount of variability and the types of 4

variability in the mechanisms. They are also similar in one sex usually being heterochromatic and the other being homochromatic. Indeed, early research reports used

the terms interchangeably,9 and there has been some debate as to whether the two should

be given separate designations at all15–17. The main difference is that in the XX/XY

system, the male is heterochromatic (XY) and the female is homochromatic (XX). As

with ZW, there are several specific ways in which sex is determined. For example,

Drosophila melanogaster sex is dependent on the ratio of X to autosomal chromosomes

(X:A ratio), such that 1:1 X:A ratio individuals develop female and 1:2 X:A ratio

individuals develop male18,19. Within therians, nearly all species determine sex based on

the presence of a Y-chromosome and the SRY gene on the Y-chromosome in particular,

as will be described below. For simplicity, hereafter all further references to mammals

should be taken to be in reference specifically to therian mammals unless specifically

described otherwise.

Discovery of Mammalian Sex Determination

In most mammals, generation of biological sex is caused by the XY sex-

determination system. In this system, females are most often homogametic, having two

X-chromosomes, while males are heterogametic, having one X- and one Y-chromosome.

All initial research of mammalian sex determining mechanisms was primarily

accomplished studying humans and mice as described here. For the sake of simplicity,

protein and gene naming conventions will be used hereafter regardless of species.

When the X- and Y-chromosomes were first discovered in humans in 192320 it

was believed that they influenced sex determination via the same mechanism as 5

Drosophila melanogaster, namely the X:autosome ratio. This was believed until 1959

when aneupliod individuals in the form of both XXY individuals having testes 21 and XO

individuals with ovaries22 were discovered. These discoveries meant that human sex was

determined by some element within the Y-chromosome and not in the same manner as

Drosophila melanogaster, as the X:autosome ratio would have created phenotypes opposite of what was observed. In 1947, castration of fetal rabbits prior to sexual differentiation was found to cause both XX and XY rabbits to develop female ducts and external genitalia, leading to the realization that 1) the default state of development without gonads is female and 2) development of either testes or ovaries determines sex23.

The three decades following the discovery of the Y-chromosome’s role in determining testes development were spent trying to identify which gene on the Y- chromosome initiated the change in gonadal fate. Studies to find this testes determining factor gene (TDF) were carried out largely by examining the X- and Y-chromosomes of individuals expressing aberrant sex genotype by phenotype interactions (i.e. intersex individuals, XX males, XY females, etc.). After numerous false starts and misidentifications of TDF (including H-Y24, ZFY125, AND ZFY225), the location of the

gene was found to lie within a 60 kb region of the Y-chromosome. This was determined

through analysis of one XX intersex and three XX male humans that carried only a 60 kb

section of the Y-chromosome in common26. Shortly thereafter, another laboratory

identified a gene in this area which they termed SRY (sex-determining region Y)27. The

role of the SRY gene in sex determination of mammals was then confirmed by further

studies of the SRY gene in both humans28,29, and in mice25,30,31. 6

How SRY Induces Sexual Differentiation

The SRY protein is a transcription factor that contains a conserved domain similar to DNA-binding domains known as high mobility group (HMG) boxes27. Many related

transcription factors have been discovered and are termed the SRY-related HMG box

protein (SOX) family32,33. This family of operates by binding the minor groove of DNA at the consensus sequence (A/T)ACAA(T/A)34 and bending the target DNA

sharply at a 60-85° angle35,36. Outside of the HMG domain, no other highly conserved

functional regions have been identified in the SRY protein.

In mice, the SRY gene drives sexual differentiation through expression in early

gonad development. It follows a wave-like pattern beginning in the center of the genital

ridges, in pre-Sertoli cells, and rapidly extending out to the poles37. Expression is rapidly

extinguished following a similar pattern, reducing in expression in the center first, then

moving toward the poles. This occurs with the extreme posterior pole persisting in SRY

expression somewhat longer than the anterior pole37. This same pattern is seen in the

expression of the SRY protein in the genital ridges as well, though delayed by 6-8

hours38. The timing of SRY protein expression is critical for proper sexual

development39–41. Expression of SRY need only be brief (~2 days in mice37,38), though prolonged expression is seen in some mammals including humans42,43.

Across animal species, the SOX9 gene is a major target of SRY proteins, with its expression initiation pattern in the genital ridges closely following the wave pattern of

SRY described in the previous paragraph38,40,44–46. However, unlike SRY, SOX9 expression is maintained once initiated44, and there are indications that this continued expression is 7

needed to prevent Sertoli cells from reverting to their nondifferentiated state, which

would lead to sex regression47. SRY, in conjunction with an orphan nuclear receptor

called steroidogenic factor 1 (SF1), upregulates SOX9 in gonadal tissue48. SOX9 operates with SF1 to create a feedback loop increasing and maintaining its

own expression level independent of SRY48. Beyond maintaining its own expression

level, SOX9 appears to target several other genes in a cascade effect that results in the development of Sertoli cells from pre-Sertoli cells, as well as development of testes, and

ultimately the male sex (for a more complete review on what is known about this complex cascade, see references 47,49–53). While the entirety of the mechanism of mammalian sex determination is not known, and there may be additional direct ways in which SRY influences sex, its role as a switch to begin the cascade of male sex determination via SOX9 activation is clear.

Mammalian Sex Identification

Purpose

Sex identification is important for many applications in humans and other mammals. Major uses include in vitro fertilization (IVF)/embryo transfer, prenatal sex

identification, population ecological studies, lab sample controls, archeology, and various

forensic uses.

The CDC reports that 197,706 assisted reproductive technology procedures (99% of which are in vitro fertilization) were performed in 2016 with the intent of transferring at least one embryo54. Of those procedures, 154,439 (78.1%) led to the actual transfer of 8

an embryo and 65,964 (42.7%) resulted in a live birth54. Hamilton and McManus report

that the average cost to the patient is $10,000-15,000 per attempted IVF treatment55.

Given the low success rates and high costs, sex identification of embryos prior to transfer is a potentially large boon in cases where a child of a particular sex is desired.

The International Embryo Transfer Society (IETS) Data Retrieval Committee reported that, worldwide, during 2018, among cattle, horses, sheep, and goats, a total of

434,137 in vivo-derived (IVD) embryos were transferred (386,133, 22,495, 17,510, and

7,999 respectively) and 744,504 in vitro-produced (IVP) embryos were transferred

(742,908, 1,080, 358, and 158 respectively)56. Given the high cost of these procedures

(highly variable, but generally ranging from $1,000-5,000 per animal born), paying to transplant embryos of an undesired sex is prohibitive, and thus the low cost of embryonic sex identification prior to transfer is preferable.

Sex identification in prenatal fetuses is also beneficial in both humans and other mammals (particularly livestock). In humans this is primarily used to sate the curiosity of the parents57, however, it can also be used to reduce invasive prenatal testing for X-linked

genetic disorders58–60. The impact in animals is more significant economically, as it can

be used by producers to make informed decisions earlier than if sex could not be

determined until parturition. Examples of decisions include potential presale of animals,

sale of pregnant animals dependent on the sex of the fetus, determining if contracts are

completed or if more embryos should be produced, and selective termination of

pregnancies61,62.

In population ecology, sex identification is useful in monitoring populations, 9

estimating sex ratios, determining range and distribution, and understanding the roles of

each sex within an organism’s social structure63–65. Of particular use are sex identification

methods that can be used non-invasively so as to not disrupt the normal interactions of

the population.

Chromosomal based sex identification methods can be used as a primary control

against mislabeled or potentially contaminated mammalian samples. A simple way of

detecting if a biological sample has been contaminated, mislabeled or damaged severely

is to use one of the several chromosomal based sex identification methods that exist66,67.

While normal results do not confirm the identity or purity of a sample, aberrant sex

results are a strong indication that the sample is contaminated or mislabeled.

Archaeology has similar needs to ecology with regard to sex identification. The

main difference is that while ecologists use special methods for sex identification to

avoid disturbing the natural habits of the population, archaeologists need special methods

because they are working with a skeletal record68,69.

Finally, sex identification is used in forensic science in several ways. Perhaps the

most obvious of these is determining the sex of a potential suspect in a criminal

investigation based on the analysis of tissue or DNA70. In situations where hunting one

sex of animal (usually the male) is legal, but it is illegal to take the other, poachers may

strip the meat from the carcass leaving behind all easily identifiable markers of sex71. In

such situations, other sex identification methods may be used to identify game animals

taken illegally71. A similar method is used in countries, such as India, where slaughter of livestock may be illegal depending on the sex or where the price of meat is dependent on 10

the sex of the animal it came from72,73.

Methods of Mammalian Sex Identification

A multitude of different methods of sex identification exist for mammals. These methods can be divided into two broad categories, namely: direct and indirect identification methods. Direct methods are those that look directly at some aspect of the animal to determine sex, and include methods such as visual detection of physical sex

characteristics, osteological sex identification, ultrasound viewing of fetal genital structures, and karyotyping. Indirect methods rely on a secondary indicator to determine sex. These methods include H-Y antigen detection, X-linked enzymatic identification,

DNA probes of Y-specific targets, and PCR/qPCR sex identification.

It is important to recognize that there are three distinct ways sex can be perceived: genotypic sex, phenotypic sex, and gender74. Genotypic sex refers to the sex

chromosomes possessed by an individual (XX is female, XY is male for example).

Phenotypic sex refers to the internal and external genitalia of an individual, as well as

expression of secondary sex characteristics. Gender is the subjective perception of an

individual’s own sex and sexual orientation. As gender is only applicable in humans, and

is far beyond the scope of this work, no more will be said on the subject of gender.

Genotypic and phenotypic sex are generally the same, however, in a small portion of individuals (which varies between species) genotypic and phenotypic sex do not match. In such cases the individual is considered to be intersex (other terms include hermaphrodite, true hermaphrodite, pseudohermaphrodite, male pseudohermaphrodite, sex regressed, congenital eunuch, etc.). Intersex individuals are generally rare; however, 11

they are worth mentioning here as all sex identification methods currently identify either

genotypic or phenotypic sex, but not both. The description of each method, given below

will also note whether it identifies genotypic or phenotypic sex.

Direct Methods

Visual detection of sex characteristics

Visual detection of physical sex characteristics is the easiest, simplest, and

cheapest method of sex identification and is the most common method of identifying phenotypic sex, in many situations. This involves identifying the presence or absence of primary and secondary sex characteristics in the individual. Secondary sex characteristics

appear as the animal reaches sexual maturity and are highly variable from species to

species, though they often include subtle differences in size, bone structure, and

musculature. Primary sex characteristics are generally more consistent between species

and the external genitalia can be used for visual identification in many instances. For

males, the presence of a penis structure is the primary external indicator of sex.

Additionally, in many creatures, testicles outside of the abdominal cavity also indicate the male sex. In females, presence of the vaginal opening is generally the external indicator of phenotypic sex.

For many applications, this is not a viable method of sex identification as it requires the individual to be present and to have genitalia developed enough to be seen.

Meaning, it cannot be used for forensic purposes or for sex identification of embryos or of fetuses in utero and it tends to be an invasive process for ecological purposes. Further, 12

special training or species familiarity may be required in species that are particularly

difficult, or do not follow normal patterns. One such species is the spotted hyena, in which females are generally larger and more muscular than males (opposite from the pattern most commonly observed in mammals), and they have a pseudo-penis75 which

can be confused with the male penis by those unfamiliar with the species.

Osteological

Osteological sex identification uses skeletal measurements of animal remains in

order to identify the phenotypic sex of the individual. As with visual inspection of the animal, osteological sex identification is easier after the individual has reached maturity and secondary sex characteristics are present. This is because as mentioned above,

skeletal changes occur as animals reach sexual maturity. Even before the animal reaches

maturity however, it may be possible to determine sex, though accuracy is generally

much lower.

These changes are often different between species requiring a specific knowledge

of each skeletal sexual dimorphisms in any given species before sex can be determined.

In humans for example, females have smaller and more delicate (lighter) bones and these

differences between the sexes are most easily detected in the nuchal crest, mastoid

process, supraorbital margin, supraorbital ridge/glabella, and mental eminence68. In

contrast, the breadth measurement of metacarpal bones is used to determine sex in

cattle76.

While potentially reliable, interpopulational variance requires means to be

determined for each population before sex identification can be performed with any real 13 degree of accuracy. Additionally, because the osteological features of each sex vary within a given population and the variances overlap, the accuracy in the portion of the population that contains this overlap is decreased68. The “nontrivial error rate” of osteological sex identification has led researchers to recommend use of DNA based sex identification techniques whenever possible68 as those methods have a much higher accuracy. The high error rate and the need for the presence of intact bone limits the usefulness of the method to archaeological applications where DNA sex identification is not possible.

Ultrasound

Ultrasound (sonography or ultrasonography) sex identification uses the emission and detection of high frequency (1-18 MHz) sound waves to determine the phenotypic sex of an animal in utero. The sound waves are produced by a piezoelectric or capacitive transducer, both of which convert electrical energy into high frequency sound emissions.

The emitted sound is partially reflected back to a sensor on the probe with different tissues and structures reflecting and absorbing the sound in different ways. The sensor works in the reverse of the transducer, converting sound into electrical current. The signals coming in from the transducer are then analyzed based on how long it took the echo to return and how strong it was. The scanner uses this information to determine which pixels of the image to light up and to what intensity and this information is used to display the image.

The first published data on the use of ultrasound for fetal imaging was in 195877 and the first published data on fetal sex identification using ultrasound was in 197757. 14

Methods of ultrasound for fetal sex identification vary by animal, for example, in humans

the ultrasound is performed through the abdominal wall, while in cattle transrectal

ultrasound is used. This imaging method uses the same basic premise as visual

inspection, in that both involve observing the genital structures of the subject. The major

difference is that the error rate is higher in ultrasound as it can be difficult to get a good

image of the genitals, though training and experience significantly improve accuracy.

This method of sex identification has the advantage of being rapid and relatively

inexpensive; however, it is of course only useful for in utero sex identification.

Additionally, successful identification is only likely after the point in the pregnancy when

primary sex characteristics have developed.

Cytogenetic

Cytogenetic sex identification is a method for determining the genotypic sex of an

individual. This method works in the same way as generating a karyotype namely,

inducing cell replication, arresting cell replication in metaphase, hypotonic shock,

fixation, staining, and microscopic visualization, primarily looking for the presence of the

Y-chromosome. This methodology has been used for sexing of embryos by taking a

biopsy of the embryos prior to transfer78–81 and by collecting and culturing amniotic fluid

cells82. It can also be used on somatic cells83 so long as they are living and capable of

replication.

The requirement, for cells to be living and replicating, limits the uses of this methodology such that it is not used except for sex identification of embryos or of fetuses in utero and in such cases it is often not the most efficient method. When sexing embryos 15

in this way, it can be difficult to get sufficient numbers of cells in metaphase and of a

high enough quality in a reasonable amount of time. The best way to work around the

lack of cells is to take a larger sample, but this can cause different issues. For example, by

waiting until bovine embryos reach day 14 or day 15, a larger sample may be taken78

without, in theory, compromising the ability of the embryo to develop normally84.

Unfortunately, this late in development, the embryo has hatched and if frozen and thawed

for later implantation the success rate of transfer drops significantly, meaning that the

embryo may need to be transferred before the sexing is actually completed. In contrast,

day-6 or day-7 bovine embryos can still be frozen and thawed but have only a limited

number of cells that can be biopsied without damaging the embryo’s developmental

potential79–81. The only advantage of this method currently is that it does not require

information regarding the genomic sequence of the species, as methods such as DNA probes and PCR methods do.

Indirect Methods

Antigen detection

Detection of antigens such as the Histocompatibility-Y (H-Y) antigen have been

used to identify the genotypic sex of embryos prior to transfer by using the presence of

Y-specific proteins to identify Y-chromosome containing samples. With H-Y antigen this is done by either arresting the development of male embryos through culturing in the presence of H-Y antiserum and complement which causes H-Y positive cells to lyse85 or

by indirect immunofluorescence assays using polyclonal or monoclonal antibodies86,87. 16

By culturing cells with the antiserum and complement, cells containing the

antigen will experience antibody binding followed by the complement cascade, which

will ultimately lead to cell death in any cells containing the Y-specific antigen. In

contrast, the immunofluorescence assay uses binding of a fluorescently tagged secondary

antibody to bind to the primary antibody. Thus when viewed under a fluorescent

microscope, cells expressing the antigen will fluoresce making them potentially easy to

identify. Unfortunately, H-Y antigen detection is not highly accurate, which is thought to be a result of the highly subjective nature of the immunofluorescence assays86. This

subjectivity seems to be due to altered fluorescence depending on the quality of the

embryo and to the inability to detect the antigen at different stages in embryonic

development86.

X-linked enzyme methods

In mammals, females have a higher copy number of all genes on the X-

chromosome because they carry two copies of the chromosome while males only carry

one. This difference in copy number requires that females inactivate one X-chromosome as a form of dosage compensation in order to allow for normal development. While the exact timing of X-inactivation varies between species, there is a short period in the early development of all species where both X-chromosomes are active. X-linked enzymatic sex identification takes advantage of this brief period of doubled X-linked gene products in females to differentiate between the genotypic sexes of embryos by measuring the relative activity of an X-linked enzyme.

Two different X-linked enzymes have been used for this, glucose 6-phosphate 17

dehydrogenase (G6PD)88 and hypoxanthine phosphoribosyl transferase (HPRT)89. In T.J.

Williams’s 1986 report, the sex of 64% of mouse embryos successfully carried to term

was determined accurately88. This was done using a vital colorimetric assay. Briefly,

morula to blastocyst stage embryos were incubated in a medium containing glucose 6-

phosphate, NaDP, and brilliant cresyl blue stain. Embryos were then graded semi-

quantitatively based on the amount of staining. Results showed a bimodal distribution of

staining and embryos with a high staining score were assumed female and low staining

were assumed male, with the result that 72% of the predicted females were sexed

correctly and 57% of the predicted males were sexed correctly, for an overall accuracy of

64%88.

A study conducted by M. Monk and A. H. Handyside used the expression of

HPRT in single mouse blastocysts separated from zona-free eight-cell embryos to determine sex. In addition to the X-linked enzyme, HPRT, they also examined the activity of the autosomal-linked enzyme adenine phosphoribosyl transferase (APRT) for use as a control of base cellular activity. They found a bimodal distribution of

HPRT:APRT ratios 89, similar to the findings of Williams88, and determined that those

with a low ratio were male while those with a high ratio were female. Subsequent

examination of the fetuses indicated 91% (11/12) were accurately sexed as female, and

100% (3/3) were accurately sexed as male89.

As stated above, this method is limited to early embryos as X-inactivation corrects for the double dosage. Significant testing would be required to adapt the method for each new species, because X-inactivation does not occur at a consistent 18

developmental stage across species. An increased mortality rate in extreme stained

embryos88 indicates that there is still some toxicity to the process, but in some species separation of single blastocysts is not as feasible as it is in mice. Taken together, this is a suboptimal form of embryonic sex identification.

DNA probe

DNA probe methods for genotypic sex identification use Southern blotting or dot

blotting of DNA or in situ hybridization of cells with a probe consisting of a relatively

short DNA sequence specific to a conserved region of the Y-chromosome labeled with

either a radioactive or fluorescent tag90–93. These methods detect the presence of the Y-

chromosome in a DNA sample or cell of any kind making this one of the most broadly

useful and accurate (95-99%)92,93 methods of sex identification.

Use of DNA probes for genotypic sex identification came from the use of probes

in attempts to identify TDF in humans94. In searching for TDF, Y-specific sequences

were isolated, those sequences were then replicated, tagged, and used to test individuals

whose genotypic sex did not match their phenotypic sex. This was done to determine

which parts of the Y-chromosome were present or absent in XX males and XY females

respectively95–97.

In the early days of DNA probe sex identification, this method (especially

creating new probes) was a difficult, time and labor intensive process (see references 93,94

for more information). Modern sequence databases and the ability to perform BLAST

searches and order premade labeled probes has significantly reduced the difficulty of

making DNA probes for most species. Despite these advances, use of DNA probes is still 19

a time and labor-intensive method for sex identification, especially compared to PCR and

qPCR methods.

PCR and qPCR

Overview of PCR and qPCR

Polymerase chain reaction (PCR) and quantitative PCR (qPCR) methods of sex

identification both use PCR amplification to identify genotypic sex by either determining

whether a Y-chromosome or two X-chromosomes are present in a DNA sample. Since

PCR and qPCR can be used in a multitude of ways for sex identification and are of

primary importance to the research presented in this thesis; a more detailed explanation of the methods, chemistries, and applications of these techniques will be discussed. An

overview of the methods will be discussed prior to their use in sex determination

procedures.

PCR and qPCR differ mainly in the methods of detection and data analysis. Both

PCR and qPCR operate on the principle of massively replicating a DNA sequence (or

sequences) from the sample DNA template. A pair of short oligonucleotides, called

primers, determines the replicated sequence, called an amplicon. Amplicons created

using PCR are frequently detected and analyzed using gel electrophoresis and a DNA

binding stain. This separates amplicons by size allowing users to determine the

approximate size of the amplicon and determine if amplicons of different sizes are

present.

In qPCR more analysis options are available in addition to gel electrophoresis, 20

including real-time data analysis as the reaction is occurring (qPCR is also known as real-

time PCR). The collection of real-time data is dependent on the use of fluorophores in the reaction that only fluoresce under certain conditions. Throughout the reaction (usually at the end of each cycle), the fluorescence of the fluorophore(s) is measured and recorded.

There are two main classes into which the fluorescent chemistries can be divided, specific and non-specific. Specific chemistries use oligonucleotides with a fluorophore and a quencher bound in proximity to each other (called probes). Under the excitatory wavelength of light, the fluorophore’s emission is absorbed, or quenched, by the adjacent quencher molecule. During the reaction, the probe will bind to a specific sequence in the amplicon and will be digested by the 5’ to 3’ exonuclease of the polymerase allowing the fluorophore and quencher to separate, thus as more copies of the amplicon are made fewer fluorophores are quenched, and greater fluorescent emission is detected.

In contrast are non-specific chemistries. In these methods, the fluorophore only fluoresces when bound to double stranded DNA (dsDNA). As more amplicons are generated, there is more dsDNA and thus more fluorescence with each cycle of the reaction.

Both methods of the chemistry provide fluorescence data, which can be used to quantify the absolute or relative number of copies of the amplified region in the original sample. Absolute quantification is done by comparing the fluorescence of the sample of unknown concentration to a calibration curve, created using DNA standards generated by the same primers, in order to calculate the exact number of sequence copies initially. For this calculation to be accurate, the amplification efficiency (also known as primer or 21

qPCR efficiency) of the sample and the standards needs to be the same.

Amplification efficiency is calculated using serial dilutions of the sample as

follows. qPCR is done on the serial dilutions and the cycle numbers at which the

fluorescence rises above the detection threshold (the Ct values) are graphed against the

log of the sample dilution. The slope of that line is then used in the following equation to

calculate the percent efficiency of the amplification.

An efficiency of 100% means that for every round of replication the number of

copies of PCR product doubles. This is not often the case, but an efficiency of 90-110%

is generally considered adequate for replication, but for accurate quantification data the

efficiencies of the samples and standards must be similar (for a review of the error

introduced to quantitative results by differing efficiencies see reference 98).

Relative quantification is measured in a different way (specifically using delta-

delta-Ct method) and while efficiency is important here as well, small differences in

efficiency between target and reference efficiencies do not introduce as much error as in

absolute quantification98. Quantitative analysis is used for gene expression analysis as a way of determining copy number of cDNA derived from reverse transcription. This can also be used to determine degree of infection in virally infected tissue.

In addition to the quantitative capabilities of qPCR, it can also be used in a

deterministic fashion. For example, using primers that only generate an amplicon when a

certain mutation is present, or when DNA from a particular bacterial or viral strain is

found in the template DNA, the generation of any amplicon (indicated by an increase in

fluorescence) is sufficient to determine its presence. 22

A final method of analysis (and of particular importance in chapter two of this thesis) is melting curve analysis (also called fusion or melting temperature analysis) which allows for the differentiation of amplicons based on their melting temperature.

Melting curve analysis requires a special measurement step after the final extension of the reaction. Traditionally, after the final extension the reaction is heated and cooled to the denaturation and annealing temperatures respectively, as though beginning another cycle of replication. However, once the DNA is annealed, the reaction is slowly heated back up to the melting temperature as frequent fluorescence measurements are taken at regular intervals. This is usually done using an intercalating dye, which is only able to fluoresce when bound to dsDNA. As the DNA melts, separating into single strand DNA

(ssDNA), the amount of fluorescence will decrease. As the melting temperature of a particular amplicon is reached, the rapid conversion of dsDNA to ssDNA will create a correlated rapid decrease in the amount of fluorescence detected. The change in fluorescence is then graphed against the temperature and is known as the melting curve.

A number of factors including GC content, fragment length, and sequence structure influences DNA melting temperatures, thus the presence of multiple peaks in the melting curve analysis indicates the presence of multiple unique amplicons.

Use in sex identification

Sex identification using PCR and qPCR technology can be accomplished using a variety of subtle modifications. Here, detection of the Y-chromosome without a positive control, multiplex and inherent positive controls, differences in techniques depending on the detection mechanism in qPCR, and a qPCR method sex identification based on the 23

relative number of X-chromosomes will be briefly discussed.

Non-quantitative PCR sex identification is accomplished by detecting the

presence of a Y-chromosome, through amplification of a specific portion of the Y-

chromosome, then determining if the amplification occurred and if the Y-chromosome

specific amplicon is present. The major variation in these methods is 1) the chromosomal

region amplified, 2) the absence or presence (and form) of a positive reaction control, and

3) the method by which the amplicon is determined to be present and specific to the Y-

chromosome.

A large variety of Y-chromosomal regions have been targeted for sex

determination and an exhaustive list would be impractical to give, however, these targets

can generally be placed into one of two groups 1) microsatellites/highly repetitive

regions99,100 and 2) gene specific targets. Microsatellites and highly repetitive regions were mainly used in the early years of PCR sex identification, but has fallen out of common use. Gene specific targets are still routinely used.

If the primer only amplifies a target on the Y-chromosome (ex. those targeting the

SRY gene), there can be an increased risk of a false negative (incorrectly determining that a male is a female) because there will be no difference between a failed reaction and an

X-chromosome only containing female sample101–103. Multiplexing (usually duplexing)

with other primer pairs that target the X-chromosome or an autosome can mitigate this

risk somewhat. As a multiplexed reaction where the non-Y-specific amplicon is present, but the Y-specific amplicon is not, indicates that the reaction conditions do allow PCR to proceed104–107. Having a multiplexed reaction can complicate the assay development 24

process however, as it requires consideration of multiple additional factors such as

matching primer annealing temperatures and checking for primer dimer formation among

twice as many primers. In addition to these complications, it does not act as a control for

the Y-specific primers, thus if the primers fail either because they were not added in

proper concentrations or they have been compromised, this will not be identifiable. A

more efficient control can be achieved by selecting a primer that creates two amplicons of

differing sizes, one from the Y-chromosome and one from either the X-chromosome or

an autosomal chromosome108–110,63,64. This eliminates the complications of multiplexing

multiple primer pairs, and if there is a problem with either of the primers it will be detected through this inherent control.

Amplicons are generally visualized by gel electrophoresis and DNA staining with ethidium bromide or other less toxic stains such as Sybr Safe™ (Invitrogen, Life

Technologies, Carlsbad, CA). While visualization is consistent across methodologies, the way in which the amplicon is determined is variable. By amplifying only a Y- chromosomal region, presence of the Y-chromosome can be determined simply by visualizing an amplicon of the expected length111,112. As mentioned above, the use of a

single Y-chromosome specific primer pair has the problem of females and failed reactions being indistinguishable. A different method of identifying the Y-chromosome,

that was briefly used, is through Restriction Fragment Length Polymorphism (RFLP).

RFLP methods create Y-specific and non-Y-specific amplicons during PCR which are then digested using restriction enzymes. Due to differences in the amplicon sequences the

restriction enzyme digestion creates fragments of a length unique to the origin of the 25

amplicon allowing for easy identification of the presence of the Y-chromosome based on

the length of the fragments99,113,114. Relative to methods that create amplicons of differing

length during PCR, RFLP methods add extra work with no benefit. The final and most

important method for identifying Y-specific amplicons, relative to this body of work, is

through generation of a Y-specific and a non-Y-specific amplicon through the PCR that

differ in length64,104–108,110. This method includes the reaction control in each sample and

requires no added steps between PCR and visualization in order to allow for

differentiation between the amplicons.

Due to the genetic variance between species, there is a high probability that

different primer sets will need to be designed for species that are not closely related,

increasing the labor of establishing a working test in a new species.

qPCR can be used to identify genotypic sex by either determining the presence of a Y-chromosome or determining if more than one X-chromosome is present115. Methods that identify sex by determining if a Y-chromosome is present may use the same primers that are used in PCR. An intercalating dye may also be used, in conjunction with melting curve analysis, to determine if the Y-chromosome is present so long as the amplicons differ enough in melting temperature to be distinguished from one another116. Use of a single primer pair that creates amplicons of differing sizes (as described in the PCR

section above) is accomplished in the same ways as multiplexing, however, if the two

amplicons are highly homologous (which is likely), special care must be taken in probe

selection to ensure that probes bind only to their intended amplicon.

qPCR methods that determine if there is more than one X-chromosome rely on 26

the assumption that equivalent quantities of genomic DNA will have near equivalent

chromosomal copy numbers. With this assumption then, a relative copy number of X-

chromosomes can be determined based on CT values using an intercalating dye. This

technique uses the 2 –ΔΔCT method to determine relative copy number of an X-linked amplicon and by extension the genotypic sex of the individual115. This technique will

inaccurately predict phenotypic sex in XO (Turner syndrome) and XXY (Klinefelter

syndrome) individuals which occurs in 1 in 1893117 to 5000118 and 1 in 500118 to 650119

humans respectively. Thus, on average such a test in humans would be expected to not be

an accurate representation of phenotypic sex 0.17%-0.25% of the time. In contrast, PCR

and qPCR techniques that identify Y-chromosome will only have the genotype and

phenotype misalign in cases where an XY individual is phenotypically female or a XX

individual is phenotypically male. This only occurs in about 1 in 80000120 and 1 in

20000121 humans respectively, making PCR and qPCR sex identification based on the

presence of a Y-chromosome far more likely to accurately predict phenotypic sex.

Current PCR/qPCR Gene Targets

Common Origin of X & Y

As most PCR and qPCR sex identification of mammals is done by targeting genes

on the Y-chromosome and a homologous region on the X-chromosome, a brief overview of the evolutionary history of mammalian sex chromosomes is appropriate.

In the 1960s it was proposed that the vertebrate X- and Y-chromosomes were originally a pair of homologous autosomal chromosomes that evolved over time122. This 27

evolution was such that the proto-Y-chromosome developed the genes needed to be the

chromosome of the heterogametic sex (males) and progressively deteriorated until it

could not exchange most of its genetic material with its original homolog, this while the

proto-X-chromosome remained largely the same with the exception of developing

methods of dosage compensation122. The exact way and order in which this occurred is

still not fully known and many different hypotheses have been put forward to explain it

(for a more information on the proposed evolution of sex chromosomes see references

122–128).

Despite this, it is generally agreed that the X- and Y-chromosomes were

originally homologous chromosomes that lost the ability to recombine after the proto-Y-

chromosome gained a sex determining role. Interestingly, the Y-chromosome has become subject to erosion such that it is almost entirely devoid of genes, relative to the X- chromosome; yet it contains an abundance of repetitive DNA sequences124,127,128. The inability of most of the Y-chromosome to recombine is believed to be a major cause for its erosion and loss of function. In contrast, the X-chromosome retains much of its original structure due to its ability to recombine with other X-chromosomes in females.

The erosion of the Y-chromosome in therians throughout the ongoing processes of evolution and speciation has led to a number of species specific differences in the Y- chromosome. These differences range from point mutations between some species to entirely different genes being present. The presence, absence, and variations of genes on the Y-chromosome between genes can possibly give indications of the historical stability of particular regions as well as the evolutionary history of certain species. This can be 28

seen in the examples of gene targets described below. While, the same gene is targeted in

many species, the primer pairs are often not targeting the region of the gene. This is because while different species may have the same X-homologs on the Y-chromosome, the Y-chromosome has eroded differently during the evolution of each species. This lends additional importance to the cross-species comparison of the X- and Y- chromosomal regions detailed here, as the variations between species may help answer questions regarding the past and future evolution of the Y-chromosome.

The Y-chromosome in most mammals has developed such that, outside of a pseudoautosomal region (PAR) located on the distal end of the short arm of the Y- chromosome, which is homologous between X- and Y-chromosomes (allowing for X-Y pairing and gene crossover in meiosis), there is very little homology between the X- and

Y-chromosomes. The few regions of homology between the sex chromosomes that persist outside of the PAR are very unlikely to cross over which is of great importance in

PCR/qPCR methods of sex identification designed to identify the presence of the Y- chromosome and include an inherent reaction control.

The genes described in the remainder of this section, have been, or are currently, used as targets for PCR or qPCR sex identification in various species. With the exception of the SRY gene, all of the genes have a homolog on the X-chromosome. Assays targeting those genes with an X-chromosome homolog have been designed to take advantage of this potential for an inherent positive reaction control as described earlier in the chapter.

Lacking an X-chromosome homolog, the SRY gene must use duplex reactions to provide a positive reaction control. 29

SRY

As described above, the SRY gene is found exclusively on the Y-chromosome in

mammals and is responsible for initiating the change from female (the default sex) to

male in early embryonic development. Its presence always indicates a male regardless of

whether the individual has a Y-chromosome or a translocation event involving the SRY

gene has occurred, making it a reasonable target gene for sex identification.

ZFX/ZFY

In humans, the zinc finger protein X-linked (ZFX) gene codes for a member of the

krueppel C2H2-type zinc-finger protein family. ZFX is a DNA binding protein with a

transcriptional activation domain, a nuclear localization domain, and a DNA binding

domain present in the full-length protein. In mice ZFX is a required transcriptional

regulator needed for self-renewal of embryonic and adult hematopoietic stem cells,

though it is not involved in their growth or differentiation129. Multiple isoforms of ZFX

have been identified and are produced by alternatively spliced transcript variants, the full- length nature of all of these variant isoforms has not yet been determined130. Also, ZFX

escapes X inactivation131.

ZFY may act as a transcription factor in mammals and was once a preeminent

candidate of being the testis-determining factor and was in fact named TDF for a short

time (the term TDF is no longer in common use and is now used in connection with the

SRY gene if at all). In mice, there is evidence that correct ZFY function is required for

proper spermatogenesis, as improper expression of either of the two gene splice variants leads to apoptosis at the mid-pachytene checkpoint132. In humans ZFY is mainly produced 30

in the testis and prostate, though it is also found in several other tissues in the urinary and

digestive systems.

ZFX and ZFY are targets for sex identification in a number of species including humans, cattle, sheep, goats, red deer, and elk107,113.

Amelogenin/Amelogenin-like

The amelogenin gene (AMELX, formerly AMG or AMGX) is an X-linked gene

with a Y-linked paralogue termed amelogenin-like (AMELY, formerly AMGL or AMGY).

Both genes produce functional isoforms of the amelogenin protein, though expression of

AMELY is only ~10% of AMELX expression in humans133. Both AMELX and AMELY

are key extracellular proteins in the process of amelogenesis, through which enamel is

produced133. Amelogenin, operates in tandem with ameloblastins, enamelins, and tuftelins in the process of mineralization in tooth development, with amelogenin making up 50-90% of the proteins present during the stages of enamel development134.

Amelogenin self assembles in developing teeth into nanosphere structures which create

the structure around which early enamel crystallites form135,136.

The chromosomal locations of both AMELX and AMELY were first identified in

humans and mice and were found to be on the short arm of the X-chromosome and near the centromere of the Y-chromosome respectively137. Soon after discovery of their

locations, both AMELX and AMELY were first sequenced in humans from genomic

DNA138. Prior to this, the only sequences of amelogenin came from cloned murine and

bovine cDNA respectively139,140 and the chromosomal origins were not specifically

known. The sequencing of AMELX and AMELY by Nakahori et al identified a significant 31

deletion region on AMELY relative to AMELX, which was able to be used as a target for

sex identification in humans141. The common presence of AMELX and AMELY in mammals and the frequency in which significant deletion regions are seen in AMELY has made it a common target for sex identification in a variety of different species including: humans109,142, nonhuman primates63,143,144, various ungulates (including cattle, sheep,

goats, bison, yaks, pigs, deer, moose, and caribou)102,103,108,110,145–148, cetaceans146,

felids149, and bears65,150.

DEAD-Box Protein (DDX3X/DDX3Y) Gene

DEAD-Box helicase 3 X-linked (DDX3X also called DBX; DDX3; HLP2;

DDX14; CAP-Rf; MRX102) is a member of the DEAD-Box protein family. The DEAD-

Box family of proteins is the largest member of helicase superfamily 2 (SF-2)151,152 and

have nine conserved sequence motifs153,154. Motif II (or Walker B motif) of DEAD-Box

proteins has the amino acid sequence D-E-A-D and is the source of the protein family’s name153. DEAD-Box proteins are RNA helicases with ATPase activity and are involved

in transcription155, pre-mRNA splicing156, ribosome biogenesis157, nuclear export158, translation initiation159,160, RNA degradation161, and organelle gene expression162. For a

more detailed review of the DEAD-Box family of proteins and their functions, see

references 151,163–166.

DDX3X was first identified as a homologue of DDX3Y (initially termed DBY)

that escapes X-inactivation167. As with other DEAD-Box proteins, DDX3X is an RNA

helicase. It has been shown to have DNA helicase activity, a high level of RNA- independent ATPase activity, and ATPase activity appears to be stimulated by both DNA 32

and RNA which is unusual among DEAD-Box proteins168,169. It is expressed ubiquitously

in all tissues170. Haploinsufficiency is linked to intellectual disabilities171. It is directly

targeted by many viruses and has been suggested as the target of a broad spectrum of

antiviral drugs172,173.

DDX3Y is the Y-specific homolog of DDX3X. It was discovered in a systematic

search of the human Y-chromosome and was useful in the discovery of DDX3X167.

Transcription of DDX3Y is broad, but not ubiquitous and is generally lower in most

tissues than DDX3X, with the highest expression of DDX3Y transcription in bone marrow

and testes170. Despite the widespread transcription of the gene, DDX3Y protein is only

produced in testicular tissue174. Unsurprisingly, mutations in DDX3Y are associated with

male infertility ranging from reduced production of germ cells to Sertoli-cell-only (SCO) syndrome175.

DDX3X and DDX3Y are targets for sex identification in a number of species

including American bison, European cattle, sheep, pigs, Japanese weasels, Siberian

weasels, and Japanese martens72,176,177.

Conclusion

Sex determination involves complex mechanisms and is highly variable between

species. In therian species sex determination is controlled by the SRY gene located on the

Y-chromosome, which acts as a switch when present, pushing embryonic development

into the path of male phenotypic development.

Sex identification of humans and animals has a number of important applications 33

including embryo transfer, ecology population studies, lab sample controls, archeology,

and use in criminal justice. Over time, various methods of sex identification have been

developed in order to meet these complex needs. These include direct methods of sex

identification such as visual inspection, ultrasound, and cytogenetic identification of sex

chromosomes by karyotyping, as well as indirect methods such as Y-specific antigen

detection, enzyme based X-chromosome copy number detection, Y-specific DNA probes,

and PCR/qPCR. For determining genotypic sex, PCR and qPCR are the fastest, most

accurate, and most effective methods.

PCR and qPCR sex identification is done in a wide variety of species using primer

pairs that exploit the common ancestry of the X- and Y-chromosomes. This is most often

done by targeting Y-chromosomal regions that contain deletions relative to a largely

homologous X-chromosomal region. Much is still unknown about the evolution of the

sex chromosomes, and the information produced by this method of sex identification can

increase the knowledge of the process by creating direct comparisons of homologous

regions of the X- and Y-chromosomes within a species as well as between species.

The remainder of this thesis reports the development of two new pairs of

PCR/qPCR primers. The first primer pair described creates amplicons from ZRSR2 and

ZRSR2Y which are X- and Y-specific respectively. The second creates amplicons from

the X-specific GPM6B and its Y-specific pseudogene. In both cases, the Y-specific

amplicon contains deletions relative to its X-specific counterpart making it possible to distinguish the amplicons via gel electrophoresis and/or melting curve analysis. Chapter 2 focuses on the ZRSR2 targeting primer pair and demonstrates its efficacy in sex 34

identification of Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens. Chapter 3 deals with the GPM6B targeting primer pair and shows its efficacy in sex identification in pigs. 35

CHAPTER 2:

SEX IDENTIFICATION OF BOVID SPECIES USING ZRSR2 & ZRSR2Y GENE

HOMOLOGS

Abstract

A single PCR primer pair designed to produce an X-chromosome specific

amplicon and a separate Y-chromosome specific amplicon has a distinct advantage in sex

identification assays. In this case, the single primer pair has a built-in control for the PCR

reaction and eliminates duplex PCR reaction inefficiencies. Primer pairs of this kind

generally target gene regions on the X-chromosome that have homologs on the Y-

chromosome containing deletions or insertions. One example is the amelogenin (AMELX)

gene. Primers spanning deletions in the amelogenin-like gene (AMELY) found on the Y-

chromosome have commonly been used for sex identification108,141,143.

The ZRSR2 and ZRSR2Y genes are shown here to be alternative sex identification primer targets in Bos taurus (European cattle), Bos indicus (African cattle), Bison bison

(American bison), Capra hircus (domestic goat) Ovis aries (domestic sheep), and Bos grunniens (domestic yak). Comparison of the Bos taurus ZRSR2 and ZRSR2Y gene sequences yielded multiple deletion regions located within the ZRSR2Y sequence.

Primers were then designed to flank two of these deletions. DNA samples from the aforementioned species were tested with these primers and proved to be highly accurate in sex identification of all samples tested. Female samples produced a single X-

chromosome specific ZRSR2 amplicon, 149 bp in length, whereas male samples produced 36

two amplicons, one X-chromosome specific ZRSR2 amplicon (149 bp) and one Y-

chromosome specific ZRSR2Y specific amplicon, 116 bp in length. This assay provides

an efficient alternative assay for sex identification in domestic cattle species.

Introduction

Sex identification of DNA samples is an important component of preimplantation

embryo diagnosis, animal identification, and forensic biology. PCR methods are often

used in determining sex ratios in a wildlife population when the only samples available

are fecal, hair or tissue. When sex identification of subjects is an important research

parameter, costs may be reduced through testing and excluding samples of the undesired

sex. Another benefit of sex identification via PCR is the ability to examine population

structure as well as dynamics related to sex in a noninvasive manner.

Amplification of the Y-chromosome specific SRY gene and any one of a number

of X-chromosome specific or autosomal genes in a multiplex reaction is a common

practice in determining the presence of a Y-chromosome in a DNA sample. This method,

however, leaves the potential for false negatives in the event of failure of the primers

targeting SRY. For this reason, the development of assays that target Y-chromosome specific genes with X-chromosome analogs are preferable because careful design allows a single primer pair to create X- and Y-chromosome specific amplicons. Such a design

creates a built in positive reaction control in each reaction. By designing the primer pair to span a deletion unique to one of the genes, two amplicons of unique size can be produced which can be separated via gel electrophoresis. Alternately, real time PCR 37

using either melting curve analysis or X- and Y-chromosome specific probes can provide

equivalent information. The amelogenin gene and its Y-specific analog have frequently

been targeted for this method of single primer pair sex identification in a variety of

different species.

Here a novel alternative primer pair useful in sex determination of several

members of the family Bovidae is presented. The primer pair targets the X-chromosome

specific ZRSR2 gene and a deletion-containing portion of its Y-chromosome specific homolog ZRSR2Y. In humans, the ZRSR2 gene encodes for U2 small nuclear ribonucleoprotein auxiliary 35 kDa subunit-related protein 2. This protein is a critical component in the structure of the spliceosome and plays an important role in eukaryotic precursor messenger RNA splicing and mRNA assembly. It is involved in the recognition of the 3’ splice site and assists in positioning the nucleotide favorably for splicing178.

ZRSR2Y appears to be bovid specific179 and its function has not yet been determined.

Deletions have been found in Bos taurus ZRSR2Y relative to ZRSR2. The primers

described here were designed to produce an X-chromosome specific amplicon from the

ZRSR2 gene, while also spanning two deletions (6 bp and 27 bp) found in ZRSR2Y. PCR

amplification using this primer pair results in a single X-chromosome specific amplicon

of 149 bp in both male and female samples and an additional Y-chromosome specific

amplicon of 116 bp in males. When the amplicons are separated using gel

electrophoresis, female samples reveal a single distinguished band, while male samples

reveal two distinct bands.

A melting curve analysis (MCA) protocol is also described here, which can be 38

used in lieu of gel electrophoresis for the purpose determining the presence or absence of

the aforementioned amplicons based on the presence or absence of melting curve peaks

unique to the X- and Y-specific amplicons. Initial testing indicates that these primers can

be used for Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens sex

identification in addition to Bos taurus.

Materials and Methods

Sample Collection and DNA Extraction

A variety of tissues including blood, ear notch, hair, and muscle were collected

from Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens as indicated in Table 1. Several DNA extraction kits were used including Agencourt

DNAdvance Genomic DNA Isolation kit (Beckman Coulter Life Sciences, Indianapolis,

IN), PureLink™ Genomic DNA kit (ThermoFisher Scientific, USA), Dneasy Blood and

Tissue kit (Qiagen, USA), Gentra Puregene Tissue kit (Qiagen, USA), Promega Genomic

DNA kit (Promega, USA), and Viogene Blood & Tissue kit (Viogene, Taipei, Taiwan).

In addition to the kits used, modified Laird’s Buffer DNA isolation protocols180 were also

used. Some samples were received as already purified DNA and no information was

available on the tissue or extraction method. The method of DNA extraction had no effect on results. Extracted DNA was stored at -80° Celsius (or C).

Primer Design

To find appropriate primers, the Bos taurus ZRSR2 (XM_003584349.4) and

ZRSR2Y (GQ426330.1) sequences were aligned using Ensemble’s EMBOSS Needle 39

Nucleotide Alignment software. An area with adjacent 6 and 27 base pair deletions in the

Y-chromosome was identified and primers were designed to flank the two deletions (Fig.

1). The forward and reverse primer sequences are 5’-AGTTCAGTCTCTGCTCCC-3’ and 5’-GGGGAAGGCACAGTTC-3’ respectively. The positions of the 5’ base of the forward and reverse primers on the X-chromosome are 134994806 and 134994954 respectively (AC_000187.1). The primers produce amplicons 149 bp long from the X- chromosome and 116 bp long from the Y-chromosome in Bos taurus. A NCBI Primer-

BLAST search was run with the designed primers, which confirmed a minimal likelihood of off target amplification from the Bos taurus genome.

Table 1 | Bovid Tissues Used Species Tissue Number of samples Bos taurus Blood 10 Bos taurus Ear Notch 59 Bos taurus Hair 24 Bos taurus Muscle 50 Bos indicus Muscle 5 Bos indicus Dry Blood 11 Bos indicus Hair 10 Bison bison Unknown 5 Capra hircus Blood 2 Capra hircus Unknown 2 Ovis aries Unknown 22 Bos grunniens Hair 6 Counts of bovid samples obtained from different species and tissue types.

Cross-Species Analysis of Amplified Regions

The X- and Y-specific sequences predicted to be amplified in Bos taurus were

used as query sequences and compared to the reference genomes of Bos indicus

(taxid:9915), Bison bison (taxid:9901), Capra hircus (taxid:9925), Ovis aries 40

(taxid:9940), and Bos grunniens (taxid:30521) to determine the level of conservation of the amplicon in these bovid species. The query sequences were compared to the NCBI

RefSeq Reference and Representative genomes nucleotide database for each species using the NCBI BLAST tool with the megablast program181 to optimize the search for highly similar sequences.

Fig. 1 | Portion of ZRSR2/ZRSR2Y alignment in Bos taurus. Forward and reverse primer locations are indicated by orange and blue bold text respectively. Red dashes indicate deletions on ZRSR2Y.

PCR Amplification

Testing for the presence of the deletion regions in ZRSR2Y and the efficacy of the primers for use in sex identification was performed in Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens via PCR and gel electrophoresis. PCR amplification was carried out in 15 µl reactions with 0.5 µM of each primer, 20 ng/reaction of template DNA, 1x GoTaq G2 Green Master Mix (Promega, USA, cat.

M7122), and nuclease free water. Amplification was done on Bio-Rad iCycler (Bio-Rad,

Hercules, CA). Thermocycler parameters were as follows: initial denaturation at 95°C for

2 minutes followed by 30 cycles of amplification, each containing denaturation at 95°C 41

for 30 seconds, annealing at 58°C for 30 seconds, and extension at 72°C for 15 seconds.

After the final cycle, a final extension was performed at 72°C for 5 minutes.

Visualization of PCR Product

Each amplified product was subjected to electrophoresis, with Forever Ladder 100 bp molecular weight standard (Seegene, Seoul, South Korea). Electrophoresis was performed on a 6% non-denaturing polyacrylamide gel (73 x 82 x 0.75 mm) using Mini-

PROTEAN® 3 Cell apparatus (Bio-Rad, Hercules, CA) at room temperature for 45-60

minutes at 100 volts using 1x TBE running buffer. The gels were then stained in either a

0.5 μg/ml ethidium bromide solution or a 1X SYBR Safe (Invitrogen, Life Technologies,

Carlsbad, CA) solution for 30 minutes prior to visualization under UV light.

Retesting Using SE47/SE48 Primers

All Bos taurus samples whose amplification did not correlate with the sex on

record (i.e. samples labeled male showing female amplification pattern, and vice versa)

were retested using the ZRSR2/ZRSR2Y targeting primer pair to determine if reaction

contamination was at fault. These samples were also tested using the primers SE47 and

SE48108, which generates X- and Y-specific amplicons from Bos taurus AMELX and

AMELY respectively. This was done to confirm the presence or absence of the Y- chromosome in the original sample and determine if the sample was mislabeled or the sex incorrectly identified at the time of collection.

Bos taurus X and Y Controls for qPCR

X- and Y-chromosome associated amplicons were obtained from a male Bos 42 taurus sample, for use as positive controls in qPCR and MCA. This was done by excising the associated bands from a PAGE gel after electrophoresis and visualization as detailed above. The excised bands were placed in different 0.5 ml tubes each containing 100 μl of

1x TE buffer and incubated at 4°C for 48 hours to allow the amplicons to diffuse into the buffer. After the incubation, the supernatant was moved to new 0.5 ml tubes and stored at

-80°C. The controls were tested using the melting curve analysis (MCA) as described below.

qPCR & Melting Curve Analysis

qPCR amplification was carried out in 10 μl reactions with 0.5 µM of each primer and 10 ng/reaction of template DNA, 1x iTaq Universal SYBR Green Supermix (Bio-

Rad, Hercules, CA) or 1x Perfecta SYBR Green SuperMix (Quanta Biosciences, Inc.,

Beverly, MA), and nuclease free water. qPCR amplification and MCA were performed using an Eppendorf mastercycler ep realplex4 gradient S thermocycler and Realplex 2.2 software (Eppendorf, Hamburg, Germany) was used for analysis. Thermocycler parameters were as follows: initial denaturation at 95°C for 2 minutes followed by 35 cycles of amplification, each containing denaturation at 95°C for 30 seconds, annealing at

59°C (as was determined to be the optimal annealing temperature from the qPCR optimization) for 30 seconds, and extension at 72°C for 15 seconds. After the final cycle a final extension was performed at 72°C for 5 minutes. The MCA consisted of an initial denaturation at 95°C for 15 seconds, a melting curve start phase at 59°C for 15 seconds, followed by temperature increase up to 95°C over the course of 20 minutes with continual fluorescence measurement. 43

Results

Alignment and BLAST Results

The results of the Primer-BLAST search indicated that there would be amplification from Bos taurus with low risk of off target amplification. Figs. 2 and 3

show alignments from the BLAST searches performed in Bos indicus, Bison bison,

Capra hircus, Ovis aries, and Bos grunniens using the predicted Bos taurus X- and Y-

specific amplicon sequences as the query sequence respectively. The X- and Y-specific sequences produced no alignment with Bos grunniens and the Y-specific sequences also produced no alignment from Capra hircus, or Ovis aries. As the PCR results will show, the lack of produced alignment was not a result of the sequence being absent in Capra hircus, Ovis aries, and Bos grunniens.

The Bison bison alignments from both the X- and the Y-specific sequence queries come from unplaced genomic scaffolds. The Capra hircus X-specific query alignment is from an unlocalized X-chromosomal genomic scaffold. All other alignments are localized within the chromosome with all alignments from the X-specific query coming from X- chromosomes and the Bos indicus alignment with the Y-specific query coming from the

Y-chromosome.

PCR Results

All samples tested using PCR and primer pair designed to target ZRSR2 and

ZRSR2Y matched either the recorded sex or the sex as determined by testing with

SE47/SE48 primers in the process of retesting. A summary of result details is found in 44

Table 2 for Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens. Of the 143 Bos taurus samples tested, 24 had an incorrectly recorded sex

(based on sex identification performed using SE47/SE48 primers). ZRSR2/ZRSR2Y sex identification agreed with all SE47/SE48 test results in all cases.

Query 1 AGTTCAGTCT CTGCTCCCTG ACCTCCTCCC GCCATGGCTC 40 Bos i. 79805303 AGTTCAGTCT CTGCTCCCTG ACCTCCTCCC GCCATGGCNC 79805342 Bison b. 1897 AGTTCAGTCT CTGCTCCCTG ACCTCCTCCC GCCATGGCTC 1936 Capra h. 54474614 AGTTCAGTCT CTGCTCCCTG ACCTCCTCCC GCCTCGGCTC 54474653 Ovis a. 14686911 AGTTCAGTCT CTGCTCCCTG ACCTCCTCCC GCCACGGCTC 14686872

Query 41 CTGGACCTGG AGGAGCTCTG GCTGCGGCTC CTGCGGCTGT 80 Bos i. 79805343 CTGGACCTGG AGGAGCTCTG GCTGCGGCTC CTGCGGCTGT 79805382 Bison b. 1937 CTGGACCTGG AGGAGCTCTG GCTGCGGCTC CTGCGGCTGT 1976 Capra h. 54474654 CTGGACCTGG AGGAGCTCTG GCTGCGGCTC CTGCGGCTAT 54474693 Ovis a. 14686871 CTGGACCTGG AGGAGCTCTG GCTGCGGCTC CTGCGGCTAT 14686832

Query 80 GGCTCCGGCT GCCCCGACCA CGGATGCGGT CCCTTTTTCT 120 Bos i. 79805383 GGCTCCGGCT GCCCCGACCA CGGATGCGGT CCCTTTTTCT 79805422 Bison b. 1977 GGCTCCGGCT GCCCCGACCA CGGATGCGGT CCCTTTTTCT 2016 Capra h. 54474694 GGCTCCGGCT GCCCCGTCCG CGGCTGCGGT CCCTTTTTCT 54474733 Ovis a. 14686831 GGCTCCGGCT GCCCCGTCCG CGGCTGCGGT CCCTTTTTCT 14686792

Query 121 TCCTCTGCTT CGGGAACTGT GCCTTCCCC 149 Bos i. 79805423 TCCTCTGCTT CGGGAACTGT GCCTTCCCC 79805451 Bison b. 2017 TCCTCTGCTT CGGGAACTGT GCCTTCCCC 2045 Capra h. 54474734 TCCTCTGCTT CGGGAACTGT GCCTTCCCC 54474762 Ovis a. 14686791 TCCTCTGCTT CGGGAACTGT GCCTTCCCC 14686763

Fig. 2 | NCBI BLAST alignment for the amplified region of ZRSR2. The predicted X- specific Bos taurus amplicon sequence was used as the query sequence. Query was compared to Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens RefSeq Reference and Representative genomes. Highlighted bases are those which differ from the query sequence. No Bos grunniens alignment was returned from the search.

The 24 samples with an incorrectly recorded sex were obtained as part of a group of 41 samples from a slaughterhouse. Sex identification was performed based on visual identification of secondary sex characteristics of the hanging carcass at the time of collection. These 41 samples were the only samples in which sex identification was not performed via visual inspection of primary sex characteristics of a living animal. Fig. 4 45 shows representative images of the gels produced from ZRSR2/Y amplification of Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens samples.

Query 1 AGTTCAGTCT CTGCTCCCCA ACCTCCTCCC ACCACGGCTC 40 Bos i. 644465 AGTTCAGTCT CTGCTCCCCA ACCTCCTCCC ACCACGGCTC 644504 Bison b. 13809 AGTTCAGTCT CTGCTCCCCA ACCTCCTCCC ACCACGGCTC 13770

Query 41 ATGGAGGAGC TCTGGCTGCT GTTCCTGCTG CTGTGGTCCC 80 Bos i. 644505 ATGGAGGAGC TCTGGCTGYT GTTCCTGCTG CTGTGGTCCC 644544 Bison b. 13769 ATGGAGGAGC TCTGGCTGCT GTTCCTGCTG CTGTGGTCCC 13730

Query 81 TTTTTCTTCC GCTGCTTTGA GAACTGTGCC TTCCCC 116 Bos i. 644545 TTTTTCTTCC GCTGCTTTGA GAACTGTGCC TTCCCC 644580 Bison b. 13729 TTTTTCTTCC GCTGCTTTGA GAACTGTGCC TTCCCC 13694

Fig. 3 | NCBI BLAST alignment for the amplified region of ZRSR2Y. The predicted Y-specific Bos taurus amplicon sequence was used as the query sequence. Query was compared to Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens RefSeq Reference and Representative genomes. The highlighted base is the only one which differed from the query sequence. No Capra hircus, Ovis aries, or Bos grunniens alignments were returned from the search.

Melting Curve Analysis Results

MCA of the excised X- and Y-specific fragments using the ZRSR2/ZRSR2Y targeting primers revealed that each fragment had a single melting curve peak and that the peaks were unique from one another (Fig. 5). Analysis of whole genome Bos taurus and Bos indicus samples produced both X- and Y-specific peaks from male samples and only X-specific peaks from female samples, as expected (Figs. 6 and 7). MCA of Bison bison, Capra hircus, Ovis aries, and Bos grunniens samples gave results similar to those of the Bos taurus and Bos indicus with males producing X- and Y-specific peaks and females producing only an X-specific peak (Fig. 8).

46

Table 2 | ZRSR2/ZRSR2Y Targeting Primer Pair Results Summary Bos Bos Bison Capra Ovis Bos Total taurus indicus bison hircus aries grunniens Number of Samples 143 26 5 4 22 6 206 Number of Males 91 16 3 2 11 4 127 (Recorded) Number of Females 52 10 2 2 11 2 79 (Recorded) Number of Males 103 ------103 (Corrected)* Number of Females 40 ------40 (Corrected)* Results Match 119 26 5 4 22 6 182 Recorded Sex Results Match 24 0 0 0 0 0 24

PCR SE47/SE48 Sex Accuracy of ZRSR2/Y 100 100 100 100 100 100 100 Primers (%)** Results Match 119 26 5 4 22 6 182

Record** Results Match 24 0 0 0 0 0 24

qPCR SE47/SE48 Sex** Accuracy of ZRSR2/Y 100 100 100 100 100 100 100 Primers (%)** *Corrected numbers of males and females was determined based on results of sex identification done using SE47/SE48 primers in samples where ZRSR2/Y primers produced results contrary to those expected based on the recorded sex. **Results and percent accuracy represent data after retesting protocol was completed. Data for each species tested including total number of samples, recorded number of male and female samples, corrected sex information of samples (if applicable), and accuracy information for both PCR and qPCR testing.

47

a b

c d

e f

Fig. 4 | Gels showing representative PCR results for bovid species tested. The size marker in all gels is Forever Ladder molecular weight standard (100-500 bp in 100 bp increments) (Seegene, Seoul, South Korea). NTC is the negative control in each gel. (a) Bos taurus male (lanes 2-5) and female (lanes 6-9). (b-f) Gels of Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens samples, respectively. The right-most two occupied lanes of each gel (b-f) contains Bos taurus male and female PCR products (from left to right respectively) as a reference for the amplicon size in each other species. 48

Fig. 5 | Graph of melting curve of excised X- and Y-specific fragments. The Y- specific fragment peaks at 84.1°C the X-specific fragment peaks at 88.5°C. 49

a

b

Fig. 6 | Graphs of melting curves for a subset of Bos taurus samples. (a) Female Bos taurus samples which, are distinguished by a single melting curve peak. (b) Male Bos taurus samples which, are distinguished by the presence of two melting curve peaks.

50

a

b

Fig. 7 | Graphs of melting curves for a subset of Bos indicus samples. (a) Female Bos indicus samples, which are distinguished by a single melting curve peak. (b) Male Bos indicus samples, which are distinguished by the presence of two melting curve peaks.

51

a

b

Fig. 8 | Graphs of melting curves of Bos taurus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens samples. (a) A single female of each species. (b) A single male of each species. All species present a nearly identical melting curve to Bos taurus.

52

Discussion

The data from the BLAST searches did not indicate that amplification from the

Bos grunniens X-chromosome, or from the Capra hircus, Ovis aries, or Bos grunniens Y- chromosomes should be expected. However, the PCR and qPCR tests created X- and Y- specific amplicons in all species. Indicating that the reference genomes for Capra hircus,

Ovis aries, and Bos grunniens are incomplete in the regions where ZRSR2Y, and ZRSR2

in the case of Bos grunniens, are located. Given the difficulty of sequencing sex

chromosomes (particularly the Y-chromosome) it is unsurprising that these regions were missing from the reference genome databases. Additionally, it appears, based on the cross-species alignment data and the PCR and qPCR results that the Bison bison alignments matching the X- and Y-specific Bos taurus query sequences belong to the bison X- and Y-chromosomes respectively. Further, the BLAST results combined with the PCR and qPCR results indicates that these regions are conserved on both the X- and

Y-chromosomes between these species and may be largely conserved throughout their respective sub families (Bovidae and Caprinae) if not the whole family Bovidae.

The 100% sex identification accuracy in all Bos taurus, Bos indicus, Bison bison,

Capra hircus, Ovis aries, and Bos grunniens DNA samples tested to date, using either

PCR or qPRC, indicates that the ZRSR2/ZRSR2Y targeting primer pair developed is a viable for use in sex identification in these species. The ZRSR2 and ZRSR2Y genes appear

to be fairly stable with a low likelihood of mutation, as indicated by the conservation of

these genes between species. The apparent stability of these gene regions is a further

indicator that the designed primer pair has a low likelihood of failure due to mutations 53

within a species. Together, this strongly suggests that the primer pair will be useful for

rapid and cost effective sex identification of DNA samples.

In conclusion, the deletion regions identified in the Bos taurus ZRSR2Y gene are

present in Bos taurus, Bos indicus, Bison bison, Capra hircus, Ovis aries, and Bos grunniens. These deletions can be readily identified compared to the homologous region of ZRSR2 by the use of PCR and gel electrophoresis analysis or qPCR and melting curve analysis using the primer pair developed. In addition to confirming the presence of the deletions, a sex identification assay using these same primers that is rapid and highly effective in a variety of bovid species has been developed.

54

CHAPTER 3:

SEX IDENTIFICATION OF PIGS USING GPM6B & ITS Y-CHROMOSOME

HOMOLOG

Abstract

As described previously (see Chapters 1 and 2), PCR and qPCR sex identification

assays that use a single primer pair to generate X- and Y-chromosome specific amplicons

have a distinct advantage over duplexed methods. The presence of a built-in (or inherent) reaction control without the reaction inefficiencies of duplex PCR is the main advantage afforded. Such advantageous primer pairs generally target gene regions on the X-

chromosome with homologs on the Y-chromosome that contain deletions or insertions.

One example is the amelogenin (AMELX) gene, which is targeted for PCR and qPCR sex

identification in various species108,141,143.

The GPM6B gene located on the X-chromosome and its Y-chromosome

pseudogene are shown here to be viable gene targets PCR sex identification in Sus scrofa.

Comparison of the homologous sequences in the Sus scrofa genome showed multiple

deletion regions located within the Y-chromosome pseudogene sequence. A single primer pair flanking several of these deletions was designed, resulting in a Y-chromosome specific amplicon 43 bp shorter than the X-chromosome specific GPM6B amplicon produced in the same reaction. Utilizing this primer pair, correct identification of sex was obtained in 100% of samples tested. In addition to the ability to determine the sex of individuals, a deletion variant in the GPM6B gene present in a portion of the population 55

was also confirmed. Female samples produced a single X-chromosome specific GPM6B amplicon, either 244 bp or 228 bp in length, whereas male samples produced two amplicons, one X-chromosome specific GPM6B amplicon (244 or 228 bp) and one Y- chromosome pseudogene specific amplicon, 201 bp in length.

Introduction

As demonstrated in chapters 1 and 2, sex identification of DNA samples is important in a variety of applications. ZRSR2 and ZRSR2Y were previously investigated as potential targets for a PCR sex identification assay in bovid species, resulting in the development of a novel primer pair useful in this method of sex identification (Chapter

2). Testing of this ZRSR2/ZRSR2Y targeting primer pair in Sus scrofa (domestic pigs) produced no amplification. Subsequent analysis of the genome revealed mismatches between the Bos taurus and Sus scrofa genomes in the primer regions (unpublished data).

A search of literature identified GPM6B as a porcine gene on the X-chromosome with a highly homologous region (a pseudogene) on the Y-chromosome182, making it a

probable target for designing primers capable of being used for sex identification in Sus

scrofa. In humans, the GPM6B gene encodes for a membrane glycoprotein of the

proteolipid protein family183. It is expressed throughout most regions of the brain, and is

believed to function in membrane trafficking and cell-to-cell communication. It is also

involved in the regulation of osteoblast function and bone formation184.

Alignment of the homologs allowed for the identification of deletions on the Y-

specific homolog relative to GPM6B. From this, a single primer pair, was designed to 56

produce an X-chromosome specific amplicon from the GPM6B gene, and an amplicon

spanning several deletions (totaling 43 bp) from the Y-chromosome homolog in Sus

scrofa. PCR using the designed primer pair resulted in an X-chromosome specific

amplicons 244 and/or 228 bp long in female samples and a Y-chromosome specific amplicon 201 bp long (in addition to one of the X-specific amplicons) in males.

The possibility of different sized X-specific amplicons is due to a 16 bp intron deletion variant (rs1111696537) in the amplified region, which is found in a portion of the population. When the amplicons are separated using gel electrophoresis, female samples reveal either a single band or two bands very close together (difficult to distinguish without a lengthened running time on a high-resolution gel), while male samples reveal two distinct bands with greater separation than seen between the possible

X-amplicons.

Materials and Methods

Sample Collection and DNA Extraction

A variety of tissues including blood, hair, muscle, and tail were collected from 59 pigs. Several DNA extraction kits were used including PureLink™ Genomic DNA kit

(ThermoFisher Scientific, USA), Dneasy Blood and Tissue kit (Qiagen, USA), Gentra

Puregene Tissue kit (Qiagen, USA). In addition, modified Laird’s Buffer DNA isolation180 and phenol/chloroform extraction protocols were also used. The method of

DNA extraction had no effect on results. Extracted DNA was stored at -80° C.

57

Primer Design

To find appropriate primers, the Sus scrofa GMP6B gene (NC_010461.4 base

pairs 11383241 to 11388803) and its Y-specific pseudogene (NC_010462.3 base pairs

10459005 to 10464501) were aligned using NCBI’s BLAST tool. From the alignment, a

section containing multiple deletions on the pseudogene was selected (Fig. 9) and primers

were created to flank the region containing the deletions. The forward and reverse primer

sequences are 5’- GGAAAAGCTGGCAAGG -3’ and 5’- CGTTGCACATCTGTGTG -

3’ respectively. The positions of the 5’ base of the forward and reverse primers on the X-

chromosome are 11387210 and 11387453 respectively (AC_000187.1). The primers

were predicted to produce amplicons 244 bp long from the X-chromosome and 201 bp

long from the Y-chromosome in Sus scrofa.

Fig. 9 | Portion of Sus scrofa GPM6B and its Y-chromosome pseudogene aligned. Forward and reverse primer locations are indicated by orange and blue bold text respectively. Red dashes indicate deletions on the pseudogene.

58

PCR Amplification

Testing to identify the presence of the deletion regions in the Y-specific

pseudogene and the efficacy of the primers for use in sex identification was performed in

Sus scrofa via PCR amplification. PCR amplification was carried out in 15 µl reactions with 0.5 µM of each primer, 20 ng/reaction of template DNA, 1x GoTaq G2 Green

Master Mix (Promega, USA, cat. M7122), and nuclease free water. Amplification was done on Bio-Rad iCycler (Bio-Rad, Hercules, CA). Thermocycler parameters were as follows: initial denaturation at 95°C for 2 minutes followed by 30 cycles of amplification, each containing denaturation at 95°C for 30 seconds, annealing at 58°C for 30 seconds, and extension at 72°C for 15 seconds. After the final cycle, a final extension was performed at 72°C for 5 minutes.

Visualization of PCR Product

Each PCR product was subjected to electrophoresis, with Forever Ladder 100 bp molecular weight standard (Seegene, Seoul, South Korea). Electrophoresis was done on a

6% non-denaturing polyacrylamide gel (73 x 82 x 0.75 mm) using Mini-PROTEAN® 3

Cell apparatus (Bio-Rad, Hercules, CA) at room temperature for 45-60 minutes at 100 volts using 1x TBE running buffer. The gels were then stained in either a 0.5 μg/ml ethidium bromide solution or a 1X SYBR Safe (Invitrogen, Life Technologies, Carlsbad,

CA) solution for 30 minutes prior to visualization under UV light.

X-Amplicon Sequencing Analysis

Sequencing analysis of long and short X-amplicons identified during the PCR 59

analysis was done by the Utah State University Center for Integrated Biosystems core genomics lab. PCR product from a female homozygous for the long X-amplicon and a

female homozygous for the short X-amplicon were sequenced on an ABI PRISM™ 3730

DNA Analyzer using the GPM6B targeting reverse primer as sequencing primer (the

reverse primer was chosen as the expected deletion was near the forward primer). The

resulting sequencing data from both the long and short amplicons was then aligned

against the reference genome and the suspected deletion sequence (rs1111696537) using

BioEdit185.

Results

The PCR analysis of porcine samples using the GPM6B targeting primer pair

described above allowed for accurate sex identification of all 59 samples tested (100%

accuracy). A subset of eight individuals expressed at least one shortened X-specific

fragment. Of these animals, two females were heterozygous for the mutation creating the

long and short X-amplicon, making it possible to estimate that the short fragment was 15

to 20 bases shorter than the longer fragment. All 51 other samples displayed only the

longer amplicon. With the exception of the eight samples just mentioned, all female

samples tested showed a single amplicon. All male samples displayed two amplicons

with the Y-amplicon being approximately 201 bp long and the X-amplicon being either

approximately 244 bp or the shortened X-amplicon. Fig. 10 shows a representative gel

image of the PCR results, including the long and short X-specific amplicons. Table 3

gives a summary of the sex and X-amplicon sizes produced. 60

Table 3 | Summary of X-Amplicon Sizes Observed Sex | Amplicon Size Number of Samples Male | Normal X-amplicon 27 Male | Short X-amplicon 3 Female | Both Normal X-amplicon 24 Female | Both Short X-amplicon 3 Female | Mixed X-amplicons 2 Counts of Sus scrofa samples based on sex and the size(s) of amplicons created using the primer pair designed to target GPM6B and its Y-chromosme homolog.

Fig. 10 | A representative gel showing the amplicons produced by the primer pair targeting GPM6B and its Y-specific homologue. The size marker is Forever Ladder molecular weight standard (Seegene, Seoul, South Korea). ♂ Sus scrofa 1 and 2 are from males with normal (longer) X-amplicons. ♂ Sus scrofa 3 and 4 are examples of males with a short X-amplicon. ♀ Sus scrofa 1 and 4 are female samples in which fragments from both X-chromosomes produce normal (long) amplicons. ♀ Sus scrofa 2 is an example of a female with a short X-amplicon. ♀ Sus scrofa 3 is a female with a normal (long) and a short X-fragment. NTC is the no template (negative) control.

Examination of variants within the amplified region indicated that a 16 bp intron variant, rs1111696537, was likely to be the reason for the shortened X-amplicons. This was confirmed by the alignment of the long and short X-amplicons, the relevant portion 61

of which is shown in Fig. 11. The alignment clearly shows a deletion in the same region

as the deletion variant rs1111696537.

X-RefSeq 1 GGAAAAGCTG GCAAGGACTC GGTGCTGAGC ACAGGCAATT 40 Long Fragment 1 ...... 40 Short Fragment 1 ...... ----- 35 rs1111696537 1 CAATT 5

X-RefSeq 41 TCCCCCTCCC TCAATTTCCC CCTCCCTCTA CCCATCAGAT 80 Long Fragment 41 ...... 80 Short Fragment 36 ------...... 64 rs1111696537 6 TCCCCCTCCC T 16

X-RefSeq 81 GTCTGGAAAC CACAAGCCAC ACCTGCCCAA GTTACACATA 120 Long Fragment 81 ...... 120 Short Fragment 65 ...... 104

Fig. 11 | Sequencing data from long and short X-amplicon generated from GPM6B primer pair compared to the Sus scrofa reference sequence (AC_000187.1 bases 11387210 to 11387329) and to the rs1111696537 deletion. Dots represent bases conserved with reference sequence. Orange text represents the forward primer. Red dashes indicate a deletion relative to the reference sequence.

Discussion

Despite the differences in X-amplicon sizes, sex was easily able to be determined based on electrophoresis data. Use of male and female control samples allowed for easy differentiation between male samples and female samples with mixed X-amplicon sizes.

This allowed for 100 % accuracy (59/59 samples) in identifying sex in porcine samples using the primer pair designed to target GPM6B and its Y-chromosome homolog. It is not currently known how broadly useful the designed primer pair is, as it has not been tested extensively in species outside of Sus scrofa.

Further investigation of the amplified region revealed that the shortened X- amplicons were likely the result of the 16 bp intron deletion variant, rs1111696537, 62 resulting in a 228bp X-chromosome fragment. This was confirmed by comparing sequencing data from the long and short X-specific amplicons. Interestingly, the 16 bp region missing in the shorter fragment appears twice in the longer fragment indicating that the variant came about through a duplication event not a deletion.

In summary, a primer pair has been created for use in Sus scrofa, which targets

GPM6B and a section its Y-chromosome homolog that contains deletion regions amounting to a 43 bp or 27 bp difference in length. The section of GPM6B amplified contains a 16 bp deletion variant (rs1111696537) in a portion of the species. This deletion variant is not large enough to complicate PCR sex determination so long as male and female positive control samples are also run on the gel. Altogether, this indicates that the

GPM6B targeting primer pair designed is viable in PCR based sex identification of Sus scrofa.

63

REFERENCES

1. Encyclopedia of Reproduction. vol. 1 (Academic Press, 1998).

2. Charnier, M. Action of temperature on the sex ratio in the Agama agama (Agamidae,

Lacertilia) embryo. C. R. Seances Soc. Biol. Fil. 160, 620–622 (1966).

3. Echiurans: Echiura - Green Bonellia (bonellia Viridis): Species Account.

https://animals.jrank.org/pages/1720/Echiurans-Echiura-GREEN-BONELLIA-

Bonellia-viridis-SPECIES-ACCOUNT.html.

4. Warner, R. R. The adaptive significance of sequential hermaphroditism in animals.

Am. Nat. 109, 61–82 (1975).

5. Agassiz, L. Der Schöpfungsplan: Vorlesungen über die natürlichen Grundlagen der

Verwandtschaft unter den Thieren. (Quandt & Händel, 1875).

6. Schrader, F. & Hughes-Schrader, S. Haploidy in metazoa. Q. Rev. Biol. 6, 411–438

(1931).

7. Trivers, R. L. & Hare, H. Haploidploidy and the evolution of the social insect. Science

191, 249–263 (1976).

8. Sokolow, N. N., Tiniakow, G. G. & Trofimov, J. E. On the morphology of the

chromosomes in Gallinaceae. Cytologia (Tokyo) 7, 466–489 (1936).

9. International Congress of Genetics (11th : 1963 : Hague). Genetics today :

Proceedings of the XI International Congress of Genetics, The Hague, The

Netherlands, September 1963. (New York : Macmillan, 1963).

10. Reed, K. J. & Sinclair, A. H. FET-1: A novel W-linked, female specific gene up-

regulated in the embryonic chicken ovary. Mech. Dev. 119, S87–S90 (2002). 64

11. O’Neill, M. et al. ASW : A gene with conserved avian W-linkage and female

specific expression in chick embryonic gonad. Dev. Genes Evol. 210, 243–249 (2000).

12. Hori, T., Asakawa, S., Itoh, Y., Shimizu, N. & Mizuno, S. Wpkci, encoding an

altered form of PKCI, is conserved widely on the avian W chromosome and expressed

in early female embryos: Implication of its role in female sex determination. Mol. Biol.

Cell 11, 3645–3660 (2000).

13. Smith, C. A. et al. The avian Z-linked gene DMRT1 is required for male sex

determination in the chicken. Nature 461, 267–271 (2009).

14. Matsubara, K. et al. Evidence for different origin of sex chromosomes in snakes,

birds, and mammals and step-wise differentiation of snake sex chromosomes. Proc.

Natl. Acad. Sci. U. S. A. 103, 18190–18195 (2006).

15. Winge, O. The nature of sex chromosomes. in Proc Sixth Int Congr Genet vol. 1

343–355 (1932).

16. Kerr, W. E. Genetics of sex determination. Annu. Rev. Entomol. 7, 157–176

(1962).

17. Robinson, R. Lepidoptera Genetics: International Series of Monographs in Pure

and Applied Biology: Zoology. (Elsevier, 2017).

18. Bridges, C. B. Triploid intersexes in drosophila melanogaster. Science 54, 252–

254 (1921).

19. Bridges, C. B. Sex in relation to chromosomes and genes. Am. Nat. 59, 127–137

(1925). 65

20. Painter, T. S. Studies in mammalian spermatogenesis. II. The spermatogenesis of

man. J. Exp. Zool. 37, 291–336 (1923).

21. Jacobs, P. A. & Strong, J. A. A case of human intersexuality having a possible

XXY sex-determining mechanism. Nature 183, 302–303 (1959).

22. Ford, C. E., Jones, K. W., Polani, P. E., De Almeida, J. C. & Briggs, J. H. A sex-

chromosome anomaly in a case of gonadal dysgenesis (Turner’s syndrome). (1959).

23. Jost, A. Recherches sur la differenciation sexuelle de I’embryon de lapin. Arch

Anat Microsc Morphol Exp 36, 271–315 (1947).

24. Goldberg, E. H., Adinolfi, M., Wiberg, U., Bradley, M. & Sharma, H. H-Y

antigen and sex determination. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 322, 73–81

(1988).

25. Gubbay, J., Koopman, P., Collignon, J., Burgoyne, P. & Lovell-Badge, R. Normal

structure and expression of Zfy genes in XY female mice mutant in Tdy. 7.

26. Palmer, M. S. et al. Genetic evidence that ZFY is not the testis-determining

factor. Nature 342, 937–939 (1989).

27. Sinclair, A. H. et al. A gene from the human sex-determining region encodes a

protein with homology to a conserved DNA-binding motif. Nature 346, 240–244

(1990).

28. Berta, P. et al. Genetic evidence equating SRY and the testis-determining factor.

Nature 348, 448–450 (1990).

29. Jager, R. J., Anvrett, M. & Hall, K. A human XY female with a frame shift

mutation in the candidate testis-determining gene SRY. 348, 3 (1990). 66

30. Koopman, P., Münsterberg, A., Capel, B., Vivian, N. & Lovell-Badge, R.

Expression of a candidate sex-determining gene during mouse testis differentiation.

Nature 348, 450–452 (1990).

31. Koopman, P., Gubbay, J., Vivian, N., Goodfellow, P. & Lovell-Badge, R. Male

development of chromosomally female mice transgenic for Sry. Nature 351, 117–121

(1991).

32. Bowles, J., Schepers, G. & Koopman, P. Phylogeny of the SOX family of

developmental transcription factors based on sequence and structural indicators. Dev.

Biol. 227, 239–255 (2000).

33. Schepers, G. E., Teasdale, R. D. & Koopman, P. Twenty pairs of Sox: Extent,

homology, and nomenclature of the mouse and human Sox transcription factor gene

families. Dev. Cell 3, 167–170 (2002).

34. Harley, V. R., Lovell-Badge, R. & Goodfellow, P. N. Definition of a consensus

DNA binding site for SRY. Nucleic Acids Res. 22, 1500–1501 (1994).

35. Ferrari, S. et al. SRY, like HMG1, recognizes sharp angles in DNA. EMBO J. 11,

4497–4506 (1992).

36. Giese, K., Pagel, J. & Grosschedl, R. Distinct DNA-binding properties of the high

mobility group domain of murine and human SRY sex-determining factors. Proc. Natl.

Acad. Sci. 91, 3368–3372 (1994).

37. Bullejos, M. & Koopman, P. Spatially dynamic expression of Sry in mouse

genital ridges. Dev. Dyn. 221, 201–205 (2001). 67

38. Wilhelm, D. et al. Sertoli cell differentiation is induced both cell-autonomously

and through prostaglandin signaling during mammalian sex determination. Dev. Biol.

287, 111–124 (2005).

39. Eicher, E. M., Washburn, L. L., Whitney, J. B. & Morrow, K. E. Mus

poschiavinus Y chromosome in the C57BL/6J murine genome causes sex reversal.

Science 217, 535–537 (1982).

40. Bullejos, M. & Koopman, P. Delayed Sry and Sox9 expression in developing

mouse gonads underlies B6-YDOM sex reversal. Dev. Biol. 278, 473–481 (2005).

41. Hiramatsu, R. et al. A critical time window of Sry action in gonadal sex

determination in mice. Development 136, 129–138 (2009).

42. Hanley, N. A. et al. SRY, SOX9, and DAX1 expression patterns during human

sex determination and gonadal development. Mech. Dev. 91, 403–407 (2000).

43. Pannetier, M. et al. Goat SRY induces testis development in XX transgenic mice.

FEBS Lett. 580, 3715–3720 (2006).

44. Kent, J., Wheatley, S. C., Andrews, J. E., Sinclair, A. H. & Koopman, P. A male-

specific role for SOX9 in vertebrate sex determination. Development 122, 2813–2822

(1996).

45. Schepers, G., Wilson, M., Wilhelm, D. & Koopman, P. SOX8 is expressed during

testis differentiation in mice and synergizes with SF1 to activate the Amh promoter in

Vitro. J. Biol. Chem. 278, 28101–28108 (2003). 68

46. Sekido, R., Bar, I., Narváez, V., Penny, G. & Lovell-Badge, R. SOX9 is up-

regulated by the transient expression of SRY specifically in Sertoli cell precursors.

Dev. Biol. 274, 271–279 (2004).

47. DiNapoli, L. & Capel, B. SRY and the standoff in sex determination. Mol.

Endocrinol. 22, 1–9 (2008).

48. Sekido, R. & Lovell-Badge, R. Sex determination involves synergistic action of

SRY and SF1 on a specific Sox9 enhancer. Nature 453, 930–934 (2008).

49. Wilhelm, D., Palmer, S. & Koopman, P. Sex determination and gonadal

development in mammals. Physiol. Rev. 87, 1–28 (2007).

50. Polanco, J. C. & Koopman, P. Sry and the hesitant beginnings of male

development. Dev. Biol. 302, 13–24 (2007).

51. Sekido, R. & Lovell-Badge, R. Sex determination and SRY: Down to a wink and

a nudge? Trends Genet. 25, 19–29 (2009).

52. Kashimada, K. & Koopman, P. Sry: The master switch in mammalian sex

determination. Development 137, 3921–3930 (2010).

53. Munger, S. C. & Capel, B. Sex and the circuitry: Progress toward a systems-level

understanding of vertebrate sex determination. WIREs Syst. Biol. Med. 4, 401–412

(2012).

54. Sunderam, S. Assisted reproductive technology surveillance — United States,

2016. MMWR Surveill. Summ. 68, (2019).

55. Hamilton, B. H. & McManus, B. The effects of insurance mandates on choices

and outcomes in infertility treatment markets. Health Econ. 21, 994–1016 (2012). 69

56. Viana, J. 2018 Statistics of embryo production and transfer in domestic farm

animals. Embryo Technol. Newsl. 36, (2019).

57. Stocker, J. & Evens, L. Fetal sex determination by ultrasound. Obstet. Gynecol.

50, 462–466 (1977).

58. Efrat, Z., Akinfenwa, O. O. & Nicolaides, K. H. First-trimester determination of

fetal gender by ultrasound. Ultrasound Obstet. Gynecol. 13, 305–307 (1999).

59. Tachdjian, G. et al. Clinical applications of fetal sex determination in maternal

blood in a preimplantation genetic diagnosis centre. Hum. Reprod. 17, 2183–2186

(2002).

60. Hyett, J. A. et al. Reduction in diagnostic and therapeutic interventions by non-

invasive determination of fetal sex in early pregnancy. Prenat. Diagn. 25, 1111–1116

(2005).

61. Curran, S. Fetal sex determination in cattle and horses by ultrasonography.

Theriogenology 37, 17–21 (1992).

62. Rajamahendran, R., Ambrose, D. J. & Burton, B. Clinical and research

applications of real-time ultrasonography in bovine reproduction: A review. Can. Vet.

J. 35, 563–572 (1994).

63. Villesen, P. & Fredsted, T. A new sex identification tool: One primer pair can

reliably sex ape and monkey DNA samples. Conserv. Genet. 7, 455 (2006).

64. Lindsay, A. R. & Belant, J. L. A simple and improved PCR-based technique for

white-tailed deer (Odocoileus virginianus) sex identification. Conserv. Genet. 9, 443–

447 (2008). 70

65. Xu, X. et al. A reliable, non-invasive PCR method for giant panda (Ailuropoda

melanoleuca) sex identification. Conserv. Genet. 9, 739–741 (2008).

66. Tzvetkov, M. V., Meineke, I., Sehrt, D., Vormfelde, S. V. & Brockmöller, J.

Amelogenin-based sex identification as a strategy to control the identity of DNA

samples in genetic association studies. Pharmacogenomics 11, 449–457 (2010).

67. Heiss, J. A. & Just, A. C. Identifying mislabeled and contaminated DNA

methylation microarray data: An extended quality control toolset with examples from

GEO. Clin. Epigenetics 10, 73 (2018).

68. White, T. D., Black, M. T. & Folkens, P. A. Human Osteology. (Academic Press,

2011).

69. Gibbon, V., Paximadis, M., Štrkalj, G., Ruff, P. & Penny, C. Novel methods of

molecular sex identification from skeletal tissue using the amelogenin gene. Forensic

Sci. Int. Genet. 3, 74–79 (2009).

70. Kreike, J. & Lehner, A. Sex determination and DNA competition in the analysis

of forensic mixed stains by PCR. Int. J. Legal Med. 107, 235–238 (1995).

71. Gilson, A., Syvanen, M., Levine, K. & Banks, J. Deer gender determination by

polymerase chain reaction: Validation study and application to tissues, bloodstains,

and hair forensic samples from California. Calif. Fish Game 84, 159–169 (1998).

72. Gokulakrishnan, P. et al. Sex determination of cattle meat by polymerase chain

reaction amplification of the amelogenin (AMELX/AMELY) gene. Vet. World 5, 526

(2012). 71

73. Zeleny, R. & Schimmel, H. Sexing of beef — a survey of possible methods. Meat

Sci. 60, 69–75 (2002).

74. Purves, D. et al. What is sex? Neurosci. 2nd Ed. (2001).

75. Schmotzer, B. & Zimmermann, A. Über die weiblichen Begattungsorgane der

gefleckten Hyäne. Anat. Anz. 55, 257–264 (1922).

76. Svensson, E. M., Götherström, A. & Vretemark, M. A DNA test for sex

identification in cattle confirms osteometric results. J. Archaeol. Sci. 35, 942–946

(2008).

77. Donald, I., Macvicar, J. & Brown, T. G. Investigation of abdominal masses by

pulsed ultrasound. The Lancet 271, 1188–1195 (1958).

78. Hare, W. C. D., Mitchell, D., Betteridge, K. J., Eaglesome, M. D. & Randall, G.

C. B. Sexing two-week old bovine embryos by chromosomal analysis prior to surgical

transfer: Preliminary methods and results. Theriogenology 5, 11 (1976).

79. Moustafa, L. A., Hahn, J. & Roselius, R. Attempts at sex determination on 6- and

7- day old cattle embyros. Berl. Munch. Tierarztl. Wochenschr. 91, 236–238 (1978).

80. Singh, E. L. & Hare, W. C. D. The feasibility of sexing bovine morula stage

embryos prior to embryo transfer. Theriogenology 14, 421–427 (1980).

81. Rottmann, O. J. Chromosome preparation from single blastomeres after colcemid

treatment and removal from rabbit morulae — Unsuitable for sexing in routine embryo

transfer. Theriogenology 15, 321–326 (1981).

82. Leibo, S. P. & Rall, W. F. Prenatal diagnosis of sex in bovine fetuses by

amniocentesis. Theriogenology 33, 531–552 (1990). 72

83. Stich, H. F. & Hsu, T. C. Cytological identification of male and female somatic

cells in the mouse. Exp. Cell Res. 20, 248–249 (1960).

84. Plante, C., Bousquet, D. & Goff, A. K. Embryonic tissue associated with the

embryonic signal in early pregnancy. Theriogenology USA (1984).

85. Wachtel, S. S. H-Y antigen in the study of sex determination and control of sex

ratio. Theriogenology 21, 18–28 (1984).

86. Anderson, G. B. Identification of embryonic sex by detection of H-Y antigen.

Theriogenology 27, 81–97 (1987).

87. Wachtel, S. et al. Sex selection with monoclonal H-Y antibody. Fertil. Steril. 50,

355–360 (1988).

88. Williams, T. J. A technique for sexing mouse embryos by a visual colorimetric

assay of the X-linked enzyme, glucose 6-phosphate dehydrogenase. Theriogenology

25, 733–739 (1986).

89. Monk, M. & Handyside, A. H. Sexing of preimplantation mouse embryos by

measurement of X-linked gene dosage in a single blastomere. Reproduction 82, 365–

368 (1988).

90. Gosden, J. R., Mitchell, A. R., Gosden, C. M., Rodeck, C. H. & Morsman, J. M.

Direct vision chorion biopsy and chromosome-specific DNA probes for determination

of fetal sex in first-trimester prenatal diagnosis. The Lancet 320, 1416–1419 (1982).

91. Lau, Y.-F., Dozy, A. M., Huang, J. C. & Yuet Wai Kan. A rapid screening test for

antenatal sex determination. The Lancet 323, 14–16 (1984). 73

92. Leonard, M. et al. Sexing bovine embryos using Y chromosome specific DNA

probe. Theriogenology 27, 248 (1987).

93. Bondioli, K. R., Ellis’, S. B., Pryor’, J. H., Williams, M. W. & Harpold’, M. M.

The use of male-specific chromosomal DNA fragments to determine the sex of bovine

preimplantation embryos. 31, 10 (1989).

94. van Vliet, R. A., Verrinder Gibbins, A. M. & Walton, J. S. Livestock embryo

sexing: A review of current methods, with emphasis on Y-specific DNA probes.

Theriogenology 32, 421–438 (1989).

95. Vergnaud, G. et al. A deletion map of the human Y chromosome based on DNA

hybridization. Am. J. Hum. Genet. 38, 109–124 (1986).

96. Affara, N. A. et al. Mapping the testis determinants by an analysis of Y-specific

sequences in males with apparent XX and XO karyotypes and females with XY

karyotypes. Nucleic Acids Res. 15, 7325–7342 (1987).

97. Page, D. C. et al. The sex-determining region of the human Y chromosome

encodes a finger protein. Cell 51, 1091–1104 (1987).

98. Bar, T., Kubista, M. & Tichopad, A. Validation of kinetics similarity in qPCR.

Nucleic Acids Res. 40, 1395–1406 (2012).

99. Witt, M. & Erickson, R. P. A rapid method for detection of Y-chromosomal DNA

from dried blood specimens by the polymerase chain reaction. Hum. Genet. 82, 271–

274 (1989). 74

100. Bradbury, M. W., Isola, L. M. & Gordon, J. W. Enzymatic amplification of a Y

chromosome repeat in a single blastomere allows identification of the sex of

preimplantation mouse embryos. Proc. Natl. Acad. Sci. 87, 4053–4057 (1990).

101. Taberlet, P., Mattock, H., Dubois‐Paganon, C. & Bouvet, J. Sexing free-ranging

brown bears Ursus arctos using hairs found in the field. Mol. Ecol. 2, 399–403 (1993).

102. Sembon, S. et al. Sex identification of pigs using polymerase chain reaction

amplification of the amelogenin gene. Zygote 16, 327–332 (2008).

103. Pfeiffer, I. & Brenig, B. X- and Y-chromosome specific variants of the

amelogenin gene allow sex determination in sheep (Ovis aries) and European red deer

(Cervus elaphus). BMC Genet. 6, 16 (2005).

104. Herr, C. M. Sex of progeny from bovine embryos sexed with a rapid Y-

chromosome-detection assay. Theriogenology 33, 247 (1990).

105. Pomp, D., Good, B. A., Geisert, R. D., Corbin, C. J. & Conley, A. J. Sex

identification in mammals with polymerase chain reaction and its use to examine sex

effects on diameter of day-10 or-11 pig embryos. J. Anim. Sci. 73, 1408–1415 (1995).

106. Phua, A. C. Y., Abdullah, R. B. & Mohamed, Z. A PCR-based sex determination

method for possible application in caprine gender selection by simultaneous

amplification of the Sry and Aml-X genes. J. Reprod. Dev. 49, 307–311 (2003).

107. Han, S. H., Lee, S. S., Ko, M. S. & Cho, I. C. A molecular sex identification using

duplex PCR method for SRY and ZFX-ZFY genes in red deer and elk. J. Anim. Sci.

Technol. 49, 1–8 (2007). 75

108. Ennis, S. & Gallagher, T. F. A PCR-based sex-determination assay in cattle based

on the bovine amelogenin locus. Anim. Genet. 25, 425–427 (1994).

109. Haas-Rochholz, H. & Weiler, G. Additional primer sets for an amelogenin gene

PCR-based DNA-sex test. Int. J. Legal Med. 110, 312–315 (1997).

110. Yamauchi, K. et al. Sex determination based on fecal DNA Analysis of the

amelogenin gene in sika deer (Cervus nippon). J. Vet. Med. Sci. 62, 669–671 (2000).

111. Sasi, R., Fan, Y.-S. & Lin, C. C. Prenatal sexing and detection of zfy gene

sequences in sex chromosome disorders by polymerase chain reaction. J. Clin. Lab.

Anal. 5, 193–196 (1991).

112. Tagliavini, J., Bolchi, A., Bracchi, P. G. & Ottonello, S. Sex determination on

samples of bovine meat by polymerase chain reaction. J. Food Sci. 58, 237–238

(1993).

113. Aasen, E. & Medrano, J. F. Amplification of the Zfy and Zfx genes for sex

identification in humans, cattle, sheep and goats. Bio/Technology 8, 1279–1281

(1990).

114. Amstrup, S. C., Garner, G. W., Cronin, M. A. & Patton, J. C. Sex identification of

polar bears from blood and tissue samples. Can. J. Zool. 71, 2174–2177 (1993).

115. Ballester, M., Castelló, A., Ramayo-Caldas, Y. & Folch, J. M. A quantitative real-

time PCR method using an X-linked gene for sex typing in pigs. Mol. Biotechnol. 54,

493–496 (2013). 76

116. Chang, H.-W. et al. High-throughput avian molecular sexing by SYBR green-

based real-time PCR combined with melting curve analysis. BMC Biotechnol. 8, 12

(2008).

117. Nielsen, J. & Wohlert, M. Chromosome abnormalities found among 34910

newborn children: Results from a 13-year incidence study in Århus, Denmark. Hum.

Genet. 87, 81–83 (1991).

118. Marino, B. S. & Fine, K. S. Blueprints Pediatrics. (Lippincott Williams &

Wilkins, 2013).

119. Boada, R., Janusz, J., Hutaff-Lee, C. & Tartaglia, N. The cognitive phenotype in

Klinefelter Syndrome: A review of the literature including genetic and hormonal

factors. Dev. Disabil. Res. Rev. 15, 284–294 (2009).

120. Michala, L., Goswami, D., Creighton, S. M. & Conway, G. S. Swyer syndrome:

Presentation and outcomes. BJOG Int. J. Obstet. Gynaecol. 115, 737–741 (2008).

121. de la Chapelle, A. The etiology of maleness in XX men. Hum. Genet. 58, 105–

116 (1981).

122. Ohno, S. Sex Chromosomes and Sex-Linked Genes. vol. 1 (Springer, 1966).

123. Charlesworth, B. The Evolution of Sex Chromosomes. Science 251, 1030–1033

(1991).

124. Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes.

Philos. Trans. R. Soc. Lond. B. Biol. Sci. 355, 1563–1572 (2000).

125. Graves, J. A. M. The rise and fall of SRY. Trends Genet. 18, 259–264 (2002). 77

126. Waters, P. D., Wallis, M. C. & Graves, J. A. M. Mammalian sex—Origin and

evolution of the Y chromosome and SRY. Semin. Cell Dev. Biol. 18, 389–400 (2007).

127. Bachtrog, D. Y chromosome evolution: Emerging insights into processes of Y

chromosome degeneration. Nat. Rev. Genet. 14, 113–124 (2013).

128. Cortez, D. et al. Origins and functional evolution of Y chromosomes across

mammals. Nature 508, 488–493 (2014).

129. Galan-Caridad, J. M. et al. Zfx Controls the Self-Renewal of Embryonic and

Hematopoietic Stem Cells. Cell 129, 345–357 (2007).

130. Schneider-Gädicke, A. et al. Putative transcription activator with alternative

isoforms encoded by human ZFX gene. Nature 342, 708–711 (1989).

131. Schneider-Gädicke, A., Beer-Romero, P., Brown, L. G., Nussbaum, R. & Page,

D. C. ZFX has a gene structure similar to ZFY, the putative human sex determinant,

and escapes X inactivation. Cell 57, 1247–1258 (1989).

132. Vernet, N., Mahadevaiah, S. K., de Rooij, D. G., Burgoyne, P. S. & Ellis, P. J. I.

Zfy genes are required for efficient meiotic sex chromosome inactivation (MSCI) in

spermatocytes. Hum. Mol. Genet. 25, 5300–5310 (2016).

133. Salido, E. C., Yen, P. H., Koprivnikar, K., Yu, L. C. & Shapiro, L. J. The human

enamel protein gene amelogenin is expressed from both the X and the Y

chromosomes. Am. J. Hum. Genet. 50, 303–316 (1992).

134. Termine, J. D., Belcourt, A. B., Christner, P. J., Conn, K. M. & Nylen, M. U.

Properties of dissociatively extracted fetal tooth matrix proteins. I. Principal molecular

species in developing bovine enamel. J. Biol. Chem. 255, 9760–9768 (1980). 78

135. Fincham, A. G. et al. Self-assembly of a recombinant amelogenin protein

generates supramolecular structures. J. Struct. Biol. 112, 103–109 (1994).

136. Ciba Foundation Symposium 205 ‐ Dental Enamel. (John Wiley & Sons, Ltd,

1997). doi:10.1002/9780470515303.

137. Lau, E. C., Mohandas, T. K., Shapiro, L. J., Slavkin, H. C. & Snead, M. L.

Human and mouse amelogenin gene loci are on the sex chromosomes. Genomics 4,

162–168 (1989).

138. Nakahori, Y., Takenaka, O. & Nakagome, Y. A human X-Y homologous region

encodes “amelogenin”. Genomics 9, 264–269 (1991).

139. Snead, M. L. et al. DNA sequence for cloned cDNA for murine amelogenin

reveal the amino acid sequence for enamel-specific protein. Biochem. Biophys. Res.

Commun. 129, 812–818 (1985).

140. Shimokawa, H., Sobel, M. E., Sasaki, M., Termine, J. D. & Young, M. F.

Heterogeneity of amelogenin mRNA in the bovine tooth germ. J. Biol. Chem. 262,

4042–4047 (1987).

141. Nakahori, Y., Hamano, K., Iwaya, M. & Nakagome, Y. Sex identification by

polymerase chain reaction using X-Y homologous primer. Am. J. Med. Genet. 39,

472–473 (1991).

142. Faerman, M. et al. Sex identification of archaeological human remains based on

amplification of the X and Y amelogenin alleles. Gene 167, 327–332 (1995). 79

143. Ensminger, A. L. & Hoffman, S. M. G. Sex identification assay useful in great

apes is not diagnostic in a range of other primate species. Am. J. Primatol. 56, 129–

134 (2002).

144. Steiper, M. E. & Ruvolo, M. Genetic sex identification in orangutans. Anthropol.

Anz. 61, 1–5 (2003).

145. Weikard, R., Pitra, C. & Kühn, C. Amelogenin cross-amplification in the family

Bovidae and its application for sex determination. Mol. Reprod. Dev. 73, 1333–1337

(2006).

146. Macé, M. & Crouau-Roy, B. A highly polymorphic insertion in the Y-

chromosome amelogenin gene can be used for evolutionary biology, population

genetics and sexing in Cetacea and Artiodactyla. BMC Genet. 9, 64 (2008).

147. Brinkman, T. J. & Hundertmark, K. J. Sex identification of northern ungulates

using low quality and quantity DNA. Conserv. Genet. 10, 1189–1193 (2009).

148. Gurgul, A., Radko, A. & Słota, E. Characteristics of X- and Y-chromosome

specific regions of the amelogenin gene and a PCR-based method for sex identification

in red deer (Cervus elaphus). Mol. Biol. Rep. 37, 2915–2918 (2010).

149. Pilgrim, K. L., Mckelvey, K. S., Riddle, A. E. & Schwartz, M. K. Felid sex

identification based on noninvasive genetic samples. Mol. Ecol. Notes 5, 60–61

(2005).

150. Yamamoto, K. et al. Sex identification of Japanese Black Bear, Ursus thibetanus

japonicus, by PCR based on amelogenin gene. J. Vet. Med. Sci. 64, 505–508 (2002). 80

151. Gorbalenya, A. E. & Koonin, E. V. Helicases: Amino acid sequence comparisons

and structure-function relationships. Curr. Opin. Struct. Biol. 3, 419–429 (1993).

152. Fairman-Williams, M. E., Guenther, U.-P. & Jankowsky, E. SF1 and SF2

helicases: Family matters. Curr. Opin. Struct. Biol. 20, 313–324 (2010).

153. Linder, P. et al. Birth of the D-E-A-D box. Nature 337, 121–122 (1989).

154. Tanner, N. K., Cordin, O., Banroques, J., Doère, M. & Linder, P. The Q motif: A

newly identified motif in DEAD Box helicases may regulate ATP binding and

hydrolysis. Mol. Cell 11, 127–138 (2003).

155. Fuller-Pace, F. V. DExD/H box RNA helicases: Multifunctional proteins with

important roles in transcriptional regulation. Nucleic Acids Res. 34, 4206–4215 (2006).

156. Staley, J. P. & Guthrie, C. Mechanical devices of the spliceosome: Motors,

clocks, springs, and things. Cell 92, 315–326 (1998).

157. Granneman, S., Bernstein, K. A., Bleichert, F. & Baserga, S. J. Comprehensive

mutational analysis of yeast DEXD/H box RNA helicases required for small ribosomal

subunit synthesis. Mol. Cell. Biol. 26, 1183–1194 (2006).

158. Snay-Hodge, C. A., Colot, H. V., Goldstein, A. L. & Cole, C. N. Dbp5p/Rat8p is

a yeast nuclear pore-associated DEAD-box protein essential for RNA export. EMBO J.

17, 2663–2676 (1998).

159. Grifo, J. A. et al. Characterization of eukaryotic initiation factor 4A, a protein

involved in ATP-dependent binding of globin mRNA. J. Biol. Chem. 257, 5246–5252

(1982). 81

160. Linder, P. Yeast RNA helicases of the DEAD-box family involved in translation

initiation. Biol. Cell 95, 157–167 (2003).

161. Py, B., Higgins, C. F., Krisch, H. M. & Carpousis, A. J. A DEAD-box RNA

helicase in the Escherichia coli RNA degradosome. Nature 381, 169–172 (1996).

162. Huang, H.-R. et al. The splicing of yeast mitochondrial group I and group II

introns requires a DEAD-box protein with RNA chaperone function. Proc. Natl. Acad.

Sci. 102, 163–168 (2005).

163. Cordin, O., Banroques, J., Tanner, N. K. & Linder, P. The DEAD-box protein

family of RNA helicases. Gene 367, 17–37 (2006).

164. Linder, P. Dead-box proteins: A family affair—active and passive players in

RNP-remodeling. Nucleic Acids Res. 34, 4168–4180 (2006).

165. Linder, P. & Fuller-Pace, F. V. Looking back on the birth of DEAD-box RNA

helicases. Biochim. Biophys. Acta BBA - Gene Regul. Mech. 1829, 750–755 (2013).

166. Putnam, A. A. & Jankowsky, E. DEAD-box helicases as integrators of RNA,

nucleotide and protein binding. Biochim. Biophys. Acta BBA - Gene Regul. Mech.

1829, 884–893 (2013).

167. Lahn, B. T. & Page, D. C. Functional coherence of the human Y chromosome.

Science 278, 675–680 (1997).

168. Franca, R., Belfiore, A., Spadari, S. & Maga, G. Human DEAD-box ATPase

DDX3 shows a relaxed nucleoside substrate specificity. Proteins Struct. Funct.

Bioinforma. 67, 1128–1137 (2007). 82

169. Garbelli, A., Beermann, S., Di Cicco, G., Dietrich, U. & Maga, G. A motif unique

to the human Dead-Box protein DDX3 is important for nucleic acid binding, ATP

hydrolysis, RNA/DNA unwinding and HIV-1 replication. PLoS ONE 6, (2011).

170. Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-

wide integration of transcriptomics and antibody-based proteomics. Mol. Cell.

Proteomics 13, 397–406 (2014).

171. Snijders Blok, L. et al. Mutations in DDX3X are a common cause of unexplained

intellectual disability with gender-specific effects on Wnt signaling. Am. J. Hum.

Genet. 97, 343–352 (2015).

172. Valiente‐Echeverría, F., Hermoso, M. A. & Soto‐Rifo, R. RNA helicase DDX3:

At the crossroad of viral replication and antiviral immunity. Rev. Med. Virol. 25, 286–

299 (2015).

173. Brai, A. et al. Human DDX3 protein is a valuable target to develop broad

spectrum antiviral agents. Proc. Natl. Acad. Sci. 113, 5388–5393 (2016).

174. Ditton, H. J., Zimmer, J., Kamp, C., Rajpert-De Meyts, E. & Vogt, P. H. The

AZFa gene DBY (DDX3Y) is widely transcribed but the protein is limited to the male

germ cells by translation control. Hum. Mol. Genet. 13, 2333–2341 (2004).

175. Foresta, C., Ferlin, A. & Moro, E. Deletion and expression analysis of AZFa

genes on the human Y chromosome revealed a major role for DBY in male infertility.

Hum. Mol. Genet. 9, 1161–1169 (2000). 83

176. Lucas, C. G. et al. A novel swine sex-linked marker and its application across

different mammalian species. Transgenic Res. (2020) doi:10.1007/s11248-020-00204-

z.

177. Sekiguchi, T. et al. New methods for species and sex determination in three

sympatric Mustelids, Mustela itatsi, Mustela sibirica and Martes melampus. Mol. Ecol.

Resour. 10, 1089–1091 (2010).

178. Sashital, D. G., Venditti, V., Angers, C. G., Cornilescu, G. & Butcher, S. E.

Structure and thermodynamics of a conserved U2 snRNA domain from yeast and

human. RNA 13, 328–338 (2007).

179. Chang, T.-C., Yang, Y., Retzel, E. F. & Liu, W.-S. Male-specific region of the

bovine Y chromosome is gene rich with a high transcriptomic activity in testis

development. Proc. Natl. Acad. Sci. 110, 12373–12378 (2013).

180. Laird, P. W. et al. Simplified mammalian DNA isolation procedure. Nucleic Acids

Res. 19, 4293 (1991).

181. Zhang, Z., Schwartz, S., Wagner, L. & Miller, W. A greedy algorithm for aligning

DNA sequences. J. Comput. Biol. 7, 203–214 (2000).

182. Skinner, B. M. et al. The pig X and Y chromosomes: Structure, sequence, and

evolution. Genome Res. 26, 130–139 (2016).

183. Olinsky, S. et al. Chromosomal mapping of the human M6 genes. Genomics 33,

532–536 (1996). 84

184. Drabek, K., Peppel, J. van de, Eijken, M. & Leeuwen, J. P. van. GPM6B regulates

osteoblast function and induction of mineralization by controlling cytoskeleton and

matrix vesicle release. J. Bone Miner. Res. 26, 2045–2051 (2011).

185. Hall, T. A. BioEdit: A user-friendly biological sequence alignment editor and

analysis program for Windows 95/98/NT. in Nucleic acids symposium series vol. 41

95–98 ([London]: Information Retrieval Ltd., c1979-c2000., 1999).