Y chromosome dynamics in Drosophila

Amanda Larracuente Department of Biology Sex chromosomes

X X X Y J. Graves Sex chromosome evolution

Proto-sex Autosomes chromosomes

Sex Suppressed determining recombination Differentiation X Y

Reviewed in Rice 1996, Charlesworth 1996 Y chromosomes

• Male-restricted • Non-recombining • Degenerate • Heterochromatic

Image from Willard 2003 Drosophila

D. melanogaster Cen

Hoskins et al. 2015 ~40 Mb • ~20 genes

• Acquired from autosomes

• Heterochromatic:

Ø 80% is simple satellite DNA Photo: A. Karwath

Lohe et al. 1993 Satellite DNA

• Tandem repeats

• Heterochromatin

• Centromeres, telomeres, Y chromosomes

Yunis and Yasmineh 1970 http://www.chrombios.com Y chromosome assembly challenges

• Repeats are difficult to sequence • Underrepresented • Difficult to assemble

Genome

Sequence read:

Short read lengths cannot span repeats Single molecule real-time sequencing

• Pacific Biosciences

• Average read length ~15 kb

• Long reads span repeats

• Better assemblies

Zero mode waveguide

Eid et al. 2009 Comparative Y chromosome evolution in Drosophila

I. Y chromosome assemblies

II. Evolution of Y-linked genes Drosophila

2 Mya

0.24 Mya

Photo: A. Karwath

P6C4 ~115X ~120X ~85X ~95X De novo genome assembly

• Assemble genome Iterative assembly: Canu, Hybrid, Quickmerge

• Polish reference Quiver x 2; Pilon

2L 2R 3L 3R 4 X Y

Assembled genome

Mahul Chakraborty, Ching-Ho Chang 2L 2R 3L 3R 4 X Y

Y

X/A heterochromatin De novo genome assembly

species Total bp # contigs NG50 D. simulans 154,317,203 161 21,495729

D. mauritiana 154,866,913 164 22,121,759

D. sechellia 166,750,432 254 19,907,079

2L 2R 3L 3R 4 X Y

Assembled genome

Mahul Chakraborty, Ching-Ho Chang 2L 2R 3L 3R 4 X Y

Y

X/A heterochromatin Genome assembly

Missing most of Y chromosome!

2L 2R 3L 3R 4 X Y

Assembled genome

2L 2R 3L 3R 4 X Y

Y

X/A heterochromatin 600

Underrepresented heterochromatin

Dmau Region 400 Sequenced males A 600600 U count Expected: X ~100X for autosomes Y X Y Region ~ 50X for X 400400 A U

~ 50X for Y count X count Y 200 Observed: ~100X for autosomes 200200 ~ 50X for X ~ 35X for Y

0 0 0 50 100 150 Ching-Ho Chang 0 50 100 150 coveragecoverage

0

0 50 100 150 coverage Y chromosome assembly

• Map reads to • Remove X and autosomal reads

2L 2R 3L 3R 4 X Y

2L 2R 3L 3R 4 X Y

Y

X/A heterochromatin Y chromosome assembly

• Map reads to reference2L genome2R 3L 3R 4 X Y • Remove X and autosomal reads • Reassemble leftover reads 2L• Manual2R curation3L 3R 4 X Y • Confirm Y contigs 2L 2R 3L 3R 4 X Y

2L 2R 3L 3R 4 X Y Y

X/A heterochromatin Y

X/A heterochromatin 3000 Identifying Y-linked contigs

A Location

2000 2 3000 3 4 count • Illumina reads Unknown X • Map XX LocationY 2000 2 Y 3 1000 4 • Map XY count Unknown X Y • Take ratio 1000 X

0 0 0.0 0.1 0 1 0.2 2 0.3 3 Female/Male Female/Male

3000

2000 count

1000

0

0 1 2 3 Female/Male Y-sensitive assemblies

X + autosomes + Y enriched New reference Reference2L 2R 3L 3Rmerge 4 X Y

2L 2R 3L 3R 4 X Y

Y

X/A heterochromatin Y chromosome assembly

~40 Mb

Assembly con,gs total size N50 Dmel reference (v6 ) 261 3,977,036 81,922 Dmel PacBio 97 15,245,657 415,400 Dsim PacBio 40 14,132,328 1,189,645 Dsec PacBio 68 14,889,957 606,900 Dmau PacBio 19 14,537,038 2,282,139

30-40% of Y chromosomes in contigs! Ching-Ho Chang Dmel Y-linked genes

Most Y-linked genes acquired from autosomes

2L 2R 3L 3R 4 X Y

2L 2R 3L 3R 4 X Y e.g. Carvalho et al. 2000, 2001; Carvalho et al. 2015; Krsticevic et al. 2015

Y

X/A heterochromatin Dmel Y-linked genes

origin Dmel Dsim Dsec Dmau kl-5 3R ✔ ✔ ✔ ✔ PRY 2L ✔ ✔ ✔ ✔ kl-3 2L ✔ ✔ ✔ ✔ kl-2 2R ✔ ✔ ✔ ✔ Pp1-Y1 2R ✔ ✔ ✔ ✔ ARY 3L ✔ ✔ ✔ ✔ Ppr-Y 2L ✔ ✔ ✔ ✔ WDY 2L ✔ ✔ ✔ ✔ Pp1-Y2 3R ✔ ✔ ✔ ✔ ORY 3R ✔ ✔ ✔ ✔ CCY 3R ✔ ✔ ✔ ✔ Mst77Y1-18ψ 3L ✔ ✖ ✖ ✖ FDY 3R ✔ ✖ ✖ ✖ Mst35Y 2R ✔ ✖ ✖ ✖ Consistent with Koerich et al. 2008; labs of AB Carvalho and AG Clark Y-linked gene organization

cen Dmel kl-5 PRY kl-3 kl-2 Pp1-Y1 ARY Ppr-Y WDY Pp1-Y2 ORY CCY

Dmau kl-5 Pp1-Y1 ORY kl-2 CCY Pp1-Y2 PRY kl-3 Ppr-Y ARY WDY

Dsim kl-5 Pp1-Y1 Pp1-Y2 ORY kl-3 PRY Ppr-Y ARY kl-2 CCY WDY

Dsec kl-5 Pp1-Y1 Pp1-Y2 PRY ORY kl-2 Ppr-Y ARY CCY kl-3 WDY Dmel DmauGene duplicationsDsim Dsec 10.0 • Most genes have duplicate copies • Degenerated copies 7.5 Dmel Dmau Dsim Dsec ARY 10.0

5.0 complete 7.5 degenerated copy number 5.0 complete 2.5 degenerated copy number

2.5

0.0 0.0

1 2 3 4 5 6 1 1 22 33 4 45 65 1 6 2 31 4 25 63 1 4 2 53 46 5 61 1 2 2 33 44 5 56 6 exon numberexon number Dmel DmauGene duplicationsDsim Dsec 10.0 • Most genes have duplicate copies • Degenerated copies 7.5 Dmel Dmau Dsim Dsec 15 kl-2

5.0 complete

10 degenerated copy number

complete 2.5 degenerated copy number 5

0.0

1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 0 exon number 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 exon number New genes in simulans clade

2 Mya To Y 0.24 Mya

Photo: A. Karwath simulans clade New genes in simulans clade

2 Mya CG3511 (2R) 0.24 Mya

Dmel Dsim Dsec Dmau CG3511 ✖ ✖ ✖ ✔ New genes in simulans clade

CG3511 e.g. Dmau (2R) parent gene

Dmau Y-linked copies: all inactivated Invader2 New genes in simulans clade

Hrb27C (2L), CG16781 (X), SRPK (2R) 2 Mya CG3511 (2R) 0.24 Mya

Dmel Dsim Dsec Dmau CG3511 ✖ ✖ ✖ ✔ inactivated Hrb27C ✖ ✔ ✔ ✔ CG16781 ✖ ✔ ✔ ✔ SRPK ✖ ✔ ✔ ✔ Comparative Y chromosome evolution in Drosophila

I. Y chromosome assembly2L 2R 3L 3R 4 X Y -30-40% of Y chromosomes in contigs

2L 2R 3L 3R 4 X Y

II. Y-linked gene evolution Y -Reorganization Dmel -Duplication

-New genes Dmau X/A heterochromatin -Subfunctionalization Acknowledgments

Ching-Ho Chang (U. Rochester)

Casey Bergman (UGA) Center for Integrated Research Computing sim clade PacBio genomes Mahul Chakorbharty (UC Irvine) J.J. Emerson (UC Irvine) Kristi Montooth (U Nebraska) Colin Meiklejohn (U Nebraska) Jeffrey Vedenayagam (NYU)