Emerging opportunies for genomic research - what can we expect

Leif Andersson

Uppsala University & Swedish University of Agricultural Sciences New molecular technologies make extensive genec characterizaons possible

1. Single Nucleode Polymorphisms (SNPs)

A SNP chip comprising about 200,000 SNPs costs about 150 USD 1 USD > 1000 genotypes!

- Genec relaonships

- Genome-wide associaon analysis New molecular technologies make extensive genec characterizaons possible

2. Whole Genome Sequencing

30 Gigabase pair (30 x 109) of sequence costs about 2,000 USD

Sequence individuals - Sequence one mammal or three birds to 10X coverage

Sequence pools of individuals - For instance, sequence a pool represenng a rare breed of pigs to 10X coverage How some chickens got black skin and black connecve ssues

Fibromelanosis in chicken is caused by dominant allele (FM)

Ben Dorshorst et al. PLoS Genecs 2011 The Silkie Chicken The Silkie Chicken We used whole-genome sequencing to idenfy the FM mutaon

1. We used a pool of Silkie chickens and a control populaon

2. Shear the genome into fragments in the range 150 bp to 20,000 bp In our case 2,500 bp fragments

3. Sequence each fragment 50-100 bp from each end

4. Align the reads to the reference genome and analyse the data

Ref genome

about 2.5 kb Whole genome resequencing revealed a complex structural rearrangement on chromosome 20

The same rearrangement present in four different breeds carrying the FM mutaon The mutaon constutes a complex genomic rearrangement One of the duplicated fragments contains an obvious candidate gene for hyperpigmentaon Endothelin 3 (EDN3)

Svarthöns – a local strain of chicken in Sweden Structural variaons have enabled rapid evoluon of domescated animals

Dorsal hair ridge and Salmon Hillbertz pre-disposion to et al.2007 Nat Genet

Genomic dermoid sinus

[oligogenic?] 133 kb dup (Contains FGF3, FGF4, FGF19, ORAOV1) 39:1318-20

Rubin et al.2010 ‘High growth’ Nature 464:587-91

Intragenic 19 kb del (3’ end of SH3RF2) WGS

Premature hair graying Pielberg et al.2008 and suscepbility to Nat Genet 40:1004-9 melanoma 4.6 kb dup (Intron 6 of STX17) Intronic Pea-comb Wright et al. 2009 phenotype PLoS Genet 5:e1000512 ~20-40X amplificaon (Intron 1 of SOX5)

Gunnarsson et al. Dark brown 2011 Pigment Cell plumage color Melanoma Res 8.3 kb del (Upstream of SOX10) 24:268-74

Intergenic Skin wrinkling and Olsson et al. 2011 suscepbility to PLoS Genet 7:e1001332 familial Shar-Pei fever 1-10X+ amplificaon (Upstream of HAS2) Pace Maker - a single base change reshapes ’ neural circuits to add pacing to their repertoir OUTLOOK Human papillomavirus

THE INTERNATIONAL WEEKLY JOURNAL OF SCIENCE

A single mutation reshapes horses’ neural circuits to add pacing to their repertoire PAGE 642 GAIT KEEPER EXTREME PHYSICS ENVIRONMENT SALARY SURVEY 20!2 NATURE.COM/NATURE 30 August 2012 £10 BEYOND THE STATE OF THE WEATHERING Vol. 488, No. 7413 HIGGS BOSON OCEAN THE STORM Particle physicists A health-check for Researchers enjoy the plan next moves the world’s oceans work — but not the pay PAGES 572 & 58! PAGES 594 & 6!5 PAGE 685

Cover 30 August UK PJ.indd 1 23/08/2012 11:24 Walk, 4-beat Trot, 2-beat, diagonal Canter, 3-beat Gallop, 4-beat Increasing speed

All horses

Photo: Freyja Imsland Walk, 4-beat Trot, 2-beat, diagonal

Canter, 3-beat Gallop, 4-beat

Toelt, 4-beat Pace, 2-beat, lateral

Gaited horses Photo: Freyja Imsland Gaited horses Non-gaited horses

Akhal teke Europe Andalusian Icelandic horse Arabian America Exmoor Pony Campolina Friesian Mangalarga Marchador Haflinger Missouri Foxtrotter Hanoverian Paso Fino Lusitano Peruvian Paso North Swedish Draft Norwegian Fjord Quarter Horse Tennessee Walker Shetland Pony Kentucky Mountain Horse Thoroughbred Asia Trakehner Marwari horse

Icelandic horses – a special case

4-gaited horses (walk, toelt, trot, gallop)

5-gaited horses (walk, toelt, trot, gallop + PACE) occur approximately at the same frequency

Trait Heritability Ability to pace ~0.60 Human height ~0.80 Icelandic horses – a special case Genome-wide association study

✽ Illumina EquineSNP50 Genotyping BeadChip ✽ Questionnaire - Is your horse capable of pace? ✽ 30 without pace (four-gaited) and 40 with pace (five-gaited) Genom-wide association analysis

C G m A C a T G m T s e T G m A s

T C + T C o T C + A n t r C C + T o l s Genome-wide association study

GWA defined a 684 kb interval containing 4 genes

Further fine-mapping refined the IBD region to 438 kb containing DMRT1-3

(Doublesex and MYB-related transcription factors) Mutation identification

• Whole genome sequencing of one four-gaited and one five-gaited Icelandic horse • Identified only one coding mutation in the critical interval, a nonsense mutation in DMRT3 exon 2 and almost all other polymorphisms excluded by genetic analysis

Genotype distributions

MUT/ Breed wt/wt wt/MUT MUT Total Icelandic horses, four-gaited 3 103 46 152 Icelandic horses, five-gaited 0 2 103 105

Photo: Freyja Imsland Genotype distributions Breed wt/wt wt/MUT MUT/MUT Total Gaited horses Icelandic horses 3 105 149 257 Rocky Mountain Horse 0 0 17 17 Kentucky Mountain Saddle Horse 0 2 20 22 0 0 40 40 Tennesse Walking Horse 0 1 32 33 Peruvian Paso 0 0 19 19 Paso Fino 0 0 45 45

Non-gaited horses Przewalski’s horse 6 0 0 6 Gotland Pony 21 0 0 21 22 0 0 22 Swedish 35 0 0 35 Arabian 18 0 0 18 Thoroughbred 35 0 0 35 Shetland Pony 48 0 0 48 North-Swedish 24 0 0 24 So what is the function of DMRT3 and where is it expressed? DMRT3 is expressed in a specific subset of neurons in the spinal cord of mice DMRT3 neurons are interneurons with inhibitory character that project axons both ipsi- and contralateral in the spinal cord, and make synaptic connections to motor neurons

Trans-synaptic pseudorabies virus-based tracing How old is the mutaon?

Bronze pacing horse poised on a swallow with wings outstretched. Eastern Han Dynasty, 2nd century AD

Plinius (23-79 AD) describing some Spanish horse breeds as “These horses Detail from a painng done in the 1400's have a unique leg acon that differs showing two horses in a lateral movement from that of other horses”. Conclusions

1. Development of new technologies create many emerging opportunies for genec studies of all type of organisms – both basic research and applicaons

2. The cost is expected to come down even further – 100 USD genome?

3. Collecon of unique and well-characterized biological materials will have an increasing value