The coalition between Italian goats and Italian researchers: the Italian Goat Consortium

Paolo Ajmone Marsan and the Italian Goat Consortium Institute of Zootechnics Università Cattolica del Sacro Cuore Piacenza, Italy [email protected]

Cardiff, 17/06/2014 Outline

IMMAGINE

• The Italian Goat Consortium IMMAGINE • SNP diversity in Italian goats • Perspectives Italian goat Consortium

• Hard time for economy in Europe – Harder in Italy than in Central/Northern Europe • Even harder for research funding – Very difficuly for small ruminant research » No way for goat diversity! • From crysis to opportunity – Modest seed funding of a project started in 2008(Innovagen funded by the Ministry of Agriculture) • Coalization and Coordination….. – Definetely a new model for Italian scientists……. Italian goat Consortium

Paola Crepaldi, Coordinator Università degli Studi di Milano, ITALY Fabio PILLA, Maria Silvia D'ANDREA Università degli Studi del Molise, Campobasso, ITALY Paolo AJMONE-MARSAN, Nicola BACCIU, Lorenzo BOMBA, Licia COLLI, Marco MILANESI Università Cattolica del Sacro Cuore, Piacenza, ITALY Paola Crepaldi Antonello CARTA, Tiziana SECHI [email protected] AGRIS, Loc. Bonassai, Sassari, ITALY Stefania CHESSA, Bianca CASTIGLIONI Consiglio Nazionale delle Ricerche, Lodi, ITALY Pooling local efforts and Donata MARLETTA, Salvatore BORDONARO resources for the genomic Università degli Studi di Catania, ITALY characterisation of Italian Salvatore MURRU Associazione Nazionale della Pastorizia, Roma, ITALY goat breeds Riccardo NEGRINI, Raffaele MAZZA Associazione Italiana Allevatori, Roma, ITALY www.italiangoatconsortium.eu Giulio PAGNACCO, Beatrice COIZET, Letizia NICOLOSO, Università degli Studi di Milano, ITALY Alessio VALENTINI Università degli Studi della Tuscia, Viterbo, ITALY Sampling

Camosciata delle Alpi Bionda dell’Adamello

Orobica Saanen Val Passiria

Valdostana Teramana

Grigia Ciociara

Sarda Nicastrese Aspromontana Maltese

Girgentana

Argentata dell’Etna 50K Illumina goat SNP chip

• Discovery on 6 breeds (meat, mixed and milk) • Detection of ~12 million variations with > 10 millionSNPs • 60,000 SNPs (spaced on the genome, with >0.2 MAF, >0.8 Illumina ADT score…) • 52,295 successful loci (tested with 288 goat DNA samples from 10 different breeds) • Pseudochromosomes aligned on cattle • Details on www.goatgenome.org • Sequencing and novel de novo assembly on going at USDA

Dataset cleaning

• Filtering exclusion threshold – MAF < 1% – Missing (SNP) > 5% – Missing (animal) > 5% – HW within breed FDR > 20% • Working Dataset – 15 breeds – 350 animals (15-32 per breed) – 51,136 SNPs SNPchip affected by ascertainment bias (EU Nextgen project) but highly informative for the Italian gene pool Within breed MAF distribution

CAM

SAA

SAR

ARG

BIO X=0 VPS 0

ORO

SAM

GIR

TER

0 4000 8000 12000 16000 20000 24000 28000 32000 36000 40000 44000 48000 52000 Expected vs Observed Heterozygosity K=2

Alps Center South & Islands Cross Validation error plot

ADMIXTURE Software (10 runs) The Best K (K=11)

Alps Center South & Islands Geographic distribution of 11 genomic components Neighbour Net based on Reynolds distance

Reynolds Distance

2 å(xi - yi ) 1 i DRe ynolds = 2 1- å xi yi i Principal Component Analysis

CAM = Camosciata (Alpine) VAL = SAA = Saanen Teramana BIO = Bionda ORO = VPS = Valpassiria GCI = Grigia Ciociara ARG = Argentata NIC = Nicastrese ASP = Aspromontana Alps Center-South MAL = Maltese Siciliana GIR = TER = Teramana SAM = Maltese Sarda Maltese SAR = Sarda Principal Component Analysis

CAM = Camosciata (Alpine) West-East Alps VAL = Valdostana SAA = Saanen BIO = Bionda Aspromontana ORO = Orobica VPS = Valpassiria Girgentana GCI = Grigia Ciociara ARG = Argentata Maltese NIC = Nicastrese ASP = Aspromontana MAL = Maltese Siciliana GIR = Girgentana TER = Teramana Orobica SAM = Maltese Sarda SAR = Sarda

LD in Chromosome 6

2 r

Distance (Mb) Ne

* Historical Ne of Italian goat breeds was estimated using SNeP . For each pair of SNPs within a chromosome the LD is calculated according to Hill & Robertson (1968) using the method of Sved (1971) and correcting for sample size and mutations (Weir & Hill 1980, Hayes 2003, Corbin et al. 2012).

5000

4500 ARG 4000 ASP BIO 3500

CAM

e

z GCI

i 3000

s

n GIR

o 2500 MAL

a l

u NIC

p 2000

o ORO

P

e 1500 SAA v

c SAM e

f 1000

f SAR E 500 TER VAL 0 VPS 0 500 1000 1500 2000 Genera ons Ago Selection signatures

Lositan software Simulation of markers under the neutral model Detection of outliers having Fstor lower than expected under a neutral model at that value of heterozygosity

- 456 markers under directional selection - 629 under balancing selection

What about breed diversity at these loci? Markers under directional/balancing selection

0.02 0.06 0.02 0.06 0.10

5

2 .

Under_sel 0

5

0 .

0 Neutral = 50051 markers

6

0 .

0 Neutral1 Under_Sel = 456 markers

2 0

. Bal_Sel = 629 markers 0

6 Neutral 1 = 456 markers

0 .

Neutral 0 Neutral 2 = 456 markers

2

0 .

0

8

0 .

0 Neutral2

2

0

.

0

8

1

0 .

Bal_sel 0

0

1

0

. 0 0.05 0.20 0.35 0.02 0.06 0.010 0.016 Ntr Dir Bal Ntr1 Ntr2 Ntr 1 0.85 0.89 0.98 0.99 Pearson’ correlation Dir 1 0.70 0.83 0.88 between genetic distances Bal 1 0.85 0.88 between breeds Ntr 1 1 0.97 Ntr 2 1 Directional selection

PCA − PC1 (%VarExp: 13.509) and PC2 (%VarExp: 4.794) d = 5

2

1

0

1

)

8

% (

d ●

e

n

i

a

l

p x

6 ● E

r ● a ●

V ● 4 ● ● ●● 2 ● ● ● 0 ● ● TER● PCs ● ● ● ● ●

● ●

● ●● ● ●●● ●● ● ● ● ●● GIR●●● ●● ●● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ●ASP● ● ● ●● ● ● ● ● GCI ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●●● ●● ● ● ● ● ARG● ● ● ● ● ● ●● ●● ● ● ●● VPS●●● ● ● ●● ● ● ●● ● ● ●● NIC● ● ●● ● ● ●● ● ● ●● ●● ●●● ●● ● ● ● ● BIO●● ●● ● ●● ● ●● ● ● ● ● ● ● ● SAA●●● ● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●OR● ●●O● ● ● ● ● ●●● ● ● ●● ● ● ●●●● ● ●● CAM● ● ● ● ●● ● ● ●● ● ● SAR● ● ● ●● ● ●● ● ● ● ● ● V●●AL● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● MAL ● ●● ● ● ● ●

SAM ● ● ● ● ● ● ● ● ● ● ● ● Directional selection

CV error

2

7

.

0

0

7

.

0

8

6

.

r

0

o

r

r

e

V

C

6

6

.

0

4

6

.

0

2

6

. 0

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

K

ADMIXTURE Trial K=11

Directional

Selection

8

.

0

y

r

t

s

e

4

c

.

n

0

A

0

.

0

I

L L

S A P

C R

R R

G

O O

M M

C I I

I

A A

P A S

E A

R

A R A

N

B

G G

V

V S T A

M S

A S

C

O

Neutral

Mostly

Alps Center South & Islands Balancing selection

PCA − PC1 (%VarExp: 0.884) and PC2 (%VarExp: 0.846) d = 2

8 .

0 ●

6

.

0

)

%

(

d

e

n

i

a

l

p

4

.

x

0

E

r

a V

● ● ●

2

. 0

0

. 0 ● ● PCs ● ● ●

● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ●● ●NIC●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● SAR ●ARG GCI● ● ●● ● ● ASP●● ● ● ● ● ● ● ● ●● ●● ● MAL● ● ●● ● CAM●● ●● ●● ● ●● ● ● ●● ●● ●●● ● ●● ● ●●● ●● ●●● ● ●● ●● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●VAL● ●● ● ● ● ● ● ● ● ●●● ● SAAVPS●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● GIR● ● ● ● BIO● ● ● ●● ● TER ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● SAM● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ORO● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●

● ●

● Balancing selection

CV error

0

8

.

0

5

7

.

0

r

o

r

r

e

V

C

0

7

.

0

5

6

. 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

K

ADMIXTURE Trial K=11

Balancing

Selection

8

.

0

y

r

t

s

e

4

c

.

n

0

A

0

.

0

I

L L

S A P

C R

R R

G

O O

M M

C I I

I

A A

P A S

E A

R

A R A

N

B

G G

V

V S T A

M S

A S

C

O

Neutral

Mostly

Alps Center South & Islands

0.35 0.38 0.41 0.35 0.38 0.410.34 0.38 0.34Correlation0.38 between

2

2

4

4

.

.

0 0

Bal_sel Bal_sel heterozygosities of breeds

9

9 3

0.35 0.38 0.41 0.35 0.38 0.410.34 0.38 0.34 0.38 3

.

.

0

0

2

2

4

4

.

.

9

9

0

0

3

3

.

. 0

0 Bal_sel NeutrBal_selal2 Neutral2

9

9

5

5

3

3

.

.

3

3

.

.

0

0

0

0

9

9

9

3

9

3

.

.

3

3

.

0

.

0 0

Neutral2 NeutrNeutral2al Neutral 0

5

5

6

6

3

3

3

3

.

.

.

.

0

0

0

0

9

9

8

8

3

3

3

3

.

.

.

.

0

0 0

0 Neutral NeutrNeutral1al Neutral1

6

6

4

4

3

3

.

.

3

3

.

.

0

0

0

0

4

4

3

3

8

8

.

.

3

3

0

0

.

. 0

0 Neutral1 Under_selNeutral1 Under_sel

6

6

4

4

2

2

.

.

3

3

.

.

0

0

0

0 4

0.39 0.41 0.39 0.41 0.36 0.39 0.36 0.39 0.26 0.32 0.380.264 0.32 0.38

3

3

.

. 0

Under_selNeutral Under_sel0 Directional Balancing Neutral 1

6

6

2

2

.

. 0 Neutral 1 0 0.87 0.33 0.99 0.39 0.41 0.36 0.39 0.26 0.32 0.38 0.39 0.41 0.36 0.39 Directional0.26 0.32 0.38 1 0.14 0.85 Balancing 1 0.33 Neutral 1 1 Correlation between heterozygosities of individuals

0.30 0.35 0.40

0

5

.

0

0

4 .

Bal_sel 0

0

3

.

0

0

4

.

0

5

3 .

0 Neutral

0

3

.

0

5

4

.

0

5

3 .

Under_sel 0

5

2

.

0

5

1

. 0

0.30 0.40 0.50 0.15 0.25 0.35 0.45

Neutral Directional Balancing Neutral 1 0.76 0.66 Directional 1 0.33 Balancing 1 IMMAGINE

Conclusions IMMAGINE SNPs vs others

• Much higher level of resolution (many thousand vs a few markers) • Robust and non homoplasic • Easier comparability across projects and data merging • Suited to genome wide analyses (ROH, Ne, Selection signatures, GWAS, breeding applications) • However panels should be carefully prepared and evaluated (ascertainment bias)

Italian goats

• Little or no inbreeding. • Variable level of admixture. • Some distinct breeds: Girgentana, Teramana, Orobica, Maltese. • Low Ne nowadays (bottlenecks, breeding management), higher in the past. • Geographic partition of diversity at small geographic scale (North-South and East-West in the Alps). • Markers under selection are valuable for conservation decisions • Neutral marker diversity is a reasonably good proxy of diversity of markers under directional selection

Breeding

• Breeding will be more and more guided by molecular analyses if cost continues to decrease • Methods customised to populations (small vs large pop. improvement, inbreeding control, maintenance of diversity) • Knowledge of population structure is needed for any kind of application to avoid false positives

30 International Networking

ADAPTMAP Goat Adaptmap

Traditional and novel approaches to study adaptation genomics: Selection signatures Spatial analysis

Enriched SNP panel Detection of new variation

To chacterize the study Population studies based on Mutation Classification Mutation effect on protein structure and function used as a tag for adaptation

Alessandra Stella

NextGen project

Pierre Taberlet

Francois Pompanon

Next generation methods to preserve farm animal biodiversity by optimizing present and future breeding options. 33

Interdisciplinarity and training GIS

Training

34 Future challenges

Copy Number Variations (CNVs)

Mefford and Eichler 2009 See following presentation of Fernando Garcia Final consideration

• Very fast molecular tool development. • Faster than our capacity to understand.

Under these circumstances any loss of diversity before characterization is a loss of unvaluable opportunity for science and agriculture

36 ACKNOWLEDGMENTS

Marco Milanesi Elia Vajana Lorenzo Bomba Licia Colli

Goat farmers International Goat Consortium (SNPChip) ASSONAPA (Italian small ruminant breeder association)