Triticum dicoccoides в Хайфском Университете: Генофонд, структурная и функциональная геномика, клонирование генов, иприложения

http://evolution.haifa.ac.il [email protected]/

ВИР, Санкт-Петербург, 5 октября 2011 The Institute of Evolution (IOE)

IOE was founded in 1976, by Prof. E. Nevo, now foreign member of the USA National Academy of Science. IOE includes ~30 researchers; 11 are full or associated professors. Research activities • Target organisms: microorganisms, plants, animals, humans •Basic scienceis linked to applied research in agriculture, medicine, biotechnology, and industry • Teaching at all levels – B.Sc., M.Sc., PhD, and posdocs • Annual publication ~60-100 research articles, 2 books, 1-2 patents • Grants: from BSF, BARD, GIF, DIP, CDR, INTAS, FP7, ISF The Institute of Evolution Research fields

‰ Biodiversity ‰ Molecular Biology & Evolution ‰ Evolutionary Cytogenetics ‰ Population Genetics ‰ Phylogenetics ‰ Genomics & Proteomics ‰ Bioinformatics ‰ Community Ecology ‰ Fungi Biotechnology ‰ Plant Ecology and Pollination ‰ Palaeobotany and Palaeoecology Evolution & Biodiversity studies

• Ecology and genetics of natural populations • Endangered species • Evolution in action • Assessment of environmental quality • Genetic resources and their utilization Wild cereals evolution & utilization program

• Origin of polyploid genome • Genetics of domestication • Genetic resources: evaluation and conservation (Gene Bank) • Cereal genomics: genetic/physical mapping, QTL analysis, MAS, cloning and sequencing

Wild progenitors of the Old World main domesticated plants

wild barley wild wheat H. spontaneum T. dicoccoides ErosionErosion ofof GeneticGenetic DDiversityiversity inin CCroprop PPlantslants

Domestication and Wild species plant breeding practices caused an erosion in the Domestication genetic diversity of crop plants. Landraces The most promising source for novel Plant Breeding genes and alleles are the gene pools of wild relatives of Modern varieties cultivated crops. Wild Emmer Wheat ™ discovered in Israel in 1906 ™ is the progenitor of domesticated wheat

Triticum turgidum ssp. dicoccoides (Genome AABB) Aaronsohn’s vision was to use wild emmer wheat gene pool for improvement of cultivated wheat

100 year later ……

Spike

Bread wheat

T. aestivum 2n=6x=42, AABBDD 2n=6x=42, with

crossed

be

Can Wild Emmer Wheat

T. 2n=4x=28, AABB, 2n=4x=28, In the last 50 years about 50% of the wild wheat natural populations in Israel were lost due to urban/industrial development Characterization of Wild Emmer Wheat Populations in Israel

Rosh Pinna Amirim Tabigha Nahef Achihood Yehudiya Ammiad Haifa Gamla Tiberias Yehudiya, April, 2005 Beit Oren Kohav Hayarden Bat Shlomo Mt. Gilbboa Agronomic traits Ma’ale Merav Yabad - Disease resistance Mediterranean sea Mediterranean Mediterranean sea Mediterranean Borj el Maliech Camp Brosh - Drought tolerance

Tel Aviv Ma’ale Efraim Gitit -Grain proteins Givat Koach Aman - Grain minerals (Fe, Zn) Taiyiba Kohav Hashahar - Salt tolerance Sanhedriya Jerusalem - Herbicide resistance

J’aba - Photosynthetic yield Biodiversity studies

Gene Bank of wild relatives of crop plants

● Wild emmer wheat, Triticum dicoccoides (from Israel, Turkey, Jordan) ● Wild barley, Hordeum spontaneum (from Israel, Turkey, and Iran) ● Wild wheat relative, Aegilops complex (170 populations of 11 species) ● Wild oat, Avena sterilis (1000 genotypes from Israel) ● Brachypodium, an important model cereal with a small genome ● Wild lettuce, Lactuca serriola (from Israel, USA, Europe; ~ 22 countries) ● Wild lettuce, Lactuca saligna (600 genotypes from Israel) ● Wild lettuce, Lactuca aculeata (180 genotypes from Israel, Turkey, Jordan)

The gene bank is used for basic research (genomics, population genetics and cytogenetics, domestication evolution) and enables coordinated and integrated sampling and conservation, and breeding utilization of wild germplasm. Wild Emmer Wheat Genetic Diversity

Agronomic traits Genetic Markers - Disease resistance -Allozymes - Drought tolerance - SSRs - Grain proteins - RAPDs - Salt tolerance - RGAs - Herbicide resistance -SNPs - Photosynthetic yield

=► Rich diversity for crop improvement Wheat Domestication Genomics

Study of genomic distribution of domestication QTLs in a cross T. durum × T. dicoccoides

- How many loci (QTLs) involved - How domestication QTLs are distributed

● among genomes and chromosomes ● within chromosomes ● linkage vs. pleiotropy - Characteristics of the QTLs Genetic dissection of complex traits

development genotype phenotype + markers environment observations (data)

“explaining” the phenotype QTL analysis ApplicationsApplications Genetics QTL of economic or medical importance Mol. Biol, Physiol., Fitness-related QTL Comp.Sci. Statistics Gene expression as molecular phenotype Appl. Math. About the map

Number of markers - 549 Chr 1A Chr 1B

Total map length - 3170 cM ● A genome - 1589 ● B genome - 1581 Average interval -9.9 cM ● A genome -11.1 ● B genome - 8.9 Marker clustering (χ2) ● A genome - 0.92 ns ● B genome - 19.97

Peng et al. 2000, Genome Res. 10:1509-31 Islands of negative interference in wheat chromosomes (e.g., 1B)

Negative interference seems to be not an exclusion (in wheat, barley, rye, Drosophila, and other species.

Together with marker reading errors, it may result in wrong marker orders Æ disagreement of genetic and physical maps Highly pleiotropic Domestication Syndrome Factors (DSF) in tetraploid wheat

DSF Chr. Involved QTL effects Interval Position (cM) DSF1 1B KNP, YLD, GWH, SNP, SWP Xgwm273a-Xgwm403a 68.6 ± 8.8 DSF2 1B KNP, KNS, KNL, YLD, SNP, SLS, SWP, SSW Xgwm124-P57M52u 133.9 ± 22.8 DSF3 2A KNP, KNS,YLD, HD, SNP, SWP, SNP, SWP Xgwm630c-Xgwm294 150.8 ± 31.4 DSF4 2A KNP, KNS, KNL, YLD, GWH, SLS, SSW, SWP Xgwm294-Br 216.8 ± 11.8 DSF5 3A KNP, KNS, KNL, YLD, SSW, SWP Xgwm218-Xgwm638 148.1 ± 28.4 DSF6 5A KNP, KNS, KNL, YLD, HT, SLS, SSW, SWP Xgwm154-P56M50m 72.5 ± 12.6 DSF7 5A all traits Xgwm186-P56M53c 144.2 ± 29.3

DSFs - domestication syndrome factors Summary of the mapping results

● Seven intervals of strong effects (DSFs) - domestication syndrome factors

● Six out of seven DSFs appear as linked pairs - some with same-sign QTL effects (e.g., on 5A), but most with opposite-sign effects (e.g., on 2A)

● AnAn excessexcess ofof detecteddetected QTLQTL effectseffects inin AA genomegenome

● Remarkably, DSFs coincide with gene-rich regions Mapping wheat domestication syndrome factors

at 1B Physical interval Marker cluster 1BS.sat18-0.50-1.00 XksuE19, Nor, 1BS.sat19-0.31-0.50 1BS.sat- XksuD14a, XksuD14b, 0.31 DSFs on 1B 1BS9-0.84-1.06 XksuD14c, Xcdo388, 1BS10-0.50-0.84 Xmwg36, Xmwg938, Xbcd1434 C-1BS10-0.50

168 ESTs* C1B C-1BL6-0.32 Nor

1BL6-0.32-0.47

1BL1-0.47-0.69 68 ESTs*

1BL2-0.69-0.85 Xbcd 508, Adh, Xbcd304, XksuG34, Xmwg710, Xbcd1562, Xwg241, 1BL3-0.85-1.00 XksuE11, Xabg387, XksuI27 Peng et al. 2003 *EST loci mapped as of July, 2003 Gill et al. 1996 PNAS 100: http://wheat.pw.usda.gov/NSF/proj Genetics 144: 2489-2494 ect/ mapping_data.html 1883-1891 Physical intervals

DSFs on 2A 2AS6-0.51-1.00

C-2AS6-0.51 Marker cluster

C2A Xpsr101, Xpsr102, Xpsr112, 2AL2-0.00 Xpsr388, XksuF15, XksuD22, XksuE16, XksuF2, XksuD8, Xbcd135 C-2AL1-0.85 305 ESTs* 104 ESTs*

XksuF41, XksuH16, 2AL1-0.85-1.00 XksuI24, Xwg645 Peng et al. * EST loci mapped as of 2003 Delaney et al. 2003 PNAS http://wheat.pw.usda.gov/NSF/ 2003 Theor Appl 100: 2489-94 project/mapping_data.html Genet 91: 568-73 Marker cluster Physical intervals XksuE2, XksuG13, XksuG53, 3AS4-0.45- XksuH7, XksuI32, Xpsr123, 1.00 Xpsr598, Xpsr698, Xpsr903, Xpsr902, Xpsr910, Xpsr926, 3AS2-0.23- Xglk724 0.45

C-3AS2- 183 ESTs* 0.23 C3A

C-3AL3-0.42 DSFs on 3A

3AL3-0.42- 0.78

3AL5-0.78- 1.00

Peng et al. * EST loci mapped as of 2003 Delaney et al. 2003 PNAS http://wheat.pw.usda.gov/NSF/ 2003 Theor Appl 100: 2489-94 project/mapping_data.html Genet 91: 780-2 Physical intervals

DSFs on 5A 5AS7/10-0.98-1.00 5AS3-0.75-0.98

5AS1-0.40-0.75

122 ESTs* C-5AS1-0.40

C5A

Marker clusters C-5AL12-0.35 XksuG7, Xbcdb351, Xbcd157, Xcdo412, XksuH1, Xksu8,Xtag614,

Xwg564, Xwg522, XksuQ45a 5AL12-0.35-0.57

Xcdo385, Xbcd1734, Xbcd1235a, Xbcd21, Xmwg604, Xmwg805, Xbcd9a, Xcdo388, Xabc168, Xmgb63, 5AL10-0.57-0.78

Xmgb1, Xmwg522, Xbcd183, Xcdo1326, and other 60 5AL17-0.78-0.87 markers 62 ESTs* 5AL23-0.87-1.00

Peng et al. * EST loci mapped as of 2003 Faris et al. 2000 2003 PNAS http://wheat.pw.usda.gov/NSF/ Genetics 154: 100: 2489-94 project/mapping_data.html 823-835 Summary of the mapping results

20

16

12

8 QTL effects 4

0 1 2 3 4 5 6 7 S 1 2 3 4 Single-QTL A 5 6 7 ← genom B Linked-QTL e → S+L Functional asymmetries in the tetraploid wheat genome

A B P1:1 QTL effects (FDR 5%) 37 16 0.004 Disease resistance genes 45 75 0.008 Mapped ESTs ×103 5.3 5.9 ns Mapped AFLP, % 40 60 0.007 Segregation-distorted loci 23 8 0.0004

Clustering of markers, χ2 0.92 19.9 <10-6 Genome asymmetry in population variation of wild wheat T. dicoccoides (Li et al. 2000, Mol. Biol. Evol. 17: 851-862)

32 7 Genome

28 6 24 5 A 20 •• 4 16

12 3 B SSR allele number •• Mean repeat number Mean repeat 8 2

Am T a Ye Am - Ammiad, Ta - Tabigha, Ye -Yehuddya Hypothesis

In the wild, B genome plays a more important role in adaptation of the allopolyploid to ecological conditions than A. Thus, during domestication the response to artificial selection was provided by allelic variation at A genome loci.

It is a testable hypothesis, but needs a high resolution of QTL analysis complemented by functional genomics and sequencing candidate genes Our new ISF project on wheat Domestication Genomics Evolutionary Cytogenetics

Genome in situ hybridization enables revealing and scoring evolutionary divergence of genomes as well as introgressions caused by interspecific recombination Some Applied Wheat Genomic Studies

• Software: Multi-Point, Multi-QTL and LTC - powerful tools for Genetic and Physical Mapping that enabled us to develop better maps of the complex wheat genome • Yr15, YrH52, PmG3M, Gpc-B1 and Yr36 - novel genes derived from wild emmer wheat • Gpc-B1 was the first QTL to be cloned in wheat • Yr36 (and Lr34) were the first quantitative resistance genes cloned in wheat • The cloning of these genes is emphasizing the importance of wild wheat gene pool for crop improvement Our mapping tools are distributed worldwide

USA, Canada, UK, France, Germany, Italy, Spain, Poland, Holland, Portugal, Czech Republic, Belgium, Switzerland, Russia, India, Japan, China, South Korea, Kenya, South Africa, Argentina, Australia, Israel Positional cloning of agriculturally important genes derived from wild emmer wheat

Gene Trait Type Location Current status

High grain protein Cloned Gpc-B1 and mineral QTL 6BS content Uauy et al. Science 2006 High-temperature Cloned stripe rust Yr36 QTL 6BS Fu et al. Science 2009 resistance

Stripe rust Major Yr15 1BS Physical mapping resistance gene

Stripe rust Fine mapping/ YrH52 Major 1BS resistance gene physical mapping

Powdery mildew Major PmG3M 6BL Fine mapping resistance gene Complexity of Positional Cloning in Wheat

Relative Genome Size 1. Genome size 17,000 Mbp Arabidopsis 145 Mbp (110×) Rice 430 Mbp (35×) Tomato 900 Mbp (16×) Human 3,500 Mbp (5×) 2. Polyploidy (Bikram Gill, personal comm.) Triticum aestivum (6x) AABBDD Triticum dicoccoides (4x) AABB

95% similarity between A, B & D genomes 3. Repetitive sequences More than 80% repetitive (mainly transposon elements) Wheat Genome is not sequenced (yet !!!) A QTL for Grain Protein Content was identified using a set of Chromosome Substitution Lines

A T. turgidum ssp. durum B Low GPC 1 2 3 4 5 6 7 LDN

A T. turgidum ssp. dicoccoides

B High GPC

1 2 3 4 5 6 7 Dic (Joppa et al., 1990) QTL mapping of Gpc-B1 on 6BS

2.6-cM Olmos et al, 2003 9 8 7 6 Exp. 1 5 Exp. 2 4 Exp. 3 3 2 LOD=3-8 1 PEV=71-87%

Xgwm508 NORXcdo365 Xucw67 Xucw66 Xgwm193

Wheat 6BS

XksuG8 Xbcd342 Xpsr8 Xcmwg652 10-cM Complete Physical Map of Gpc-B1

Rice chr. 2S physical map Xucw Xucw Xucw 79 EST1 EST2 EST3 EST4 EST5 71 85 5-kb

Linkage Map GPC region (chr 6BS wheat) Xucw791 Xucw84 1 Xucw71

Dic B1 Dic B2 Dic B3 Dic B4 Dic B5 Xucw89 Xucw86

≈ 250-kb

Distelfeld et al. (2006) New Phytologist Proposed model for Gpc-B1 effects

Earlier Senescence Degredation of cell components

Higher levels of soluble proteins and minerals available for remobilization

More efficient remobilization LDN DIC

Higher grain protein and mineral concentrations (N, Zn, Fe) QTL effects of GPC-B1 on: Protein, Zn and Fe concentrations in the grain

LDN DIC LDN DIC LDN DIC

140 70 45 130 65 120 40 110 60 35 100 55 90 30 Zn (mg x kg -1) 50 Fe (mg x kg -1) GPC(g x kg -1) 80 45 25 70 60 40 20 12 12 12

Year Year Year

GPC Zn Fe

Average increase over 2 years of: 18 % in GPC 11 % in Zinc 15 % in Iron

Distelfeld et al. 2007, In collaboration with I.Cakmak & H.Budak ImportanceImportance ofof GPC-B1 for wheat nutritional value

¾600 million tons of wheat per year Zinc deficiency in diet can affect a Produce more than range of functions: 60 million tons of protein/year • Immune system •Growth rate • Brain development • Sexual Reproduction QTL Mapping of Valuable Traits in Tetraploid Wheat

Gitit × LDN Mapping Population

(152 F6 Recombinant Inbred Lines)

152

Spike Morphology • Color Mineral Content • Length Drought tolerance • Seeds per spike • Shape • Weight • etc……. Genetic map based on 600 DNA markers

2300cM long, 7.5cM/Marker QTL map of grain nutrients 2A 2B 7A 7B 0.0 2.0 0.0

9.9 16.7 GPC

0.0 24.1 Ca

6.0 29.6 K 32.1 11.1 35.9 38.4 Mn 40.7 Fe

P 48.9 0.0 47.0 50.9 7.0 57.1 10.1

13.2 59.9 Cu 51.4 72.5 52.9 76.9 56.0 K Ca

Zn 79.6 Fe GPC 74.9 57.8 81.9

Mg 61.4 80.6 64.2 46.0 83.6 Cu Mg

47.6 Zn

102.7 Fe 82.4 S 50.1 95.8 51.5

88.4 Mn 54.6 116.5 S 110.4 P Fe

120.5 Zn Fe 112.0 71.3 Mg Zn 128.0 114.1 S GPC 109.0 77.8 Cu 118.4 91.1

K 150.4 131.7 P 141.2 159.8 104.2 146.7 142.8 166.2 153.3 168.9 GPC 170.8 163.4 176.0 124.9 162.8 182.9 172.1 129.8 173.7 170.4 137.0 138.1 182.3 181.3 188.8 Peleg et al. TAG (2009) Emmer wheat as a source for drought stress tolerance

Drought stress is a major challenge for several fields in plant biology. In this study we used a multidisciplinary approach: ‰ Population genetics ‰ QTL mapping ‰ Functional genomics Assisted by ‰ Plant physiology ‰ Metabolomics (a) (b)

SC SD RC RD SC SD RC RD (c) (d)

SC SD RC RD SC SD RC RD

(e) (f)

SC SD RC RD SC SD RC RD Transcriptome analysis to reveal candidate genes for drought stress tolerance in emmer wheat

Parallel Plot 1.5 R=resistant S=susceptible 1.0 D=drought W=well-watered

0.5 Expression patterns of 300 0.0 up-regulated transcripts -0.5 identified in the resistant -1.0 genotype

-1.5 S S R R StdLsmean geno*irri Y D StdLsmean geno*irri Y C StdLsmean geno*irri A D WStdLsmean geno*irri A C D R D Transcriptome Analysis of Response to Drought

Langdon Gitit 18-16 5B

0.0 Xgwm234

9.5 Xgwm443

19.8 X346957

Wild emmer wheat 34.2 Xgwm1180 Durum wheat 38.7 XwPt-1951 43.0 Xbarc128a 49.6 Xgwm1108 Flroll 54.8 Xgwm371 59.9 Xgwm831 62.0 Xgwm1191 66.5 Xgwm499

49 77.0 Xwmc415b

87.7 X348501 89.8 X374741 42 95.6 XwPt-3661 98.4 Xgwm1043 104.0 Xgwm777 LDN G 18-16 Candidate gene mapping 109.8 XwPt-0498 HI_w 35 50 transcripts 207 transcripts and QTL co-localization 121.3 XwPt-1733 128.2 Xgwm408 62K transcripts 133.2 XwPt-5896 OP 28 Chl d13C DH-M_d geno*irri = (G d)-(L d)

21 156.7 X306463 163.9 XwPt-6880

173.2 X310864 14 181.5 Xgwm1016b d13C 191.2 Xgwm814 -log10(p-value) for Diff of 7 202.4 X304255 208.6 X305419 209.8 XwPt-0837 0

-3 -2 -1 0 1 2 3

Diff of Geno*Irri= (G18-16 drought)-(LDN drought) Photosynthetic rateofdroughtresistantvs. drought susceptible genotypesunderdroughtstress

-2 -1 CO2 assimilation, μmol m s Intercellular CO 2 ppm A24-39 (S) Y12-3 (R) 30% diff GO enrichment analysis

Number of transcripts and Genome Biological process Cluster frequencies frequency of GO term Out of 837 Out of 514 56,985 used DETs - R DETs – S rice genes Response to stress 104, 12.4% 61, 11.9% 2516, 4.4% Response to endogenous 76, 9.1% n.s. 3963, 7.0% stimulus Response to abiotic 70, 8.4% 48, 9.3% 2308, 4.1% stimulus Response to biotic 47, 5.6% 33, 6.4% 2019, 3.5% stimulus Response to external 27, 3.2% 14, 2.7% 797, 1.4% stimulus Response to stimulus 180, 21% 121, 23.5% 6182, 10.8% Adaptive mechanisms of drought response identified in the resistant genotype

9Abscisic acid (ABA) dependent stomatal closure 9Cell wall adjustment 9Cuticular wax formation 9Lignification 9Osmoregulation and cell homeostasis 9Energy transfer 9Dehydration protection 9Maintained metabolism and catalytic activity 9Delayed senescence International effort for wheat genome physical mapping Chromosome 1BS - University of Haifa Physical mapping with BAC clones

Genomic library: Breaking the DNA, cloning the fragments in BACs, and ordering

Let us cut the isolated DNA with a restriction enzyme taken at a low concentration many sites will remain unrestricted

1 2 3 4 5 Cleavage site 6 1,...,6 Cloned DNA Fragments Æ BAC library BAC Fingerprinting by Fragment Separation

96 samples, 25 marker lanes Marker every fifth lane

Marra et al., Genome Res., 7, 1072-1084 (1997) Building a contig map

An ordered set of clones Genome mapping problems are computationally challenging

“… We have been looking at the assemblies of large genomes … and for every ‘draft’ genome we look at, we find hundreds - and sometimes thousands - of mis-assemblies”.

Salzberg & Yorke (2005) Beware of mis-assembled genomes. Bioinformatics, 21: 4320-4322 Network representation of significant clone overlaps

vertices correspond to clones and edges – to significant clone overlaps

PAG-19 2011 Putative Q-clones and Q-overlaps

Frenkel et al. 2010

PAG-19 2011 Network representation of significant clone overlaps

clones

clones from the selected diametric path (MTP)

wheat 1B

PAG-19 2011 Identification of contig non-linearity

PAG-19 2011 1BS assembly: FPC vs. LTC

In total 49,412 clones (av. size 113 kb), coverage 17.7

FPC LTC Contigs with ≥6 clones 517 385 Clones in contigs 33,262 33,912 Mean clones/contig 64.3 88.1 Clones in MTPs 3,647 3,827 Coverage by MTP 270 Mb (86%) 283 Mb (90%)

MTPs were constructed by LTC using e-25 cutoff Parallel genomes for 1BS 315 Mbp

38 cM 0 1Bs

Bd2 30 40 Mbp

R5 12 Mbp 0

Sb9 17 Mbp 0 1Hs 50 cM 0 Genomic Organization and Diversity of Stripe Rust Resistance Genes Derived from Wild Emmer Wheat

How many genes are involved? What is their genomic organization?

Theoretical

calculations Haifa Tiberias = Mediterrane an sea Mediterrane an sea

11 novel Yr Tel Aviv Aman genes Jerusalem Yr15 yr15 Prospects for association mapping in T. dicoccoides

Decay in LD between SNP markers across genome Prospects for association mapping in T. dicoccoides

Decay in LD within leaf rust resistance gene LR10 in Israeli populations of wild emmer

Sela et al. TAG 122:175–187, 2011 Participating groups

Tzion Fahima, Tamar Krugman, Abraham Korol, Eviatar Nevo, Vladimir Frenkel, Yuchun Li, Junhua Peng, Olga Raskina, Alexander Belyaev, Hanan Sela, Assaf Distelfeld, Dina Raats, Institute of Evolution, University of Haifa, Israel

Jorge Dubcovsky, Cristobal Uauy, Lynn Epstein, Department of Plant Science, University of Calofornia, Davis

Yehoshua Saranga, Zvi Peleg, Asaf Avneri, Institute of Plant Science & Genetics, Hebrew University of Jerusalem, Israel Boulos Chalhoub, Veronique Chague, Sandrine Balzergue, Nathalie Boudet, Organization and Evolution of Plant Genomes, Unité de Recherche en Génomique Végétale, Evry, France

Etienne Paux, Catherine Feuillet, INRA, Amélioration et Santé des Plantes, Clermont-Ferrand, France

Aaron Fait, Lydia Quansah, Ben Gurion University of the Negev, Israel

Marion Röder, Institute of Plant Genetics & in Crop Plant Research (IPK), Gatersleben, Germany Благодарю за терпение

Please, visit our website: http://evolution.haifa.ac.il