BREEDWHEAT 2011 - 2020

Recueil des résultats majeurs SOMMAIRE 7 6 5 4 3 2 1 Bien géreretexploiterlesdonnées duprojet du génotypageetphénotypagehaut-débit Des nouvellesméthodesdesélectionbaséessur pour améliorerlaqualitédugrain Comprendre lasynthèsedesprotéinesderéserve et abiotiquespourlerendement Quantifier latoléranceauxcontraintesbiotiques le fonctionnementdublé et desméthodesdephénotypagepourcomprendre Développer desmodèlesécophysiologiques Mieux caractériserlesressourcesgénétiques et développementd’outilsmoléculaires Participation auséquençagedublétendre Introduction INTRODUCTION

Nous sommes maintenant en décembre 2020, et le projet BreedWheat se termine. Ce recueil synthétise les nombreuses avancées obtenues dans différents domaines tout au long de ces neuf années.

Nous reprendrons seulement en introduction les objectifs fixés lors de l’écriture du projet en 2010 pour répondre collectivement aux enjeux du changement global et notamment du changement climatique.

« Les études réalisées dans le cadre de BREEDWHEAT et soutenues par des capacités de phénotypage et de génotypage à haut débit permettront (i) de combiner des analyses génétiques, génomiques et écophysiologiques avec un phénotypage et un génotypage à haut débit pour réaliser des études d’association et identifier des marqueurs et des gènes candidats pour les caractères de rendement et de qualité en cas de stress abiotique et biotique, et (ii) de développer des outils, des méthodes et du matériel végétal innovants pour optimiser l’utilisation des ressources génétiques et sélectionner de nouvelles variétés de blé de qualité, de durabilité et de productivité améliorées. BreedWheat renforcera également le leadership français dans l’effort international actuellement en cours pour obtenir une séquence de référence du génome du blé panifiable au sein de l’IWGSC et permettra à nos scientifiques d’exploiter cette séquence rapidement et efficacement pour accroître la caractérisation et l’utilisation de la diversité génétique ainsi que pour étendre le bénéfice des études de génomique comparative. »

La diversité et l’originalité des outils et résultats générés par BreedWheat montre que ces objectifs ont été pleinement atteints. Parmi toutes les productions obtenues dans le cadre de BreedWheat, notamment les 50 articles scientifiques publiés à ce jour, les supports suivants ont été sélectionnés pour illustrer les avancées majeures. Ils constituent un recueil fondamental, un « héritage » destiné aux publics internes du projet et aux scientifiques. Celui-ci sera décliné et vulgarisé pour les autres multiples publics.

Pour faciliter la lecture de ce recueil, nous avons structuré le document en sept rubriques mais de nombreux ponts existent entre les différentes parties. INTRODUCTION

We are now in December 2020, and the BreedWheat project is coming to an end. This collection synthesizes the many advances obtained in different fields throughout these nine years. We will only resume in introduction the objectives set during the writing process of the project in 2010 to respond collectively to the challenges of global change and in particular climate change.

« Studies performed in BreedWheat and supported by high throughput phenotyping and genotyping capabilities will (i) combine genetic, genomic, and ecophysiology analyses with high throughput phenotyping and genotyping to perform association studies and identify markers and candidate genes for yield and quality traits under abiotic and biotic stress, and ii) develop innovative tools, methods, and plant material to optimize the utilization of genetic resources and breed for new wheat varieties with improved quality, sustainability, and productivity. BreedWheat will also strengthen French leadership in the international effort currently underway to obtain a bread wheat genome reference sequence within the IWGSC and position our scientists to exploit the sequence rapidly and efficiently to increase the characterization and use of genetic diversity as well as to extend the benefit of comparative genomics studies. »

The diversity and originality of the tools and results generated by BreedWheat shows that these objectives have been fully achieved. Among all productions obtained within the framework of BreedWheat, in particular the 50 scientific articles published to date, the following materials have been selected to illustrate the major advances. They constitute a fundamental collection, a «legacy» intended for the project’s internal audiences and scientists. It will be disseminated and popularized for other multiple audiences.

To facilitate the reading of this collection, we have structured the document in seven sections, but many bridges exist between the different parts. Pour en savoir plus

Version externe : Vous pouvez consulter l’intégralité des documents présents à l’aide du lien qui les accompagne et consulter le site internet www.breedwheat.fr

Version interne : Vous pouvez de plus avoir accès à la liste complète des documents présents en allant sur l’espace privatif du projet BreedWheat : www.sites.inra.fr/site/breedwheat

Le groupe de communication BreedWheat

Page 4 DOSSIER P�������� « I�������������� �’������ �

P����� B����W���� DES VARIÉTÉS DE BLÉ TENDRE pour aujourd’hui et demain

Étienne Paux - [email protected] ♦ Jérémy Derory - [email protected] Stéphane Lafarge - [email protected] ♦ Gilles Charmet - [email protected] ♦ Jacques Le Gouis - [email protected] avec la contribution de Marion Bondoux, Jean-Pierre Cohan, Michael Alaux, Geoffrey Perchet et le consortium BreedWheat

BreedWheat est un projet collaboratif dont l’objectif est de développer des outils innovants pour sélectionner et caractériser des variétés de blés adaptées aux contraintes conjecturelles actuelles et futures. Voici une revue de neuf ans d’accomplissements.

Le séquençage du génome du blé tendre a nécessité

un effort international de plus de treize ans. végétal du ARVALIS-Institut Moquet- M. ©

Septembre 2020 - N°480 �� ������������ ��������� Page 5 � Participation au séquençage du blé tendre 1 et développement d’outils moléculaires

Page 7 Shifting the limits in wheat research and breeding using a fully annotated reference genome International Wheat Genome Sequencing Consortium (IWGSC) - Science, 2018 Page 8 Linking the international wheat genome sequencing consortium bread wheat reference genome sequence to wheat genetic and phenomic data Michael Alaux, Jane Rogers, Thomas Letellier, Raphaël Flores, Françoise Alfama, Cyril Pommier, Nacer Mohellibi, Sophie Durand, Erik Kimmel, Célia Michotey, Claire Guerche, Mikaël Loaec, Mathilde Lainé, Delphine Steinbach, Frédéric Choulet, Hélène Rimbert, Philippe Leroy, Nicolas Guilhot, Jérôme Salse, Catherine Feuillet, International Wheat Genome Sequencing Consortium, Etienne Paux, Kellye Eversole, Anne-Françoise Adam-Blondon and Hadi Quesneville - Genome Biology, 2018 Page 9 Séquençage du génome du blé : un grand pas pour la science, un petit pas pour l’agriculture Etienne Paux - Rencontre filière semences céréales & protéagineux () - 2019 Page 10 High throughput SNP discovery and genotyping in hexaploid wheat Hélène Rimbert, Benoit Darrier, Julien Navarro, Jonathan Kitt, Frederic Choulet, Magalie Leveugle, Jorge Duarte, Nathalie Rivière, Kellye Eversole On Behalf Of The International Wheat Genome Sequencing Consortium, Jacques Le Gouis On Behalf The Breedwheat Consortium, Alessandro Davassi, Francois Balfourier, Marie-Christine Le Paslier, Aurelie Berard, Dominique Brunel, Catherine Feuillet, Charles Poncet, Pierre Sourdille, Etienne Paux - Plosone, 2018 RESEARCH

◥ as the human genome, polyploid, and complex, RESEARCH ARTICLE SUMMARY containing more than 85% repetitive DNA. To provide a foundation for improvement through molecular breeding, in 2005, the International WHEAT GENOME Wheat Genome Sequencing Consortium set out to deliver a high-quality annotated reference Shifting the limits in wheat research genome sequence of bread wheat. RESULTS: An annotated reference sequence and breeding using a fully annotated representing the hexaploid bread wheat ge- nome in the form of 21 chromosome-like se- quence assemblies has now been delivered, reference genome giving access to 107,891 high-confidence genes, including their genomic context of regulatory International Wheat Genome Sequencing Consortium (IWGSC)* sequences. This assembly enabled the discovery of tissue- and developmental stage–related gene INTRODUCTION: Wheat (Triticum aestivum L.) wheat biology and the molecular basis of cen- coexpression networks using a transcriptome is the most widely cultivated crop on Earth, tral agronomic traits. To meet the demands of atlas representing all stages of wheat develop- contributing about a fifth of the total calories human population growth, there is an urgent ment. The dynamics of change in complex gene consumed by humans. Consequently, wheat need for wheat research and breeding to ac- ◥ families involved in environ- Downloaded from yields and production affect the global econ- celerate genetic gain as well as to increase and ON OUR WEBSITE mental adaptation and end- omy, and failed harvests can lead to social protect wheat yield and quality traits. In other Read the full article use quality were revealed at unrest. Breeders continuously strive to develop plant and animal species, access to a fully an- at http://dx.doi. subgenome resolution and improved varieties by fine-tuning genetically notated and ordered genome sequence, includ- org/10.1126/ contextualized to known science.aar7191 agronomic single-gene or complex yield and end-use quality parameters ing regulatory sequences and genome-diversity ...... while maintaining stable yields and adapt- information, has promoted the development of quantitative trait loci. As- ing the crop to regionally specific biotic and systematic and more time-efficient approaches pects of the future value of the annotated as- http://science.sciencemag.org/ abiotic stresses. for the selection and understanding of im- sembly for molecular breeding and research portant traits. Wheat has lagged behind, pri- were exemplarily illustrated by resolving the RATIONALE: Breeding efforts are limited by marily owing to the challenges of assembling genetic basis of a quantitative trait locus con- insufficient knowledge and understanding of a genome that is more than five times as large ferring resistance to abiotic stress and insect damage as well as by serving as the basis for genome editing of the flowering-time trait. Grain CONCLUSION: This annotated reference se- quence of wheat is a resource that can now drive disruptive innovation in wheat improve- Spike on October 25, 2018 ment, as this community resource establishes the foundation for accelerating wheat research and application through improved understanding of wheat biology and genomics-assisted breeding. Importantly, the bioinformatics capacity devel- oped for model-organism genomes will facilitate a better understanding of the wheat genome as a result of the high-quality chromosome-based genome assembly. By necessity, breeders work Leaves with the genome at the whole chromosome level, as each new cross involves the modification of genome-wide gene networks that control the ex- pression of complex traits such as yield. With the annotated and ordered reference genome 5 cm sequence in place, researchers and breeders can Roots now easily access sequence-level information to precisely define the necessary changes in the genomes for breeding programs. This will be realized through the implementation of new DNA marker platforms and targeted breeding tech- Min Max nologies, including genome editing. Wheat genome deciphered, assembled, and ordered. Seeds, or grains, are what counts with ▪ respect to wheat yields (left panel), but all parts of the plant contribute to crop performance. With The list of author affiliations is available in the full article online. complete access to the ordered sequence of all 21 wheat chromosomes, the context of regulatory *Corresponding authors: Rudi Appels (rudi.appels@unimelb. sequences, and the interaction network of expressed genes—all shown here as a circular plot (right edu.au); Kellye Eversole ([email protected]); panel) with concentric tracks for diverse aspects of wheat genome composition—breeders and Nils Stein ([email protected]) Cite this article as International Wheat Genome Sequencing researchers now have the ability to rewrite the story of wheat crop improvement. Details on value Consortium, Science 361, eaar7191 (2018). DOI: 10.1126/ ranges underlying the concentric heatmaps of the right panel are provided in the full article online. science.aar7191

International Wheat Genome Sequencing Consortium (IWGSC), Science 361, 661 (2018) 17 August 2018 1 of 1

1

Page 7 Alaux et al. Genome Biology (2018) 19:111 https://doi.org/10.1186/s13059-018-1491-4

DATABASE Open Access Linking the International Wheat Genome Sequencing Consortium bread wheat reference genome sequence to wheat genetic and phenomic data Michael Alaux1* , Jane Rogers2, Thomas Letellier1, Raphaël Flores1, Françoise Alfama1, Cyril Pommier1, Nacer Mohellibi1, Sophie Durand1, Erik Kimmel1, Célia Michotey1, Claire Guerche1, Mikaël Loaec1, Mathilde Lainé1, Delphine Steinbach1,4, Frédéric Choulet3, Hélène Rimbert3, Philippe Leroy3, Nicolas Guilhot3, Jérôme Salse3, Catherine Feuillet3,5, International Wheat Genome Sequencing Consortium6, Etienne Paux3, Kellye Eversole7, Anne-Françoise Adam-Blondon1 and Hadi Quesneville1

Abstract The Wheat@URGI portal has been developed to provide the international community of researchers and breeders with access to the bread wheat reference genome sequence produced by the International Wheat Genome Sequencing Consortium. Genome browsers, BLAST, and InterMine tools have been established for in-depth exploration of the genome sequence together with additional linked datasets including physical maps, sequence variations, gene expression, and genetic and phenomic data from other international collaborative projects already stored in the GnpIS information system. The portal provides enhanced search and browser features that will facilitate the deployment of the latest genomics resources in wheat improvement. Keywords: Data integration, Information system, Big data, Wheat genomics, genetics and phenomics

Background To ensure that wheat breeding and research programs The International Wheat Genome Sequencing Consortium can make the most of this extensive genomic resource, (IWGSC) [1] is an international collaborative group of the IWGSC endorsed the establishment of a data reposi- growers, academic scientists, and public and private tory at URGI (Unité de Recherche Génomique Info/re- breeders that was established to generate a high-quality ref- search unit in genomics and bioinformatics) from INRA erence genome sequence of the hexaploid bread wheat, and (Institut National de la Recherche Agronomique/French to provide breeders with state-of-the-art tools for wheat im- national institute for agricultural research) to develop provement. The vision of the consortium is that the databases and browsers with relevant links to public data high-quality, annotated ordered genome sequence inte- available worldwide. The IWGSC data repository is thus grated with physical maps will serve as a foundation for the hosted by URGI to support public and private parties in accelerated development of improved varieties and will em- data management as well as analysis and usage of the se- power all aspects of basic and applied wheat science to ad- quence data. Wheat functional genomics (expression, dress the important challenge of food security. A first methylation, etc.), genetic, and phenomic data have in- analysis of the reference sequence produced by the consor- creased concurrently, requiring the development of add- tium (IWGSC RefSeq v1.0) was recently published [2]. itional tools and resources to integrate different data for biologists and breeders. To manage this escalation of data, URGI has built this data repository for the wheat commu- * Correspondence: [email protected] nity with the following specific aims: (1) to store resources 1URGI, INRA, Université Paris-Saclay, 78026 Versailles, for which no public archive exists (e.g. physical maps, Full list of author information is available at the end of the article

© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Page 8

Page 9 RESEARCH ARTICLE High throughput SNP discovery and genotyping in hexaploid wheat

1 1 1 1 1 He´lqne Rimbert ᄊ, Benoıˆt Darrier ᄊ, Julien Navarro , Jonathan Kitt , Fre´de´ric Choulet , 2 2 2 a 3 Magalie Leveugle , Jorge Duarte , Nathalie Riviqre ͞ , Kellye Eversole , on behalf of The International Wheat Genome Sequencing Consortium¶, Jacques Le Gouis4, on behalf The BreedWheat Consortium¶, Alessandro Davassi5, Franc¸ois Balfourier1, Marie-Christine Le Paslier6, Aure´lie Berard6, Dominique Brunel6, Catherine Feuillet1͞b, Charles Poncet1, Pierre Sourdille1, Etienne Paux1 a1111111111 1 GDEC, INRA, Universite´ Clermont Auvergne, Clermont-Ferrand, France, 2 Biogemma, Chappes, France, a1111111111 3 IWGSC, Eversole Associates, Bethesda, Maryland, United States of America, 4 BreedWheat, Clermont- a1111111111 Ferrand, France, 5 Affymetrix, High Wycombe, United Kingdom, 6 EPGV US 1279, INRA, CEA, IG-CNG, a1111111111 Universite´ Paris-Saclay, Evry, France a1111111111 ᄊ These authors contributed equally to this work. ͞a Current address: Limagrain, Chappes, France ͞b Current address: CropScience, Morrisville, North Carolina, United States of America ¶ Membership of the International Wheat Genome Sequencing Consortium and the BreedWheat Consortium can be found in the Acknowledgments. * [email protected] 23(1 $&&(66 Citation: Rimbert H, Darrier B, Navarro J, Kitt J, Choulet F, Leveugle M, et al. (2018) High throughput SNP discovery and genotyping in Abstract hexaploid wheat. PLoS ONE 13(1): e0186329. https://doi.org/10.1371/journal.pone.0186329 Because of their abundance and their amenability to high-throughput genotyping tech- niques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics Editor: Aimin Zhang, Institute of Genetics and Developmental Biology Chinese Academy of and genomics studies, including characterization of genetic resources, genome-wide asso- Sciences, CHINA ciation studies and genomic selection. In wheat, most of the previous SNP discovery initia-

Received: May 9, 2017 tives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight Accepted: September 13, 2017 wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic frac- Published: January 2, 2018 tions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on Copyright: ‹ 2018 Rimbert et al. This is an open the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the access article distributed under the terms of the development of the TaBW280K high-throughput genotyping array containing 280,226 Creative Commons Attribution License, which permits unrestricted use, distribution, and SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions reproduction in any medium, provided the original representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently author and source are credited. scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a Data Availability Statement: Sequences from very efficient tool for diversity analyses, as well as for breeding as it can discriminate Renan, Robigus, Premio and Xi19 have been between closely related elite varieties. Finally, the TaBW280K array was used to genotype a deposited on the EMBL European Nucleotide population derived from a cross between Chinese Spring and Renan, leading to the con- Archive (PRJEB16737). Sequences from Yitpi, Xiaoyan, Volcani and Westonia can be accessed at struction a dense genetic map comprising 83,721 markers. The results described here will the following address www.bioplatforms.com/ provide the wheat community with powerful tools for both basic and applied research. wheat-sequencing. The list of 3,289,847 intervarietal SNPs with fraction (ISBP or low-copy sequence), chromosomal origin, corresponding IWGSC contig, and context sequence can be downloaded either at https://figshare.com/articles/ Supplemental_Table_S1_zip/5501329 or at https://

PLOS ONE | https://doi.org/10.1371/journal.pone.0186329 January 2, 2018 1 / 19

Page 10 2 Mieux caractériser les ressources génétiques

Page 12 Caractérisation et exploitation de la diversité génétique - Livret du consortium Breedwheat. 2014 Page 13 Worldwide phylogeography and history of wheat genetic diversity F. Balfourier, S. Bouchet, S. Robert, R. De Oliveira, H. Rimbert, J. Kitt, F. Choulet, international wheat genome sequencing consortium, BreedWheat consortium, E. Paux, 2019

Page 12 SCIENCE ADVANCES | RESEARCH ARTICLE

GENETICS Copyright © 2019 The Authors, some Worldwide phylogeography and history of wheat rights reserved; exclusive licensee genetic diversity American Association for the Advancement François Balfourier1*†, Sophie Bouchet1†, Sandra Robert1, Romain De Oliveira1, Hélène Rimbert1, of Science. No claim to 1 1 2‡ original U.S. Government Jonathan Kitt , Frédéric Choulet , International Wheat Genome Sequencing Consortium , Works. Distributed 3‡ 1 BreedWheat Consortium , Etienne Paux * under a Creative Commons Attribution Since its domestication in the Fertile Crescent ~8000 to 10,000 years ago, wheat has undergone a complex history NonCommercial of spread, adaptation, and selection. To get better insights into the wheat phylogeography and genetic diversity, License 4.0 (CC BY-NC). we describe allele distribution through time using a set of 4506 landraces and cultivars originating from 105 different countries genotyped with a high-density single-nucleotide polymorphism array. Although the genetic structure of landraces is collinear to ancient human migration roads, we observe a reshuffling through time, related to breeding programs, with the appearance of new alleles enriched with structural variations that may be the signature of introgressions from wild relatives after 1960. Downloaded from

INTRODUCTION diversity (data file S1). These accessions were categorized in sets Bread wheat (Triticum aestivum L.) is an allohexaploid species that were relevant in terms of agricultural practices: landraces originating from two successive rounds of hybridization. The second corresponding to the original pool of worldwide diversity, traditional hybridization event is thought to have occurred in the Fertile Crescent cultivars registered before the Green Revolution and the global during the Neolithic, ~8000 to 10,000 years ago (1, 2). Then, bread introduction of dwarf genes (1960), and modern varieties registered http://advances.sciencemag.org/ wheat germplasm has evolved along ancient human migration after 1960. Following genotyping on a high-density single-nucleotide roads. It has been spread by the first farmers from this area both polymorphism (SNP) array containing 280,226 SNPs (6), a set of westward, to Europe, and eastward, to Asia, from 8500 to 2300 113,457 high-quality SNPs showing less than 2% missing data was before the present (3). After their dissemination in Europe and Asia, selected (data file S2). This dataset comprised 99,333 polymorphic domesticated wheat populations have adapted to local environments, high-resolution (PHR) biallelic SNPs and 14,124 off-target variants becoming so-called landraces. From the 16th century, bread wheat (OTVs), i.e., markers that detect both nucleotide polymorphisms was introduced in the New World, first in Latin America and then and presence-absence variations. The genomic position of these in Northern America and Australia (3). During the past two centuries, SNPs was determined using the International Wheat Genome breeding programs were organized in Europe and Asia to improve Sequencing Consortium (IWGSC) RefSeq v1.0 (7). The distribution these landraces. Last, after the Second World War, the introduction of markers on the homoeologous genomes was consistent with of dwarf genes in crops during the Green Revolution, particularly in previous studies (40% on the A-genome, 48% on the B-genome, on June 4, 2019 wheat, contributed to marked modifications in the gene pool over and 12% on the D-genome) (8, 9). the world (4). Today, with more than 220 million hectares and However, as chip-designed markers can lead to ascertainment bias almost 750 metric megatons produced every year, wheat is one of due to the marker type and selection (10), we inferred haplotype the most cultivated and consumed crops worldwide, providing 15% blocks and corresponding alleles along the genome (Fig. 1 and data of calories consumed every day. Since the transition from hunting- file S3). We parsed the bread wheat genome in 8741 sizable regions gathering to agriculture, bread wheat has been essential to the rise over which there was little evidence for historical recombination of civilizations. It has repeatedly been shaped by selection to meet and within which only a few common haplotypes were observed human needs and adaptation to different environments. Here, we (data file S3). Mean similarities between accessions were 0.69 report on a worldwide phylogeographical study aiming at understand- (SD = 0.07), 0.70 (SD = 0.05), and 0.49 (SD = 0.07) using 113,457 ing this complex history of wheat dissemination and differentiation. SNPs, 58,602 pruned SNPs, or 8741 haplotypes, respectively. Simi- larities calculated with haplotypes were much lower on average. The difference between similarities calculated with haplotypes or SNPs RESULTS AND DISCUSSION was low for individuals that were very similar, intermediate when Defining haplotypic blocks in the wheat genome individuals were very different, and high when individuals were Since previous studies highlighted the importance of both geograph- moderately different. The smaller similarities using haplotypes may ical and temporal effects in structuring wheat diversity (5), a set of be due to the higher number of alleles, particularly rare ones. It may 4506 wheat accessions was sampled that represented the worldwide allow us to reveal recent differentiations that are not tractable with SNPs. The number of haplotype blocks was highly correlated with 1 the overall number of SNPs per chromosome (Pearson correlation GDEC, INRA, Université Clermont Auvergne, 5 chemin de Beaulieu, 63000 Clermont- −5 Ferrand, France. 2IWGSC, 2841 NE Marywood Ct, Lee’s Summit, MO 64086, USA. coefficient R = 0.79, P = 2 × 10 ). The average number of haplo- 3BreedWheat, 5 chemin de Beaulieu, 63000 Clermont-Ferrand, France. types per block was 4, ranging from 2 to 20. The median size of *Corresponding author. Email: [email protected] (F.B.); etienne.paux@ haplotype blocks was 105 kb, with 85% of the blocks being shorter inra.fr (E.P.) †These authors contributed equally to this work. than 1 Mb. The mean size was 863 ± 4595 kb. This huge SD reflects ‡List of members can be found in the Supplementary Materials. the structural partitioning of wheat chromosomes. The size was

Balfourier et al., Sci. Adv. 2019; 5 : eaav0536 29 May 2019 1 of 10

Page 13 Développer des modèles écophysiologiques 3 et des méthodes de phénotypage pour comprendre le fonctionnement du blé

Page 15 CN-Wheat, a functional–structural model of carbon and nitrogen metabolism in wheat culms after anthesis. II. Model evaluation Romain Barillot, Camille Chambon And Bruno Andrieu - Annals Of Botany, 2016 Page 16 A method to estimate plant density and plant spacing heterogeneity: application to wheat crops Shouyang Liu, Frédéric Baret, Denis Allard, Xiuliang Jin, Bruno Andrieu, Philippe Burger, Matthieu Hemmerlé and Alexis Comar - Plant Methods, 2017 Page 17 Bridging the gap between ideotype and genotype: Challenges and prospects for modelling as exemplified by the case of adapting wheat (Triticum aestivum L.) phenology to climate change in France David Gouache, Matthieu Bogard, Marie Pegard, Stéphanie Thepot, Cécile Garcia, Delphine Hourcade, Etienne Paux, Francois-Xavier Oury, Michel Rousset, Jean-Charles Deswarte, Xavier Le Bris - Field Crops Research, 2017 Page 18 Management and characterization of using crop growth model stress via PhenoField®, a high throughput field phenotyping platform K. Beauchene, F. Leroy, A. Fournier, C. Huet, M. Bonnefoy, J. Lorgeou, B. De Solan, B. Piquemal, S. Thomas and J-P Cohan, 2019 Annals of Botany Page 1 of 17 doi:10.1093/aob/mcw144, available online at www.aob.oxfordjournals.org

CN-Wheat, a functional–structural model of carbon and nitrogen metabolism in wheat culms after anthesis. II. Model evaluation

Romain Barillot, Camille Chambon and Bruno Andrieu* UMR ECOSYS, INRA, AgroParisTech, Universite´Paris-Saclay, 78850 Thiverval-Grignon, France

*For correspondence. E-mail [email protected] Downloaded from

Received: 26 January 2016 Returned for revision: 28 May 2016 Accepted: 1 June 2016

 Background and Aims Simulating resource allocation in crops requires an integrated view of plant functioning and the formalization of interactions between carbon (C) and nitrogen (N) metabolisms. This study evaluates the functional–structural model CN-Wheat developed for winter wheat after anthesis. http://aob.oxfordjournals.org/  Methods In CN-Wheat the acquisition and allocation of resources between photosynthetic organs, roots and grains are emergent properties of sink and source activities and transfers of mobile metabolites. CN-Wheat was cali- brated for field plants under three N fertilizations at anthesis. Model parameters were taken from the literature or calibrated on the experimental data.  Key Results The model was able to predict the temporal variations and the distribution of resources in the culm. Thus, CN-Wheat accurately predicted the post-anthesis kinetics of dry masses and N content of photosynthetic or- gans and grains in response to N fertilization. In our simulations, when soil nitrates were non-limiting, N in grains was ultimately determined by availability of C for root activity. Dry matter accumulation in grains was mostly

affected by photosynthetic organ lifespan, which was regulated by protein turnover and C-regulated root activity. at INRA Institut National de la Recherche Agronomique on August 29, 2016  Conclusions The present study illustrates that the hypotheses implemented in the model were able to predict real- istic dynamics and spatial patterns of C and N. CN-Wheat provided insights into the interplay of C and N metabo- lism and how the depletion of mobile metabolites due to grain filling ultimately results in the cessation of resource capture. This enabled us to identify processes that limit grain mass and protein content and are potential targets for plant breeding.

Key words: Amino acids, carbon, cytokinins, fructans, process-based functional–structural plant model, nitrogen, proteins, plant metabolism and physiology, sink–source relations, sucrose, Triticum aestivum, wheat.

INTRODUCTION (photosynthesis and N uptake) and the growth of reproductive organs (grain filling)? On one hand, the availability of C and N For monocarpic species, such as winter wheat (Triticum aesti- for grain growth determines their biomass (crop production) vum), the post-anthesis period is a crucial stage when complex and protein content. On the other hand, C is also needed to interactions occur among vegetative organs and the growing maintain root activity, which will affect the amount of N in grains. Aerial vegetative organs are the main source of carbon photosynthetic organs, thus determining their activity and (C) due to their photosynthetic activity, in particular laminae thereby sugar synthesis. This trade-off is under environmental and chaffs (Araus et al., 1993), and also through the constitu- control but also depends on the genotype, as illustrated by the tion of reserve pools, e.g. fructans accumulated in stems stay-green behaviour of some crop cultivars. These are charac- (Schnyder, 1993). Besides, phloem C is necessary to maintain terized by their ability to maintain significant areas of vegeta- root activity, in particular regarding nitrate uptake (Simpson tive tissues that are still photosynthetic at maturity (Benbella et al., 1983; Thornley and Cannell, 2000). Post-anthesis nitrate and Paulsen, 1998; Borrell et al., 2001), but the underlying uptake can contribute 5–40 % of final grain nitrogen (N) ac- mechanisms remain unclear. cording to environmental conditions and genotypes (Kichey The trade-offs cited above result from the balance among et al., 2007; Bogard et al., 2010). Nevertheless, the mechanisms several physiological processes, and the genetic determinants involved in the decrease of uptake capacity after anthesis regulating these processes constitute crucial targets for agrono- (Oscarson et al., 1995) and the signal of N satiety (Taulemesse mists and breeders. The diversity of the underlying processes et al., 2015) still constitute areas of current research. Whole- and their genotypic and environmental regulation make it diffi- plant behaviour results from trade-offs in the allocation of cult to associate genetic parameters with specific plant traits. It resources between roots, leaves, stems and grains. These trade- therefore appears that an integrated view of plant functioning is offs raise some crucial issues for plant scientists. How is the necessary to unravel the partitioning rules of resources, their en- allocation of resources among shoot and roots regulated? This vironmental variability and their impact on measurable traits plastic trait is involved in the ability of plants to compete for (White et al., 2015). Functional–structural plant models that ac- above- and below-ground resources. What determines the bal- count for interactions between plant structure, functioning and ance between the maintenance of vegetative activity environment (Godin and Sinoquet, 2005; DeJong et al., 2011)

VC The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: [email protected]

Page 15 Liu et al. Plant Methods (2017) 13:38 DOI 10.1186/s13007-017-0187-1 Plant Methods

METHODOLOGY Open Access A method to estimate plant density and plant spacing heterogeneity: application to wheat crops Shouyang Liu1*, Fred Baret1, Denis Allard2, Xiuliang Jin1, Bruno Andrieu3, Philippe Burger4, Matthieu Hemmerlé5 and Alexis Comar5

Abstract Background: Plant density and its non-uniformity drive the competition among plants as well as with weeds. They need thus to be estimated with small uncertainties accuracy. An optimal sampling method is proposed to estimate the plant density in wheat crops from plant counting and reach a given precision. Results: Three experiments were conducted in 2014 resulting in 14 plots across varied sowing density, cultivars and environmental conditions. The coordinates of the plants along the row were measured over RGB high resolution images taken from the ground level. Results show that the spacing between consecutive plants along the row direc- tion are independent and follow a gamma distribution under the varied conditions experienced. A gamma count model was then derived to define the optimal sample size required to estimate plant density for a given precision. Results suggest that measuring the length of segments containing 90 plants will achieve a precision better than 10%, independently from the plant density. This approach appears more efficient than the usual method based on fixed length segments where the number of plants are counted: the optimal length for a given precision on the density estimation will depend on the actual plant density. The gamma count model parameters may also be used to quan- tify the heterogeneity of plant spacing along the row by exploiting the variability between replicated samples. Results show that to achieve a 10% precision on the estimates of the 2 parameters of the gamma model, 200 elementary samples corresponding to the spacing between 2 consecutive plants should be measured. Conclusions: This method provides an optimal sampling strategy to estimate the plant density and quantify the plant spacing heterogeneity along the row. Keywords: Wheat, Gamma-count model, Density, RGB imagery, Sampling strategy, Plant spacing heterogeneity

Background of individual plants during the growth cycle. For wheat Plant density at emergence is governed by the sowing crops which are largely cultivated over the globe, tillering density and the emergence rate. For a given plant den- is one of the main mechanisms used by the plant to adapt sity, the uniformity of plant distribution at emergence its development to the available resources that are partly may significantly impact the competition among plants controlled by the number of tillers per unit area. The till- as well as with weeds [1, 2]. Plant density and uniformity ering coefficient therefore appears as an important trait is therefore a key factor explaining production, although to be measured. It is usually computed as the ratio of the a number of species are able to compensate for low plant number of tillers per unit area divided by the plant den- densities by a comparatively significant development sity [3]. Plant density is therefore one of the first variables measured commonly in most agronomical trials. Crops are generally sown in rows approximately evenly *Correspondence: [email protected] 1 INRA, UMR-EMMAH, UMT-CAPTE, UAPV, 228 Route de l’aérodrome CS spaced by seedling devices. Precision seedling systems 40509, 84914 Avignon, France mostly used for crops with plants spaced on the row by Full list of author information is available at the end of the article

© The Author(s) 2017. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/ publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Page 16 Field Crops Research 202 (2017) 108–121

Contents lists available at ScienceDirect

Field Crops Research

journal homepage: www.elsevier.com/locate/fcr

Bridging the gap between ideotype and genotype: Challenges and prospects for modelling as exemplified by the case of adapting wheat (Triticum aestivum L.) phenology to climate change in France

David Gouache a,b,∗, Matthieu Bogard a,d, Marie Pegard b,1, Stéphanie Thepot c,2, Cécile Garcia c, Delphine Hourcade d, Etienne Paux e,f, Franc¸ ois-Xavier Oury e,f, Michel Rousset g, Jean-Charles Deswarte c, Xavier Le Bris h a ARVALIS Institut du Végétal, Rue de Noetzlin, Bat. 630, F-91405 Orsay, France b ARVALIS Institut du Végétal, Station Expérimentale, F-91720 Boigneville, France c ARVALIS Institut du Végétal, Route de Châteaufort ZA des Graviers, F-91190 Villiers le Bâcle, France d ARVALIS Institut du Végétal, 6, Chemin de la Côte Vieille, F-31450 Baziège, France e INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 Chemin de Beaulieu, F-63039 Clermont-Ferrand, France f Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, F-63177 Aubière Cedex, France g INRA, UMR 0320/UMR 8120 Génétique Végétale, F-91190 Gif-sur-Yvette, France h ARVALIS Institut du Végétal, Station Expérimentale de La Jaillière, F-44370 La Chapelle Saint Sauveur, France a r t i c l e i n f o a b s t r a c t

Article history: Simulations using crop models can assist designing ideotypes for current and future agricultural condi- Received 19 June 2015 tions. This approach consists in running simulations for different “in silico genotypes” obtained by varying Received in revised form 1 December 2015 the most sensitive genotypic parameters of these models, and analyzing results obtained for different Accepted 18 December 2015 environments, so as to identify the best genotypes for a target population of environments. However, Available online 13 January 2016 this approach has rarely been used to guide commercial breeding programs so far. In this paper, we attempt to address some of the gaps yet to be filled before this kind of approach can be implemented, Keywords: and identify some remaining issues that should be addressed in future research. Our focus is on opti- Wheat Earliness mizing wheat phenology, integrating simulations from a modified version of the ARCWHEAT model of Stem extension wheat growth stages with available knowledge on the genetic control of wheat phenology obtained via Marker-based model molecular markers. Based on simulations, stem extension could be advanced by 10 days in 2025–2049 Plasticity without increasing frost risks, thus opening up opportunities for lengthening the rapid growth period. Photoperiod sensitivity Analysis of the current genetic variability for major phenology genes in French elite varieties, showed that the insensitive PpdD1—spring Vrn3 allele combination appears undesirable and current genotypes with early stem extensions are unstable (i.e. show a strong response to temperature and can start the stem extension very early in case of mild winter temperatures). We finally use a case study on gene-based modelling of wheat phenology in France to illustrate how it can be used to dissect the genetic basis of the quantitative nature of the three components of earliness, beyond the effects of major genes. We identify the need to link the variability for optimized model parameters and the allelic variations at the gene level as a critical step of this type of approach. © 2015 Elsevier B.V. All rights reserved.

1. Introduction

Wheat is one of the world’s key staple crops (Reynolds et al., 2009), and has been impacted by worrisome negative impacts of climate change, in a great deal of regions worldwide, including Abbreviation: RMSEP, root mean square error of prediction. ∗ high-yield potential breadbaskets of Europe such as France (Brisson Corresponding author at: ARVALIS Institut du Végétal, Station Expérimentale, 91720 Boigneville, France. et al., 2010) and both higher and lower-yielding areas in develop- E-mail address: [email protected] (D. Gouache). ing countries (Lobell et al., 2005; Ortiz et al., 2008; Liu et al., 2014). 1 Present address: INRA, UR0588 Amélioration Génétique et Physiologie These existing trends are projected to continue, with a reported Forestières, F-45075 Orléans, France. 6% worldwide reduction in production per degree Celsius of 2 Present address: Bayer CropScience, Ferme Paly, F-91490 Milly-la-Forêt, France. http://dx.doi.org/10.1016/j.fcr.2015.12.012 0378-4290/© 2015 Elsevier B.V. All rights reserved.

Page 17 ORIGINAL RESEARCH published: 16 July 2019 doi: 10.3389/fpls.2019.00904

Management and Characterization of Abiotic Stress via PhénoField R ,a High-Throughput Field Phenotyping Platform

Katia Beauchêne1*, Fabien Leroy1, Antoine Fournier1, Céline Huet1, Michel Bonnefoy1, Josiane Lorgeou2, Benoît de Solan2, Benoît Piquemal2, Samuel Thomas2 and Jean-Pierre Cohan3

1 ARVALIS – Institut du Végétal, Ouzouer-le-Marché, France, 2 ARVALIS – Institut du Végétal, Boigneville, France, 3 ARVALIS – Institut du Végétal, La Chapelle-Saint-Sauveur, France

In order to evaluate the impact of water deficit in field conditions, researchers or breeders must set up large experiment networks in very restrictive field environments. Experience shows that half of the field trials are not relevant because of climatic conditions that do not allow the stress scenario to be tested. The PhénoField R platform is the first field Edited by: based infrastructure in the European Union to ensure protection against rainfall for a Yanbo Huang, large number of plots, coupled with the non-invasive acquisition of crops’ phenotype. In United States Department R of Agriculture (USDA), United States this paper, we will highlight the PhénoField production capability using data from 2017-

R Reviewed by: wheat trial. The innovative approach of the PhénoField platform consists in the use of Sebastien Christian Carpentier, automatic irrigating rainout shelters coupled with high throughput field phenotyping to Bioversity International, Belgium Quan Qiu, complete conventional phenotyping and micrometeorological densified measurements. Beijing Research Center of Intelligent Firstly, to test various abiotic stresses, automatic mobile rainout shelters allow fine Equipment for Agriculture, China management of fertilization or irrigation by driving daily the intensity and period of the *Correspondence: application of the desired limiting factor on the evaluated crop. This management is Katia Beauchêne [email protected] based on micro-meteorological measurements coupled with a simulation of a carbon, water and nitrogen crop budget. Furthermore, as high-throughput plant-phenotyping Specialty section: under controlled conditions is well advanced, comparable evaluation in field conditions This article was submitted to Technical Advances in Plant Science, is enabled through phenotyping gantries equipped with various optical sensors. This a section of the journal approach, giving access to either similar or innovative variables compared manual Frontiers in Plant Science measurements, is moreover distinguished by its capacity for dynamic analysis. Thus, Received: 28 September 2018 Accepted: 26 June 2019 the interactions between genotypes and the environment can be deciphered and better Published: 16 July 2019 detailed since this gives access not only to the environmental data but also to plant Citation: responses to limiting hydric and nitrogen conditions. Further data analyses provide Beauchêne K, Leroy F, Fournier A, access to the curve parameters of various indicator kinetics, all the more integrative Huet C, Bonnefoy M, Lorgeou J, de Solan B, Piquemal B, Thomas S and relevant of plant behavior under stressful conditions. All these specificities of the and Cohan J-P (2019) Management PhénoField R platform open the way to the improvement of various categories of crop and Characterization of Abiotic Stress via PhénoField R , a High-Throughput models, the fine characterization of variety behavior throughout the growth cycle and Field Phenotyping Platform. the evaluation of particular sensors better suited to a specific research question. Front. Plant Sci. 10:904. doi: 10.3389/fpls.2019.00904 Keywords: field phenotyping, drought tolerance, high throughput, rainout shelters, remote sensors

Frontiers in Plant Science | www.frontiersin.org 1 July 2019 | Volume 10 | Article 904

Page 18 Quantifier la tolérance aux contraintes biotiques 4 et abiotiques pour le rendement

Page 20 Breeding for bread wheat adaptation to abiotic and biotic stress - Livret du Consortium Breedwheat. 2015 Page 21 Coexpression network and phenotypic analysis identify metabolic pathways associated with the effect of warning on grain yield components in wheat Christine Girousse, Jane Roche, Claire Guerin, Jacques Le Gouis, Sandrine Balzegue, Said Mouzeyar, Mouhamed Fouad Bouzidi - Plos One, 2018 Page 22 Using environmental clustering to identify specific drought tolerance QTLs in bread wheat (T. aestivum L.) G. Touzy, R. Rincent, M. Bogard, S. Lafarge, P. Dubreuil, A. Mini, J-C. Deswarte, K. Beauchêne, J. Le Gouis and S. Praud, 2019 Page 23 Phenotypic characterization and identification of genomic regions associated with Fhb resistance in the French elite bread wheat varieties Lasserre-Zuber P, Saintenac C, Serre F, Soudière O, Joannin M, Roche S, Desray P, Tourvieille D, Langin T, 2015 Page 24 Improving Septoria tritici blotch (STB) resistance in wheat by genome wide association studies Guylaine Hebert, Stephane Lafarge, Gaëtan Touzy, Cyrille Saintenac, Thierry Marcel, Etienne Paux, Jacques Le Gouis, Sebastien Praud The BreedWheat consortium, 2017

Page 20 RESEARCH ARTICLE Coexpression network and phenotypic analysis identify metabolic pathways associated with the effect of warming on grain yield components in wheat

Christine Girousse1, Jane Roche1 , Claire Guerin1, Jacques Le Gouis1, Sandrine Balzegue2͞, Said Mouzeyar1, Mohamed Fouad Bouzidi1

1 GDEC, Universite´ Clermont Auvergne, INRA, Clermont–Ferrand, France, 2 Transcriptomic Platform of iPS2, INRA, Evry, France a1111111111 a1111111111 ͞ Current address: IRHS, INRA, AGROCAMPUS-Ouest, Universite´ d’Angers, SFR 4207 QUASAV, a1111111111 Beaucouze´, France * a1111111111 [email protected] a1111111111 Abstract

Wheat grains are an important source of human food but current production amounts cannot 23(1 $&&(66 meet world needs. Environmental conditions such as high temperature (above 30˚C) could Citation: Girousse C, Roche J, Guerin C, Le Gouis affect wheat production negatively. Plants from two wheat genotypes have been subjected J, Balzegue S, Mouzeyar S, et al. (2018) to two growth temperature regimes. One set has been grown at an optimum daily mean tem- Coexpression network and phenotypic analysis perature of 19˚C while the second set of plants has been subjected to warming at 27˚C from identify metabolic pathways associated with the effect of warming on grain yield components in two to 13 days after anthesis (daa). While warming did not affect mean grain number per wheat. PLoS ONE 13(6): e0199434. https://doi.org/ spike, it significantly reduced other yield-related indicators such as grain width, length, vol- 10.1371/journal.pone.0199434 ume and maximal cell numbers in the endosperm. Whole genome expression analysis iden- Editor: Aimin Zhang, Institute of Genetics and tified 6,258 and 5,220 genes, respectively, whose expression was affected by temperature Developmental Biology Chinese Academy of in the two genotypes. Co-expression analysis using WGCNA (Weighted Gene Coexpres- Sciences, CHINA sion Network Analysis) uncovered modules (groups of co-expressed genes) associated Received: February 15, 2018 with agronomic traits. In particular, modules enriched in genes related to nutrient reservoir Accepted: June 7, 2018 and endopeptidase inhibitor activities were found to be positively associated with cell num-

Published: June 25, 2018 bers in the endosperm. A hypothetical model pertaining to the effects of warming on gene expression and growth in wheat grain is proposed. Under moderately high temperature con- Copyright: ‹ 2018 Girousse et al. This is an open access article distributed under the terms of the ditions, network analyses suggest a negative effect of the expression of genes related to Creative Commons Attribution License, which seed storage proteins and starch biosynthesis on the grain size in wheat. permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Introduction Funding: This work was supported by the According to the fifth Assessment Report released by the Intergovernmental Panel on Climate BreedWheat project (ANR-10-BTBR-0003). Change (http://www.ipcc.ch/report/ar5/syr/), global surface temperature change, for the Competing interests: The authors have declared end of the 21st century, is projected to exceed 1.5˚C to 2˚C depending on the RCP scenario that no competing interests exist. (Representative Concentration Pathways). The rise in the average global temperature will be

PLOS ONE | https://doi.org/10.1371/journal.pone.0199434 June 25, 2018 1 / 22

Page 21 Theoretical and Applied Genetics (2019) 132:2859–2880 https://doi.org/10.1007/s00122-019-03393-2

ORIGINAL ARTICLE

Using environmental clustering to identify specific drought tolerance QTLs in bread wheat (T. aestivum L.)

Gaëtan Touzy1,2 · Renaud Rincent3 · Matthieu Bogard4 · Stephane Lafarge2 · Pierre Dubreuil2 · Agathe Mini3 · Jean-Charles Deswarte5 · Katia Beauchêne6 · Jacques Le Gouis3 · Sébastien Praud2

Received: 8 October 2018 / Accepted: 6 July 2019 / Published online: 19 July 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract Key message Environmental clustering helps to identify QTLs associated with grain yield in different water stress scenarios. These QTLs could be useful for breeders to improve grain yields and increase genetic resilience in marginal environments. Abstract Drought is one of the main abiotic stresses limiting winter bread wheat growth and productivity around the world. The acquisition of new high-yielding and stress-tolerant varieties is therefore necessary and requires improved understand- ing of the physiological and genetic bases of drought resistance. A panel of 210 elite European varieties was evaluated in 35 field trials. Grain yield and its components were scored in each trial. A crop model was then run with detailed climatic data and soil water status to assess the dynamics of water stress in each environment. Varieties were registered from 1992 to 2011, allowing us to test timewise genetic progress. Finally, a genome-wide association study (GWAS) was carried out using genotyping data from a 280 K SNP chip. The crop model simulation allowed us to group the environments into four water stress scenarios: an optimal condition with no water stress, a post-anthesis water stress, a moderate-anthesis water stress and a high pre-anthesis water stress. Compared to the optimal water condition, grain yield losses in the stressed conditions were 3.3%, 12.4% and 31.2%, respectively. This environmental clustering improved understanding of the effect of drought on grain yields and explained 20% of the G × E interaction. The greatest genetic progress was obtained in the optimal condi- tion, mostly represented in France. The GWAS identified several QTLs, some of which were specific of the different water stress patterns. Our results make breeding for improved drought resistance to specific environmental scenarios easier and will facilitate genetic progress in future environments, i.e., water stress environments.

Abbreviations G × E Genotype-by-environment QTL Quantitative trait loci ETs Environmental types GWAS Genome-wide association study OPT Optimal condition MET Multi-environment trials LWD Late water deficit MWD Medium water deficit HWD High water deficit Communicated by Albrecht E. Melchinger. PH Plant height Electronic supplementary material The online version of this HD Heading date article (https ://doi.org/10.1007/s0012 2-019-03393 -2) contains SA Spikes per area supplementary material, which is available to authorized users.

* Sébastien Praud 4 Arvalis-Institut du végétal, 6 Chemin de la côte vieille, [email protected] 31450 Baziège, France 5 Arvalis-Institut du végétal, Route de Châteaufort, ZA des 1 Arvalis-Institut du végétal, Biopôle Clermont Limagne, graviers, 91190 Villiers-le-Bâcle, France 63360 Saint-Beauzire, France 6 Arvalis-Institut du végétal, 45 voie Romaine, Ouzouer Le 2 Centre de recherche de Chappes, Biogemma, Route Marché, 41240 Beauce La Romaine, France d’Ennezat CS90216, 63720 Chappes, France 3 INRA, UCA UMR 1095, Génétique, Diversité et Ecophysiologie des Céréales, 24 Avenue des Landais, 63177 Aubière Cedex, France

Vol.:(0123456789)1 3

Page 22 PHENOTYPIC CHARACTERIZATION AND IDENTIFICATION OF GENOMIC REGIONS ASSOCIATED WITH FHB RESISTANCE IN THE FRENCH ELITE BREAD WHEAT VARIETIES Lasserre-Zuber P.1,2, Saintenac C.1,2, Serre F.3, Soudière O.1,2, Joannin M.3, Roche S.3, Desray P. 3, Tourvieille D.3, Langin T.1,2

1 INRA, UMR 1095, Genetics, Diversity and Ecophysiology of Cereals, F 63100 Clermont-Ferrand, France 2 UBP, UMR 1095, Genetics, Diversity and Ecophysiology of Cereals, F 63100 Clermont-Ferrand, France 3 INRA, UE 1375, PHénotypage Au Champ des Céréales, F 63100 Clermont‐ -Ferrand, France ‐ Introduction ‐

Granted by the French Stimulus Initiative, the BreedWheat project aims at giving sustainable solutions to breed for a safe and high-quality wheat production. While yield losses related to Fusarium head blight (FHB) average 0.2t/ha/year in France, grains contamination by mycotoxins may reach a significant percentage of the production. To evaluate the level of resistance to FHB of 220 bread wheat cultivars, a 3-year field trial network has been set in four different locations in France. This study will focus on the field trial achieved at INRA Clermont-Ferrand and inoculated with a pathogenic and mycotoxinogen isolate of Fusarium graminearum. A precise phenotypic characterization of the different cultivars and a genome-wide association study (GWAS) allowed identifying major wheat regions involved in resistance. Material & Methods

Material and field trial design TRAIT HERITABILITY ➢ 220 elite European cultivars (73% from France, 8% from UK and 5% from Italy) SEVERITY_350°C.dpi 0.86 ➢ Artificial inoculation of 3 randomized replicates (in blocks) with an isolate F. graminearum SEVERITY_450°C.dpi 0.92 Phenotyping SEVERITY_400°C.dpi 0.92 ➢ Disease severity (1-9 scale) and incidence (% of diseased spikes) at ~350 and ~450°C.dpi INCIDENCE_350°C.dpi 0.89 ➢ Fusarium damaged kernels (FDK) (%), relative grains weight (RGW: infected/non-infected ratio) INCIDENCE_450°C.dpi 0.90 ➢ Disease severity and incidence at 400°C.dpi were calculated from scores at ~350 and ~450°C.dpi INCIDENCE_400°C.dpi 0.91 for a more accurate comparison between cultivars FDK 0.91 Genotyping RELATIVE_WEIGHT 0.80 ➢ SNP genotyping (TaBW420K 240k SNP array, Axiom technology) Table 1: Genotypic broad sense heritability. ➢ 133 487 polymorphic SNPs, 92k SNPs suitable for GWAS, 57k genetically mapped Statistical analyses ➢ LD, structuration and kinship analyses were performed by Biogemma (R statistics software) ➢ Heritability mixed model with genotype as random effect: trait (corrected for block effects) ~ genotype + replicates + error ➢ Adjustement of phenotypic means ANOVA model: trait ~ genotype + block + replicate in block + error ➢ Correlation analyses on ajusted means for all phenotypic traits ➢ GWAS on ajusted means using a polygenic model with kinship information: trait ~ SNPmarker + polygenic effects + error Results

Figure 1: Trial heatmaps of raw data (a), residuals of ANOVA model (b) and ajusted means following the ANOVA model (c) for the trait FDK. Abscissa and ordinate indicate field positions in the field.

➢ Estimations of heritability were high, indicating the reliability of the method. (Table 1)

➢ An adjustement of phenotypic means regarding blocks and replicates effects was performed by trait for each cv. Figure 2: Traits data distributions, X-Y plots and ➢ Residuals trial heatmaps by trait indicated the correlations matrix between absence of other environmental effects. (Figure 1) phenotypic traits based on ajusted means. ➢ Correlation coefficients were high between: - Severity and incidence at 400°C.dpi (0.82) - FDK and incidence at 400°C.dpi (0.7) - RGW and FDK (-0.73) (Figure 2)

➢ Preliminary GWAS results showed several significant QTLs for severity at 400°C.dpi (8 QTLs), incidence at 400°C.dpi (7 QTLs) and FDK (13 QTLs). (Figure 3)

Conclusion

➢ The assessment of complementary phenotypic traits gave a precise description of the FHB resistance levels present in this panel, mostly consisting of elite French wheat varieties.

➢ Preliminary GWAS results showed the presence of 28 QTLs, among which potentially new resistance sources.

➢ The associated SNPs markers will contribute to a more accurate breeding for an increased and a more durable FHB resistance. 0 50 100 150 200 250 300 Map position (cM) Figure 3: Manhattan plot of chromosome 7A for Acknowledgments: To Jacques Le Gouis, Renaud Rincent, and Delphine Ly for their advice. the trait FDK. This study was possible thanks to the French Stimulus Initiative financing.

Page 23 Improving Septoria tritici Blotch (STB) resistance in Wheat by Genome Wide Association Studies

Guylaine Hebert, Stéphane Lafarge, Gaëtan Touzy, Cyrille Saintenac, Thierry Marcel, Etienne Paux, Jacques Le Gouis, Sébastien Praud The BreedWheat Consortium

Breedwheat : French research project on wheat

2011-2020

Versailles- Grignon

Angers- 15 public research laboratories (INRA, Nantes Universities…)

1 technical institute (Arvalis) Auvergne- Rhône-Alpes 10 Breeding companies PACA Bordeaux 1 competitiveness cluster (Céreales Vallée)

Toulouse

Page 24 Comprendre la synthèse des protéines de réserve 5 pour améliorer la qualité du grain

Page 26 Genome-wide identification of chromosomal regions determining nitrogen use efficiency components in wheat Jacques Le Gouis, Fabien Cormier, Agathe Mini, Katia Beauchene, David Gouache, Sebastien Praud, Stephane Lafarge, 2015 Page 27 Proteomic data integration highlights central actors involved in einkorn (Triticum monococcum) grain filling in relation to seed storage protein composition Bancel E, Bonnot T, Davanture M, Alvarez D, Zivy M, Martre P, Ravel C, 2019 Page 28 The bZIP transcription factor SPA heterodimerizing protein represses glutenin synthesis in Triticum aestivum Julie Boudet, Marielle Merlino, Anne Plessis, Jean-Charles Gaudin, Mireille Dardevet, Sibille Perrochon, David Alvarez, Thierry Risacher, Pierre Martre, Catherine Ravel - Plant Journal, 2018 Genome-wide identification of chromosomal regions determining nitrogen use efficiency components in wheat Jacques Le Gouis, Fabien Cormier, Agathe Mini, Katia Beauchêne, David Gouache, Sébastien Praud, Stéphane Lafarge

9th International Wheat Conference, Sydney, 20-25 September 2015

What are the N traits to select for ?

Traits taken into account in French national registration trials

GPD+ • GPD = Grain Protein Deviation – Proposed by Monaghan et al (2001) R~-0.8 – Bonus for GPD+ lines during registration in GPD-

France (2007) content protein Grain – About 30 lines bonified since 2007 Grain yield

12.0 14.00 NUE • = Nitrogen Use Efficiency ? 10.0 12.00

– First official trials at 3 N levels harvested in 2013 8.0 10.00

– Identification of potential criteria after two GrainYield (t/ha) experimental years 6.0 8.00 Grainprotein concentration (%)

– Bonus for « efficient » lines by 2016 ? 4.0 6.00 0 80 140 200 260 N applied (kg N / ha)

.02

Genome Wide Association Genetics for NUE IWC Sydney, 20-25 September 2015

Page 26 ORIGINAL RESEARCH published: 04 July 2019 doi: 10.3389/fpls.2019.00832

Proteomic Data Integration Highlights Central Actors Involved in Einkorn (Triticum monococcum ssp. monococcum) Grain Filling in

Edited by: Antonio Masi, Relation to Grain Storage Protein University of Padua, Italy Reviewed by: Composition Venkatesh Periyakavanam 1,2 1,2† 3 1,2 Thirumalaikumar, Emmanuelle Bancel *, Titouan Bonnot , Marlène Davanture , David Alvarez , 3 1,2† 4 1,2 Max Planck Institute of Molecular Michel Zivy , Pierre Martre , Sébastien Déjean and Catherine Ravel Plant Physiology, Germany 1 UMR GDEC, Institut National de la Recherche Agronomique (INRA), Université Clermont Auvergne, Clermont-Ferrand, Carlos Alberto Labate, France, 2 UMR1095, Genetics Diversity and Ecophysiology of Cereals, Clermont Auvergne University, Clermont-Ferrand, University of São Paulo, Brazil France, 3 UMR GQE, Institut National de la Recherche Agronomique (INRA), Centre National de la Recherche Scientifique *Correspondence: (CNRS), Agro ParisTech, Université Paris-Sud – Université Paris-Saclay, Gif-sur-Yvette, France, 4 Institut de Mathématiques Emmanuelle Bancel de , UMR5219 Université de Toulouse, Centre National de la Recherche Scientifique (CNRS), Toulouse, France [email protected] †Present address: Titouan Bonnot, Albumins and globulins (AGs) of wheat endosperm represent about 20% of total grain Department of Botany and Plant proteins. Some of these physiologically active proteins can influence the synthesis of Sciences, University of California, storage proteins (SPs) (gliadins and glutenins) and consequently, rheological properties Riverside, Riverside, CA, United States of wheat flour and processing. To identify such AGs, data, (published by Bonnot et al., Pierre Martre, 2017) concerning abundance in 352 AGs and in the different seed SPs during grain LEPSE, INRA, Montpellier SupAgro, Université de Montpellier, Montpellier, filling and in response to different nitrogen (N) and sulfur (S) supply, were integrated France with mixOmics R package. Relationships between AGs and SPs were first unraveled using the unsupervised method sparse Partial Least Square, also known as Projection to Specialty section: This article was submitted to Latent Structure (sPLS). Then, data were integrated using a supervised approach taking Plant Proteomics, into account the nutrition and the grain developmental stage. We used the block.splda a section of the journal procedure also referred to as DIABLO (Data Integration Analysis for Biomarker discovery Frontiers in Plant Science using Latent variable approaches for Omics studies). These approaches led to the Received: 15 February 2019 Accepted: 07 June 2019 identification of discriminant and highly correlated features from the two datasets (AGs Published: 04 July 2019 and SPs) which are not necessarily differentially expressed during seed development Citation: or in response to N or S supply. Eighteen AGs were correlated with the quantity of Bancel E, Bonnot T, Davanture M, Alvarez D, Zivy M, Martre P, Déjean S SPs per grain. A statistical validation of these proteins by genetic association analysis and Ravel C (2019) Proteomic Data confirmed that 5 out of this AG set were robust candidate proteins able to modulate Integration Highlights Central Actors the seed SP synthesis. In conclusion, this latter result confirmed that the integrative Involved in Einkorn (Triticum monococcum ssp. monococcum) strategy is an adequate way to reduce the number of potentially relevant AGs for further Grain Filling in Relation to Grain functional validation. Storage Protein Composition. Front. Plant Sci. 10:832. Keywords: albumin-globulin, data integration, nitrogen, proteomic, grain development, storage protein, sulfur, doi: 10.3389/fpls.2019.00832 Triticum monococcum

Frontiers in Plant Science | www.frontiersin.org 1 July 2019 | Volume 10 | Article 832

Page 27 The Plant Journal (2019) doi: 10.1111/tpj.14163 The bZIP transcription factor SPA Heterodimerizing Protein represses glutenin synthesis in Triticum aestivum

Julie Boudet1,*,**, Marielle Merlino1,**, Anne Plessis1,†, Jean-Charles Gaudin2,‡, Mireille Dardevet1, Sibille Perrochon1, David Alvarez1, Thierry Risacher3,§, Pierre Martre1,¶ and Catherine Ravel1 1UMR GDEC, INRA, Clermont Auvergne University, 63000, Clermont-Ferrand, France, 2UR BIA, INRA, 44316, Nantes, France, and 3Biogemma, Centre de Recherche de Chappes, 63720, Chappes, France

Received 6 March 2018; accepted 31 October 2018. *For correspondence (e-mail [email protected]). **These authors contributed equally to this manuscript. †Present address: School of Biological Sciences, Plymouth University, Drake Circus, Plymouth, PL4 8AA, UK. ‡Present address: UR AGPF, INRA, 45075, Orleans,� France. §Present address: Limagrain Europe, 63720, Chappes, France. ¶Present address: UMR LEPSE, INRA, Montpellier SupAgro, 34060, Montpellier, France.

SUMMARY

The quality of wheat grain is mainly determined by the quantity and composition of its grain storage pro- teins (GSPs). Grain storage proteins consist of low- and high-molecular-weight glutenins (LMW-GS and HMW-GS, respectively) and gliadins. The synthesis of these proteins is essentially regulated at the transcrip- tional level and by the availability of nitrogen and sulfur. The regulation network has been extensively stud- ied in barley where BLZ1 and BLZ2, members of the basic leucine zipper (bZIP) family, activate the synthesis of hordeins. To date, in wheat, only the ortholog of BLZ2, Storage Protein Activator (SPA), has been identi- fied as playing a major role in the regulation of GSP synthesis. Here, the ortholog of BLZ1, named SPA Het- erodimerizing Protein (SHP), was identified and its involvement in the transcriptional regulation of the genes coding for GSPs was analyzed. In gel mobility shift assays, SHP binds cis-motifs known to bind to bZIP family transcription factors in HMW-GS and LMW-GS promoters. Moreover, we showed by transient expression assays in wheat endosperm that SHP acts as a repressor of the activity of these gene promoters. This result was confirmed in transgenic lines overexpressing SHP, which were grown with low and high nitrogen supply. The phenotype of SHP-overexpressing lines showed a lower quantity of both LMW-GS and HMW-GS, while the quantity of gliadin was unchanged, whatever the nitrogen availability. Thus, the glia- din/glutenin ratio was increased, which suggests that gliadin and glutenin genes may be differently regu- lated.

Keywords: bZIP transcription factor, gluten, SPA Heterodimerizing Protein, storage proteins, wheat (Triticum aestivum L.).

INTRODUCTION and are composed of low- and high-molecular-weight sub- In cereal grains, nitrogen and sulfur, which are needed to units (LMW-GS and HMW-GS, respectively) (Shewry et al., to sustain embryo germination and early seedling develop- 1997; Shewry and Halford, 2002). Gliadins are monomeric ment, are mainly stored in the grain storage proteins proteins that are classified according to their elec- (GSPs) gliadin and glutenin. The quantities and propor- trophoretic mobility and amino acid sequence as a-, b-, c- tions of GSPs, which differ in their ability to form poly- or x-gliadins. mers, are key determinants of the end-use value of wheat The quantity and composition of GSPs in mature grain (Triticum aestivum L.) grain. Glutenins play an important are strongly affected by the nitrogen and sulfur nutrition of role in strengthening wheat dough by conferring elasticity, the parent plant. A high nitrogen supply increases the while gliadins contribute to its viscous properties by con- amount of GSPs at maturity (Shewry et al., 2001; Tribo€ı ferring extensibility (Branlard et al., 2001). Glutenins can et al., 2003; Chope et al., 2014). The GSP subclasses differ form very large macropolymers during grain desiccation in their relative proportions of sulfur-containing amino

© 2018 The Authors 1 The Plant Journal © 2018 John Wiley & Sons Ltd

Page 28 Des nouvelles méthodes de sélection basées sur 6 du génotypage et du phénotypage haut-débit

Page 30 A new way to use molecular markers to facilitate breeding: the genomic selection - Livret du consortium Breedwheat. 2016 Page 31 BWGS: a R package for genomic selection and its application to a wheat breeding programme G. Charmet, l. Gautier Tran, j. Auzanneau, R. Rincent, S. Bouchet, 2019 Page 32 Phenomic selection is a low-cost and high-throughput method based on indirect predictions: proof of concept on wheat and poplar Renaud Rincent, Jean-Paul Charpentier, Patricia Faivre-Rampant, Etienne Paux, Jacques Le Gouis, Catherine Bastien, and Vincent Segura - G3, 2018

Page 30 PLOS ONE

COLLECTION REVIEW BWGS: A R package for genomic selection and its application to a wheat breeding programme

1 1 2 1 Gilles CharmetID , Louis-Gautier Tran , Je´roˆ me Auzanneau , Renaud Rincent , Sophie Bouchet1

1 INRAE-UCA, UMR GDEC, Clermont-Ferrand, France, 2 Agri-Obtentions, Ferme de Gauvilliers, Orsonville, France

* [email protected] a1111111111 a1111111111 Abstract a1111111111 We developed an integrated R library called BWGS to enable easy computation of Genomic a1111111111 a1111111111 Estimates of Breeding values (GEBV) for genomic selection. BWGS, for BreedWheat Genomic selection, was developed in the framework of a cooperative private-public partnership project called Breedwheat (https://breedwheat.fr) and relies on existing R-libraries, all freely available from CRAN servers. The two main functions enable to run 1) replicated random cross valida- tions within a training set of genotyped and phenotyped lines and 2) GEBV prediction, for a set 23(1 $&&(66 of genotyped-only lines. Options are available for 1) missing data imputation, 2) markers and Citation: Charmet G, Tran L-G, Auzanneau J, training set selection and 3) genomic prediction with 15 different methods, either parametric or Rincent R, Bouchet S (2020) BWGS: A R package for genomic selection and its application to a wheat semi-parametric. The usefulness and efficiency of BWGS are illustrated using a population of breeding programme. PLoS ONE 15(4): e0222733. wheat lines from a real breeding programme. Adjusted yield data from historical trials (highly https://doi.org/10.1371/journal.pone.0222733 unbalanced design) were used for testing the options of BWGS. On the whole, 760 candidate Editor: Lewis Lukens, University of Guelph, lines with adjusted phenotypes and genotypes for 47 839 robust SNP were used. With a simple CANADA desktop computer, we obtained results which compared with previously published results on Published: April 2, 2020 wheat genomic selection. As predicted by the theory, factors that are most influencing predic- tive ability, for a given trait of moderate heritability, are the size of the training population and a Copyright: ‹ 2020 Charmet et al. This is an open access article distributed under the terms of the minimum number of markers for capturing every QTL information. Missing data up to 40%, if Creative Commons Attribution License, which randomly distributed, do not degrade predictive ability once imputed, and up to 80% randomly permits unrestricted use, distribution, and distributed missing data are still acceptable once imputed with Expectation-Maximization reproduction in any medium, provided the original author and source are credited. method of package rrBLUP. It is worth noticing that selecting markers that are most associated to the trait do improve predictive ability, compared with the whole set of markers, but only when Data Availability Statement: All relevant data are available: (https://forgemia.inra.fr/umr-gdec/bwgs). marker selection is made on the whole population. When marker selection is made only on the sampled training set, this advantage nearly disappeared, since it was clearly due to overfitting. Funding: This work was supported by the BreedWheat project thanks to funding from the Few differences are observed between the 15 prediction models with this dataset. Although French Government managed by the National non-parametric methods that are supposed to capture non-additive effects have slightly better Research Agency (ANR) in the framework of predictive accuracy, differences remain small. Finally, the GEBV from the 15 prediction models Investments for the Future (ANR-10-BTBR-03), are all highly correlated to each other. These results are encouraging for an efficient use of France AgriMer and the French Fund to support Plant Breeding (FSOV). https://breedwheat.fr/. genomic selection in applied breeding programmes and BWGS is a simple and powerful toolbox to apply in breeding programmes or training activities. Competing interests: The authors have declared that no competing interests exist.

PLOS ONE | https://doi.org/10.1371/journal.pone.0222733 April 2, 2020 1 / 20

Page 31 GENOMIC PREDICTION

Phenomic Selection Is a Low-Cost and High-Throughput Method Based on Indirect Predictions: Proof of Concept on Wheat and Poplar

Renaud Rincent,* Jean-Paul Charpentier,†,‡ Patricia Faivre-Rampant,§ Etienne Paux,* Jacques Le Gouis,* Catherine Bastien,† and Vincent Segura†,1 † ‡ *GDEC, INRA, UCA, 63000 Clermont-Ferrand, France, BioForA, INRA, ONF, 45075 Orléans, France, GenoBois analytical platform, INRA, 45075 Orléans, France, and §EPGV, INRA, CEA-IG/CNG, 91057 Evry, France ORCID IDs: 0000-0003-0885-0969 (R.R.); 0000-0002-6029-0498 (J.-P.C.); 0000-0002-3094-7129 (E.P.); 0000-0001-5726-4902 (J.L.G.); 000-0002-9391-6637 (C.B.); 0000-0003-1860-2256 (V.S.)

ABSTRACT Genomic selection - the prediction of breeding values using DNA polymorphisms - is a KEYWORDS disruptive method that has widely been adopted by animal and plant breeders to increase productivity. It Poplar was recently shown that other sources of molecular variations such as those resulting from transcripts or Wheat metabolites could be used to accurately predict complex traits. These endophenotypes have the advantage breeding of capturing the expressed genotypes and consequently the complex regulatory networks that occur in the endophenotypes different layers between the genome and the phenotype. However, obtaining such omics data at very large Near InfraRed scales, such as those typically experienced in breeding, remains challenging. As an alternative, we Spectroscopy proposed using near-infrared spectroscopy (NIRS) as a high-throughput, low cost and non-destructive tool (NIRS) to indirectly capture endophenotypic variants and compute relationship matrices for predicting complex Genomic traits, and coined this new approach ”phenomic selection” (PS). We tested PS on two species of economic Prediction interest (Triticum aestivum L. and Populus nigra L.) using NIRS on various tissues (grains, leaves, wood). GenPred We showed that one could reach predictions as accurate as with molecular markers, for developmental, Shared Data tolerance and productivity traits, even in environments radically different from the one in which NIRS were Resources collected. Our work constitutes a proof of concept and provides new perspectives for the breeding com- munity, as PS is theoretically applicable to any organism at low cost and does not require any molecular information.

To meet the world’s current and future challenges, especially in terms of strongly constrain the number of candidates that can be evaluated, food and energy supplies, there is a great need to develop efficient crop especially when there are interactions between individuals and envi- varieties, livestock breeds or forest materials through breeding. Until ronments that necessitate the evaluation of selection candidates in recently, the selection of promising individuals in animal and plant various environments. Another strong constraint - typical in perennial breeding was mostly based on their phenotypic records. This approach crops, trees or animals - is that it can sometimes take several years to was a strong limit to genetic progress as the high costs of phenotyping evaluate phenotypes, which increases the duration of selection cycles. These limitations are some of the main reasons why genomic selection Copyright © 2018 Rincent et al. (GS) has become so popular in the last two decades. Its principle is doi: https://doi.org/10.1534/g3.118.200760 based on a combination of phenotypic records and genome-wide mo- Manuscript received September 26, 2018; accepted for publication October 20, lecular markers to train a prediction model that can in turn be used to 2018; published Early Online October 29, 2018. This is an open-access article distributed under the terms of the Creative predict the performances of - potentially unphenotyped - individuals Commons Attribution 4.0 International License (http://creativecommons.org/ (Meuwissen et al. 2001). We can thus select more individuals faster, licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction which increases genetic gain. The development of high-throughput in any medium, provided the original work is properly cited. genotyping tools at decreasing costs has made GS possible for many Supplemental material available at Figshare: https://doi.org/10.25387/ animal and plant species. It can be used both in pre-breeding to screen g3.7243256. et al. et al. et al. 1Corresponding author: BioForA, INRA, ONF, 45075 Orléans, France. E-mail: diversity material (Crossa 2016; Yu 2016; Gorjanc 2016) [email protected] and in breeding to make the schemes more efficient (Heffner et al. 2010;

Volume 8 | December 2018 | 3961

Page 32 7 Bien gérer et exploiter les données du projet

Page 34 Big Data Management - Livret du Consortium Breedwheat. 2017 Page 35 Applying FAIR principles to plant phenotypic data management in GnpIS C. Pommier, C. Michotey, G. Cornut, P. Roumet, E. Duchêne, R. Flores, A. Lebreton, M. Alaux, S. Durand, E. Kimmel, T. Letellier, G. Merceron, M. Laine, C. Guerche, M. Loaec, D. Steinbach, M. A. Laporte, E. Arnaud, H. Quesneville, and A. F. Adam-Blondon, 2019 Page 36 BreedWheat GWAS data in GnpIS information system Mathilde Laine, Thomas Letellier, Raphaël Flores, Nacer Mohellibi, Cyril Pommier, Sophie Durand, Anne-Françoise Adam-Blondon, Hadi Quesneville, Frederic Sapet, Agathe Mini, Stephane Lafarge, Etienne Paux, François Balfourier, Jacques Le Gouis and Michael Alaux, 2018 Page 38 Ongoing wheat projects managing phenotyping data Michael Alaux, 2018

Page 34 AAAS Plant Phenomics Volume 2019, Article ID 1671403, 15 pages https://doi.org/10.34133/2019/1671403

Research Article Applying FAIR Principles to Plant Phenotypic Data Management in GnpIS

C. Pommier1,⋆, C. Michotey1, G. Cornut1, P. Roumet2, E. Duchêne3, R. Flores1, A. Lebreton1, M. Alaux1, S. Durand1, E. Kimmel1, T. Letellier1, G. Merceron1, M. Laine1, C. Guerche1, M. Loaec1, D. Steinbach1, M. A. Laporte4, E. Arnaud4, H. Quesneville1, and A. F. Adam-Blondon1

� URGI, INRA, Universite´ Paris-Saclay, ����� Versailles, France �AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France �UMR SVQV, �� rue de Herrlisheim, B.P. �����, ����� Colmar, France �Bioversity International, parc Scientifique Agropolis II, ����� Montpellier cedex �, France

⋆Correspondence should be addressed to C. Pommier; [email protected]

Received 8 January 2019; Accepted 8 April 2019; Published 30 April 2019

Copyright © ���� C. Pommier et al. Exclusive Licensee Nanjing Agricultural University. Distributed under a Creative Commons Attribution License (CC BY �.�).

GnpIS is a data repository for plant phenomics that stores whole field and greenhouse experimental data including environment measures. It allows long-term access to datasets following the FAIR principles: Findable, Accessible, Interoperable, and Reusable, by using a flexible and original approach. It is based on a generic and ontology driven data model and an innovative software architecture that uncouples data integration, storage, and querying. It takes advantage of international standards including the Crop Ontology, MIAPPE, and the Breeding API. GnpIS allows handling data for a wide range of species and experiment types, including multiannual perennial plants experimental network or annual plant trials with either raw data, i.e., direct measures, or computed traits. It also ensures the integration and the interoperability among phenotyping datasets and with genotyping data. This is achieved through a careful curation and annotation of the key resources conducted in close collaboration with the communities providing data. Our repository follows the Open Science data publication principles by ensuring citability of each dataset. Finally, GnpIS compliance with international standards enables its interoperability with other data repositories hence allowing data links between phenotype and other data types. GnpIS can therefore contribute to emerging international federations of information systems.

1. Introduction the environment and phenotypic variable values at each relevant scale (plant, micro plot, ...) and very importantly Plant phenotyping regroups all the observations and mea- the identification of the phenotyped germplasm, i.e., the sures that can be made on a precisely identified plant material plant material being experimented. In addition, there are in a characterized environment. This very general definition often relationships between levels, i.e., physical scales, inside of phenomics [�] includes diverse types of properties and datasets and between different datasets. The resulting rich variables measured at different physical [�] and temporal wealth of data is usually formatted in a very heterogeneous scales, ranging from field observation of plant populations to manner and is difficult to integrate automatically. molecular cell characterizations, including for some research Phenotyping experiments are expensive and are not community metabolomics or gene expression. The acqui- exactly reproducible since the environmental conditions are sition of these data is conducted in various experimental difficult if not impossible to completely control. Furthermore, facilities like greenhouses, fields, phenotyping networks, or most traits are highly dependent on genotype by environ- natural sites. It can be done using many different devices ment interactions, which increases again the uniqueness and from hand measurements to high throughput means. The the value of the data collected to describe environmental resulting complex and heterogeneous datasets include all conditions and resources available to the plants during their

Page 35 BreedWheat GWAS data in GnpIS information system

Mathilde LAINE1, Thomas LETELLIER1, Raphaël Flores1, Nacer MOHELLIBI1, Cyril POMMIER1, Sophie Durand1, Anne-Françoise Adam-Blondon1, Hadi QUESNEVILLE1, Frédéric SAPET2, Agathe MINI2, Stéphane Lafarge2, Etienne PAUX3, François BALFOURIER3, Jacques LE GOUIS3 and Michael ALAUX1

1: INRA, UR1164 URGI - Research Unit in Genomics-Info, INRA de Versailles, Route de Saint-Cyr, Versailles, 78026, France 2: Biogemma, route Ennezat, Chappes, 63720, France 3: INRA, UMR 1095 GDEC – Génétique Diversité Ecophysiologique des Céréales, INRA de Clermont-Ferrand, Domaine de Crouël, 5 chemin de Beaulieu, Clermont-Ferrand, 63039, France Résumé Le projet BreedWheat (https://breedwheat.fr/) a pour ambition de soutenir la compétitivité de la filière française de sélection du blé en répondant aux enjeux de société pour une production durable et de qualité. Ce projet permet de développer de nouvelles méthodologies de sélection et utilise des ressources génétiques inexploitées pour identifier et combiner des allèles d'intérêts pour de nouvelles variétés plus performantes dans des conditions de cultures respectueuses de l'environnement et adaptées au changement climatique.

L'URGI (Unité de Recherche en Génomique Info) est une unité de recherche INRA en génomique et bioinformatique dédiée aux plantes et leurs parasites. Elle développe et maintient un système d'information en génomique et génétique : GnpIS (Steinbach et al., Database 2013, doi:10.1093/database/bat058).

Les données BreedWheat disponibles dans GnpIS sont les ressources génétiques (collection de 5232 accessions), les polymorphismes (724020 SNPs provenant de 10 sources), le génotypage (Affymetrix Axiom TaBW420K array), le phénotypage (48000 micro-parcelles dans 21 lieux) et l’association (775621 résultats d’associations calculés à partir des valeurs de phénotypage et de génotypage) : https://wheat-urgi.versailles.inra.fr/Projects/BreedWheat.

L’interface de GnpIS permet l’affichage des valeurs d’association (avec des liens vers les métadonnées, les valeurs de phénotypage et de génotypage liées), ces données peuvent être filtrées selon plusieurs critères (ex : p-val) et visualisées de manière graphique (QQplot, boxlplot basé sur les allèles de génotypage, Manhattan plot mappé sur l’IWGSC RefSeq v1.0). Summary BreedWheat project (https://breedwheat.fr/) aims to support the competitiveness of the French wheat breeding sector, answering to societal challenges for a sustainable and quality production. Moreover, the BreedWheat project characterize yet poorly exploited genetic resources to expand the diversity of the elite germplasm. Finally, new breeding methods are developed and evaluated for their socioeconomic impact.

The URGI (Research Unit in Genomics Info) is an INRA research unit in genomics and bioinformatics dedicated to plants and their parasites. It develops and maintains an information system in genomics and genetics: GnpIS (Steinbach et al., Database 2013, doi: 10.1093/database/bat058).

Page 36 • • •

Page 37 Data management @ INRA (national and EU wheat projects)

Michael Alaux

24 September 2018 Michael Alaux

Ongoing wheat projects managing phenotyping data

Michael Alaux Page 38 @breedwheat www.breedwheat.fr

Ce travail bénéficie d’une aide de l’Etat gérée par l’Agence Nationale de la Recherche (ANR) au titre du programme Investissements d’avenir portant la référence ANR-10-BTR-03, de France AgriMer et Fonds de Soutien à l’Obtention Végétale (FSOV). Le projet a été labellisé par le GIS Biotechnologies Vertes.

This project receives funding from the French Government managed by the Research National Agency (ANR) in the framework of the Investments for the Future (ANR-10-BTBR-03), France AgriMer and the French Fund to support Plant Breeding (FSOV). The project has been labellised by the GIS Biotechnologies Vertes.