Bioinformatics Assisted Breeding, from QTL to Candidate Genes Pierre-Yves Chibon
Total Page:16
File Type:pdf, Size:1020Kb
Bioinformatics assisted breeding, From QTL to candidate genes Pierre-Yves Chibon Thesis committee Promotor Prof. Dr R.G.F. Visser Professor of Plant Breeding Wageningen University Co-promotor Dr H.J. Finkers Senior Scientist, Wageningen UR Plant Breeding Wageningen University & Research Centre Other members Prof. Dr P.C. de Ruiter, Wageningen University Dr E. Schultes, Leiden University Medical Centre Dr J.P.H. Nap, Hanze University of Applied Sciences, Groningen Dr R.A. de Maagd, Plant Research International, Wageningen This research was conducted under the auspices of the Graduate School: Experimental Plant Sciences (EPS) Bioinformatics assisted breeding, From QTL to candidate genes Pierre-Yves Chibon Thesis submitted in fulfillment of the requirements for the degree of doctor at Wageningen University by the authority of the Rector Magnificus Prof. Dr M. J. Kropff, in the presence of the Thesis committee appointed by the Academic Board to be defended in public on Thursday, November 7th 2013 at 11 a.m. in the Aula. Pierre-Yves Chibon Bioinformatics assisted breeding, from QTL to candidate genes PhD thesis Wageningen University, Wageningen, The Netherlands, 2013 With references, with summaries in English, French and Dutch. ISBN: 978-94-6173-736-6 Contents Chapter 1: General introduction ............................................................................................................. 9 Chapter 2: Genetic analysis of metabolites in apple fruits indicates an mQTL hotspot for phenolic compounds on Linkage Group 16 .......................................................................................................... 27 Chapter 3: MQ2: Visualizing multi-trait mapped QTL results. ............................................................... 49 Chapter 4: Marker2sequence, mine your QTL regions for candidate genes ........................................ 57 Chapter 5: Identification of transcription factor binding sites in tomato. ............................................ 61 Chapter 6: Annotex: Exploring the genome annotation ....................................................................... 87 Chapter 7: General discussion............................................................................................................. 101 References ........................................................................................................................................... 115 Summary ............................................................................................................................................. 129 Samenvatting ....................................................................................................................................... 133 Résumé ................................................................................................................................................ 137 Acknowledgements ............................................................................................................................. 141 Curriculum vitae .................................................................................................................................. 145 Publications ......................................................................................................................................... 147 Abbreviation table API Application Programming Interface BLAST Basic Local Alignment Search Tool cDNA Complementary DNA cM centiMorgan CSV Comma Separated Values DART Diversity Array Technology DNA Deoxyribonucleic Acid EBI European Bioinformatics Institute FAO Food and Agriculture Organization FAQ Frequently Asked Questions FTP File Transfer Protocol GCMS Gas Chromatography Mass Spectrometry GO Gene Ontology HTTP Hyper-Text Transfer Protocol IL Introgression Line ITAG International Tomato Annotation Group JSON JavaScript Object Notation LCMS Liquid Chromatography Mass Spectrometry LG Linkage Group M2S Marker2sequence MAS Marker Assisted Selection MEME Multiple Em for Motif Elicitation MFLP Microsatellite-anchored Fragment Length Polymorphism MIME Multipurpose Internet Mail Extensions mRNA Messenger RNA NAR Nucleic Acid Research NCBI National Center for Biotechnology Information NGS Next Generation Sequencing PGSC Potato Genome Sequencing Consortium PPI Protein-Protein Interaction QTL Quantitative Trait Loci RAPD Random Amplified Polymorphic DNA RDF Resource Description Framework REST Representation State Transfer RFLP Restriction Fragment Length Polymorphism RNA Ribonucleic Acid RSAT Regulatory Sequence Analysis Tools SNP Single Nucleotide Polymorphism SOAP Simple Object Access Protocol SPARQL SPARQL Protocol and RDF Query Language SSR Single Sequence Repeat TCP/IP Transmission Control Protocol / Internet Protocol TF Transcription Factor TFBS Transcription Factor Binding Site URI Unique Resource Identifier URL Unique Resource Locator W3c World Wide Web Consortium WSDL Web-Service Description Language WWW World Wide Web XML eXtensible Markup Language Chapter 1: General introduction Chapter 1 Plant breeding is a key factor in the future. A growing world population According to the United Nations (UN), the world population was just above 2.5 billion persons in 1950; just under 6.2 billion in 2000 and passed 7 billion in 2010. Estimations from 2011 predict that more than 7.5 billion humans will be living on the planet in 2017. The world population will thus have tripled in less than 70 years. Previsions are that the world population will reach 9 billion in 2038; 10 billion people in 2057 and by the end of the century, in 2100, will be just below 11 billion (United Nations, Department of Economic and Social Affairs, Population Division (2011). World Population Prospects: The 2010 Revision, CD-ROM Edition - http://esa.un.org/unpd/wpp/Excel- Data/population.htm). Maslow’s hierarchy of needs puts access to food (one of the physiological needs) as one of the most important needs (Maslow 1943). The Food and Agriculture Organisation (FAO) believes that food safety will be one of the major challenges for the coming years: “Producing 70 percent more food for an additional 2.3 billion people by 2050 while at the same time combating poverty and hunger, using scarce natural resources more efficiently and adapting to climate change are the main challenges world agriculture will face in the coming decades” (http://www.fao.org/news/story/en/item/35571/). We, humans, depend on agriculture directly or indirectly for food but also fuel, clothing and we compete with it for housing. As the world population increases, the competition on land for agriculture versus land for urban development will increase further but agricultural techniques and breeding will mitigate this. For example, between 1960 and 2000, the land used in agriculture world-wide has increased by 11% to reach 1.5 billion ha, while the world population has doubled (http://www.fao.org/docrep/005/y4252e/y4252e06a.htm). This low increase in land used for agriculture, is due to improved crops and agricultural techniques. These improvements have allowed, between 1961 and 1999, reducing by 56% the arable land required to produce any quantity of grain. Over this time period, the world average grain yield has increased from 1.4 T/ha to 3.05 T/ha (http://www.fao.org/docrep/005/y4252e/y4252e06a.htm). Plant breeding is therefore a key issue for the coming years. A short history of plant breeding and its goals Prehistoric visual selection of plants that facilitated the harvest or increased the productivity led to the first domesticated varieties (Harlan 1975). Since the domestication of the first plants 13,000 to 11,000 years ago, mankind has tried to develop plants, especially food plants, which better serve his needs. In recent years, this process has become a recognized scientific discipline named plant breeding (Allard 1999). The hybridizations and selection pressure applied by mankind over these 10,000 years has resulted in the domestication of wild varieties into hundreds of thousands of breeds, forming the basis of our current crops (McCouch 2004). This selection process however has reduced the genetic basis of the plants used for food production (Tester and Langridge 2010) leading to a situation where for instance in Russia, in 2006, more than 95% of all winter wheat varieties used are descendants of only two cultivars (Mba, Guimaraes et al. 2012). This narrow genetic base directly endangers food security as crops worldwide become susceptible to the same stresses (biotic or abiotic) and modern breeders use old, wild varieties to find genes to improve current crops (yield, resistance) (Gur and Zamir 2004). Breeders have two possibilities to improve current crops (McCouch 2004), either select for a superior individual among the existing possibilities or efficiently swap, 10 General Introduction replace or recombine to build a biological system from an extending range of possibilities which includes wild and old varieties containing traits lost in the course of domestication (Gur and Zamir 2004). Modern breeding relies on the revolution that have brought advances in biotechnology, genomic and molecular marker development and application (Moose and Mumm 2008). The evolution of modern breeding: from marker development to genome sequencing Modern breeding integrating new biotechnological approaches started in the early 1980s with the production of the first transgenic plants using Agrobacterium tumefaciens transformation (Bevan, Flavell et al. 1983; Fraley, Rogers et al. 1983; Herrera-Estrella,