Leafing Through the Genomes of Our Major Crop Plants: Strategies for Capturing Unique Information
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS Leafing through the genomes of our major crop plants: strategies for capturing unique information Andrew H. Paterson Abstract | Crop plants not only have economic significance, but also comprise important botanical models for evolution and development. This is reflected by the recent increase in the percentage of publicly available sequence data that are derived from angiosperms. Further genome sequencing of the major crop plants will offer new learning opportunities, but their large, repetitive, and often polyploid genomes present challenges. Reduced- representation approaches — such as EST sequencing, methyl filtration and Cot-based cloning and sequencing — provide increased efficiency in extracting key information from crop genomes without full-genome sequencing. Combining these methods with phylogenetically stratified sampling to allow comparative genomic approaches has the potential to further accelerate progress in angiosperm genomics. Crop domestication ranks among the greatest of human them, and their frequent polyploidy. I also highlight achievements, and is closely correlated with human popu - new technologies and approaches that might accelerate lation growth and social evolution1. Scientific plant progress in our understanding of crop genomes. breeding, building on the accomplishments of domestica- tion, predates the discovery of the Mendelian principles Applications of crop genome information and is now complemented by genetic engineering. The Combining comprehensive sequence information with independent but sometimes convergent2–4 domestications knowledge of the morphological and physiological diver- of about 200 species5 has provided a broad sampling of sity of angiosperms and their well-understood phylogeny angiosperm diversity among our major crops. In addi- promises to answer many questions about crop genome tion, the intensive directional selection that is applied evolution and function. Genetic improvement of crops is during crop breeding has produced many extraordinary an essential means for meeting many basic needs of the models for aspects of plant morphology and develop- developing world, and global genetic resource collections ment, such as the single-celled seed epidermal trichomes for these plants and their relatives (BOX 1) form the basis (‘lint fibres’) of Gossypium spp. (cotton)6, the enlarged of future progress in this area. A genome sequence for vegetative and floral meristems of the Brassica genus, one representative of these species provides a platform and the enlarged carbohydrate-rich seeds of cereals. Crop for the formulation of hypotheses — for example about genomes, therefore, not only hold information that can possible relationships between specific DNA elements be used for further improvements, but also offer insights and phenotypes. Genetic resource collections can then into angiosperm biology in general. be used to test such hypotheses, translating hard-won The recent sequencing of the first angiosperm functional data that are gained from botanical (or other) genomes — Arabidopsis thaliana and two Oryza sub- models into increased crop productivity and/or quality, species — foreshadowed new opportunities that could and perhaps also elucidating adaptations that contributed Plant Genome Mapping result from having the complete genetic blueprints for to the evolutionary success of the angiosperms. Laboratory, University of our major crops. Here I explore the opportunities and Georgia, Athens, Georgia obstacles that are associated with obtaining and using Applications to crop improvement. Studying the relation- 30602, USA. e-mail: paterson@dogwood. these genomic sequences. I discuss the challenges that ship between molecular and morphological diversity is botany.uga.edu arise from the large and variable sizes of plant genomes, a key step in probing the consequences of prehistoric doi:10.1038/nrg1806 the abundance and distribution of repetitive DNA within domestications, and is also important for further 174 | MARCH 2006 | VOLUME 7 www.nature.com/reviews/genetics © 2006 Nature Publishing Group REVIEWS Box 1 | Global germplasm resources a Populus deltoides genotype, consisting of 47,000 stems that occupy 43 ha (REF. 12). Several others of a similar size The fact that most angiosperm seeds can be preserved in a viable condition for many are known, some as much as 43,000 years old13. years by storage at low temperature and humidity has led to the establishment of Further sequence information could allow the deduc- genebanks for most of the world’s leading crops. These provide insurance against habitat loss or crop disease epidemics, and are also useful as an archive from which scientists can tion of the gene repertoire and organization of the last obtain well-defined materials for research. Particularly noteworthy are the 11 genebanks universal common ancestor of the angiosperms. The that are currently maintained by the Consultative Group on International Agricultural Arabidopsis thaliana and Oryza spp. sequences have Research (CGIAR), in close association with breeding programmes that study and use the allowed us to begin to unravel the consequences of germplasm. As of 2001, the CGIAR genebanks held about 666,000 germplasm accessions genome duplications that have fragmented ancestral (plant or seed samples) of crops, forages and agroforestry trees93. gene orders and functions across multiple chromosomes. Several countries also maintain substantial germplasm collections, often focusing on Many other angiosperm genomes have undergone lineage- crops of national priority. The US Department of Agriculture Germplasm Resources specific duplications of sufficient size to be discernible Information Network collection is particularly extensive, including 464,864 accessions of from EST data14. However, smaller duplications might 11,329 species and 1,837 genera. Most of these taxa are represented by few accessions also have occurred that are too small to be resolved using — only 400 genera are present in more than 20 accessions, with the depth of REF. 15 representation being closely related to economic importance. this approach (for examples, see ) but which might often result in phenotypic consequences16. Detailed genetic maps that are based on sequence-tagged sites (STSs) reveal early clues about these duplications17, but the improving crops. The remarkable success that has been extent of their effect on genome structure and function achieved in modifying angiosperms for agricultural use awaits further sequencing. has been based largely on phenotypic data, rather than A better understanding of the nature and duration on a genetic, biochemical or molecular understanding. of the euchromatic state in angiosperms could also be Independent domestications that seem to be convergent obtained through an increased availability of sequence at the phenotypic level show non-random correlations information. Defined originally by differential hetero- in the locations of QTLs for the corresponding traits2–4. pycnosis (instrinsic stainability), euchromatin and This indicates that Vavilov’s ‘law of homologous series heterochromatin represent evolutionarily distinct in variation’7 (an early recognition of the fundamental components of genomes that can be detected by several similarity between different cultivated species) might comparative or in silico approaches18. Detailed analy- hold at the molecular level. However, frequent genome sis shows that most heterochromatin in rice has been duplications in plants create opportunities for subfunction- restructured since its divergence from Sorghum bicolor Germplasm alization and neofunctionalization — mechanisms by which (sorghum) about 42 million years ago, whereas gene The hereditary materials within a species. genes might have come to carry out different functions order in the neighbouring euchromatin has largely been from their ancestors or extant relatives in other taxa8. preserved18. A combination of cytological and sequence Subfunctionalization Sequence information for other angiosperms will allow information from other branches of the angiosperm Division of ancestral functions us to explore how such changes in gene function have family tree will help to explore the duration over which of a gene between duplicated affected botanical diversity. the euchromatic state has been preserved, and might copies of the original gene. A better understanding of the evolutionary fates of also provide clues about the factors that define it. Neofunctionalization duplicated genes could also be important for dissecting Increased sequencing could also explain why hetero- Evolution of new function(s) for epistasis, which has an important role in the genetics of zygosity in self-pollinating plants persists to a higher a gene, which are thought to be crop productivity, but has been difficult to study using degree than would be expected on the basis of Mendelian made possible by duplication 9,10 19 of the gene, with one copy molecular marker approaches . For example, the divi- principles of segregation . The identification of relictual retaining the ancestral sion of ancestral functions among different genes by heterozygosity in primary sequence is now possible, to a function. subfunctionalization could create ‘webs’ of permanent degree, using whole-genome shotgun (WGS) sequenc- interdependence, which might be a foundation for the ing approaches (detailed below)20, or by resequencing Epistasis formation of epistatic relationships among unlinked approaches. Such studies could