Comparative Physical Mapping Links Conservation of Microsynteny to Chromosome Structure and Recombination in Grasses

Comparative physical mapping links conservation of microsynteny to chromosome structure and recombination in grasses John E. Bowers*, Miguel A. Arias*, Rochelle Asher*, Jennifer A. Avise*, Robert T. Ball*, Gene A. Brewer*, Ryan W. Buss*, Amy H. Chen*, Thomas M. Edwards*, James C. Estill*, Heather E. Exum*, Valorie H. Goff*, Kristen L. Herrick*, Cassie L. James Steele*, Santhosh Karunakaran*, Gmerice K. Lafayette*, Cornelia Lemke*, Barry S. Marler*, Shelley L. Masters*, Joana M. McMillan*, Lisa K. Nelson*, Graham A. Newsome*, Chike C. Nwakanma*, Rosana N. Odeh*, Cynthia A. Phelps*, Elizabeth A. Rarick*, Carl J. Rogers*, Sean P. Ryan*, Keimun A. Slaughter*, Carol A. Soderlund†, Haibao Tang*, Rod A. Wing‡, and Andrew H. Paterson*§ *Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602; and †Arizona Genomics Computational Laboratory and ‡Arizona Genomics Institute, University of Arizona, Tucson, AZ 95721 Edited by Ronald L. Phillips, University of Minnesota, St. Paul, MN, and approved July 28, 2005 (received for review March 22, 2005) Nearly finished sequences for model organisms provide a founda- duplication in sorghum appears to be Ϸ70 mya (1) vs. Ϸ12 mya in tion from which to explore genomic diversity among other taxo- maize (4) and Ͻ5 mya in sugarcane (5), promising a higher success nomic groups. We explore genome-wide microsynteny patterns rate in relating sorghum genes to phenotypes by knockouts than between the rice sequence and two sorghum physical maps that either maize or sugarcane genes. Comparison of SB and closely integrate genetic markers, bacterial artificial chromosome (BAC) related Sorghum propinquum (SP) promises to shed new light on fingerprints, and BAC hybridization data. The sorghum maps genes related to domestication (2) and expedite dissection of the largely tile a genomic component containing 41% of BACs but 80% genetic control of key features of ‘‘Johnsongrass’’ (Sorghum of single-copy genes that shows conserved microsynteny with rice halepense), a polyploid descendant of SB and SP that is one of the and partially tile a nonsyntenic component containing 46% of BACs world’s most noxious weeds (6). but only 13% of single-copy genes. The remaining BACs are We describe two genetically anchored Sorghum physical maps. centromeric (4%) or unassigned (8%). The two genomic compo- Anchoring of thousands of genetically mapped sequence-tagged nents correspond to cytologically discernible ‘‘euchromatin’’ and sites (STSs) permits their detailed comparison with the rice se- ‘‘heterochromatin.’’ Gene and repetitive DNA distributions support quence, to explore levels and patterns of conserved microsynteny this classification. Greater microcolinearity in recombinogenic (eu- on a genome-wide scale. Microsynteny is conserved in euchromatin chromatic) than nonrecombinogenic (heterochromatic) regions is but not in heterochromatin. A strong association of conserved consistent with the hypothesis that genomic rearrangements are microsynteny with recombinogenicity suggests that intergenic gene usually deleterious, thus more likely to persist in nonrecombino- rearrangement (i.e., changes in gene order or location in the genic regions by virtue of Muller’s ratchet. Interchromosomal genome) is generally deleterious. Hybridization of conserved STS centromeric rearrangements may have fostered diploidization of a markers to bacterial artificial chromosome (BAC) libraries for polyploid cereal progenitor. Model plant sequences better guide other cereal and noncereal monocots may leverage conserved studies of related genomes in recombinogenic than nonrecombi- microsynteny to study their euchromatin. However, the high levels nogenic regions. Bridging of 35 physical gaps in the rice sequence of colinearity suggested by cross-species comparisons of genetic by sorghum BAC contigs illustrates reciprocal benefits of compar- recombination maps mask much rearrangement in recombination- ative approaches that extend at least across the cereals and ally inert (heterochromatic) regions. Such regions may require perhaps beyond. physical rather than genetic characterization, with little guidance from cross-species data. comparative genomics ͉ Oryza ͉ synteny Materials and Methods he grasses (Poaceae) provide most of mankind’s caloric intake Probe Selection, Design, and Hybridization. Overgo (7) probes were Tand a growing share of our fuel. The best-studied grasses, designed from sequences of interest by BLAST comparison leading cereal crops, shared a common paleopolyploid ancestor against all plant species to identify the most-conserved 40-bp Ϸ42–47 million years ago (mya) (1). Cereals show much colinearity segment, using a Microsoft VISUAL BASIC program. Repetitive of genetic maps and often have important traits controlled by probes were filtered by comparison with highly repetitive sor- quantitative trait loci at corresponding locations (2). Despite these ghum DNA (8) and The Institute for Genome Research (TIGR) similarities, the cereals have diverged remarkably in genome size grass repeat database. from Ϸ430 million base pairs (MBP) in rice (3) to 15,966 MBP in Overgo probes were radioactively labeled, then hybridized in wheat (3), largely due to differential repetitive DNA amplification pools of up to 24 probes (7). Most experiments consisted of 576 probes, applied to macroarrays of 18,432 BACs each, in 72 hybrid- and elimination. izations of 24 probes, each using row–column–diagonal pools. As a model for tropical grasses, sorghum [Sorghum bicolor (SB)] is a logical complement to rice (Oryza), in that it has biochemical and morphological specializations to improve carbon assimilation at This paper was submitted directly (Track II) to the PNAS office. high temperatures (C4 photosynthesis). By contrast, rice uses C3 Ϸ Abbreviations: mya, million years ago; MBP, million base pair; SB, Sorghum bicolor; SP, photosynthesis more typical of temperate grasses. The 760-MBP Sorghum propinquum; BAC, bacterial artificial chromosome. Ϸ (3) sorghum genome is a logical bridge to the 2,500-MBP (3) Data deposition: The sequences reported in this paper have been deposited in the GenBank maize genome, and the Ϸ4,000-MBP (3) genome of sugarcane, the database (accession nos. BZ691020–BZ697099, CL696780–CL705752, and CW784313– world’s leading biomass͞biofuels crop. Sorghum shared common CW7926311). ancestry with maize (12 mya) and sugarcane (5 mya), much more §To whom correspondence should be addressed. E-mail: [email protected]. recently than rice (42–47 mya). The most recent whole-genome © 2005 by The National Academy of Sciences of the USA 13206–13211 ͉ PNAS ͉ September 13, 2005 ͉ vol. 102 ͉ no. 37 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0502365102 Downloaded by guest on September 26, 2021 Table 1. Overgo probes applied to monocots No. of Average no. Genome overgos No. of BACs of anchors Taxon coverage No. of BACs tested Scored Worked anchored per BAC Zea mays 12.7ϫ 221,184 2,623 2,623 2,161 85,437 0.39 SP 9.0ϫ 55,296 7,573 7,143 5,683 148,758 2.69 SB cv. BTϫ623 11.0ϫ 73,728 6,470 6,349 5,147 139,434 1.89 O. sativa cv. Nipponbare 20.0ϫ 73,728 2,297 2,297 1,590 29,064 0.39 Saccharum spp. Cv. R570 14ϫ* 92,160 5,974 5,574 4,639 80,535 0.87 Banana 6.0ϫ 36,864 1,276 1,276 679 13,059 0.35 Total 571,392 30,607 26,200 19,899 476,254 0.83 Data are available at: www.plantgenome.uga.edu͞bacman͞BACManwww.php. The SB library is from cultivar BTX623. Others are as cited in the text. *The sugarcane BAC library contains Ϸ14ϫ coverage of the basic chromosome set, which equates to Ϸ1.3ϫ coverage of each locus in the dodecaploid source cultivar. Labeled filters were exposed to x-ray film for 2 weeks. Films were (based on high band numbers or formation of ‘‘Q’’ contigs) to manually scored onto transparencies, which were scanned into a represent multiple and͞or chimeric clones were voided. computer and read by using text recognition software (ABBYY FINEREADER 5.0, ABBYY Software, Moscow). Data were trans- BAC End Sequencing. A subset of contig-terminal SP BACs from a ferred into the database system ‘‘BACMan’’ (www.intl-pag.org͞ preliminary FPC assembly were end-sequenced as described (12), pag͞11͞abstracts͞C01࿝C4࿝XI.html), which was used to deconvo- yielding 15,103 sequences. lute multiplexed results into individual probe to BAC assignments. Comparison of Sorghum with Rice. Sorghum contigs were anchored Physical Map Assembly. Physical maps of both SP and SB BACS to a rice genome assembly (Ver. 2.0, www.tigr.org), based in part on were constructed by using HindIII agarose fingerprinting (9). Gel sequence described elsewhere (13–15). Probe sequences were images were processed by using IMAGE (10) and assembled into masked for repeats by using REPEATMASKER (www.repeatmasker. contigs by using FPC, Ver. 7.2 (11), also incorporating overgo data. org) and compared with rice using BLAST (identifying the best hit Automated assembly was performed with a cutoff of 10Ϫ11, toler- at e Յ10Ϫ8 and using mismatch penalties Ϫ2 vs. the default of Ϫ3 ance 7 (CpM, contigs plus markers) of 10Ϫ10,10Ϫ9, and 10Ϫ8.All to detect similarities down to nearly 66%). Based on these criteria, contigs containing ‘‘Q’’ (questionable) clones were manually edited sequences useful in synteny analysis included 42.4% of hybridiza- and split at higher thresholds until none remained. Two manual tion probes detecting five contigs or fewer and 16.8% (2,625 of merge cycles (at cutoff 10Ϫ10 and 10Ϫ9, respectively) were applied 15,103) of BAC ends showing no match to repetitive sequences and to contigs that were adjacent

Comparative Physical Mapping Links Conservation of Microsynteny to Chromosome Structure and Recombination in Grasses

Constraints on Genes Shape Long-Term Conservation of Macro-Synteny in Metazoan Genomes

About Synteny Analysis

“Parent-Daughter” Relationships Among Vertebrate Paralogs

Calcisponges Have a Parahox Gene and Dynamic Expression of 2 Dispersed NK Homeobox Genes 3

Breakup of a Homeobox Cluster After Genome Duplication in Teleosts

Inferring Synteny Between Genome Assemblies: a Systematic Evaluation Dang Liu1,2, Martin Hunt3,4 and Isheng J Tsai1,2*

A Fast, General Synteny Detection Engine

Identifying Regions of Conserved Synteny Between

Assembly and Validation of Conserved Long Non-Coding Rnas in the Ruminant

Multiple Roles of Secretory Lipocalins (Mup, Obp) in Mice

Whole-Genome Microsynteny-Based Phylogeny of Angiosperms Tao

Synteny-Based Analyses Indicate That Sequence Divergence Is Not the Main Source of Orphan Genes Nikolaos Vakirlis1, Anne-Ruxandra Carvunis2, Aoife Mclysaght1