A Genomics-Guided Approach for Discovering and Expressing Cryptic

A Genomics-Guided Approach for Discovering and Expressing Cryptic

TECHNICAL REPORT to discover natural-product gene clusters because the analysis of a A genomics-guided relatively small number of GSTs provides reasonable assurance of full genome representation. For example, analysis of 1,000 GSTs approach for discovering from a genome of 8.5 Mbp (the approximate size of the genome of an antibiotic-producing actinomycete) provides DNA sequence sampling every 8.5 kbp (assuming random library coverage). and expressing cryptic Given that natural-product gene clusters range in size from 20 to 200 kbp6,7 (C.F., unpublished data), it is expected that any given metabolic pathways gene cluster will be represented by anywhere from 2 to more than 20 of 1,000 GSTs analyzed. To date, we have used the genome scan- ning approach to successfully identify more than 450 natural- Emmanuel Zazopoulos1, Kexue Huang1, product gene clusters in a variety of actinomycetes (C.F., unpub- Alfredo Staffa1, Wen Liu2, Brian O. Bachmann1, lished data). Koichi Nonaka2, Joachim Ahlert2,3, Jon S. Thorson2,3, We used the genome scanning method to isolate enediyne 2,4 1 biosynthesis genes from a variety of actinomycete strains known Ben Shen , and Chris M. Farnet * to produce enediynes, a potent class of antitumor antibiotics8.The enediynes induce irreversible DNA damage by a mechanism that Published online 21 January 2003; doi:10.1038/nbt784 involves cycloaromatization of the warhead chromophore (Fig. 2A) to form highly reactive benzenoid diradicals that strip hydro- Genome analysis of actinomycetes has revealed the presence gen atoms from the sugar phosphate backbone of the DNA helix9. of numerous cryptic gene clusters encoding putative natural We chose the dynemicin and macromomycin biosynthesis gene products1,2. These loci remain dormant until appropriate chem- clusters to demonstrate the effectiveness of the genome scanning ical or physical signals induce their expression. Here we method. Comparison with the C-1027 (ref. 3), calicheamicin4, and demonstrate the use of a high-throughput genome scanning neocarzinostatin (W. L., K.N., L. Nie, J. Bae, and B.S., unpublished http://www.nature.com/naturebiotechnology method to detect and analyze gene clusters involved in natur- data) gene clusters reveals that the homology among all these loci al-product biosynthesis. This method was applied to uncover is limited to a set of five genes, including the gene encoding PKSE, biosynthetic pathways encoding enediyne antitumor antibiotics that form a putative “warhead gene cassette” (Fig. 2B). The con- in a variety of actinomycetes. Comparative analysis of five served genes are generally arranged in a presumed operon with biosynthetic loci representative of the major structural classes of enediynes reveals the presence of a conserved cassette of five genes that includes a novel family of polyketide synthase (PKS)3,4. The enediyne PKS (PKSE) is proposed to be involved in the formation of the highly reactive chromophore ring struc- ture (or “warhead”) found in all enediynes3,4. Genome scanning analysis indicates that the enediyne warhead cassette is wide- ly dispersed among actinomycetes. We show that selective growth conditions can induce the expression of these loci, sug- gesting that the range of enediyne natural products may be © Group 2003 Nature Publishing much greater than previously thought. This technology can be used to increase the scope and diversity of natural-product discovery. We have developed a high-throughput genome scanning method to discover metabolic loci independently of their expression. This approach takes advantage of the fact that the genes required for secondary metabolite biosynthesis are typically clustered together in a bacterial genome5. A shotgun DNA sequencing approach is used to generate short (700 bp) random genome sequence tags Figure 1. A diagrammatic view of the genome scanning method for (GSTs) from a library of genomic DNA prepared from a microor- high-throughput discovery of natural-product biosynthetic gene clusters. Natural-product biosynthetic genes (in color) are clustered in ganism. GSTs derived from genes that are likely to be involved in the bacterial genome (for simplicity, only a single gene cluster is the biosynthesis of natural products are identified by sequence shown). High-molecular-weight genomic DNA is randomly fragmented comparisons to a database of microbial gene clusters known to be and small fragments are used to prepare a genome sampling library involved in natural-product biosynthesis. Selected GSTs are then (GSL) in a plasmid vector while large fragments are used to prepare a cluster identification library (CIL) in a cosmid or BAC vector. Gene used to design screening probes to identify cloned subgenomic sequence tags (GSTs) are generated from the GSL clones using a fragments (for example, cosmids or bacterial artificial chromo- universal primer located in the plasmid vector. The GSTs are compared somes (BACs)) containing the genes of interest as well as the to a database of natural-product biosynthetic genes to identify tags neighboring genes that together may constitute a biosynthetic derived from genes involved in natural-product biosynthesis (“hot” GSTs, colored inserts; step 1). These genes are then used as probes to gene cluster (Fig. 1). Genome scanning provides an efficient way identify CIL clones containing the corresponding genes as well as their neighboring genes (“hot” CIL clones). Overlapping CIL clones may be identified by restriction fragment length mapping or during the 1Ecopia BioSciences, Inc., 7290 Frederick Banting, Montreal, Quebec H4S subsequent sequencing step. The hot CIL clones are randomly 2A1, Canada. 2Division of Pharmaceutical Sciences, 3Laboratory for fragmented and used to prepare a second plasmid library that provides Biosynthetic Chemistry, and 4Department of Chemistry, University of templates for sequencing (step 2). Sequencing and assembly of the Wisconsin, Madison, WI 53706. *Corresponding author selected CIL clones result in a complete natural-product gene cluster ([email protected]). that is then annotated and entered into the database (step 3). www.nature.com/naturebiotechnology • FEBRUARY 2003 • VOLUME 21 • nature biotechnology 187 TECHNICAL REPORT A HO D We compared the other war- O HO head cassette proteins to protein H3CO SSSCH3 HO OR R O OH 1 OH 2 H 007A O N sequences present in the O O S NH O CH I O 3 O H CH3 GenBank nonredundant data- H3C N O CH3 009C H3COOC base to assess putative functions. HOOC OCH3 O 1 2 R1, R2 = sugars 028D One protein family (TEBC) is CH2 similar to the 4-hydroxybenzoyl- O 054A H3CO O N CoA thioesterase of Pseudomonas H CH3 O O O O 059A sp. strain CBS-3 in regions of the O O O O protein that have been shown to H C O O 3 O 132H H C HO have an important role in cataly- OH 3 O O O (H3C)2N O 11 CH3HN CH3 OH 135E sis and thus may be involved in OH NH2 HO O Cl polyketide chain release, cycliza- CH3 OH 145B tion, or both (see Supplementary 3 4 Fig. 3 online). Three families of B E unknown proteins (UNBL, DYNE 046E UNBV, and UNBU) show no sig- nificant homology to proteins in CALI 100B the public databases and there- MACR 171B fore represent novel protein fam- ilies that appear to be specific to NEOC enediyne biosynthetic loci. C-1027 Structural analysis of the UNBV PKSE TEBC UNBL UNBV UNBUUNBU proteins predicts that they are http://www.nature.com/naturebiotechnology secreted proteins with N-termi- C ACP nal signal sequences, whereas the NH2 KS AT ? KR DH ? PPTE COOH UNBU proteins are predicted to be integral membrane proteins Figure 2. Chemical structures of enediynes and genes involved in warhead formation. (A) The structures of with seven or eight putative the enediynes dynemicin (1), calicheamicin (2), neocarzinostatin (3), and C-1027 (4). The common membrane-spanning alpha cyclododecylpolyene skeleton found in all warheads is highlighted in red. The complete structure of helices (see Supplementary Figs. macromomycin has yet to be elucidated; however, the limited structural information available is consistent 4–6 online). Although the func- with a chromophore ring system similar to that found in C-1027 (ref. 20). (B) Organization of the warhead gene cassettes found in the dynemicin (DYNE), calicheamicin (CALI), macromomycin (MACR), tions of the TEBC, UNBL, UNBV, neocarzinostatin (NEOC), and C-1027 loci. (C) Domain organization of the warhead PKS, consisting of KS and UNBU proteins remain (ketoacyl synthase), AT (acyl transferase), ACP (acyl carrier protein), KR (keto reductase), DH unknown, their strict association ′ (dehydratase), and PPTE (4 -phosphopantetheinyl transferase). (D) Organization of the warhead cassette with the warhead PKS and their genes found in loci from actinomycete strains not previously reported to produce enediyne natural products. (E) Warhead cassette genes from actinomycete strains newly isolated from soil samples. 007A, locus found presence in all enediyne biosyn- in Amycolatopsis orientalis; 009C, locus found in Streptomyces ghanaensis; 028D, locus found in thetic loci strongly suggest that © Group 2003 Nature Publishing Kitasatosporia sp.; 054A, locus found in Micromonospora megalomicea subsp. nigra; 059A, locus found in they have essential roles in the Streptomyces cavourensis subsp. washingtonensis; 132H, locus found in Saccharothrix aerocolonigenes; formation, stabilization, or trans- 135E, locus found in Streptomyces kaniharaensis; 145B, locus found in Streptomyces citricolor. Loci 046E, 100B, and 171B were found in new actinomycete isolates (Ecopia BioSciences Inc.). port

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us