Next Generation Soil Metagenomics Large-Insert BAC Libraries for The
Total Page:16
File Type:pdf, Size:1020Kb
Next Generation Soil Metagenomics Large-insert BAC Libraries for the Discovery of Natural Products and Drugs Mark Liles2, David Mead1, Xing Cong Li3, Kavita Kakirde2, Rosa Ye1, Megan Wagner1, Amanda Krerowicz1, Molly Staley2, Svetlana Jasinovica1, Melissa R Jacob3, Ameeta Kagarwal3, Peter Ladell1, Ronald Godiska1, Cheng-Cang Wu1 1Lucigen Corporation, Middleton, WI 53562; 2Department of Biological Sciences, Auburn University, AL; 3The National Center for Natural Products Research (NCNPR), University of Mississippi, Oxford MS Abstract Methods and Results A An ideal metagenomic library for screening large pathways for novel small molecules Phylogenetic Diversity of the Auburn Soil Metagenomic Library Table 2. Bacterial phyla in the Cullars soil would contain randomly sheared DNA inserts > 100 kb capable of heterologous metagenomic library based on the % expression in a broad range of bacterial hosts. Culture-independent metagenomic Pooled metagenomic BAC DNA was used as a template in a PCR relative abundance of 16S rRNA genes. methods are limited by the challenges of constructing unbiased large-insert libraries with universal bacterial 16S rRNA gene-specific primers 27F and greater than 50 kb for screening and heterologous expression. Most metabolic 1492R. The resultant PCR products were cloned into Lucigen’s Acidobacteria 11.0 pathways for natural products are greater than 50 kb, precluding the discovery of intact pSmart vector and 768 transformants were Sanger sequenced using Bacteroidetes 21.7 operons for new drugs. Soil microbial communities are highly diverse and have both 27F and 907R primers. Sequence reads were trimmed for Gemmatimonadetes 12.9 provided a large proportion of existing drugs derived from cultivated bacteria and fungi. quality, assembled to produce a ~800 bp consensus sequence, and Alpha-Proteobacteria 19.8 Unfortunately less than 1% of viable microorganisms in soil can be recovered by compared to the GenBank nr/nt database by BLASTn. In total, there traditional culturing techniques and soil metagenomic DNAs is notoriously difficult to were 318 non-E. coli rRNA sequences identified from the soil Beta-Proteobacteria 6.6 work with. We have succeeded in preparing and cloning high quality, high molecular metagenomic library. A phylogenetic analysis revealed that these Gamma-Proteobacteria 6.6 weight metagenomic DNA of an average insert size of 110 kb from soil using broad host sequences are affiliated with bacteria in 9 different phyla (Table 2) Delta-Proteobacteria 4.4 range bacterial artificial chromosome (BAC) shuttle vectors for expression in gram- with extensive diversity present especially within the alpha- positive and gram-negative hosts. Heterologous expression of a BAC library containing Verrucomicrobia 1.9 19,200 clones in Escherichia coli for functional screening identified 32 hits inhibiting the Proteobacteria and Bacteroidetes (Fig. 3A). A rarefaction curve was Planctomycetes 1.6 generated using the software package MOTHUR (Fig. 3B) and growth of Methicillin-resistant Staphyloccocus aureus (MRSA). Sequence analysis of Chloroflexi 0.3 the anti-MRSA clones revealed novel secondary metabolic pathways and unique clearly demonstrated that a significant portion of the metagenomic Actinobacteria 7.9 sequences compared to the GenBank database. 16S rRNA analysis of the library library diversity had been sampled in this survey. B Firmicutes 5.3 indicates a very diverse assemblage of microbial genomes representing 9 bacterial. Interestingly, we found that multiple clones encoding different pathways/enzymes were Sequence Diversity of Anti-MRSA BAC Clones capable of modifying the chloramphenicol that was added exogenously to the culture Genetic analysis of 32 validated anti-MRSA BAC clones was carried medium, thereby resulting in modification of an existing antimicrobial scaffold. These out using bar-coded NGS libraries, which were sequenced by Roche Figure 3. The metagenomic library new metagenomic technologies have significant potential for discovering novel natural 454, Illumina, or Ion Torrent methods. These BAC clones had an ribotype diversity was based on a products and drugs as well as revealing large tracts of metagenomes and secondary average insert size of 113.5 kb (data not shown), in agreement with maximum parsimony analysis (Panel metabolic pathways from previously unexplored members of soil microbial communities. the average insert size of randomly selected clones analyzed by A), and sampling redundancy was restriction digests and pulsed field gel electrophoresis. The anti- assessed by a rarefaction curve at OTUs of Unique Number Introduction MRSA clones had a great diversity of predicted gene products, along different % identity cutoffs (Panel B). The emergence of methicillin-resistant Staphylococcus aureus (MRSA) infections cost with many open reading frames (ORFs) with no significant similarity Total Number of Sequences $14.5 billion in 20035 and MRSA strains have caused approximately 19,000 deaths in to genes in the GenBank database. Three of the clones had low the United States in 2007 (WHO). The discovery of new antibiotic compounds is similarity to genes involved in antimicrobial synthesis, including clone becoming increasingly important with the rise in the incidence of multi-drug resistant P6L4, which contained a predicted gene product with 34% amino microbial pathogens worldwide. The huge costs and a high rate of antibiotic rediscovery acid identity to a polyketide cyclase (Fig. 4). The most striking have limited the investments of pharmaceutical industries. Different approaches have been used for natural product discovery that employ both culture-dependent and – feature of all 32 of the sequenced anti-MRSA BAC clones is the independent methods2,3. Functional analysis of metagenomic libraries has enabled overall low sequence identity to known database genes and the lack identification of diverse and novel secondary metabolites, such as the pentacyclic of any identifiable small molecule pathway, indicating a high level of polyketide erdacin encoded by type II PKS4,5. Here we use a culture-independent and novel genes present in the library. function-based approach to screen a shuttle BAC soil metagenomic library containing randomly sheared inserts of >100 kb for anti-MRSA activity. This was achieved using Chemical Diversity of Anti-MRSA BAC Clones randomly sheared DNA thereby removing bias and improving the quality of the library. Using the CLSI microdilution protocol, 31 validated metagenomic clones from the E. coli host have been rescreened for in vitro Library Construction and Screening antibacterial activity against S. aureus ATCC 29213 and methicillin- resistant S. aureus (MRSA) ATCC 33591. Culture Soil Sample Source # of BAC clones Avg. Insert Size supernatants/lysates were initially tested at 95% vol/vol to generate Figure 4. 32 of the anti-MRSA BAC clone inserts have been sequenced Cullars Rotation agricultural soil, % inhibitions, with active samples (≥50% inhibition) proceeding to 19,200 110 kb dose response studies to generate IC s to prioritize seven hits, and depicted here are the predicted open reading frames (ORFs) for Auburn, AL 50 three BAC clones with their respective nearest BLASTx result. Zero which exhibited IC50s ranging from 10.32 to 44.95 (%vol/vol) against Iowa Morris Prairie soil the two pathogens. All seven hits proceeded to scale-up identity indicates no GenBank database match. (Terra-base metagenomic sequence data 103,680 100kb fermentation and three clones have been analyzed chemically. An available from DOE JGI*) ethyl acetate extract of the supernatant of BAC clone P35B14 was chromatographed on silica gel to afford diketopiperazines 1−6. The *https://www.orau.gov/gtl2012/abstracts/jansson01.pdf diketopiperazines are a class of dipeptide antibiotics previously isolated from marine organisms and associated bacteria. cyclo- Prolinyl-tyrosine (1) was reported to be more active than tetracycline and streptomycin against S. aureus ATCC 25923 and E. coli ATCC 25922. cyclo-Prolinyl-phenylalanine (2) and its analogs were also reported to possess potent activity against Vibrio anguillarum. BAC clone P6B5 gave structures 7-9 plus novel compounds. BAC clone P6B5 encodes genes for a putative esterase, a carboxylesterase Figure 5. Extracts of metagenomic clone supernatants were and a metallophosphoesterase that are presumably reactivating the chromatographed on silica gel to identify diketopiperazines 1−6 from clone chloramphenicol activity counteracted in the chloramphenicol acetyl P35B14; chloramphenicol derivatives 7−9 and novel compounds (not transferase encoded on the BAC vector. Novel chloramphenicol shown) from clone P6B5 presumably biosynthesized by acylation of derivatives have been isolated from this clone. exogenously added chloramphenicol through a series of acyltransferases. The structure elucidation of these compounds was achieved by NMR and Summary and Conclusions MS analyses. 1. Two soil metagenomic BAC libraries were constructed using a random shear cloning method resulting in an average insert size of 100~110 kb. Figure 1. Diagram of two broad host range shuttle BAC vectors, 2. The genetic diversity of the soil metagenomic library has been confirmed by both 16S analysis and functional BAC sequencing. pBACSBO and pSMART BAC-S, that allow construction of libraries in 3. An in situ lysis method was used to rapidly screen metagenomic clones for MRSA growth inhibition. high transformation efficiency E. coli strains and high-throughput transfer 4. 32