Microbial diversity, deep sequence analysis and metagenomes: the environmental gene pool Novel enzymes Community structure and function New sources of antibiotics New resistance Pathogen detection genes Metapopulation analysis Metagenome: -recovering uncultured bacteria -uncovering rare taxa -biogeography -capturing HMW DNA in expression libraries Metatranscritome: -IVET plant-microbe interactome -heavy metal resistome -antibiotic biosynthesis Metaproteome: -diverse proteins -enzymes after enrichments

Enzymes with useful properties for exploitation: some examples

1. Soil secretome: hydrolytic exoenzymes eg glycosyl hydrolases; chitinases 2. Soil antibiotic resistome: eg antibiotic disabling enzymes 3. Soil reservoir drug discovery: antibiotic biosynthesis enzymes eg polyketide synthases and non-ribosomal peptide synthetases ENDOCHITINASES have a groove-like Family 19 chitinases structure which is open at both ends and cleave at random intervals within the chitin chain. Endo EXOCHITINASES have a tunnel-like Endochitinases structure and cleave chitooligo- saccharides from the non-reducing end of a chitin chain. Exo

 5 clusters Plant class I and II Nematoda chitinases chitinases  Once thought only to occur in higher plants until found in Strepto- chitinases myces griseus HUT 6037. (Ohno, et al. 1996) family 19  Actinobacteria dominate chitinases bacterial F19 databases  Other bacteria are very Plant class IV chitinases distantly related Mainly Proteobacteria and viral chitinases

(Kawase, et al. 2004) Properties if family 18 and 19 chitinases

Multiplicity of genes  ? Synergy of proteins Family 18 Family 19 Catalysis model Substrate assisted General acid-base

Mechanism Retention Inversion mechanism mechanism

Position of anomeric Equatorial (b) Axial (a) oxygen at C1 Exoxhitinase or Exo- and Endo- Endo- Endochitinase Inhibitor (s) Allosamidin Amidines, amidrazones and nojiritetrazoles Family 19 diversity and biogeography

Sourhope Cayo Blanco No chitin α-chitin β-chitin No chitin α-chitin β-chitin -ve +ve

PCR using family 19 chitinase primers • Degenerate primers by (Williamson et al. 2000) • Great diversity both within and between sites • Family 19 chitinases may be specific to different allomorphs of chitin • Physiochemical properties maybe more important than environmental factors (LeCleir, et al. 2004) Enzyme molecular diversity Enrichment alters diversity Metagenomic and metaproteomic analysis of soil Soil Microcosm (+/- substrate amendment)

Extraction of Extraction of Isolates on high molecular metaproteome indicator plates weight eDNA and metaexoproteome

l-zap metagenomic Fosmid metagenomic library (4-10 kb 454 Roche library (35kb insert) insert) (Epicentre kit) Pyrosequencing (Stratagene)

PCR screening for Functional Analysis and functional genes activity screening bioinformatics Metaproteome combined with metaexoproteome of a soil

Metaproteome

metaexoproteome New sources of diversity from soil bacteria Expression screening of metagenomic libraries The cellular biomass was Air dried Warwick soil collected washed prior to Fresh Warwick Soil immobilisation in an LMP agarose ‘plug’. The LMP ‘plug’ was treated with lysozyme, proteinase K and SDS to lyse the cells

Bifunctional BAC vector for expression in Streptomyces lividans Nycodenz extraction (Bakken and Lindahl 1995)

Aqueous layer Cell biomass at interface Nycodenz (van der Geize et al., 2006)

Soil pellet Metagenome libraries: Expression screening for chitinase/lysozyme activity in lambda ZapII libraries SOLR cells infected with phagemids to give c 150 clones per well. After 24 hrs of incubation 4-Muf-DiNag is added to 50mm. The microtitre dish contains 14, 000 clones viewed under long-wave UV. Positive clones fluoresce as 4-MUF accumulates in the media. Positive wells are diluted twice before plating to isolate single colonies. The plasmid DNA has been isolated from +’s and transformed into the same backround to test for same positive signal. Inserts sequenced. Screen improved by addition of F18 or F19 inhibitors and some of these inhibit lysozyme Also screening fosmid and plasmid libraries inserts 40 4 Library sizes 0.5 2.6 Gb Antibiotic resistance (soil resistome)

Soil acts as an environmental reservoir for antibiotic resistance genes –associated with ab biosynthesis clusters -in closely related non-producers -in unrelated non-producers indigenous soil bacteria -in unrelated non-producers exotic bacteria = pathogens/commensals added to soil •Potential for selection for resistance -pollution •HGT of resistance genes- mobilome •Pathogens can survive in soil -Acquire integrons/plasmids -Act as source of antibiotic resistance – Possible carriage by amoebae-HGT Reservoirs of antibiotic resistance genes in diverse environments: RESERVOIR survey RESERVOIR environments: diverse in genes resistance antibiotic of Reservoirs aph6-Ic (deg) aac(3)-III/IV aac(6’) aac(3)-II/VI adenylase aph(2”) ant(2”) aac(3)-I aph6- aph6- aphD strB1 -II/Ib aph3 tetM ant3 stsC strA tetH tetG tetD tetO tetK tetK tetC tetE tetB tetA tetT tetL Id Ic -I -I

Non-producers Non-producers Producers Non-producers Tetracycline Gentamicin Streptomycin Streptomycin ol Riopee aue eae Seawater Soil Rhizosphere Manure Sewage

Flow chart of metagenomic approaches

DNA extracted Gel- End-repair from samples by Fractionate to make soil plug methods 4-10 kb DNA blunt- fragments ended

Analyse Ligate sample Transform E efficiency and DNA with blunt coli estimate the ended/dephos competent coverage plasmid pCF430 cell Transposon Amplified and Screen for new mutagenesis store the libraries phenotype and sequencing Broad-host-range expression vectors that carry the - arabinose-inducible Escherichia coli araBAD promoter and the araC regulator Metagenomic Library and ESBL gene screening

Reed Bed Sewage 1 Month Control Grass FYM soil Cake Cake Soil Land Soil Applied Applied Grass Soil No. of clones 386 000 500000 170000 630000 210000

Average insert size 4.64 4.12 4.40 3.70 2.85 3.71 (Kb)

Clones with inserts 65 65 85 50 75 85 (%) Coverage (Gb) 0.63 1.59 1.87 0.32 1.53 1.47

No of cefotaxime 0 2 1 0 0 0 resistance No of ceftazidime 2 1 0 0 0 0 resistance No of imipenem 1 ? 2 ? 0 0 0 0 resistance No of amp res 4 ? 5? 1 ? 0 0 0 clones Hit rate 1/80 1/150 1/900 0 0 0

Resistance to b-lactam antibiotics PSPA7_2859 Pseudomonas aeruginosa PA7 pvsA Pseudomonas fluorescens SBW25 pvdD Pseudomonas aeruginosa 206-12 60 Druridge 38 psvA Pseudomonas fluorescens 100 Druridge 26 Diversity of NRPS Athens 26 PSPA7_2858 Pseudomonas aeruginosa PA7 Druridge 16 Athens 12 100 Cockle 45 92 Cockle 48 Cockle 36 Polar effects? 92 Daci_4753 Delftia acidovorans SPH-1 RS06179 Ralstonia solanacearum GMI1000 megaplasmid RSMK04952 Ralstonia solanacearum MolK2 84 RSIPO_02940 Ralstonia solanacearum IPO1609 100 bglu_2g09010 Burkholderia glumae BGR1 chromosome 2 sequences from the Athens 29 Athens 8 sypB Bradyrhizobium sp. BTAi1 Druridge 1 51 BURPS1106A_A2213 Burkholderia pseudomalleiAntarctic 1106a library 93 Cockle 28 Athens 48 63 Druridge 17 Athens 5 and the 3 European 100 Cockle 3 Cockle 41 Athens 35 65 Cockle 25 89 Cockle 4 soils with markers Druridge 13 Athens 13 82 Athens 25 64 ECA1487 Pectobacterium atrosepticum SCRI1043 53 Druridge 28 from GenBank in in 80 Druridge 7 59 massB Pseudomonas fluorescens SS101 Antarctic 215 Druridge 32 bacB Bacillus subtilis 916 bold. The tree was Cockle 2 86 Athens 39 96 Druridge 15 99 Athens 7 PSPTO_4519 Pseudomonas syringae DC3000 constructed using Cockle 13 71 Cockle 37 Antarctic 244 63 Cockle 16 MXAN_4403 Myxococcus xanthus DK 1622 the neighbor-joining Cockle 35 Druridge 22 50 100 Druridge 30 Druridge 37 method; the 69 Athens 4 Athens 10 Athens 36 Cockle 20 Cockle 38 numbers besides the 88 Athens 46 76 Cockle 29 100 Cockle 7 Athens 6 Athens 24 branches indicate 100 Cockle 39 84 Cockle 50 Antarctic 318 Druridge 11 Druridge 33 the percentage 100 Druridge 2 60 Athens 23 Athens 33 51 snbDE S. virginiae MXAN_3636 Myxococcus xanthus DK 1622 bootstrap value of Athens 28 MXAN 3779 Myxococcus xanthus DK 1622 98 Antarctic 337 Cockle 40 Cockle 21 1000 replicates. The Athens 14 Athens 34 Antarctic 283 MXAN_4532 Myxococcus xanthus DK 1622 Antarctic 326 scale bar indicates Athens 41 Athens 37 92 Athens 42 89 Antarctic 9 10% nucleotide Antarctic 14 nrps2-1 50 Antarctic 27 67 74 Antarctic 353 Antarctic 78 nrps2-1 S. avermitilis MA-4680 dissimilarity S. avermitilis Antarctic 104 55 sce8255 Sorangium cellulosum 'So ce 56' Antarctic 335 72 Druridge 31 MA-4680 Athens 17 Druridge 39 sce2387 Sorangium cellulosum 'So ce 56' Hoch_1747 Haliangium ochraceum DSM 14365 Antarctic clones 78 scpsB Saccharothrix mutabilis subsp. capreolus 73 Druridge 9 95 Druridge 3 and 104 74 Cockle 5 SACE_4288 Saccharopolyspora erythraea NRRL2338 snbC S. pristinaespiralis 100 Athens 16 83 79 Athens 18 Tcur_1886 Thermomonospora curvata DSM 43183 Amir_3602 Actinosynnema mirum DSM 43827 86 visE S. virginiae 0.1 Hit rate 1/100 Antibiotics in nature

Microbial metabolic exchange has important roles in ecology and the survival of higher organisms. (a) Symbiotic bacteria of the brine shrimp produce the antifungal compound istatin, thereby protecting shrimp embryos from pathogenic fungi. (b) Actinomyces spp. symbionts of leaf-cutting ants produce metabolites that protect the fungus farmed by the ants from a pathogenic fungus. Environmental mycobacteria as opportunistic pathogens Particular risk to immunocompromised persons such as those with HIV/AIDS Pre-existing lung diseases and Helminth Infections Common causative Clinical disease Unusual causative species species M. asiaticum M. branderi M. celatum Atypical pulmonary M. abscessus M. avium M. fortuitum M. gordonae M. haemophilum M. disease Pulmonary complex M. intermedium M. lentiflavum M. kansasii M. malmoense M. Disease magdeburgensis M. shimodii M. simiae M. xenopi smegmatis M. szulgai M. avium complex M. M. abscessus M. conspicuum M. fortuitum M. Disseminated chelonae M. genavense M. malmoense M. marinum M. haemophilum M. kansasii Cervical Lymphadenitis Disease sherrisii M. simiae M. triplex M. xenopi M. scrofulaceum M. abscessus M. bohemicum M. chelonae M. M. avium complex M. fortuitum M. haemophilum Lymphadenitis malmoense M. M. heidelbergense M. interjectum scrofulaceum M. kansasii M. lentiflavum M. tusciae M. abscessus M. chelonae Cutaneous M. haemophilum M.kansasii M. malmoense M. M. fortuitum M. marinum M. smegmatis Disease ulcerans M. aurum M. avium M. gordonae M. Nosocomial M. abscessus M. fortuitum mucogenicum M. neoaurum M. simiae M. chelonae Disease M. smegmatis M. xenopi

Aquarium Granuloma Diverse sampling location: Ethiopia

Woldiya (Wo1-Wo8) Gondar Warm lowland →tepid agricultural highlands Warm agricultural mid- (Go1-Go4) Hot agricultural Butajira highlands highland (Bu1- Bu4) Gambella (Ga1-Ga4) Hossana (Ho1- Ho4) Tropical agricultural lowlandJinka (Ji1- Warm agricultural mid-highlands Omorate Turmi Ji6) (Om1-Om2) (Tu1- Bale Tu2) (Ba1- Ba8)

Hot agricultural mid- highlands Hot arid lowland Hot arid lowland Warm lowland →Dense forest → Afro- desert alpine Ethiopian EM diversity study

Culture independent 16SrRNA target gene

Two sets of primers to target EM Sampling: village Pyrosequencing output Mycobacteriumlevel genus (JSY16S F/R ) Cluster based approach Phylogenetic slow-growingSOIL mycobacteria (APTK16SWATER F/R) Five sampling Two sampling (OTUs) approach locations locations Sampling effort Rarefaction curves Unique branch length Unifrac PcoA Soil collected fromPyrosequencing Water (100 ml) three positions to filtered through 0.22 Canonical Neighbour-joining make a composite µm filter correspondence bootstrapped sample analysis (CCA) phylogenetic trees PowerWater DNA FASTDNA extraction Readings for each composite extraction from Mantel tests 0.22µm filters Soil temperatureBLAST matches sample Soil pH Multivariate GLMs : Multivariate GLMsSoil moistureProportion of slow DNA pooled from the DNA pooled from the for OTU diversityWater growers and five composite two samples temperaturepathogenic species samples Water pH Altitude Pyrosequencing of Pyrosequencing of Longitude village soil DNA village water DNA Latitude genus level : DNA from soils Alpha rarefaction curves Slow-growing EM in Ethiopian soils Alpha rarefaction Diversity of slow-growing EM in soils Hot arid lowlands Warm agricultural midlands Tepid highlands

* *

* *

*

BLAST results at a 97% identity and 90% alignment coverage Bale Slow-growing EM water samples: Butajira taxonomy plots Jinka Gambella Gondar Ji1-6 Om1-2 Tu1-2 Ho1-4 Bu1-4 Ga1-4 Ba1-8 Wo1-8 Go1-4 Hossana Woldiya

Higher prevalence of M. complex, and M. gordonae, M. malmoense, M. conspicuum , M. colombiense Detection of M. bovis in Ethiopian samples M. tuberculosis complex pyrosequencing reads

M. bovis RD4Mtb qpcr detection Sweeney et al. 2007 Pontiroli et al. 2011

1.00E+04

1.00E+03

cell copies cell 1.00E+02 gram/per ml gram/per 1.00E+01 per per M. bovis bovis M.

1.00E+00 Ba8 Bu3 Bu4 Ga1 Ga3 Wo5 Data suggests water sources are a potential 2.38% (1/42) of soils were positive for M. bovis reservoir of M. bovis infection 11.90% (5/42) of water samples were positive for M. bovis Human-Environment-Livestock-Interface ACKNOWLEDGEMENTS Co-workers Chitinases: Ashley Johnson-Rollings, Helena Wright Antibiotics: Paris Laskaris, Nikos Kyratsous Resistome: Will Gaze, Lihong Zhang, Greg Amos Community analysis: Tanya Khera collaborator Orin Courtenay Collaborators: Peter Hawkey, University of Birmingham, UK David Pearce BAS, Cambridge, UK Brian Oakley, CDC Atlanta, USA Girum Erenso, Armaue Hansen Research Institute, Ethiopia