Quantifying Environmental Adaptation of Metabolic Pathways in Metagenomics

Quantifying Environmental Adaptation of Metabolic Pathways in Metagenomics

Quantifying environmental adaptation of metabolic pathways in metagenomics Tara A. Gianoulisa,1, Jeroen Raesb,1, Prianka V. Patelc, Robert Bjornsond, Jan O. Korbelc,b, Ivica Letunicb, Takuji Yamadab, Alberto Paccanaroe, Lars J. Jensenb,f, Michael Snyderc,g, Peer Borkb,h,2, and Mark B. Gersteina,c,d,2 aProgram in Computational Biology and Bioinformatics, Departments of cMolecular Biophysics and Biochemistry, dComputer Science, and gMolecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06520; eRoyal Holloway, University of London, Egham, TW20 0EX United Kingdom; fNovo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200 Copenhagen, Denmark; hMax Delbru¨ck Center for Molecular Medicine, D-13125 Berlin-Buch, Germany; and bEuropean Molecular Biology Laboratory, D-69117 Heidelberg, Germany Edited by Michael Levitt, Stanford University School of Medicine, Stanford, CA, and approved November 20, 2008 (received for review August 14, 2008) Recently, approaches have been developed to sample the genetic rine), providing evidence for genomic adaptations. Further, content of heterogeneous environments (metagenomics). How- variation in specific community biological processes have been ever, by what means these sequences link distinct environmental shown for different water column zones at a single geographic conditions with specific biological processes is not well under- site (9), different climatic regions in the ocean (10), and, more stood. Thus, a major challenge is how the usage of particular recently, among 9 ecosystems (7). pathways and subnetworks reflects the adaptation of microbial The wealth of information generated from these studies communities across environments and habitats—i.e., how network emphasizes the importance of investigating relative differences dynamics relates to environmental features. Previous research has in biological processes among qualitatively different environ- treated environments as discrete, somewhat simplified classes ments. However, to date, none of them have directly incorpo- (e.g., terrestrial vs. marine), and searched for obvious metabolic rated multiple, specific measurements of the environment. By differences among them (i.e., treating the analysis as a typical treating the environment explicitly as a set of complex, contin- classification problem). However, environmental differences result uous features, rather than relying on an implicit subjective from combinations of many factors, which often vary only slightly. classification, one can build models to determine how a diverse Therefore, we introduce an approach that employs correlation and array of biochemical activities, and particularly metabolic ver- regression to relate multiple, continuously varying factors defining satility, reflect sets of or specific environmental differences. an environment to the extent of particular microbial pathways Providing an ideal dataset for exploring these environmental– present in a geographic site. Moreover, rather than looking only at biochemical links, the Global Ocean Survey (GOS) collected individual correlations (one-to-one), we adapted canonical corre- quantitative environmental features and metagenomic se- lation analysis and related techniques to define an ensemble of quences from Ͼ40 different aquatic sites (10). Here, we used weighted pathways that maximally covaries with a combination of environmental variables (many-to-many), which we term a meta- GOS data to investigate and develop multivariate approaches to bolic footprint. Applied to available aquatic datasets, we identified systematically relate metabolic pathway usage directly to quan- footprints predictive of their environment that can potentially be titative environmental differences. These approaches allowed us used as biosensors. For example, we show a strong multivariate to address multiple relationships simultaneously, as well as to correlation between the energy-conversion strategies of a com- relate specific environmental features to metabolic processes at munity and multiple environmental gradients (e.g., temperature). different levels of resolution, including 14 broad functional Moreover, we identified covariation in amino acid transport and categories, 111 pathways, 141 modules (sections of pathways), cofactor synthesis, suggesting that limiting amounts of cofactor 191 operons, and 15,554 orthologous groups (OGs). By identi- can (partially) explain increased import of amino acids in nutrient- fying environmentally-dependent pathways involved in energy limited conditions. conversion, amino acid metabolism, and cofactor synthesis, among others, we were able to define metabolic footprints of environmental genomics ͉ network dynamics ͉ microbiology ͉ distinct environments. Our study provides an analytical frame- canonical correlation analysis work for uncovering ways, in which microbes adapt to (and perhaps even) how they change their environment. icrobes function as highly interdependent communities. Results MFundamental to the maintenance of the energy balance of the ecosystem, the recycling of nutrients, and the neutralization Quantitative Approach for Footprint Detection. We mapped 37 and degradation of toxins and other detritus (1), microbial size-filtered GOS sites (Table S1) to their respective environ- community processes are intimately intertwined with ecosystem mental and metabolic features at several levels of complexity functioning. Thus, it is critical to understand the complex interplay between the influence of the environment on microbial commu- Author contributions: T.A.G., J.R., M.S., P.B., and M.B.G. designed research; T.A.G., J.R., and nities, and, in turn, the microbes’ reshaping of their environment. P.V.P. performed research; T.A.G., R.B., J.O.K., I.L., T.Y., A.P., and L.J.J. contributed new Until recently, the tools to systematically study global com- reagents/analytic tools; T.A.G. and J.R. analyzed data; and T.A.G., J.R., M.S., P.B., and M.B.G. munity function and environment at the molecular level were not wrote the paper. available, because complex microbial communities are generally The authors declare no conflict of interest. not amenable to laboratory study (2). The recent advent of direct This article is a PNAS Direct Submission. sequencing of environmental samples (i.e., metagenomics) has Freely available online through the PNAS open access option. allowed the first large-scale insights into the function of these 1T.A.G. and J.R. contributed equally to this work. complex microbial communities. 2To whom correspondence may be addressed. E-mail: [email protected] or mark. Comparative metagenomics approaches have revealed signif- [email protected]. icant variation in sequence composition (3), genome size (4), This article contains supporting information online at www.pnas.org/cgi/content/full/ evolutionary rates (5), and metabolic capabilities (6–8) among 0808022106/DCSupplemental. qualitatively dissimilar environments (e.g., terrestrial vs. ma- © 2009 by The National Academy of Sciences of the USA 1374–1379 ͉ PNAS ͉ February 3, 2009 ͉ vol. 106 ͉ no. 5 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0808022106 Downloaded by guest on September 27, 2021 C DPM Footprint A Quantitative Environmental Metadata 23˚C 36.6 NaCl 10 m Site-Set 1 P1 P2 P3 P4 P5 24.3˚C 36.6 NaCl 5 m B1 B1 B3 B2 B3 B1 B2 B5 B2 ATGCTCTCTCTC - - - - - - - - B4 B ATCGTTTTTGGG- - - - - - GTGTCATTATCTCT- - - - - - Pathway ATGCTCTGCCGCGC- - - - - - - - - - - - - - - - - - - - - - - - - - - B4 Site-Set 2 B5 ATCGTCGCTGCTGC - - - - - - - Assignment - - - - - - - - - - - - - - - - - - - - (1) Partition sites based (2) Test for pathways that on the environment. differ between site-sets. C Linear combination of Linear combination of 1.0 C ) environmental features. pathways ( NaCl P3 A D P1 Depth Temp P1 F s (Dim I I I o P2 .0 0.5 ight B5 I o 0 max B1 .. B5 P4 . .. t P5 max . B4 B1 . II p 0.5 corr B3 . .. We lized corr . II B4 . B3 Temp B2 r B2 i rma No NaCl P2 P3 n 1.0 Depth ..........1.0 t 1.0 0.5 0.0 0.5 ( , Normalized Weights (Dim I) Fig. 1. Schematic representation of approach. The large squares labeled B1, B2, etc. represent the geographic sites (buckets). Each bucket has sequence and environmental feature data associated with it. (A) Mapping quantitative environmental features [salinity (ppt), sample depth (position in water column from which the sample was collected), water column depth (measured from surface to floor), and chlorophyll]. (B) Metagenome-derived metabolism at different levels of resolution (see Materials and Methods). Reads are color-coded according to their corresponding pathway elements (shapes). Different pathways are represented by different shapes (square, circle, etc.). All of the instances of a particular pathway are summed and normalized to compute the pathway score. (C) Schematic representation of DPM (see details in text). (D) Schematic representation of CCA (see details in text). (pathways, modules, and operons; see Fig. 1 A and B). These in predicting single environmental features (Fig. 2), there are data can naturally be represented as matrices, where the rows are limitations to viewing each environmental measurement in geographic sites and the columns are either environmental or isolation, because there are hidden dependencies among the metabolic features. We interrelated these matrices to examine how environmental features. pathway usage across different sites is related to environmental To discover the complex, higher order interactions between parameters. The simplest and

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us