Analytical Strategies for the Study the Predicted Proteins
Total Page:16
File Type:pdf, Size:1020Kb
Systems Biology Strategies and Technologies for Understanding Microbes, Plants, and Communities corresponding to 2,369 unique proteins, covering 65% of Analytical Strategies for the Study the predicted proteins. Quantitative analysis of changes in protein abundance under environmental perturbations has of Plants, Microbes, and Microbial led to the identification of the key proteins required for the Communities maintenance of cellular fitness necessary for cell survival. We also examined the impact of diurnal rhythms on the protein level of Cyanothece 51142. We identified a total of 3,616 proteins with high confidence, which accounts for ~68% of the predicted proteins based on the completely Towards176 an Understanding of Proteins sequenced Cyanothece 51142 genome. About 77% of identi- fied proteins could be assigned to functional categories. that Govern the Structure and Function Quantitative proteome analysis uncovered that ~3% of the of Synechocystis sp. PCC6803 and Multiple proteins exhibit oscillations in their abundance under alter- Cyanothece Strains nating light-dark conditions. The majority of these cyclic proteins are associated to central intermediary metabolism, Stephen J. Callister,2* Jon M. Jacobs,2 Jana Stöckel,4 photosynthesis as well as biosynthesis of cofactors. Our Jason E. McDermott,3 Lee Ann McCue,3 Carrie D. Nicora,2 data also suggest that diurnal changes in activities of several Ljiljana Paša-Tolić,1 Louis A. Sherman,6 Himadri B. enzymes are mainly controlled by turnover of related cofac- Pakrasi,4,5 and David W. Koppenaal1,2 (david. tors and key players but not entire protein complexes. [email protected]) While Cyanothece sp. ATCC 51142 continues to represent a 1Environmental Molecular Sciences Laboratory and model organism for proteomics investigations, six additional 2Biological and 3Computational Sciences Division, Pacific Cyanothece either have finished, or draft genome sequences. Northwest National Laboratory, Richland, Wash.; 4Depts. This number of strains, having genome sequences, allows of Biology and Energy, 5Environmental and Chemical for comparison of Cyanothece at the level of the core genome Engineering, Washington University, St. Louis, Mo.; and and core proteome. While the core genome predicts the 6Dept. of Biological Sciences, Purdue University, West common phenotype of Cyanothece, the core proteome repre- Lafayette, Ind. sents the actual protein phenotype characteristic of Cyanoth- ece. As such, all strains were cultured in the laboratory under Project Goals: The primary goal of this project is to apply growth conditions best representing their natural environ- a systems biology approach to understand the network of ments and proteins extracted from cells have been analyzed genes and proteins that govern the structure and function by LC-MS/MS. Results from over 460 LC-MS/MS of Cyanobacteria. These microorganisms make significant analyses have been used to develop proteomics databases contributions to harvesting solar energy, planetary carbon for strains PCC8801, PCC8802, PCC7424, PCC7425, sequestration, metal acquisition, and hydrogen produc- PCC7822, and ATCC51142. An additional proteomics tion in marine and freshwater ecosystems. Cyanobacteria database for ATCC51472 is pending on the completion of a are also model microorganisms for studying the fixation draft genome sequence for this strain. The Proteomics data- of carbon dioxide through photosynthesis at the biomo- base for the more distantly related Synechosystis sp. PCC6803 lecular level. Importantly, this project addresses critical was also included in this analysis to better constrain the core U.S. Department of Energy science needs, provides model proteome to represent Cyanothece. While the core proteome microorganisms to apply high-throughput biology and of Cyanothece is composed of a large percentage of proteins computational modeling, and takes advantage of EMSL’s involved in energy production, translation, and amino acid experimental and computational capabilities. production, a significant portion of the core proteome is also made up of proteins having no predicted function. or only a In the initial phase of this project, proteomics characteriza- general assigned function. Of interest was the observation of tions of Synechocystis sp. PCC6803 and Cyanothece sp. ATCC hypothetical and conserved hypothetical proteins suggesting 51142 were performed to improve our understanding of the the importance of these proteins in defining the general proteome makeup of these model organisms under normal lifestyle of Cyanothece, yet also suggesting the need for addi- and perturbed growth conditions. We have developed the tional functional characterization of these proteins to better most complete quantitative proteome analysis of Synecho- understand Cyanothece from a systems biology perspective. cystis sp. PCC6803 under various critical environmental perturbations applying a high sensitivity mass spectrometry The research was performed as part of an EMSL Scientific Grand approach spanning 33 physiological conditions. The result- Challenge project at the W.R. Wiley Environmental Molecular ing proteome dataset consists of 22,318 unique peptides, Sciences Laboratory, a national scientific user facility sponsored * Presenting author 129 Systems Biology Strategies and Technologies by the U.S. Department of Energy’s Office of Biological and developed a protocol for subtractive hybridization of small Environmental Research (BER) program located at Pacific and large subunit RNAs using sample-specific probes, that is Northwest National Laboratory. PNNL is operated for the applicable across diverse environmental samples. To test the Department of Energy by Battelle. method, rRNA-subtracted and unsubtracted transcriptomes were pyrosequenced from several different bacterioplankton communities ocean, representing ~350 Mbp of metatran- scriptomic data. The new subtractive hybridization method reduced bacterial rRNA transcript abundance by 40 to 58%, Deciphering177 Microbial Community Dynamics increasing recovery of non-rRNA sequences up to fourfold. To evaluate this method, we established criteria for detect- via Observational and Experimental ing sequences replicated artificially via pyrosequencing errors ‘Metatranscriptomics’: Developments and and identified such replicates as a significant component (6 Applications to 39%) of total pyrosequencing reads. Following replicate removal, statistical comparisons of reference genes (identi- Edward F. DeLong1 ([email protected]), Elizabeth fied via BLASTX to NCBI-nr) between technical replicates Ottesen1* ([email protected]), Adrian Sharma1* (sharma. and between rRNA-subtracted and unsubtracted samples [email protected]), Yanmei Shi,1 Gene Tyson,1 Frank showed low levels of differential transcript abundance (< Stewart,1 Rex Malmstrom,1 Sallie W. Chisholm,1 Jamie 0.2% of reference genes). However, gene overlap between Becker,2 and Dan Repeta2 datasets was remarkably low, with no two datasets (including duplicate runs from the same pyrosequencing library tem- 1Massachusetts Institute of Technology, Cambridge and plate) sharing greater than 17% of unique reference genes. 2Woods Hole Oceanographic Institution, Woods Hole, These results suggest that current levels of pyrosequencing Mass. capture a small subset of total mRNA diversity, underscoring the importance of rRNA subtraction to enhance sequencing http://openwetware.org/wiki/DeLong_Lab coverage across the functional transcript pool. Project Goals: One of our central project goals is to Novel small RNAs revealed by metatranscriptomics develop and refine methods for studying microbial com- Previous metatranscriptomic studies have suggested that munity gene expression in the environment, referred many cDNA sequences share no significant homology with to here as ‘metatranscriptomics’. There are a variety of known peptide sequences, and therefore might represent diverse applications of these new metatranscriptomic transcripts from uncharacterized proteins. We found that methods. The approach can be used to verify that hypo- a large fraction of cDNA sequences detected in a meta- thetical ORFs identified in metagenomic projects are transcriptomic datasets are comprised of well-known small indeed transcribed and expressed. In surveys, ‘observa- RNAs (sRNAs), as well new groups of previously unrecog- tional metatranscriptomics’ can be used to survey both the nized putative sRNAs (psRNAs). These psRNAs mapped nature and abundance of different RNA species (including specifically to intergenic regions of microbial genomes ribosomal RNA, messenger RNA, and small non-coding recovered from similar habitats, displayed characteristic RNAs) existing within any given microbial community. conserved secondary structures, and were frequently flanked Finally, ‘experimental metatranscriptomics’ can be lever- by genes that suggested potential regulatory functions. aged to reveal transcriptional responses of microbial Depth-dependent variation of psRNAs generally reflected communities to environmental perturbation, as well as known depth distributions of broad taxonomic groups, but to discover the specific metabolic pathways involved in fine-scale differences in the psRNAs within closely related matter and energy cycling. In the course of developing populations suggested potential roles in niche adaptation. methods and protocols for metatranscriptomics, we have Genome-specific mapping of a subset of psRNAs derived explored many of these specific applications. We report from predominant