Proteomics of the Chloroplast: Experimentation and Prediction Klaas Jan Van Wijk
Total Page:16
File Type:pdf, Size:1020Kb
trends in plant science Reviews Proteomics of the chloroplast: experimentation and prediction Klaas Jan van Wijk New technologies, in combination with increasing amounts of plant genome sequence data, have opened up incredible experimental possibilities to identify the total set of chloroplast proteins (the chloroplast proteome) as well as their expression levels and post-translational modifications in a global manner. This is summarized under the term ‘proteomics’ and typi- cally involves two-dimensional electrophoresis or chromatography, mass spectrometry and bioinformatics. Complemented with nucleotide-based global techniques, proteomics is expected to provide many new insights into chloroplast biogenesis, adaptation and function. hloroplasts are chlorophyll-containing plastids and origi- bound to become another important tool for plant biology. Several nate from proplastids, which are generally maternally plant proteomics studies have been published in recent years10. Cinherited via the embryo. Although the study of the chloro- However, in these studies, protein identification was achieved plast is a classic field in plant biology, there is no good overview through Edman sequencing, which necessarily limited the identi- of the total set of chloroplast proteins (the chloroplast proteome). fication of proteins in terms of cost, speed and sensitivity. Mass Improvements in two-dimensional electrophoresis (2-DE) and spectrometry will allow identification at a much higher speed and mass spectrometry have, in combination with increasing amounts with 100–1000 times less protein. Two plant proteomics studies of sequence data from Arabidopsis, rice, maize and other plant using mass spectrometry have been published recently, one con- species, opened up fantastic experimental possibilities enabling cerning anoxia tolerance in maize root tips11 and the other on pea the chloroplast proteins as well as their expression levels and thylakoid proteins12. An explosion of plant proteomics initiatives post-translational modifications to be identified rapidly. This can be expected in the coming years. is summarized under the term ‘proteomics’. Proteomics typically involves biochemical purification techniques such as 2-DE, Compartmenting the chloroplast proteome chromatography or affinity purification, mass spectrometry and From a biochemical point of view, the chloroplast can be divided bioinformatics1,2. Complemented with other functional genomics into several compartments, with each compartment having its own techniques such as cDNA or oligonucleotide microarrrays (Box 1) specific subset of proteins, or subproteome. To characterize the and reverse genetics, a better understanding of chloroplast bio- chloroplast proteome fully, either experimentally or by prediction, genesis, adaptation to the environment, signal transduction and it is useful (and probably essential) to subdivide the chloroplast metabolic pathways can be obtained. proteome into such subproteomes13. Only then can we devise opti- Here, we discuss in detail such an experimental approach to the mal experimental strategies to identify and characterize most pro- characterization of the chloroplast proteome. A complete charac- teins, including those that are hydrophobic5, of low abundance or terization includes not only the identification of proteins but also transiently expressed14. For each of the chloroplast compartments, studies of their expression levels, post-translational modifications, we briefly review current knowledge of each corresponding sub- protein–protein interactions and apparent discrepancies between proteome and discuss possible experimental and theoretical strat- the identified proteins and their predicted protein sequence from egies for further characterization. nucleotide sequencing data. We also comment on possibilities and limitations for the theoretical prediction of the chloroplast pro- Stromules, the chloroplast envelope and vesicles teome based on targeting or presequence information. Starting from the cytosolic side of the chloroplast, the first com- partment is the chloroplast outer and inner envelope. The enve- Experimental characterization of the chloroplast proteome by lope is the site of transport of metabolites, proteins and 2-DE and mass spectrometry messengers between plastids and the cytosol15,16. The inner enve- The improvement of 2-DE through the development of immobi- lope membrane is also a site for the biosynthesis of several prod- lized pH gradients3 and optimization of solubilization tech- ucts (e.g. lipids and pigments15,17,18) and has also been implicated niques4,5 now allows the reproducible separation of more than in DNA replication and transcription of the chloroplast genome19. 2000 proteins on a single 2-DE gel. Such gel-separated proteins Several protein complexes involved in the translocation of can be identified by mass spectrometry if genomic information is nucleus-encoded chloroplast proteins have been characterized in available1,2,6 (Box 2). In addition, mass spectrometry is a powerful great detail20,21. At least 100 protein bands can be resolved on one- tool for analyzing isoforms, secondary modifications of proteins dimensional electrophoresis (1-DE) silver- or Coomassie-stained (e.g. glycosylation and phosphorylation) and proteolysis using gels of purified inner and outer membranes. However, 1-DE gels low amounts (picomoles to attomoles) of proteins7–9. Thus, a pro- do not have sufficient resolution to obtain a global overview of the teomics approach allows us to bridge the gap between genomic envelope proteome or to study post-translational modifications or sequence information and the actual protein population in a cell. changes in protein expression. Proteomics is already an important tool in medical research and No successful systematic analysis of the envelope proteome has the analysis of yeast and prokaryotes. With the rapid progress of been carried out to date, which is partly related to the hydrophobic the sequencing of the Arabidopsis genome and ongoing EST and nature of this subproteome. The 2-DE of envelope membrane genomic sequencing of many agricultural crops, proteomics is proteins was reported to be unsuccessful in the recovery of 420 October 2000, Vol. 5, No. 10 1360 - 1385/00/$ – see front matter © 2000 Elsevier Science Ltd. All rights reserved. PII: S1360-1385(00)01737-4 trends in plant science Reviews Box 1. Functional genomics tools to understand Box 2. Mass spectrometry for the analysis of proteins cellular processes and peptides Transcriptomics Mass spectrometry is the preferred method in the study of protein Definition • The systematic analysis of accumulated identification. Mass spectrometers with ‘soft’ ionization techniques transcripts in the cell or tissue allow the rapid identification of proteins provided that genomic or Type of information • The accumulation of transcripts, indicating cDNA sequence is available. Protein identification and characteriz- the level of gene expression ation generally involves two types of mass spectrometers: Strong points • Extremely sensitive • Matrix-assisted laser desorption–ionization time-of-flight • Many different transcripts can be monitored (MALDI-TOF) mass spectrometers – accurately measure the simultaneously masses of a protein (mixture) or of proteolytic digests of gel Weak points • mRNA levels often do not correspond to separated or otherwise purified proteins. A selected protein spot protein accumulation is digested with a site-specific protease such as trypsin, resulting • Cannot study post-translational modifications in a set of peptides. The masses of the peptides are then mea- • Location of mRNA does not provide sured by MALDI-TOF MS, resulting in a list of peptide information about the location of the gene masses. For each entry in the nucleotide and protein databases, product the masses of the predicted tryptic peptides are calculated and compared (within the experimental mass accuracy) with the Proteomics list of measured peptide masses using web-based search Definition • The systematic analysis of the proteins engines (e.g. Protein Prospector, Mascot, Profound). The cor- (the proteome) of a cell, tissue, organelle rect protein will have many ‘matching’ peptides. Proteins can be or membrane identified even when they consist of a mixture of two or three Type of information • The identification and expression level proteins. This method relies on the mass accuracy (5–15 ppm) of the proteome and sensitivity (femtomole range) of the latest generation of • Post-translational modifications and protein– MALDI-TOF MS instruments. protein interactions (protein complexes) • Tandem mass spectrometer – often coupled to liquid chro- Strong points • Relatively fast and sensitive (femtomole matography, using electrospray ionization with a collision cell for range) identification of proteins induced fragmentation (the technique is thus abbreviated to • Protein–protein interactions can be studied ESI-MS/MS). When a protein cannot be positively identified • Monitors the protein directly, rather than by MALDI-TOF MS, peptide sequence tags are obtained by monitoring the mRNA ESI-MS/MS. The individual peptides are ‘screened’ in the first Weak points • With current technology, the number of section of the tandem mass spectrometer and selected peptides proteins that can be simultaneously followed are subsequently further fragmented along the protein backbone is not as high as with transcriptomics by collision with argon or nitrogen molecules. This is termed •