View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Available online at www.sciencedirect.com

Biochimica et Biophysica Acta 1778 (2008) 1698–1713 www.elsevier.com/locate/bbamem

Review Proteome of the Escherichia coli envelope and technological challenges in membrane proteome analysis ⁎ Joel H. Weiner a,b, , Liang Li c

a Membrane Research Group and The Institute for Biomolecular Design, University of Alberta, Canada b Department of and Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2E7 c Department of Chemistry and The Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2E1 Received 18 April 2007; received in revised form 19 July 2007; accepted 23 July 2007 Available online 11 August 2007

Abstract

The envelope of Escherichia coli is a complex organelle composed of the outer membrane, periplasm-peptidoglycan layer and cytoplasmic membrane. Each compartment has a unique complement of , the proteome. Determining the proteome of the envelope is essential for developing an in silico bacterial model, for determining cellular responses to environmental alterations, for determining the function of proteins encoded by genes of unknown function and for development and testing of new experimental technologies such as mass spectrometric methods for identifying and quantifying hydrophobic proteins. The availability of complete genomic information has led several groups to develop computer algorithms to predict the proteome of each part of the envelope by searching the genome for leader sequences, β-sheet motifs and stretches of α- helical hydrophobic amino acids. In addition, published experimental data has been mined directly and by machine learning approaches. In this review we examine the somewhat confusing available literature and relate published experimental data to the most recent gene annotation of E. coli to describe the predicted and experimental proteome of each compartment. The problem of characterizing integral versus membrane- associated proteins is discussed. The E. coli envelope proteome provides an excellent test bed for developing mass spectrometric techniques for identifying hydrophobic proteins that have generally been refractory to analysis. We describe the gel based and solution based proteome analysis approaches along with protein cleavage and proteolysis methods that investigators are taking to tackle this difficult problem. © 2007 Elsevier B.V. All rights reserved.

Keywords: Cytoplasmic membrane; Periplasm; Outer membrane; ; ; Polyacrylamide gel electrophoresis

Contents

1. Introduction ...... 1699 2. E. coli envelope composition ...... 1699 3. The outer membrane proteome ...... 1700 3.1. Predicted outer membrane proteome ...... 1700 3.2. Experimental outer membrane proteome ...... 1701 4. The periplasmic proteome ...... 1702 4.1. Predicted periplasmic proteome ...... 1702 4.2. Experimental periplasmic proteome ...... 1703

Abbreviations: 2MEGA, dimethylation after lysine guanidination; ABC, ATP binding cassette; CNBr, cyanogen bromide; ESI, electrospray ionization; ICAT, isotope-coded affinity tag; IEF, isoelectric focusing; IMP, inner membrane protein; LC, liquid chromatography; MAAH, microwave-assisted acid hydrolysis; MALDI- TOF, matrix assisted laser desorption/ionization-time of flight; MS, mass spectrometry; OMP, outer membrane protein; ORF, open reading frame; PAGE, polyacrylamide gel electrophoresis; PMF, peptide mass fingerprinting; SDS, sodium dodecyl sulfate; TFA, trifluoroacetic acid; TMM, transmembrane ⁎ Corresponding author. Department of Biochemistry and Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2H7. Tel.: +1 780 492 2761; fax: +1 780 492 0886. E-mail addresses: [email protected] (J.H. Weiner), [email protected] (L. Li).

0005-2736/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.bbamem.2007.07.020 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1699

5. The cytoplasmic membrane proteome ...... 1704 5.1. Predicted cytoplasmic membrane proteome ...... 1704 5.2. Respiratory complexes...... 1705 5.3. Experimental cytoplasmic membrane proteome ...... 1705 6. Comparison with microarray ...... 1706 7. Methodology for membrane proteome analysis...... 1706 7.1. Gel-based proteome analysis platform...... 1706 7.2. Solution-based proteome analysis platform ...... 1708 8. Future perspectives ...... 1710 Acknowledgements ...... 1710 References ...... 1710

1. Introduction cellular proteome is associated with the envelope and at least one half of these proteins are of unknown or putative function. It is The availability of complete genomic sequences for hundreds quite possible that the functionally uncharacterized proteins of microorganisms, as well as transcriptome measurements of catalyze unique metabolic transitions and these may prove mRNA levels under a variety of physiologic conditions, has useful for the biotechnology industry. Thus, a full understanding spurred investigators to determine the complete protein compo- of E. coli requires the development of novel technologies sition, or proteome, of entire bacterial cells and how the pro- including proteomic analysis to determine the function of these teome changes with physiologic shifts. Membrane proteins proteins. Proteomic studies under varying growth, regulatory which play essential roles in energetics, metabolism, signal and stress conditions will help with the functional assignment of transduction and transport compose as much as 40% of the unknown proteins. In addition, because of the hydrophobicity of entire proteome of a bacterial cell. The hydrophobic nature of membrane proteins there are significant technical difficulties in membrane proteins has hindered their investigation at the studying the membrane proteome and E. coli provides a standard functional and structural level. Similarly, characterizing the for evaluating and validating new technologies. E. coli offers an membrane proteome has lagged studies of the soluble prote- excellent model system to develop new extraction, solubiliza- ome. However, recent technical advances in separation metho- tion, proteolytic, electrophoretic and mass spectrometry techni- dologies by 2-dimensional electrophoretic techniques and 2- ques which will be applied to the more complex proteomes of dimensional HPLC methodologies, coupled with improved higher organisms. proteolytic methodologies, and advanced mass spectrometry instrumentation is opening the membrane proteome to experi- 2. E. coli envelope composition mental evaluation. This review will summarize current data on the predicted and The envelope of E. coli is a complex structure composed of observed proteome of the Escherichia coli K-12 envelope along the outer membrane, periplasm/peptidoglycan layer and the with advances in methodology that allow more detailed ex- cytoplasmic membrane. The envelope is usually prepared by ploration of this proteome. lysozyme-EDTA lysis [3], French press lysis [4] or Avestin The complete proteome of E. coli is being examined for EmulsiFlex [5] lysis of bacterial cultures. These processes form many reasons. The International E. coli Alliance [1] brings vesicles that can be isolated by differential centrifugation. Inner together biochemists, system biologists and computer scientists and outer membranes can be isolated by density centrifugation with the aim of developing a detailed understanding of the entire comprising an 8000×g spin to sediment unbroken cells and transcriptome, proteome, interactome, metabolome and phy- debris followed by a high speed spin (150,000×g) to sediment siome of E. coli in response to physiologic, genetic or environ- inner and outer membrane vesicles [6]. Inner from outer mem- mental variation with the aim of creating an in silico E. coli brane vesicles can then be separated by density centrifugation where changes can be predicted and experimentally verified. using a sucrose gradient, with the outer membranes having a Whereas the transcriptome determined by microarray techni- higher density. In most cases vesicle integrity is such that ques gives a whole genome profile at the mRNA level, pro- cytosolic proteins trapped in the vesicles can be removed by teomics gives information on expression levels. Proteomics buffer washing. Loosely-associated extrinsic proteins are often provides information on post-translational modifications which removed by washing with chaotropic agents such as Na car- is not possible by microarray and proteomics can be used to bonate [7]. Fractions containing periplasmic proteins can be investigate subcellular localization such as the envelope. prepared by the cold osmotic shock method [8]. The E. coli genome contains 4452 open reading frames One of the first attempts to characterize the envelope pro- (ORF). Of these 2403 (54.1%) have an experimentally deter- teome was carried out by Ames and Nakaido in the mid-1970s mined function and a further 1425 (32%) have a computationally [9] using sodium dodecyl sulfate solubilization coupled with determined function. The remaining 663 (14.9%) are of un- O'Farrell two-dimensional polyacrylamide gels (isoelectric known function. Interestingly, 238 ORFs (5.3%) do not have focusing (IEF) in the first dimension and SDS polyacrylamide homologues in other genomes [2]. About one third of the total gel electrophoresis (PAGE) in the second dimension). They 1700 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Table 1 length that cross the membrane. Many algorithms are available Predicted outer membrane proteome to identify these, ranging from the original Kyte Doolittle Prediction β-barrel Lipoprotein Reference algorithm [13] to the TMHMM algorithm [14–17] which gives Molloy-SwissProt38 39 23 [21] the most reliable prediction of the presence of transmembrane BOMP 103 N/A [17] α-helices. The outer membrane encompasses a limited number TMB-HUNT 69 N/A [22] of proteins and structural studies of several of them [18,19] have PSORTdb 91 Does not list [23] shown that they are comprised of β-barrel motifs and lack the α- Riley et al. 23 101 [2] helices found in cytoplasmic membrane proteins. The β-barrel structures form mono, di and trimeric structures with 8–22 β- noted the complexity of the sample and observed over 150 spots on the gel. It is not clear that this represented 150 different Table 2 proteins as proteolysis, post-translational modification and iso- Outer membrane proteome comparison electric changes (e.g. deamidation) could have resulted in mul- Riley Riley Riley Riley Molloy Fountoulakis Lopez- Gevaert tiple spots for the same polypeptide. Technology at the time did Campistrous not allow them to identify the proteins. As will be described acrE ompG ybjR yidX btuB btuB btuB aceE below, recent advances in analytical techniques such as the bcsZ ompL ycaL yjaH cirA cirA cirA fimD melding of two-dimensional liquid chromatography, improved blc ompN ycbS yjbF fadL fecA fadL lolB borD ompW yccZ yjcP fepA fepA fecA nlpB two dimensional gel methods, mass spectrometry, selective btuB ompX ycdR yjeI fhuA fhuA fhi ompA proteolysis and bioinformatics allow relatively facile identifica- cirA osmB ycdS yjfO fhuE fimD fhuA ompC tion of the individual proteins. csgG osmE yceB ymcC flu flu hemM ompF The entire genomic sequence of E. coli MG1655 has been cusC pal ycfL ynbE lamB lolB imp ompP available for many years [10], and multiple attempts have been ecnA panE ycfM yncD nlpB lamB lamB ompX ecnB phoE ycjN yoaF ompA mltA mulI rlpA made to assign a function and location for each open reading emtA pldA ydbA yohG ompC nfrA nlpB ycjI frame. Bioinformatic attempts to assign a function are often fadL rcsF ydbJ ypdI ompF nlpB ompC yedD carried out in parallel with experimental approaches to define fecA rlpA ydcL yqhH ompP ompA ompF yfeY function and location and by inference those proteins associated fepA rzoD yddW yraJ ompT ompC ompT yfiO with the envelope. The challenge of such efforts is to provide fhuA rzoR ydeK yraP ompW ompF ompX ygdI fhuE sfmD yeaY ysaB ompX ompT osmE yjeI insightful information that is neither contradictory nor confus- fimD slp yecR pal pal pal yraP ing. This is not a simple exercise. fiu slyB yedD slp pldA rlpA flgH smpA yegR slyB tolC slp 3. The outer membrane proteome fliL spr yehB tolC tsx slyB hslJ srlD yfaZ tsx yaeT tolC kefA tolC yfcT ybhC yciD tsx 3.1. Predicted outer membrane proteome lamB tsx yfeN ybiL yeaF yajG lepB uidC yfeY yaeT yncD ybhC The outer membrane consists of phospholipids, membrane lolB vacJ yfgH yeaF ybjP proteins, lipoproteins and lipopolysaccharide. It forms the mipA wza yfgL yciD outermost layer of the cell envelope and is covalently attached mltA yaeF yfhG yeaF mltB yaeT yfiB yedD to the peptidoglycan by an almost continuous layer of small mltC yafT yfiL yfgL lipoprotein molecules (Lpp or Braun's lipoprotein) [11].The mltD yaiW yfiM yfiO peptidoglycan (murein) layer is a lattice formed by repeating mulI yajG yfiO yifL disaccharides interconnected by peptides that make up the nfrA yajI ygdI ynfB bacterial cell wall. It is a polymer of N-acetylglucosamine nlpB ybaY ygdR yraM nlpC ybbC ygeR yraP (gluNAc) alternating with molecules of N-acetylmuramic acid nlpD ybdA yggG (murNAc). In the eubacteria these molecules are cross-linked nlpE ybeT yghG by pentapetides (L-alanine-D-glutamate-meso-diaminopimelic nlpI ybfM ygiB acid-D-alanine-D-alanine). These peptides are unusual in that nmpC ybfN yhcD nrfG ybfP yhdV they contain the rare D-enantiomers of the Ala and Glu residues ompA ybgQ yhfL [12]. ompC ybhC yiaD The outer membrane of Gram-negative bacteria is the inter- ompF ybjP yidQ face between the cell and the environment and this distinguishes The predicted outer membrane proteome from the Riley consortium annotation Gram-negative from Gram-positive organisms that lack this [2]. This list includes those proteins listed as outer membrane by “cell layer. The double leaflet membrane is unlike a typical phos- localization” as well as those proteins listed as outer membrane by “gene product pholipid bilayer as it has phospholipids on the internal leaflet description” that are underlined. A comparison of the outer membrane proteins and lipopolysaccharide on the outer leaflet. determined by Molloy [21], Fountoulakis [25] and Lopez-Campistrous [28] using 2D gel electrophoresis and Gevaert [26] using LC-MS was undertaken by Integral membrane proteins of the inner and outer membrane combining the Supplementary Data in their publications with the Riley et al. differ in structure. Cytoplasmic membrane proteins are hydro- annotation. Proteins in underline italics are common to all three reports. The phobic and are composed of α-helices 15–25 amino acids in unique proteins identified by Gevaert [26] by LC MS are in bold. J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1701 barrels. The β-barrel is composed of alternate polar and non- branal β-barrel. In E. coli K12, as many as 782 proteins have a polar amino acids. The non-polar amino acids point into the signal sequence and they predict that 69 are transmembranal β- lipid and protein interface, while the polar amino acids point barrel proteins. into the interior of the barrel. Because of this complexity, facile Unlike transmembranal β-barrel proteins, outer membrane prediction of the location of the transmembrane domains of lipoproteins are easier to identify. These proteins can be outer membrane proteins remains elusive. identified by the presence of a leader with a common consensus In addition to the outer membrane proteins with the β-barrel sequence [20]. The leader is typically between 15 and 40 amino architecture, many lipoproteins [20] are found on the outer acid residues in length, and has at least one arginine or lysine in membrane including the major outer membrane protein MulI the first seven residues. The leader is cleaved by signal peptidase (Lpp, Braun's lipoprotein) [11] that links the outer membrane to II, on the amino terminal side of the cysteine residue which is the peptidoglycan layer. It is estimated that each log phase E. then enzymatically modified by the addition of N-acyl and S- coli cell contains 1.6×105 copies of this protein, making it one diacyl glyceryl groups [20]. This lipid serves to anchor these of the most abundant in the cell. Other major outer membrane proteins to the outer leaflet of the inner membrane or inner leaflet proteins include OmpA ∼105 copies per cell, OmpC ∼2×104 of outer membrane so that they can function in the aqueous copies per cell and OmpF ∼104 copies per cell. These proteins periplasmic interface. form hydrophilic channels allowing free diffusion of small PSORTdb is a database that combines experimentation and molecules across the outer membrane. The outer membrane also computational prediction to determine the subcellular location contains specific channels such as PhoE for phosphate, LamB of a protein [23] (Tables 1 and 2). Their analysis of the E. coli for maltose and maltodextrins, Tsx for nucleosides. These genome lists 91 outer membrane proteins, but does not dis- proteins also serve as attachment sites for colicins and tinguish between β-barrel proteins and lipoproteins. bacteriophages (e.g. Tsx for phage T6, LamB for phage λ) Recently, a consortium of investigators has produced a new from which their names derive. There are also high affinity annotation of the ORFs of E. coli [2]. The E. coli consortium receptors for ferric iron (FepA, FhuA), vitamin B12 (BtuB), fatty database relied more on experimental data for annotation and acids (FadL). found that only 23 β-barrel proteins are identified along with β-barrel structures are difficult to predict due to variable 142 potential outer membrane lipoproteins. This listing comes properties and the short stretch of hydrophobic amino acids. from a combination of proteins annotated as “outer membrane” Molloy [21] carried out one of the earlier analyses of the outer by “cell location” (125 proteins) as well as 17 proteins anno- membrane proteome using the SwissProt Release 37 database that tated with an outer membrane location by “gene product de- had 58 potential outer membrane proteins based on a combination scription” (underlined in Table 2). of experiment, computer driven prediction and similarity to OMPs from other organisms. Thirty-seven β-barrel integral proteins were 3.2. Experimental outer membrane proteome predicted to have a pI of 4–7, two integral proteins had a pI N7. There were 10 lipoprotein OMPs with pI of 4–7 and 9 proteins Analysis of the outer membrane is complicated by a series of with pI N7(Tables 1 and 2). No OMPs had a predicted pI b4. The technical problems including the extraction and solubilization more recent release of the SwissProt database Release 42 reveals of outer membrane proteins prior to 2D PAGE, solubilization of 71 OMPs, but this list contains 12 proteins (FhiA, GspD, HlpA, intractable proteins, trypsin , artifacts due to the issue HofQ, Imp, PgpB, YiaT, YifL, YnfB, YpjA, YqiG, YraM) that are of loading large amounts of protein onto 2D gels, protein probably not OMPs based on the Riley annotation [2]. microanalysis by mass spectrometry and the databases used for BOMP [17] (Tables 1 and 2) was developed to predict identification of peptides. These problems also hold for the integral β-barrel OMPs. BOMP uses a C-terminal pattern inner membrane proteome. typical of internal β-barrel proteins and comparison to stretches Molloy [21] used a combination of 2D PAGE and matrix of amino acids typically found in transmembranal β-strands of assisted laser desorption/ionization-time of flight (MALDI- proteins with resolved crystal structures. These workers found TOF) MS to characterize the outer membrane proteome. E. coli ninety-one β-barrel proteins using their algorithm and an cells were broken by French press lysis to prepare a total additional 12 proteins based on BLAST searches for a total of envelope fraction that was washed with 0.1 M Na carbonate to 103 of the 4346 polypeptides in E. coli to be possible integral remove peripheral proteins. The whole envelope was solubilized OMPs. Sixty-seven of the proteins were previously annotated as in 7 M urea, 2 M thiourea, 1% ASB14 (amidosulfo- OMP within SwissProt leaving thirty-six possible additional betaine with an alkyl tail containing 14–16 carbons [24] OMPs. Using this algorithm on various prokaryotic genomes (Table 2). Apparently, this surfactant mixture did not solubilize gave a range of 1.8% to 3% OMPs. proteins from the inner membrane or proteolytic digestion of The TMB-Hunt algorithm [22] (Tables 1 and 2) is designed integral peptides taken from gel plugs was not efficient. Molloy to find β-barrel proteins based on the presence of a signal identified 40 unique proteins, 75% of which were outer mem- sequence and amino acid composition. Using the NCBI FTP brane or membrane-associated. They identified 21 of the 37 site, various strains of E. coli (including pathogens) have a total predicted integral OMPs (78% of OMP not including hypothet- of 5341 different proteins and, of these, 1032 polypeptides have ical) plus one probable OMP, 5 lipoprotein OMPs, 2 cytoplasmic a signal sequence to direct them to the periplasm or outer membrane associated polypeptides (AtpB, NuoC), 1 cytoplas- membrane. Eighty-seven proteins were identified as transmem- mic membrane lipoprotein (AcrA), 4 periplasmic proteins, 1702 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

cytoplasmic proteins were found in the outer membrane fraction. Fountoulakis [25] also found a large number of cytoplasmic proteins associated with the membrane envelope. This highlights the difficulty of using fractionation methodol- ogies as proteins from the cytoplasm can adventitiously stick to other fractions giving erroneous localization. Inner membrane proteins can be found in the outer membrane fraction if there are zones of adhesion between the membranes [29]. Lai et al. [30] extended the proteomic characterization of the E. coli membrane to one of the first functional analysis. Using 2D PAGE, image analysis techniques and trypsin in gel digestion with MALDI mass spectrometry they compared the membrane proteome of minicells that are enriched in polar proteins to rod cell membranes. One hundred seventy-three spots were analyzed and 54 proteins identified. Thirty-six spots were enriched in minicells and 15 spots in rod cells showing that there is polar distribution of envelope proteins. Examination of their results indicated that they identified 14 integral outer membrane proteins, 6 lipoproteins and 4 associated cytoplasmic membrane proteins. Only 1 integral membrane protein, YiaF, was identified. In a recent study Marini et al. [31] examined new candidate Fig. 1. Venn diagram noting the overlapping and uniquely identified proteins in the outer membrane proteins that have been identified by bioinfor- experimentally identified outer membrane proteins by Molloy (M), Fountoulakis matics prediction. The proteins were cloned and overexpressed, (F) and Lopez-Campistrous (L) data as taken from Table 2 and [21,25,28]. and finally localized by cell fractionation to the outer mem- brane. They confirmed the outer membrane localization for five flagellin, 2 cytoplasmic, 2 unknown, and 1 unknown lipoprotein. proteins – YftM, YaiO, YfaZ, CsgF, and YliI – and also provide Re-analysis of their list with the Riley listing indicates 25 OMPs preliminary data supporting an outer membrane location for a (Table 2). sixth — YfaL. Fountoulakis and Glasser [25] carried out a proteomic Thus, unlike the predicted outer membrane proteome, exper- analysis of E. coli envelopes composed of both inner and outer iments to date have only identified a limited number of proteins membranes. They solubilized the membranes with mild in the outer membrane fraction. (CHAPS) or strong detergents (SDS, sodium cholate or sodium deoxycholate), carried out 2D PAGE, in gel proteolysis and 4. The periplasmic proteome MALDI-TOF MS. They identified a total of 394 different gene products of which 25 were annotated to the outer membrane (24 4.1. Predicted periplasmic proteome using the Riley annotation list in Table 2). In one of the first gel-free proteomic analysis of the E. coli The periplasmic space is a gel-like layer composed of soluble proteome, Gevaert et al. [26] identified more then 800 proteins. proteins located between the outer and inner membranes. It is They analyzed only methionine containing peptides in a total highly enriched in and nucleases that would be peptide mixture of an unfractionated lysate. They identified 11 detrimental if they were located in the cytoplasm. The periplasm outer membrane proteins using the MASCOT search engine for contains many substrate binding proteins to capture nutrients mass spectrometry [27] and a database of E. coli methionine which are often present at very low concentrations. Proteins containing peptides and 14 proteins using the PROCORR localized in the periplasmic space can be identified by searching database at the 95% confidence limit (12 proteins at 98% the genome for leader sequences associated with Sec [32] or Tat confidence). Re-analysis of their data using the Riley et al. [2] motifs [33] which address these proteins to their respective database indicates that 17 OMPs were identified (Table 2). translocons. Several algorithms and databases are available for Lopez-Campistrous et al. [28] carried out a large scale 2D gel this analysis. PSORTdb predicts 141 periplasmic proteins [23]. analysis of the E. coli proteome in cells fractionated into cyto- Nielsen et al. [34] used an approach based on neural networks plasm, inner membrane, periplasm and outer membrane by the trained on separate sets of prokaryotic sequences and identified Yamato procedure [6]. They identified 60 proteins in the 224 proteins. isolated outer membrane fraction. Thirty-one proteins are The Project Cybercell (www.projectcybercell.ca) database characterized in Swiss-Prot as outer membrane proteins (34 lists 200 periplasmic proteins. The recent consortium database proteins using the Riley [2] annotation Table 2). Fig. 1 is a Venn [2] lists 367 periplasmic proteins (Table 3) using a combination diagram, graphically depicting the degree of overlap and of “cell localization” and “gene product description” and is the uniquely identified proteins between the Molloy, Fountoulakis, most complete annotation of E. coli. This listing may be an and Lopez-Campistrous databases. A large number of soluble over-estimate as it includes some proteins that are associated J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1703

Table 3 Table 3 (continued) Predicted and experimental periplasmic proteome Riley Riley Riley Riley Lopez-Cam Gevaert Riley Riley Riley Riley Lopez-Cam Gevaert fecB pth ydhS ynfD oppA ackA hdeB treA yfcU agp ackA fepB ptr ydiY ynfE osmY agp hisJ trxB yfdX ansB alsB fhuD puuE ydjG ynfF potD alsB hlpA tynA yfeK araF ansB fhuE rbsB yebF ynhG prc amiA hofQ ugpB yfeW artI aphA fimC recG yecG ynjB pspE amiB htrE uidC yffQ bglX araF fimF rffD yecT ynjE pta amiC hybA ushA yfgI cpdB artI fimI rna yedS ynjH pth ampC iap wcaM yfhD dppA bglX fkpA rseB yedX yobA ptrA ampH imp xylF yfiR dsbA btuB flgA rsxG yedY yodA rbsB ansB ivy yaaI yfjT dsbC cpxP flgD sapA yeeJ yoeA rseB aphA kdsC yacC ygcG eco dacA flgI sbp yeeZ ypdH slt appA ligB yacH ygcI fkpA dacC flhE sfmA yegJ ypeC sufI araF livJ yadM ygeQ fliY dcrB fliY sfmC yehA yphF surA argT livK yadN yggE glnH degP frdA sfmF yehC yqhG tolB artI lolA yaeT yggM glpQ dppA frwB sfmH yehE yqiH tolC artJ lsrB yagW yggN hisJ dsbA frwD slt yehZ yqiI torT aslA malE yagX yghX livJ dsbG galF sodC yejA yqjC trxB asr malM yagY ygiL livK erfK gatB spy yejO yraH ugpB bax malS yagZ ygiS lolA fabF ggt ssuA yfaL yraI ushA bglH marB yahB ygiW malE fdoG glcG sufI yfaP yraJ yaeT bglJ mdh yahJ ygjG mdh fhuE glnH surA yfaQ yraK yahO bglX mdoD yahO ygjJ mdoG fimC glpQ tauA yfaS yrbC yciN btuB mdoG yaiT ygjK mglB fliY gltF tbpA yfaT yrfA ydcS btuF mepA ybaE yhbN modA galF gltI tolB yfcO ytfJ ydeN carB metQ ybaV yhcA mppA glpQ gntX tolC yfcQ ytfM ydgH ccmB metR ybcL yhcD nikA gltI gpsA torA yfcR ytfQ yeeZ ccmE mglB ybcS yhcE oppA gltL gspD torT yfcS znuA yhhA ccmH miaA ybfC yhcF osmY gltX hcaD torZ zraP yliB chbB modA ybgD yhcN potD glyA ynfF chiA mpl ybgF yhdW potF glyQ ynjE citE mppA ybgO yheM proX glyS ytfQ coaE mqo ybgP yhhA ptr gnd znuA cpdB mviM ybgQ yhjJ pstS gnsB The theoretical periplasmic proteome is taken from the Riley annotation [2]. The cpxP napA ybgS yiaO rbsB gntR list includes those proteins listed as periplasmic by “cell localization” as well as creA napB ycbB yiaT slt gor those proteins listed as periplasmic by “gene product description” shown in csgA napC ycbF yibG sodC gph bold. The underlined proteins are annotated as periplasmic, but it is this is not the csgB napD ycbQ yieL sufI gpmB case for DmsA, FrdA, OmpA, OmpG, OmpT, YnfE and YnfF. A listing of the csgC napF ycbR yifB surA greA periplasmic proteins determined by Lopez-Campistrous [28] using 2D PAGE csgE napH ycbV yigE thrC greB methodology and Gevaert [26] using LC-MS was undertaken using the cueO nfo yccT yiiQ tolB groL Supplementary Data in their publications with the Riley et al. annotation [2]. cusF nfrA ycdB yiiX treA grpE cybC nikA ycdK yijF ugpB grxB cysP nrfA ycdO yjbG ushA gshA dacA nrfB ycdS yjcO ybgF gshB with the cytoplasmic membrane, but have Tat leaders (e.g. dacB nrfC yceI yjcS ycdO gsk DmsA, YnfE, YnfF), proteins that are associated with the outer dacC ompA ycfR yjfJ ydcS gst membrane (e.g. OmpA, OmpG, OmpT) and proteins that are dacD ompG ycgH yjfN ydeN guaA clearly on the cytoplasmic side of the membrane (e.g. FrdA). dcrB ompT ycgJ yjfY yehZ guaB ddpA oppA ycgK yjhA yggN guaD Nonetheless, the consortium database provides the best estimate degP osmY ychP yjhT ygiW gudD of the number of periplasmic proteins. degQ paaH yciM yjjA yhbN gudX degS pbl ycjN ykfB yncE gyrA 4.2. Experimental periplasmic proteome dmsA pbpG ydaS ykfF ynjE gyrB dppA phnD ydbD ykgI yrbC hdeB dsbA phnP ydcA ykgJ ytfQ hisJ E. coli is a facultative anaerobe that can grow in diverse dsbC phoA ydcS yliB ivy environments by inducing the synthesis of appropriate trans- dsbG potD yddB yliI lolA porters and enzymes. The periplasmic protein composition can eco potF yddL ymcA malE respond to changes in the physiologic or environmental state ecpD ppiA ydeI ymcB malS and we can expect that only a subset of the proteins in the endA prc ydeN ymgD mdh erfK prkB ydeR yncD mdoG predicted periplasmic proteome will be seen in any one con- eutP proX ydfD yncE mglB dition. Most of the studies reported are carried out with cells fabF pspE ydgD yncI miaA grown in Neidhardt's glucose minimal medium. fdoG pstS ydgH yncJ mqo Several studies of the soluble protein proteome of E. coli fecA pta ydhO ynfB napA have been carried out [35–37]. These studies did not distinguish (continued on next page) 1704 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

membrane is the location of a variety of crucial metabolic processes, such as respiration and the synthesis of lipids and cell wall constituents. Finally, the membrane contains special receptor molecules that help bacteria detect and respond to chemicals in their surroundings. Based on protein diversity the cytoplasmic membrane is by far the largest and most complex compartment of the envelope. As for the other compartments, the protein composition of the membrane will change with physiologic and environmental conditions. It is useful to define all the open reading frames on the genome that can be membrane-bound and several groups have developed algorithms to search the genomic sequence for potential integral membrane proteins. Defining what is a membrane-associated protein has been the subject of infinite debate [38]. The simple situation involves those proteins that Fig. 2. Venn diagram noting the overlapping and uniquely identified proteins in traverse the membrane as an α-helical bundle or contain a lipid the experimentally identified periplasmic proteins by Lopez-Campistrous (L) and Gevaert (G) data as taken from Table 3 and [26,28]. anchor. Far more difficult is defining peripheral or extrinsic proteins that are associated with the membrane. There will be proteins that migrate from cytoplasm to membrane due to post- between cytoplasmic and periplasmic proteins or localization. translational modifications or physiologic changes. An example In the Gevaert analysis (see above) [26] of 800 E. coli proteins is proline dehydrogenase, PutA [39,40] that migrates to the they found 39 periplasmic proteins using MASCOT and 42 membrane upon reduction. Early studies by the Corbin et al. periplasmic proteins using PROCORR. We have re-analyzed [36] group attempted to define a database of known and pre- their supplementary data and find 96 periplasmic proteins in dicted membrane proteins by combining data from the E. coli their total database (Table 3). This assumes that their peptide entry point (http://coli.berkeley.edu/cgi-bin/ecoli/coli_entry.pl) assignments were correct. (156 proteins), proteins listed as membrane proteins in the In the large-scale 2D PAGE study of Lopez-Campistrous et GenProtEC database (http://genprotec.mbl.edu) (634 proteins), al. [28] 107 different proteins were identified on the gel of the or proteins that contained at least two transmembrane (TM) periplasmic fraction isolated by cold osmotic shock of which at helices by the PHD algorithm [41] (821 proteins). This created least 54 were periplasmic proteins based on the Riley et al. a database of 1017 proteins that occurred in at least one of the database [2]. Fig. 2 is a Venn diagram, graphically depicting the databases. The GenProt database includes integral membrane degree of overlap and uniquely identified proteins between the proteins as well as some extrinsic proteins that are part of Gevaert, and Lopez-Campistrous databases. multi-subunit respiratory complexes, but it is not compre- Hervey et al. (ASM Meeting 2006 Abstract K-125) used hensive. For example, FrdA and FrdB, extrinsic subunits of the differential isotope labeling with 14Nand15N minimal medium fumarate reductase complex [42] are not included, but DmsA coupled with LC-MS-MS to identify periplasmic proteins and DmsB, extrinsic subunits of the DMSO reductase complex released by cold osmotic shock. They analyzed 667 proteins [43], are included. The extrinsic components of ATP Binding and found 103 with a high periplasmic:whole cell mass ratio. Of Cassette (ABC) transporter class of membrane proteins [44] these 39 were annotated as periplasmic in Swissprot, an are generally not included. Furthermore, some proteins, known additional 43 contained a signal sequence based on various to be membrane-associated through hydrophobic interactions prediction algorithms and 21 of the 103 were membrane such as Dld, D-lactate dehydrogenase [45] or GlpD, glycerol-3- proteins. Twenty-one proteins were cytoplasmic and presum- phosphate dehydrogenase [46] are defined as cytoplasmic in all ably arose from cell lysis. Twelve known periplasmic proteins of the databases. had low periplasmic:whole cell mass ratios either because they Further complicating the creation of a database, many of the were associated with the membrane or had unusual turnover proteins identified in the Corbin et al. database [36] may not properties. be associated with the membrane. This is because single transmembrane-spanning proteins can often be confused with 5. The cytoplasmic membrane proteome proteins that will be exported through the Sec [47] or Tat [48] translocons to the periplasm. These proteins have an amino 5.1. Predicted cytoplasmic membrane proteome terminal leader that can mimic a transmembranal helix [49]. Proteins with two or more helices are more readily identified. The cytoplasmic membrane (inner membrane or plasma Wallin and von Heijne [50] carried out a statistical analysis of membrane) retains the cytoplasm and separates it from the helix bundle proteins in many organisms including E. coli. surrounding environment. This membrane also serves as a They found that 20–30% of all the open reading frames in an selectively permeable barrier: it allows particular ions and organism code for α-helical membrane proteins. In E. coli the molecules to pass, either into or out of the cell, while value ranges from 24% (∼1028 proteins) using two “certain” preventing the movement of others. The bacterial cytoplasmic transmembrane helices for the analysis to 40% (∼1714 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1705 proteins) for proteins with two “putative” or “certain” trans- In the Gevaert analysis [26] of 800 E. coli proteins they membrane helices. found 56 IMPs using MASCOT and 68 IMPs using PROCORR The Riley et al. [2] database identified 757 integral mem- (13% of the proteins in their database (69/525). In the large- brane proteins, 144 membrane anchored proteins, 12 inner scale analysis of the E. coli proteome carried out by Lopez- membrane lipoproteins and 11 membrane associated lipopro- Campistrous et al. [28], 479 spots on the cytoplasmic membrane teins that could be on the inner membrane. gel could be identified by MALDI-TOF MS. This corresponded to 164 unique proteins. When these proteins were characterized 5.2. Respiratory complexes as to subcellular location using the Riley annotation, we find 94 cytoplasmic, 16 periplasmic, 12 outer membrane, 2 outer mem- While multi-spanning integral membrane proteins and lipo- brane lipoproteins, 14 inner membrane, 16 membrane associ- proteins can be identified by machine searching techniques, the ated, 6 membrane lipoprotein and 3 unknown. Although only 30 proteome also contains many extrinsic polypeptides that are proteins (18%) were identified as inner membrane or mem- hydrophilic subunits of respiratory chain complexes. This brane-associated, twenty-five of the cytoplasmic proteins are problem was noted above in consideration of the GenProt extrinsic components of respiratory complexes, F0F1 ATPase or database. The subunits of respiratory complexes should be con- ABC transporters and six are membrane lipoproteins (19%). sidered part of the membrane proteome, as they will be found in Thus, 37% of the proteins identified were clearly associated with experimental proteomic analysis. the cytoplasmic membrane. Nonetheless, this study showed that The structures of the succinate dehydrogenase (SdhCDAB) relatively few of the α-helical membrane proteins were complex [51], the fumarate reductase (FrdABCD) complex identified. [42,52], the formate dehydrogenase complex [53], and the Corbin et al. [36] used SDS solubilization and an LC MS/MS nitrate reductase complex [5] have all been determined by X-ray approach to analyze the membrane protein profile as part of a crystallographic techniques and each is composed of hydro- global analysis of protein expression in E. coli. They reported a philic subunits bound to hydrophobic, membrane-integral, “long list” of 1147 proteins with at least one peptide used for anchoring subunit(s). These hydrophilic subunits are exposed identification. Of these, 287 were classified as membrane to the cytoplasmic or periplasmic side of the membrane and are proteins using their database of 1017 membrane proteins. clearly part of the membrane proteome. Unfortunately, they are Yan et al. [56] used fluorescence difference 2D gel electro- often annotated as cytoplasmic or periplasmic, depending on the phoresis (2D-DIGE) with proteins differentially labeled with side of the membrane to which they are bound, as in the Riley et Cy3 (control) and Cy5 (benzoic acid treatment of cells to reduce al. database [2]. This analysis can still be open to error. For the proton motive force) dyes. They analyzed the gels by example, FrdA which has been shown in many publications to DeCyder software and studied 197 spots by tryptic digestion be on the cytoplasmic side of the membrane [42,52] is listed as and MALDI-TOF and QTOF mass spectrometry to identify the periplasmic [2]. Similar problems arise with subunits of ATP proteins. Although they indicate that a number of membrane Binding Cassette transporters. proteins were identified, examination of their data indicates that A further complication of determining the membrane pro- these are outer membrane or peripheral proteins. teome relates to soluble cytoplasmic proteins that will interact In a recent study Ji et al. [57] compared N-terminal dime- with membrane proteins through various types of functional thylation after lysine guanidation (2MEGA labeling) in concert association or metabolons. These peripheral membrane proteins with 2-dimensional LC MS/MS to analyze the membrane will only be identified by experimental analysis of the proteome proteome and reduce the number of false positive proteins. Six or tagging and immunologic precipitation studies [54]. These hundred forty proteins were identified in the membrane fraction, functional associations can be very important. For example, in which included 258 membrane and membrane-associated the red blood cell the soluble enzyme catalase is associated with proteins. The labeling method resulted in 153 integral proteins the Band III chloride/bicarbonate exchanger [55] to form a being identified compared to only 77 proteins in the unlabeled metabolon. Current efforts often utilize a Na carbonate wash to sample. remove peripheral proteins [7]. Our view is that these associated Molloy et al. [58] used organic solvent extraction with proteins must be considered part of the proteome and as meth- chloroform:methanol to try to enrich for integral membrane odologies improve it will be possible to analyze these peripheral proteins prior to 2D gel electrophoresis. This did not result in proteins in detail. the identification of any additional integral proteins. Zhang et al. [59] used a combination of SDS and methanol assisted protein 5.3. Experimental cytoplasmic membrane proteome solubilization with LC MS/MS to provide a further character- ization of the membrane proteome. They identified 431 Until recently, analysis of the membrane proteome relied on different proteins of which 217 (50%) were membrane intrinsic 2D PAGE and this method identified primarily hydrophilic (168 with two or more transmembrane α-helices) and an addi- polypeptides associated with the membrane. It failed to identify tional 55 were lipoproteins or components of membrane-bound many integral proteins. This is the result of difficulty in solu- complexes. Additionally, 29 outer membrane proteins were bilizing these proteins in a detergent suitable for isoelectric identified in this analysis. focusing in the first dimension and problems with in situ prote- Daley et al. [49], Granseth et al. [60] have taken a different olysis of the proteins to obtain peptides for MS analysis. approach to study the E. coli proteome. Although about 1000 1706 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 proteins have at least two transmembrane helices, they studied tion of the complex peptide mixture and mass spectrometric 737 inner membrane proteins that traverse the membrane at least sequencing of individual peptides for identification with elec- twice and are greater than 100 amino acids in length. Each open trospray ionization (ESI) [66] or MALDI [67] (i.e., solution- reading frame was tagged at the C-terminus with alkaline based method). As discussed below, each platform has its pros phosphatase and green fluorescent protein. Over six hundred and cons and they often provide complementary information on membrane proteins could be expressed and their topology a proteome. While technical advances in both platforms have determined. A tagging approach has great benefit for determining been made for membrane proteome analysis, the solution-based the subcellular localization of each protein and in many cases will method has gained popularity in the past few years due to its provide functional information. For example, it has been possible impressive proteome coverage and rapid quantification power. to localize cell division proteins within the bacterial cell [61], However, this approach has limitations as the chimeric proteins 7.1. Gel-based proteome analysis platform are normally expressed from an inducible promoter. In order to compare different metabolic, environmental or genetic variations There are several ways of implementing the gel-based a detailed profile of proteins identifiable by MS techniques and/ membrane proteome analysis platform [38]. Isoelectric focusing or 2-dimensional PAGE will be required. (IEF) of proteins can be carried out using a solution-based IEF system [68–71] or an immobilized IEF gel strip to provide the 6. Comparison with microarray first dimension protein separation. Polyacrylamide gel electro- phoresis (PAGE) can then be employed to separate the proteins Both Corbin et al. [36] and Zhang et al. [59] were able to further according to their molecular sizes. The combination of compare their protein profile with Affymetrix microarray data IEF and PAGE (i.e. 2D-PAGE) provides an efficient means of of the transcriptome obtained under identical growth conditions. separating proteins from a complex proteome sample. Because In both studies there was a strong correlation between mRNA of the high resolving power, particularly for separating proteins expression level and protein identification indicating that the with different conformers or different degrees of modifications, most highly expressed proteins were found. Additional purifi- 2D-PAGE is an excellent tool to unravel subtle changes in the cation and LC techniques will be required to identify low proteome. For example, it is possible to monitor the change of abundance proteins. posttranslational modifications of proteins from cells of dif- ferent states that is difficult or impossible to detect using the 7. Methodology for membrane proteome analysis solution-based platform [56]. Applying 2D-PAGE to separate very hydrophobic proteins Proteome analysis generally involves several steps including such as integral membrane proteins is challenging. Solubiliza- cell lysis, protein extraction, protein separation, protein tion of these proteins requires the use of strong , such digestion, mass spectrometric analysis of peptides, and data as SDS which are not compatible with IEF. Several zwitterionic processing for peptide and protein identification. Each step is detergents have been shown to improve the performance of 2D- important and should be optimized to generate a comprehensive PAGE separation of integral membrane proteins [38,72,73]. profile of the E. coli membrane proteome. While there are many However, hydrophobic proteins may precipitate at their iso- sample handling protocols and analytical techniques available electric points during the IEF separation process, resulting in to detect hydrophilic and readily soluble proteins, analysis of sample loss and difficulty for second dimension separation. 2D- hydrophobic membrane proteins is still a challenging task. PAGE analysis of the yeast, tobacco or Arabidopsis proteome Fortunately, in the past several years, a number of research revealed that many integral transmembrane proteins could not be groups have been actively involved in developing new and identified [64]. To overcome the problem of IEF, Zahedi et al. improved analytical tools to characterize the membrane reported an interesting technique based on the use of cationic proteome of cells, tissues, and other samples. Good progress detergent benzyldimethyl-n-hexadecylammonium chloride for has been made for membrane proteome analysis. Characteriza- the first dimension protein separation followed by anionic tion of the membrane proteome of a relatively simple detergent SDS-PAGE separation ([74]). While separation of microorganism, E. coli, is one of the excellent ways of judging several integral membrane proteins was demonstrated, it remains the performance of a newly developed technique. Although to be seen whether this technique offers sufficient separation techniques described below may not directly involve the power for a complicated membrane proteome. It appears that analysis of the E. coli membrane proteome, they should be 1D-SDS-PAGE is at present a better choice for handling very applicable to E. coli samples with or without any modifications. hydrophobic membrane proteins, albeit with reduced resolving There are mainly two major analytical platforms commonly power compared to 2D-PAGE [75]. used for analyzing membrane proteomes. One uses gel electro- One of the major advantages of the gel-based method for phoresis for proteome display followed by in-gel digestion of proteome analysis is that it can provide quantitative information protein spots of interest and mass spectrometric analysis of the on proteins directly, offering a convenient means of comparing resultant peptides for peptide and protein identification (i.e., proteomes of different samples. If protein isoforms and modi- gel-based method) [7,21,38,62–65]. Another platform involves fied proteins are resolved in the gel, they can be profiled and the digestion of a proteome with little or no separation at the relative distributions of these proteins can potentially be cor- protein level followed by liquid chromatography (LC) separa- related with a certain phenotype of the samples to study their J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1707 functions. To generate quantitative information, gels are often larger number of small peptides than can be obtained using stained with Coomassie blue or silver. Fluorescence dyes are CNBr alone, trypsin can be used to digest further the CNBr- being increasingly used for protein display. Differential display cleaved fragments. The protocol has been applied for the and analysis of relative abundance changes of two proteomes is analysis of bacteriorhodopsin, nitrate reductase 1 gamma chain possible by labeling individual proteomes with different dyes (NarI E. coli), and a complex protein mixture extracted from the and mixing of the labeled proteins followed by gel separation endoplasmic reticulum membrane of mouse liver [77]. In these and fluorescence detection of labeled proteins with different instances it was demonstrated that the sensitivity of membrane wavelengths in a gel [56]. These staining methods are com- protein identification is in the low picomole regime that is patible with mass spectrometric analysis, providing high quality compatible with Coomassie staining of gel-spots. reagents are used (i.e., using mass spectrometry compatible Mass spectrometric analysis of the extracted peptides from gel reagents available from leading gel electrophoresis suppliers). spots can be done on a variety of instruments. When 2D-PAGE is Radiolabeling or Western blotting of proteins of interest can be used for protein separation, an individual spot may contain a more sensitive than silver or fluorescence staining methods protein with relatively high abundance plus a few other minor for detecting proteins in a gel, but the sensitivity of mass components. As long as one protein is a dominant species in a spectrometric techniques may not be sufficient to identify these spot, peptide mass mapping or fingerprinting (PMF) can be proteins. applied to identify the protein. In PMF, the peptide extract is To identify the proteins detected in a gel by mass spectro- analyzed directly by MALDI or ESI MS. Peptide masses deter- metric techniques, all gel spots (e.g., in qualitative proteome mined can be searched against a proteome database comprised of profiling) or a selected few spots of interest (e.g. only the protein names, protein sequences and their predicted peptide proteins showing significant changes in relative quantification fragments after enzyme digestion with known specificity. In the of proteomes) are excised and subjected to in-gel digestion with case of trypsin digestion, expected sequences and masses of an enzyme, such as trypsin, or a chemical reagent, such as tryptic peptides from cleavage of terminal lysine or arginine cyanogen bromide (CNBr). Extracting or blotting proteins from residues can be readily generated apriorifor any given protein in a gel band to a solution for digestion is less efficient than in-gel a proteome whose sequence is known. By comparing the number digestion where proteins are degraded into peptides which can and quality of peptide mass matching between the experimental be more readily extracted into a solution than the intact proteins. data and the predicted peptides in the database, one can often As expected, in-gel digestion of membrane proteins and sub- identify a protein with high confidence. Several search engines sequent sample preparation for mass spectrometric analysis are are available to perform PMF. For example, the web-based, free much more difficult than in the case of analyzing hydrophilic search engine MASCOTcan be used to perform quickly PMF and proteins. The standard trypsin digestion protocol can be applied to generate a report of protein candidates along with confidence to membrane proteins where trypsin digestion can take place in levels of the matches [78]. PMF works well if many peptides are the gel and the resulting peptides effectively extracted for MS detected from a protein and the mass measurement accuracy for analysis. However, the method would fail if digestion is in- determining the peptide mass is moderately high. It is a rapid efficient due to lack of cleavage sites or lack of accessibility of method and does not require the use of sophisticated mass the to the cleavage sites, such as those within the spectrometric instrumentation. However, if only one or two membrane-spanning segments of a transmembrane protein. The peptides are detected from a gel spot or the gel spot contains more method would also fail if the resulting peptides are too large or than one dominant protein, PMF would fail to arrive at a unique too hydrophobic to be extracted or to be analyzed by MS and match. In this case, a more powerful technique, tandem mass MS/MS analysis. spectrometry or MS/MS, is required for protein identification. New digestion and sample handling protocols are being There are a number of different types of tandem mass spec- developed to tackle these problems. For example, van Montfort trometric instruments being used to produce MS/MS spectra. In et al. reported a protocol involving sequential in-gel trypsin and general, they start with the generation of a mass spectrum of the CNBr digestion [76] in which larger fragments generated from peptides (MS mode), followed by selecting a peptide ion and in-gel trypsin digestion were further digested by CNBr to yield subjecting it to collision-induced dissociation using a collision smaller peptides which could be more efficiently extracted for gas introduced to the mass spectrometer (MS/MS mode). The MS analysis. Quach et al. have developed an improved in-gel intensities and mass-to-charge values (m/z's) of the fragment digestion protocol that involves the use of CNBr digestion, ions are recorded to produce a MS/MS spectrum. Peptide ion followed by further trypsin digestion [77]. The chemical fragmentation takes place mainly from the breakage of the cleaving reagent CNBr is expected to interact readily with a amide bonds between amino acids. Although a MS/MS spec- membrane protein, compared to a protease which is bulky and trum usually does not contain complete amino acid sequence must retain its optimal conformer for activity. CNBr selectively information for de novo sequencing of a peptide, it contains cleaves methionine, but due to the low number of methionine in partial information on peptide sequence. For protein identifica- residues in proteins, CNBr cleavage produces a small number of tion, an un-interpreted MS/MS spectrum can be entered into a large peptide fragments with MW typicallyN2000 Da that are database search engine where the fragment ion masses (some also difficult to extract from gel pieces with high efficiency. In search engine also considers intensities) are searched against the addition, these large peptides cannot be readily fragmented to predicted fragment ion spectra of individual peptides of protein produce MS/MS spectra for protein identification. To generate a digests for possible matches. 1708 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

From the automated MS/MS database search, a report is proteins, dissolving all proteins in a solvent system that is generated including matched peptide sequences from a given comparable with enzymatic or chemical protein digestion can MS/MS spectrum and their corresponding proteins along with be challenging. An efficient means of degrading proteins into matching scores. A probability-based matching algorithm has peptides of suitable length for LC separation and MS analysis is been introduced recently in several search engines to provide a also vital to the success of the solution-based platform. statistical evaluation of the fragment ion and sequence matches. Among the reported protein digestion methods, trypsin diges- For example, in the MASCOT search engine, for each peptide tion is still the most commonly used due to trypsin's relatively sequence matched to a MS/MS spectrum, a matching score as high enzyme specificity. Trypsin generates peptides of near ideal well as a threshold score is listed [78]. The threshold score defines size (b30 residues containing basic amino acids Arg and/or Lys) the minimum score required to call a positive identification within for MS and MS/MS and digestion can be carried out as long as the a confidence level (e.g., N95% certainty). For a peptide with a solvent system used to dissolve the membrane proteins does not matching score of slightly above the threshold (e.g., confidence totally denature the trypsin. Ammonium bicarbonate buffer is level of between 95% and 99%), manual comparison of the MS/ widely used to dissolve soluble proteins and may dissolve a MS spectrum with the predict fragmentation pattern of the peptide membrane protein containing extensive hydrophilic moieties. To ion may be carried out to confirm or discard the peptide match increase the solubility of membrane proteins, surfactants can be from the automated database search step. added to a solution. For example, 0.5% SDS has been used by MS/MS spectral acquisition can be carried out in conjunction Han et al. to solubilize a membrane-enriched microsomal fraction, with liquid chromatography (LC) separation of the extracted followed by trypsin digestion in dilute SDS solution [80]. peptides from a gel spot. Either LC-ESI or LC-MALDI MS and However, surfactants may adversely affect the digestion process. MS/MS or both may be used for analyzing a sample [67]. This For example, 1% SDS, a strong ionic surfactant, can be used to is particularly important for identifying proteins separated only dissolve many membrane proteins, but trypsin will be denatured by one-dimensional gel electrophoresis. A gel band from 1D- at this high concentration of SDS, making its useless for digestion. PAGE may contain many proteins. Thus, the rapid and simple However, by diluting the membrane protein solution to about PMF method would fail to generate unique protein identifica- 0.1% SDS, protein digestion can proceed with reasonably good tion. The peptide sample from a 1D gel band is a complex efficiency [59]. A cleavable detergent, 3-[3-(1,1-bisalkylox- mixture. Direct MS/MS analysis of the mixture may result in the yethyl)pyridin-1-yl]propane-1-sulfonate (PPS) is compatible detection of only a few readily detectable peptides (i.e. high with trypsin digestion and has been useful to dissolve membrane abundance and more ionizable peptides) due to an ion sup- proteins [81]. Wu et al. reported a method for comprehensive pression effect, i.e., a few peptides dominating the spectrum at membrane protein analysis using non-specific Proteinase K for the expense of other low abundance and less ionizable peptides. digestion and subsequent analysis by LC-ESI MS/MS [66]. Peptide separation by LC can significantly reduce the ion Both organic acids (e.g. trifluoroacetic acid (TFA)) and organic suppression effect, resulting in the detection of a greater number solvents (e.g. methanol) have been reported to be effective in of peptides from a complex mixture [79]. dissolving membrane proteins. Washburn et al. developed a To summarize the application of the gel-based platform for surfactant-free method that used 90% formic acid to solubilize E. coli membrane proteome analysis, it has been shown that, proteins in the presence of CNBr, with further enzymatic if the membrane proteins are amenable to 2D-PAGE separation, digestion of the CNBr-cleaved protein fragments by LysC and relative quantification of proteomes can be carried out at the trypsin [82]. Recent studies suggest that trypsin is functional for protein level with the possibility of examining proteins with digestion with a methanol concentration of up to about 65% [83– different conformers or modifications. Protein identification can 87]. Thus, a high concentration of methanol can be used to be carried out by in-gel digestion, peptide extraction and PMF solubilize membrane proteins, followed by trypsin digestion and or MS/MS of the peptides. If only 1D-PAGE can be used to this technique has been reported to be useful for membrane separate the membrane proteins, due to its limited separation proteome analysis [85–87]. The use of an organic solvent, such as power, individual gel bands would contain mixtures of proteins, 60% methanol, to solubilize membrane proteins has one major making quantitative analysis of individual proteins difficult. advantage compared to the use of a solvent system containing a Identification of proteins residing in a gel band can be done by strong surfactant, such as SDS. The organic solvent can be readily using in-gel digestion, peptide extraction and MS/MS or LC removed after protein digestion, while a strong surfactant, such as MS/MS of the peptide mixture. SDS, is difficult to remove. The advantage of SDS over organic solvents is that it can 7.2. Solution-based proteome analysis platform dissolve a wider range of proteins, including misfolded and precipitated proteins. As SDS can degrade the performance of In the past several years, a solution-based or gel-free tech- reversed-phase separations and MS analyses of peptides, the nical platform has been developed for membrane proteome remaining SDS in the digested peptide sample must be removed analysis [79]. In the solution-based method, the entire protein by using an ion-exchange column. In a solution-based method, mixture is first digested and the resulting peptides analyzed by where two-dimensional (2D) peptide separation is used, the first LC MS/MS. In applying this method to the membrane pro- dimension of separation is generally based on ion-exchange teome, protein solubilization and digestion must proceed with chromatography. Thus, SDS removal from the digested peptide high efficiency. Due to the hydrophobic nature of membrane sample is integrated into the first-dimensional peptide separation. J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1709

Zhang et al., compared the ability of 60% methanol and 1% Considering the complexity of the membrane proteome, it SDS to dissolve the inner membrane fraction of an E. coli K12 appears that one method of solubilization and digestion cannot cell lysate [59]. By using trypsin digestion and 2D-LC ESI MS/ be universally applied to handle all proteins. A combination MS they found that 358 proteins (1417 unique peptides) and 299 of several complementary methods may result in greater proteins (892 peptides) were identified from the methanol- proteome coverage. More recently, Wang et al. reported a solubilized protein mixture and the SDS-solubilized sample, sequential protein solubilization and digestion protocol for respectively. The methanol method detected more hydrophobic zebrafish liver proteome analysis [90]. In their work, it was peptides, resulting in a greater number of proteins identified, than found that, after dissolving the protein pellet from the liver the SDS method. 159 out of 358 proteins (44%) detected by SDS tissue extracts in a basic buffer and subjecting it to trypsin solubilization and 120 out of 299 proteins (40%) detected by digestion, a non-soluble residue remained in the vials. The methanol solubilization were integral membrane proteins. Of a residue was subjected to additional levels of digestion, first total of 190 integral membrane proteins 70 were identified by methanol-assisted trypsin digestion, followed by SDS- exclusively in the methanol-solubilized sample, 89 were assisted trypsin digestion and finally by the MAAH method. identified by both methods, and only 31 proteins were exclusively The peptide mixtures were pooled and subjected to strong identified by the SDS method. In addition, it was determined that cation exchange chromatographic separation, followed by a the protein solubilization potential of SDS or methanol was not reversed-phase column and LC-ESI MS/MS analysis. Pro- the crucial parameter determining overall performance of each teome analysis using the combined solubilization/digestion method but rather the compatibility of these two reagents with methods led to the identification of 1204 unique proteins. protein digestion, downstream peptide separation and ESI-MS Among the 1204 proteins identified, 224 (19%) were found analysis was critical for their analytical performance. The better in all three samples, while 113 (9%), 420 (35%), and 214 compatibility of the methanol method with 2D-LC-MS/MS (18%) proteins or related protein groups were uniquely resulted in a higher number of total peptide and protein iden- observed in buffer/methanol digest, SDS digest, and MAAH tification, higher reproducibility and pronounced bias of the digest, respectively. method towards hydrophobic peptide and integral membrane In the solution-based method, relative quantification of proteins. Using the combined datasets obtained from the two proteomes of different samples is commonly done using stable methods, it was shown that there was no bias in the experimental isotopic labeling of proteins or peptides [80]. One sample is proteome compared to the predicted membrane proteome [59]. labeled with a light isotope and another one with a heavy isotope. To achieve maximum digestion efficiency, Ji et al. applied Isotope incorporation can be achieved at the protein level during two consecutive digestion steps for the analysis of the E. coli cell growth using either an isotope-enriched minimal medium or membrane proteome [57] in which membrane proteins were conventional medium plus isotope-labeled amino acids [91,92]. first dissolved in 60% methanol and digested with trypsin for This is followed by enzyme or chemical digestion of labeled 5 h. In a second step un-dissolved protein was pelleted and re- proteins to generate isotope-tagged peptides. Note that this suspended in 0.05% SDS with trypsin and digested overnight. labeling strategy works only for cells that can grow in the special Both digests were pooled and subjected to LC-ESI MS/MS culture medium and cannot be readily used for other samples, analysis. such as tissues or body fluids. An alternative strategy is to use When enzymes or chemical reagents are used to degrade chemical derivatization to attach an isotope tag to the digested membrane proteins, an important requisite for these protein peptides of a proteome. In both strategies, the isotope labeled degradation methods to work is that the proteins to be degraded peptides from a mixture of light- and heavy-isotope-labeled must be dissolved in a suitable solution. Unfortunately, during samples produce pairs of peaks in a mass spectrum when the the protein sample workup, many proteins may become highly mixture is analyzed using the MS scan mode. The relative de-natured and are not readily soluble in any solvent including intensities of a pair of peaks can be used to determine the that containing strong surfactant. As a consequence, these pro- abundance change of the peptide and its corresponding protein. teins are not detected by the solution-based proteome analysis MS/MS spectra of the peptide pair can be generated for peptide method. Zhong et al. have recently described a method that does and protein identification. not require the use of a solvent to solubilize proteins by using There are many isotope labeling reactions reported for rela- microwave-assisted acid hydrolysis (MAAH) [88] in 25% tive proteome quantification. Among them, the isotope-coded aqueous TFA to degrade proteins into peptides for MS char- affinity tag (ICAT) approach pioneered by Aebersold and acterization [89]. Compared to enzymatic digestion, MAAH is coworkers [80,93–95] has been extensively used. The main fast and detergent-free. It involves a simple sample handling advantage of this method is that it enriches peptides containing process and there are no background peptides, such as those the rare amino acid cysteine, thereby significantly reducing the from protease autolysis in enzyme digestion, introduced in complexity of the peptide mixture and increasing the dynamic MAAH. MAAH is particularly useful in dealing with membrane range of MS analysis. On the other hand, the use of the ICAT proteome samples of cultured cells and tissue samples. The reagents fails for quantification of cysteine-free proteins. In application of this method, in combination with LC-MALDI MS addition, the ICAT reagents are structurally complex and thus and MS/MS, was illustrated in the analysis of membrane the cost of the reagents is high. As alternatives to ICAT, other proteins isolated from a human breast cancer cell line [89] and a chemical labeling protocols of peptides after protein digestion heart tissue sample [Mulu et al., submitted]. have been developed and recently reviewed [96]. 1710 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Simple labeling chemistry and inexpensive reagents are historically “soluble” cytoplasmic proteins will be shown to important in applying the isotope labeling approach for relative be part of functional complexes on the membrane. New 18 quantification of proteomes. As an example, H2 O can be used nanotechniques along with rapid proteomic approaches will in protease digestion to introduce the 18O tag through the hy- allow the characterization of bacterial infections without the drolysis reaction to label all proteolytic peptides uniformly at the need to wait for cultures to grow in the clinical laboratory. C-terminus [97–99]. Another example is differential dimethyl Proteomics is already being used to identify increased labeling of N-termini and α-amino groups of lysine residues of expression of proteins in the clinical situation and the formation tryptic peptides with d(0)- or d(2)-formaldehyde [100–102]. of biofilms [107]. The techniques pioneered with the model Dimethyl labeling combined with LC-MALDI MS and MS/MS organism E. coli will be applied to the plasma membranes of has been applied to determine the proteins differentially expressed eukaryotic cells and the membranes of organelles. between an E-cadherin-deficient human carcinoma cell line (SCC9) and its transfectants expressing E-cadherin (SCC9-E) Acknowledgements [103]. A total of 5480 peptide pairs were examined and 320 of them showed relative intensity changes of greater than 2-fold Research in the authors' laboratories is supported by the which led to the identification of 49 differentially expressed Canadian Institutes of Health Research and the Natural Sciences proteins. More recently, Ji et al. reported a modified N-terminal and Engineering Research Council of Canada. JHW is a Canada dimethyl labeling strategy, in which the N-termini of tryptic Research Chair in Membrane Biochemistry, LL is a Canada peptides are differentially labeled with either d(0),12C-formalde- Research Chair in Analytical Chemistry. JHW would like to hyde or d(2),13C-formaldehyde after lysine residues in peptides thank the Alberta Heritage Foundation for Medical Research for are blocked by guanidination [57]. Guanidination is known to be support during the writing of this review and Dr. Fraser effective for selective labeling of lysine residues in peptides [104]. Armstrong and the Laboratory of Inorganic Chemistry at It has been demonstrated that N-terminal dimethylation (2ME) Oxford University. We thank Philip Winter for preparation of after lysine guanidination (GA) or 2MEGA provides uniform 6- the Venn diagrams. Da differential isotope tags on peptides that facilitates protein identification and quantification [57]. References In summary, with the development of new protocols for solu- bilizing and digesting membrane proteins as well as improved [1] C. Holden, Cell biology. Alliance launched to model E. coli, Science isotope labeling chemistry, the solution-based method will play an 297 (2002) 1459–1460. increasingly important role in membrane proteomics. It is [2] M. Riley, T. Abe, M.B. Arnaud, M.K. Berlyn, F.R. Blattner, R.R. Chaudhuri, J.D. Glasner, T. Horiuchi, I.M. Keseler, T. Kosuge, H. Mori, anticipated that this method, combined with multi-dimensional N.T. Perna, G. Plunkett III, K.E. Rudd, M.H. Serres, G.H. Thomas, N.R. LC separation techniques, will allow us to examine the E. coli Thomson, D. Wishart, B.L. Wanner, Escherichia coli K-12: a coopera- membrane proteome with unprecedented coverage and accuracy. tively developed annotation snapshot-2005, Nucleic Acids Res. 34 In this review we have taken the protein identifications as (2006) 1–9. published by the investigators. In many studies confidence [3] P. Owen, H.R. Kaback, Antigenic architecture of membrane vesicles from Escherichia coli, Biochemistry 18 (1979) 1422–1426. limits are not presented and the number of false positives is not [4] P. Dickie, J.H. Weiner, Purification and characterization of membrane- listed. In the Gevaert study [26], they indicate that of the 689 bound fumarate reductase from anaerobically grown Escherichia coli, proteins identified using the PROCORR algorithm approxi- Can. J. Biochem. 57 (1979) 813–821. mately 42 false positives are obtained at the 98% confidence [5] M.G. Bertero, R.A. Rothery, M. Palak, C. Hou, D. Lim, F. Blasco, J.H. limit. This rises to 69 false positives at 95% confidence. The Weiner, N.C. Strynadka, Insights into the respiratory electron transfer pathway from the structure of nitrate reductase A, Nat. Struct. Biol. 10 issues of variation in mass spectrometry technology and iden- (2003) 681–687. tification algorithms need to be addressed so that readers can [6] I. Yamato, M. Futai, Y. Anraku, Y. Nonomura, Cytoplasmic membrane have confidence in the data presented. As proteomics moves to vesicles of Escherichia coli. II. Orientation of the vesicles studied by rely on increased quantitative analysis this information will localization of enzymes, J. Biochem. (Tokyo) 83 (1978) 117–128. become even more important. [7] M.P. Molloy, N.D. Phadke, J.R. Maddock, P.C. Andrews, Two- dimensional electrophoresis and peptide mass fingerprinting of bacterial outer membrane proteins, Electrophoresis 22 (2001) 1686–1696. 8. Future perspectives [8] H.C. Neu, L.A. Heppel, The release of enzymes from Escherichia coli by osmotic shock and during the formation of spheroplasts, J. Biol. Chem. Membrane proteomic studies are rapidly progressing with 240 (1965) 3685–3692. ever more different proteins identified with increasing levels of [9] G.F. Ames, K. Nikaido, Two-dimensional gel electrophoresis of membrane proteins, Biochemistry 15 (1976) 616–623. confidence. Future studies will utilize newly devised quantita- [10] F.R. Blattner, G. Plunkett III, C.A. Bloch, N.T. Perna, V. Burland, M. tive methods to monitor the membrane proteome and how it Riley, J. Collado-Vides, J.D. Glasner, C.K. Rode, G.F. Mayhew, J. Gregor, changes as the physiologic or environmental conditions change N.W. Davis, H.A. Kirkpatrick, M.A. Goeden, D.J. Rose, B. Mau, Y. [105,106]. The turnover of membrane proteins will be Shao, The complete genome sequence of Escherichia coli K-12, Science – investigated providing the ability to correlate mRNA turnover 277 (1997) 1453 1474. [11] V. Braun, Covalent lipoprotein from the outer membrane of Escherichia with protein turnover. It will also be possible to monitor post- coli, Biochim. Biophys. Acta 415 (1975) 335–377. translational modifications of membrane proteins. More [12] W. Vollmer, J.V. Holtje, Morphogenesis of Escherichia coli, Curr. Opin. complex will be the identification of metabolons where Microbiol. 4 (2001) 625–633. J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1711

[13] J. Kyte, R.F. Doolittle, A simple method for displaying the hydropathic Root, J. McAuliffe, M.I. Jordan, S. Kustu, E. Soupene, D.F. Hunt, Toward a character of a protein, J. Mol. Biol. 157 (1982) 105–132. protein profile of Escherichia coli: comparison to its transcription profile, [14] C.P. Chen, A. Kernytsky, B. Rost, Transmembrane helix predictions Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 9232–9237. revisited, Protein Sci. 11 (2002) 2774–2791. [37] R.A. VanBogelen, K.Z. Abshire, B. Moldover, E.R. Olson, F.C. [15] E.L. Sonnhammer, G. von Heijne, A. Krogh, A hidden Markov model for Neidhardt, Escherichia coli proteome analysis using the gene-protein predicting transmembrane helices in protein sequences, Proc. Int. Conf. database, Electrophoresis 18 (1997) 1243–1251. Intell. Syst. Mol. Biol. 6 (1998) 175–182. [38] V. Santoni, M. Molloy, T. Rabilloud, Membrane proteins and proteomics: [16] A. Krogh, B. Larsson, G. von Heijne, E.L. Sonnhammer, Predicting un amour impossible? Electrophoresis F 21 (2000) 1054–1070. transmembrane protein topology with a hidden Markov model: [39] J.M. Wood, Membrane association of proline dehydrogenase in Escher- application to complete genomes, J. Mol. Biol. 305 (2001) 567–580. ichia coli is redox dependent, Proc. Natl. Acad. Sci. U. S. A. 84 (1987) [17] F.S. Berven, K. Flikka, H.B. Jensen, I. Eidhammer, BOMP: a program to 373–377. predict integral beta-barrel outer membrane proteins encoded within genomes [40] E.D. Brown, J.M. Wood, Conformational change and membrane of Gram-negative bacteria, Nucleic Acids Res. 32 (2004) W394–W399. association of the PutA protein are coincident with reduction of its [18] A. Pautsch, G.E. Schulz, High-resolution structure of the OmpA mem- FAD cofactor by proline, J. Biol. Chem. 268 (1993) 8972–8979. brane domain, J. Mol. Biol. 298 (2000) 273–282. [41] B. Futcher, G.I. Latter, P. Monardo, C.S. McLaughlin, J.I. Garrels, A [19] S.K. Buchanan, B.S. Smith, L. Venkatramani, D. Xia, L. Esser, M. Palnitkar, sampling of the yeast proteome, Mol. Cell. Biol. 19 (1999) 7357–7368. R. Chakraborty, D. van der Helm, J. Deisenhofer, Crystal structure of the [42] B.D. Lemire, J.J. Robinson, R.D. Bradley, D.G. Scraba, J.H. Weiner, outer membrane active transporter FepA from Escherichia coli,Nat.Struct. Structure of fumarate reductase on the cytoplasmic membrane of Biol. 6 (1999) 56–63. Escherichia coli, J. Bacteriol. 155 (1983) 391–397. [20] H.C. Wu, Biosynthesis of lipoproteins, American Society for Microbi- [43] P.T. Bilous, S.T. Cole, W.F. Anderson, J.H. Weiner, Nucleotide sequence ology, 1996. of the dmsABC operon encoding the anaerobic dimethylsulphoxide [21] M.P. Molloy, B.R. Herbert, M.B. Slade, T. Rabilloud, A.S. Nouwens, K.L. reductase of Escherichia coli, Mol. Microbiol. 2 (1988) 785–795. Williams, A.A. Gooley, Proteomic analysis of the Escherichia coli outer [44] K.P. Locher, Structure and mechanism of ABC transporters, Curr. Opin. membrane, Eur. J. Biochem. 267 (2000) 2871–2881. Struck. Biol. 14 (2004) 426–431. [22] A.G. Garrow, A. Agnew, D.R. Westhead, TMB-Hunt: an amino acid [45] E.A. Pratt, J.A. Jones, P.F. Cottam, S.R. Dowd, C. Ho, A biochemical composition based method to screen proteomes for beta-barrel trans- study of the reconstitution of D-lactate dehydrogenase-deficient mem- membrane proteins, BMC Bioinformatics 6 (2005) 56. brane vesicles using fluorine-labeled components, Biochim. Biophys. [23] S. Rey, M. Acab, J.L. Gardy, M.R. Laird, K. deFays, C. Lambert, F.S. Acta 729 (1983) 167–175. Brinkman, PSORTdb: a protein subcellular localization database for [46] A. Schryvers, E. Lohmeier, J.H. Weiner, Chemical and functional bacteria, Nucleic Acids Res. 33 (2005) D164–D168. properties of the native and reconstituted forms of the membrane-bound, [24] M. Chevallet, V. Santoni, A. Poinas, D. Rouquie, A. Fuchs, S. Kieffer, M. aerobic glycerol-3-phosphate dehydrogenase of Escherichia coli, J. Biol. Rossignol, J. Lunardi, J. Garin, T. Rabilloud, New zwitterionic detergents Chem. 253 (1978) 783–788. improve the analysis of membrane proteins by two-dimensional [47] A.J. Driessen, P. Fekkes, J.P. van der Wolk, The Sec system, Curr. Opin. electrophoresis, Electrophoresis 19 (1998) 1901–1909. Microbiol. 1 (1998) 216–222. [25] M. Fountoulakis, R. Gasser, Proteomic analysis of the cell envelope [48] B.C. Berks, T. Palmer, F. Sargent, The Tat protein translocation pathway fraction of Escherichia coli, Amino Acids 24 (2003) 19–41. and its role in microbial physiology, Adv. Microb. Physiol. 47 (2003) [26] K. Gevaert, J. Van Damme, M. Goethals, G.R. Thomas, B. Hoorelbeke, 187–254. H. Demol, L. Martens, M. Puype, A. Staes, J. Vandekerckhove, [49] D.O. Daley, M. Rapp, E. Granseth, K. Melen, D. Drew, G. von Heijne, Chromatographic isolation of methionine-containing peptides for gel- Global topology analysis of the Escherichia coli inner membrane free proteome analysis: identification of more than 800 Escherichia coli proteome, Science 308 (2005) 1321–1323. proteins, Mol. Cell. Proteomics 1 (2002) 896–903. [50] E. Wallin, G. von Heijne, Genome-wide analysis of integral membrane [27] M. Hirosawa, M. Hoshida, M. Ishikawa, T. Toya, MASCOT: multiple proteins from eubacterial, archaean, and eukaryotic organisms, Protein alignment system for protein sequences based on three-way dynamic Sci. 7 (1998) 1029–1038. programming, Comput. Appl. Biosci. 9 (1993) 161–167. [51] V. Yankovskaya, R. Horsefield, S. Tornroth, C. Luna-Chavez, H. [28] A. Lopez-Campistrous, P. Semchuk, L. Burke, T. Palmer-Stone, S.J. Brokx, Miyoshi, C. Leger, B. Byrne, G. Cecchini, S. Iwata, Architecture of G. Broderick, D. Bottorff, S. Bolch, J.H. Weiner, M.J. Ellison, Localization, succinate dehydrogenase and reactive oxygen species generation, Science annotation, and comparison of the Escherichia coli K-12 proteome under 299 (2003) 700–704. two states of growth, Mol. Cell. Proteomics 4 (2005) 1205–1209. [52] T.M. Iverson, C. Luna-Chavez, L.R. Croal, G. Cecchini, D.C. Rees, [29] M.E. Bayer, Zones of membrane adhesion in the cryofixed envelope of Crystallographic studies of the Escherichia coli quinol-fumarate reductase Escherichia coli, J. Struct. Biol. 107 (1991) 268–280. with inhibitors bound to the quinol-binding site, J. Biol. Chem. 277 (2002) [30] E.M. Lai, U. Nair, N.D. Phadke, J.R. Maddock, Proteomic screening and 16124–16130. identification of differentially distributed membrane proteins in Escher- [53] M. Jormakka, S. Tornroth, J. Abramson, B. Byrne, S. Iwata, Purification ichia coli, Mol. Microbiol. 52 (2004) 1029–1044. and crystallization of the respiratory complex formate dehydrogenase-N [31] P. Marani, S. Wagner, L. Baars, P. Genevaux, J.W. de Gier, I. Nilsson, R. from Escherichia coli, Acta Crystallogr., D Biol. Crystallogr. 58 (2002) Casadio, G.G. von Heijne, New Escherichia coli outer membrane 160–162. proteins identified through prediction and experimental verification, [54] G. Butland, J.M. Peregrin-Alvarez, J. Li, W. Yang, X. Yang, V.Canadien, A. Protein Sci. 15 (2006) 884–889 (2002 #74). Starostine, D. Richards, B. Beattie, N. Krogan, M. Davey, J. Parkinson, J. [32] J.D. Bendtsen, H. Nielsen, G. von Heijne, S. Brunak, Improved prediction Greenblatt, A. Emili, Interaction network containing conserved and essen- of signal peptides: SignalP 3.0, J. Mol. Biol. 340 (2004) 783–795. tial protein complexes in Escherichia coli, Nature 433 (2005) 531–537. [33] J.D. Bendtsen, H. Nielsen, D. Widdick, T. Palmer, S. Brunak, Prediction [55] B.V. Alvarez, G.L. Vilas, J.R. Casey, Metabolon disruption: a mechanism of twin-arginine signal peptides, BMC Bioinformatics 6 (2005) 167. that regulates bicarbonate transport, EMBO J. 24 (2005) 2499–2511. [34] H. Nielsen, J. Engelbrecht, S. Brunak, G. von Heijne, Identification of [56] J.X. Yan, A.T. Devenish, R. Wait, T. Stone, S. Lewis, S. Fowler, prokaryotic and eukaryotic signal peptides and prediction of their Fluorescence two-dimensional difference gel electrophoresis and mass cleavage sites, Protein Eng. 10 (1997) 1–6. spectrometry based proteomic analysis of Escherichia coli, Proteomics 2 [35] A.J. Link, K. Robison, G.M. Church, Comparing the predicted and (2002) 1682–1698. observed properties of proteins encoded in the genome of Escherichia [57] C. Ji, A. Lo, S. Marcus, L. Li, Effect of 2MEGA labeling on membrane coli K-12, Electrophoresis 18 (1997) 1259–1313. proteome analysis using LC-ESI QTOF MS, J. Proteome Res. 5 (2006) [36] R.W. Corbin, O. Paliy, F. Yang, J. Shabanowitz, M. Platt, C.E. Lyons Jr., K. 2567–2576. 1712 J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

[58] M.P. Molloy, B.R. Herbert, K.L. Williams, A.A. Gooley, Extraction of [79] C.C. Wu, J.R. Yates, The application of mass spectrometry to membrane Escherichia coli proteins with organic solvents prior to two-dimensional proteomics, Nat. Biotechnol. 21 (2003) 262–267. electrophoresis, Electrophoresis 20 (1999) 701–704. [80] D.K. Han, J. Eng, H. Zhou, R. Aebersold, Quantitative profiling of [59] N. Zhang, R. Chen, N. Young, D. Wishart, P. Winter, J.H. Weiner, L. Li, differentiation-induced microsomal proteins using isotope-coded affinity Comparison of SDS- and methanol-assisted protein solubilization and tags and mass spectrometry, Nat. Biotechnol. 19 (2001) 946–951. digestion methods for Escherichia coli membrane proteome analysis by [81] J.L. Norris, N.A. Porter, R.M. Caprioli, Mass spectrometry of intra- 2-D LC-MS/MS, Proteomics 7 (2007) 484–493. cellular and membrane proteins using cleavable detergents, Anal. Chem. [60] E. Granseth, D.O. Daley, M. Rapp, K. Melen, G. von Heijne, 75 (2003) 6642–6647. Experimentally constrained topology models for 51,208 bacterial inner [82] M.P. Washburn, D. Wolters, J.R. Yates III, Large-scale analysis of the membrane proteins, J. Mol. Biol. 352 (2005) 489–494. yeast proteome by multidimensional protein identification technology, [61] C.A. Hale, P.A. de Boer, Recruitment of ZipA to the septal ring of Nat. Biotechnol. 19 (2001) 242–247. Escherichia coli is dependent on FtsZ and independent of FtsA, [83] L.M. Simon, M. Kotorman, G. Garab, I. Laczko, Structure and activity of J. Bacteriol. 181 (1999) 167–176. a-chymotrypsin and trypsin in aqueous organic media, Biochem. [62] B. Herbert, Advances in protein solubilization for 2-dimensional Biophys. Res. Commun. 280 (2001) 1367–1371. electrophoresis, Electrophoresis 20 (1999) 660–663. [84] W.K. Russell, Z.Y. Park, D.H. Russell, Proteolysis in mixed organic- [63] M.P. Molloy, Two-dimensional electrophoresis of membrane proteins aqueous solvent systems: applications for peptide mass mapping using using immobilized pH gradients, Anal. Biochem. 280 (2000) 1–10. mass spectrometry, Anal. Chem. 73 (2001) 2682–2685. [64] V. Santoni, S. Kieffer, D. Desclaux, F. Masson, T. Rabilloud, Mem- [85] J. Blonder, M.L. Hale, D.A. Lucas, C.F. Schaefer, L.-R. Yu, T.P. Conrads, brane proteomics: use of additive main effects with multiplicative H.J. Issaq, B.G. Stiles, T.D. Veenstra, Proteomic analysis of detergent- interaction model to classify plasma membrane proteins according to resistant membrane rafts, Electrophoresis 25 (2004) 1307–1318. their solubility and electrophoretic properties, Electrophoresis 21 [86] J. Blonder, M.B. Goshe, R.J. Moore, L. Pasa-Tolic, C.D. Masselon, M.S. (2000) 3329–3344. Lipton, R.D. Smith, Enrichment of integral membrane proteins for [65] V. Santoni, P. Doumas, D. Rouquie, M. Mansion, T. Rabilloud, M. proteomic analysis using liquid chromatography-tandem mass spectrom- Rossignol, Large scale characterization of plant plasma membrane etry, J. Proteome Res. 1 (2002) 351–360. proteins, Biochimie 81 (1999) 655–661. [87] J. Blonder, P. Conrads Thomas, L.-R. Yu, A. Terunuma, M. Janini [66] C.C. Wu, M.J. MacCoss, K.E. Howell, J.R. Yates, A method for the George, J. Issaq Haleem, C. Vogel Jonathan, D. Veenstra Timothy, comprehensive proteomic analysis of membrane proteins, Nat. Biotech- A detergent- and cyanogen bromide-free method for integral mem- nol. 21 (2003) 532–538. brane proteomics: application to Halobacterium purple membranes [67] N. Zhang, N. Li, L. Li, Liquid chromatography MALDI MS/MS for and the human epidermal membrane proteome, Proteomics 4 (2004) membrane proteome analysis, J. Proteome Res. 3 (2004) 719–727. 31–45. [68] T.Q. Shang, J.M. Ginter, M.V. Johnston, B.S. Larsen, C.N. McEwen, [88] H. Zhong, Y.Zhang, Z. Wen, L. Li, Protein sequencing by mass analysis of Carrier ampholyte-free solution isoelectric focusing as a prefractionation polypeptide ladders after controlled protein hydrolysis, Nat. Biotechnol. method for the proteomic analysis of complex protein mixtures, 22 (2004) 1291–1296. Electrophoresis 24 (2003) 2359–2368. [89] H. Zhong, S.L. Marcus, L. Li, Microwave-assisted acid hydrolysis of [69] G. Weber, M. Islinger, P. Weber, C. Eckerskorn, A. Voelkl, Efficient proteins combined with liquid chromatography MALDI MS/MS for separation and analysis of peroxisomal membrane proteins using free- protein identification, J. Am. Soc. Mass Spectrom. 16 (2005) 471–481. flow isoelectric focusing, Electrophoresis 25 (2004) 1735–1747. [90] N. Wang, L. MacKenzie, A.G. De Souza, H. Zhong, G. Goss, L. Li, [70] T. McDonald, S. Sheng, B. Stanley, D. Chen, Y. Ko, R.N. Cole, P. Proteome profile of cytosolic component of zebrafish liver generated by Pedersen, J.E. Van Eyk, Expanding the subproteome of the inner mito- LC-ESI MS/MS combined with trypsin digestion and microwave-assisted chondria using protein separation technologies. One- and two-dimen- acid hydrolysis, J. Proteome Res. 6 (2007) 263–272. sional liquid chromatography and two-dimensional gel electrophoresis, [91] X. Chen, L. Sun, Y. Yu, Y. Xue, P. Yang, Amino acid-coded tagging Mol. Cell. Proteomics 5 (2006) 2392–2411. approaches in quantitative proteomics, Expert Rev. Proteomics 4 (2007) [71] X. Zuo, K.-B. Lee, D.W. Speicher, Fractionation of complex proteomes 25–37. by microscale solution isoelectrofocusing using ZOOM IEF fractionators [92] M. Mann, Functional and quantitative proteomics using SILAC, Nat. to improve protein profiling, Proteomics Protocols Handbook, 2005, Rev., Mol. Cell Biol. 7 (2006) 952–958. pp. 97–117. [93] T.J. Griffin, D.K.M. Han, S.P. Gygi, B. Rist, H. Lee, R. Aebersold, K.C. [72] M. Aivaliotis, W. Haase, M. Karas, G. Tsiotis, Proteomic analysis of Parker, Toward a high-throughput approach to quantitative proteomic chlorosome-depleted membranes of the green sulfur bacterium Chloro- analysis: expression-dependent protein identification by mass spectrom- bium tepidum, Proteomics 6 (2006) 217–232. etry, J. Am. Soc. Mass Spectrom. 12 (2001) 1238–1246. [73] R. Henningsen, B.L. Gale, K.M. Straub, D.C. DeNagel, Application of [94] S.P. Gygi, B. Rist, T.J. Griffin, J. Eng, R. Aebersold, Proteome analysis of zwitterionic detergents to the solubilization of integral membrane proteins low-abundance proteins using multidimensional chromatography and for two-dimensional gel electrophoresis and mass spectrometry, Proteo- isotope-coded affinity tags, J. Proteome Res. 1 (2002) 47–54. mics 2 (2002) 1479–1488. [95] H. Zhou, A. Ranish Jeffrey, D. Watts Julian, R. Aebersold, Quantitative [74] R.-P. Zahedi, C. Meisinger, A. Sickmann, Proteomics 5 (2005) 3581–3588. proteome analysis by solid-phase isotope tagging and mass spectrometry, [75] R.J. Simpson, L.M. Connolly, J.S. Eddes, J.J. Pereira, R.L. Moritz, G.E. Nat. Biotechnol. 20 (2002) 512–515. Reid, Electrophoresis 21 (2000) 1707–1732. [96] A. Leitner, W. Linder, Chemistry meets proteomics: the use of chemical [76] B.A. van Montfort, M.K. Doeven, B. Canas, L.M. Veenhoff, B. Poolman, tagging reactions for MS-based proteomics, Proteomics 6 (2006) G.T. Robillard, Combined in-gel tryptic digestion and CNBr cleavage for 5418–5434. the generation of peptide maps of an integral membrane protein with [97] W.-J. Qian, M.E. Monroe, T. Liu, J.M. Jacobs, G.A. Anderson, Y.Shen, R.J. MALDI-TOF mass spectrometry, Biochim. Biophys. Acta, Bioenerg. Moore,D.J.Anderson,R.Zhang,S.E.Calvano,S.F.Lowry,W.Xiao,L.L. 1555 (2002) 111–115. Moldawer,R.W.Davis,R.G.Tompkins,D.G.Camp,R.D.Smith,Quan- [77] T.T.T. Quach, N. Li, D.P. Richards, J. Zheng, B.O. Keller, L. Li, titative proteome analysis of human plasma following in vivo lipopolysac- Development and applications of in-gel CNBr/tryptic digestion combined charide administration using 16O/18O labeling and the accurate mass and with mass spectrometry for the analysis of membrane proteins, J. Proteome time tag approach, Mol. Cell. Proteomics 4 (2005) 700–709. Res. 2 (2003) 543–552. [98] X. Yao, A. Freas, J. Ramirez, P.A. Demirev, C. Fenselau, Proteolytic 18O [78] D.N. Perkins, D.J.C. Pappin, D.M. Creasy, J.S. Cottrell, Probability- labeling for comparative proteomics: model studies with two serotypes of based protein identification by searching sequence databases using mass adenovirus. [Erratum to document cited in CA135:089410], Anal. Chem. spectrometry data, Electrophoresis 20 (1999) 3551–3567. 76 (2004) 2675. J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 1713

[99] K.L. Johnson, D.C. Muddiman, A method for calculating 16O/18O SCC9 transfectants expressing E-cadherin by dimethyl isotope labeling, LC- peptide ion ratios for the relative quantification of proteomes, J. Am. Soc. MALDI MS and MS/MS. [Erratum to document cited in CA143:169101], Mass Spectrom. 15 (2004) 437–445. J. Proteome Res. 4 (2005) 1872. [100] J.-L. Hsu, S.-Y. Huang, N.-H. Chow, S.-H. Chen, Stable-isotope [104] J.R. Kimmel, Guanidination of proteins, Methods Enzymol. 11 (1967) dimethyl labeling for quantitative proteomics, Anal. Chem. 75 (2003) 584–589. 6843–6852. [105] A. Gilchrist, C.E. Au, J. Hiding, A.W. Bell, J. Fernandez-Rodriguez, S. [101] C. Ji, L. Li, Quantitative proteome analysis using differential stable Lesimple, H. Nagaya, L. Roy, S.J. Gosline, M. Hallett, J. Paiement, R.E. isotopic labeling and microbore LC-MALDI MS and MS/MS, J. Proteome Kearney, T. Nilsson, J.J. Bergeron, Quantitative proteomics analysis of Res. 4 (2005) 734–742. the secretory pathway, Cell 127 (2006) 1265–1281. [102] J.E. Melanson, S.L. Avery, D.M. Pinto, High-coverage quantitative [106] J.J. Bergeron, M. Hallett, Peptides you can count, Nat. Biotechnol. proteomics using amine-specific isotopic labeling, Proteomics 6 (2006) 25 (2007) 61–62. 4466–4474. [107] R. Orme, C.W. Douglas, S. Rimmer, M. Webb, Proteomic analysis of [103] C. Ji, L. Li, M. Gebre, M. Pasdar, L. Li, Identification and quantification of Escherichia coli biofilms reveals the overexpression of the outer differentially expressed proteins in E-cadherin deficient SCC9 cells and membrane protein OmpA, Proteomics 6 (2006) 4269–4277.