Template for Electronic Submission to ACS Journals s14

Total Page:16

File Type:pdf, Size:1020Kb

Template for Electronic Submission to ACS Journals s14

Enhanced Methylarginine Characterization by Post-Translational Modification-Specific Targeted Data Acquisition and Electron- Transfer Dissociation Mass Spectrometry

Gene Hart-Smith, Jason K. K. Low, Melissa A. Erce, Marc R. Wilkins*

NSW Systems Biology Initiative, University of New South Wales, Sydney, New South Wales 2052,

Australia

Address reprint requests to Marc R. Wilkins, NSW Systems Biology Initiative, University of New South

Wales, Sydney, New South Wales 2052, Australia. Phone: +61-2-9385-3633; Fax, +61-2-9385-3950; E- mail: [email protected]

Supplementary Material

Pages S-4 to S-8: In-depth experimental methods for protein preparation

Page S-5: Table S-1: primers used for PCR, site-directed mutagenesis, SLIM and

gene sequencing

Page S-8: Figure S-1: coomassie stained gel of purified proteins, and fluorography

results

Pages S-9 and S-10: Criteria used to assign the confidence of methylated peptide

identifications

S-1 Pages S-11 to S-13: Table S-2: complete list of methylated precursor ion identifications from

in vitro methylated His-Npl3

Pages S-14 and S-15: Table S-3: complete list of proteins identified in the protein mixture

featuring in vivo methylated Npl3

Pages S-16 and S-17: Table S-4: complete list of methylated precursor ion identifications from

in vivo methylated Npl3

Page S-18: Scheme S-1: illustration of the arginine methylation sites observed in in

vitro methylated His-Npl3

Page S-19: Scheme S-2: illustrations of the arginine methylation sites observed in in

vivo methylated Npl3

Pages S-20 and S-21: Further discussions on observed Npl3 arginine methylation sites

Pages S-22 and S-23: Figures S-2 and S-3: examples of ETD and CID derived MS/MS spectra

from methylated peptide ions

Page S-24: Figure S-4: comparative ion intensities and charge states of methylated

peptide ions observed in the CID versus ETD experiments

Pages S-25 to S-27: Figure S-5 and S-6: histograms illustrating the data of Figure 1

segmented by peptide charge state, and peptide mass

Page S-28: Figure S-7: comparative ion intensities and numbers of methylated

peptide ions observed in the DDA versus targeted data acquisition

experiments conducted upon in vitro methylated His-Npl3

S-2 Page S-29: Figure S-8: average percentages of high and very high confidence

methylated peptide ions selected for MS/MS in the DDA versus targeted

data acquisition experiments conducted upon in vitro methylated His-Npl3

Page S-30: Figure S-9: comparative ion intensities of methylated peptide ions

observed in the DDA versus targeted data acquisition experiments

conducted upon in vivo methylated Npl3

Page S-31: Figure S-10: average percentages of high and very high confidence

methylated peptide ions selected for MS/MS in the DDA versus targeted

data acquisition experiments conducted upon in vivo methylated Npl3

Page S-32: Figure S-11: average percentages of methylated features extracted after 1,

2 and 3 MaxQuant analyses, and abundances of extracted and unextracted

methylated peptide features

Page S-33: Further discussions on FindMod analyses

Page S-34: References and author contributions

S-3 Experimental

Yeast strains and growth conditions

Saccharomyces cerevisiae strain BY4741 (MATa his3Δ1 leu2Δ0 ura3Δ0 met15 Δ 0) was used in this study (Open Biosystems). Yeast strains were grown at 30°C in YEPD containing 2% glucose, 2% bactopeptone and 1% yeast extract.

Cloning of His-Hmt1 and His-Npl3

Using tailed primers containing either NdeI or BamHI restriction sites (Table S-1), the hmt1 and npl3 genes were amplified by PCR from 50 ng of genomic DNA prepared using standard methods. PCR products were purified (PCR purification kit, Qiagen), and then digested with NdeI and BamHI. The digested fragments were ligated into the NdeI and BamHI sites of pET-15b (Novagen). All amplified sequences were verified by automatic sequencing at the University of New South Wales Ramaciotti

Centre for Gene Function Analysis.

S-4 Table S-1. Primers used for PCR amplification, site-directed mutagenesis, SLIM and sequencing of gene sequences from genomic DNA.

a Primer Sequence (5’ to 3’) TA (°C) Amplification/Cloning1 pET15b-hmt1-NdeI-Fwd GGGAATTCCATATGAGCAAGACAGCCGTG 63.3 pET15b-hmt1-BamHI-Rev CGCGGATCCTTATTAATGCATTAAATAAGAACCTTCG 61.4 pET15b-npl3-NdeI-Fwd GGGAATTCCATATGTCTGAAGCTCAAGAAACTCAC 63.0 pET15b-npl3-BamHI-Rev CGCGGATCCTTATTACCTGGTTGGTGATCTTTCAC 64.2 Site-directed mutagenesis BG1805-npl3-fix-Fwd2 CTACTCCAGAGGCGGATATGGTGGCTC 67.0 BG1805-npl3-fix-Rev2 GAGCCACCATATCCGCCTCTGGAGTAG 67.0 SLIM for MORF vectors BG1805-SLIM-Fwd-Long3 GTACCCCGGCGCGCCTCAACCAGCATAATCAGGAAC 60.0 BG1805-SLIM-Rev-Short CCAATTCGCCCTATAGTGAG 60.0 BG1805-SLIM-Rev-Long3 TGAGGCGCGCCGGGGTACCCAATTCGCCCTATAGTGAG 60.0 BG1805-SLIM-Fwd-Short ACCAGCATAATCAGGAAC 60.0 Sequencing pET15b-T7 promoter primer TAATACGACTCACTATAGGG 51.6 pET15b-T7 terminator primer GCTAGTTATTGCTCAGCGG 60.7 BG1805 F5 sequencing CATTTTCGGTTTGTATTACTTCTTATTC 65.1 BG1805 Rev sequencing GCCGTCGTTTTACAACGT 61.7 1 Underlined sequences indicate engineered restriction enzyme recognition sites, where CATATG and GGATCC are NdeI and BamHI sites respectively. 2 Underlined base is the location where the point mutation would be corrected. 3 Underlined bases indicate the 5’ adaptor sequences of the tailed primers that create the necessary overhangs for the SLIM reaction.

Expression and Purification of Recombinant Proteins from Escherichia coli

His-Hmt1 and His-Npl3 recombinant proteins were expressed in Escherichia coli BL21DE3 Rosetta.

Cells were lysed via sonication (Branson Digital Sonifer) in binding buffer (50 mM Na-phosphate buffer pH 8.0, 0.5 M NaCl, 40 mM imidazole, 20% glycerol, 0.25% Triton X-100, 10 mM

β-mercaptoethanol). The lysate was clarified by centrifugation at 4°C at 20,000 g for 20 min. The supernatant was passed through a Ni2+-charged sepharose column (GE healthcare), washed with 15 column volumes (CV) of binding buffer, followed by 5 CV of binding buffer without Triton X-100.

Bound proteins were eluted with 2 CV of elution buffer (50 mM Na-phosphate buffer pH 7.4, 0.5 M

NaCl, 500 mM imidazole, 20% glycerol, 10 mM β-mercaptoethanol). Buffer exchange and

S-5 concentration of eluted proteins was achieved using a Centricon Plus-20 centrifugal column (Amicon) to reduce NaCl to 200 mM and imidazole to 40 mM. Glycerol was added to a final concentration of

50% before storage at -80°C. Proteins were quantified using a modified Lowry assay (Biorad).

In vitro Methylation of His-Npl3 and Fluorography

Methylation of His-Npl3 was performed in a 200 μL solution containing 50 mM HEPES pH 7.5,

1 mM EDTA, 200 mM NaCl, 50 μM SAM, 20 μ g His-Npl3 substrate and 20 μ g His-Hmt1 enzyme. The solution was incubated at 30°C for 1 h, after which the methylation reactions were stopped through the addition of gel loading buffer (1x), followed by boiling for 10 min. Proteins were then resolved on a 4-

12% Bis-Tris gel (Biorad) with MES-SDS PAGE buffer and stained with Biosafe coomassie (Biorad).

Prior to MS analysis, fluorography was performed to support the efficacy of in vitro methylation. His-

Npl3 was methylated as above with the following changes: 1 μM [3H]-SAM (PerkinElmer, 78 Ci/mmol) and 1 μ g His-Npl3 substrate were used. Proteins were resolved on a 4-12% Bis-Tris gel (Biorad) with

MES-SDS PAGE buffer, then electroblotted onto a PVDF membrane using standard methods. The membrane was dried, sprayed with EN3HANCE (PerkinElmer) and exposed to pre-flashed film

(Hyperflim MP, GE Healthcare) at -80°C for 1-5 days.

Site-Direct Mutagenesis prior to Overexpression and in vivo Methylation of Npl3

Mutagenesis was performed in two separate reactions each containing either the forward or reverse primer (see Table S-1); ssDNA plasmids were then generated under standard PCR conditions. Post amplification, the two reactions were allowed to hybridise using the following program: 95°C for 5 min, 90°C for 1 min, 80°C for 1 min, 70°C for 30 s, 60°C for 30 s, 50°C for 30 s, 40°C for 30 s, 37°C for 30 s and finally 25°C. The hybridised products were purified and concentrated to 30 μL (Qiaquick

PCR Purification kit, Qiagen), subjected to a DpnI restriction enzyme digest (New England Biolabs) and transformed into chemically competent E. coli JM109 cells (New England Biolabs). The plasmid was verified via sequencing at the University of New South Wales Ramaciotti Centre for Gene Function

Analysis.

S-6 Protein Digestion

Excised polyacrylamide gel bands containing 0.5 μg of methylated or non-methylated His-Npl3, or gel bands containing in vivo methylated Npl3, were destained, reduced and alkylated following the procedure described by Shevchenko et al. . After removal of the liquid phase post-alkylation, to each gel slice 40 μL of 0.1 M NH4HCO3 containing 40 ng of trypsin (Promega) was added and incubated for 16 h at 35°C. The digest solutions were removed to new microfuge tubes and the gel slices treated with each of the following solutions sequentially for 30 min each before they were removed and pooled: 50 μL

0.1% trifluoroacetic acid (TFA); 50 μ L 0.1% TFA/60% acetonitrile (ACN); and 50 μ L 100% ACN. The pooled digest solutions were then dried down (Savant SPD1010, Thermofisher Scientific) before resuspending in 20 μ L of 1% formic acid and 0.05% heptafluorobutyric acid (HFBA).

S-7 (a) (b)

His-Npl3 + +

His-Npl3 - + + His-Hmt1 - + His-Hmt1 kDa + - + [3H]-SAM + + kDa

188 188

98 98

62 62 His-Npl3

49 49

38

38

28

28

17

17 14

14

Figure S-1. (a) Coomassie stained gel of purified His-Hmt1 (located at ~40 kDa) and His-Npl3 (located at ~60 kDa). (b) Pre-flashed film following exposure to radiolabelled methylated His-Npl3.

S-8 Criteria Used to Assign the Confidence of Methylated Peptide Ion

Identifications

The criteria used to classify methylated Npl3 peptide ions identified via LC-MS/MS as “very high”,

“high” or “medium” confidence are outlined below. Each separate criterion is marked with a Roman numeral; these Roman numerals are referred to throughout the text.

Very High Confidence Identifications

For a precursor ion to be identified as a very high confidence methylated peptide: (i) at least one LC-

MS/MS analysis of the precursor ion must produce a methylated peptide match as the top scoring

Mascot hit at p<0.05 (corresponding to score thresholds of 36 to 38 for the present searches), and manual inspection of the peptide backbone fragment ion series must confirm the localization of the site(s) of methylation. The observation of strong neutral loss-derived product ions characteristic of the precursor ion’s methylation type (mono- or di-) – as implied from its Mascot identification – may be used to support its identity, but is not necessary for the precursor ion to be classified as a very high confidence identification (e.g. ETD derived neutral losses are not observed from peptides of mass

>2000 Da otherwise identified at very high confidence according to the abovementioned criteria).

High Confidence Identifications

For a precursor ion to be identified as a high confidence methylated peptide: (ii) at least one LC-

MS/MS analysis of the precursor ion must produce a methylated peptide match with a Mascot ion score

≥20, and the peptide must feature methylated amino acid residues identical to those found in very high confidence methylated peptides (i.e. both the site and type of methylation – either mono- or di- methylation – are already confirmed). Identification is also considered high confidence if (iii) at least one LC-MS/MS analysis of the precursor ion produces a methylated peptide match with a Mascot ion score ≥20, and the precursor ion is a different charge state of another precursor ion that produces the same methylated peptide match at very high confidence according to the criteria listed above; precursor

S-9 ions must also elute at identical retention times to be confirmed as different charge states of the same peptide. Additionally, (iv) if one of the criteria (v) to (vii) are fulfilled (outlined below), and MS/MS reveals strong neutral loss-derived product ions characteristic of the methylation type (mono- or di-) implied from the precursor ion’s Mascot identification, the precursor ion is considered a high confidence methylated peptide.

Medium Confidence Identifications

For a precursor ion to be identified as a medium confidence methylated peptide: (v) at least one LC-

MS/MS analysis of the precursor ion must produce a methylated peptide match as the top scoring hit with a Mascot ion score ≥20; or (vi) LC-MS/MS analysis of the precursor ion must produce a methylated peptide match as the top scoring hit with a Mascot ion score <20, and the precursor ion is a different charge state of another precursor ion that produces the same methylated peptide match at a

Mascot ion score ≥20; or (vii) LC-MS/MS analysis of the precursor ion must produce a methylated peptide match as the top scoring hit with a Mascot ion score <20, and the peptide features methylated amino acid residues identical to those found in very high confidence methylated peptides; or (viii) LC-

MS/MS analysis of the precursor ion must produce a methylated peptide match as the top scoring hit with a Mascot ion score <20, and MS/MS reveals strong neutral loss-derived product ions characteristic of the methylation type (mono- or di-) implied from the precursor ion’s Mascot identification.

S-10 Table S-2. The highest confidence identification obtained for each methylated and putatively methylated precursor ion observed across LC-MS/MS analyses of three replicate samples of in vitro methylated His-Npl3, each digested using trypsin. Mono-methylated arginine residues are marked with an asterisk, and di-methylated arginine residues are marked with two asterisks. The “Sample #” column refers to the methylated His-Npl3 sample in which the listed Mascot ion score was obtained. Targeted data acquisition has been abbreviated as TDA in the precursor ion selection method; TDA/DDA refers to the combined use of targeted data acquisition and DDA within a single LC-MS/MS experiment. The criteria used to classify the confidence of identifications are outlined in the section above, and the superscript Roman numerals refer to the specific classification criteria used. Neutral losses are abbreviated as follows: mono-methylamine (MMA), mono-methylguanidine (MMG), asymmetric di-methylamine (aDMA) and di-methylguanidine (DMG).

S-11 Precursor Methylation- Ion Observed Charge Sequence Mascot Sample MS/MS Confidence of Peptide sequence ion selection specific ID# m/z state position ion score # method identification method neutral losses 1 524.2584 2+ R.QEGELSNTR* 118-126 8 2 ETD TDA low none 2 799.8869 2+ R.ENSLETTFSSVNTR* 222-235 15 2 CID TDA low possible MMG 3 404.2147 2+ R.SNR*GGFR 282-288 22 3 ETD DDA highiv, v MMA, MMG 4 411.2226 2+ R.SNR**GGFR 282-288 21 3 ETD TDA highiv, v aDMA, DMG MMA, MMG, 5 418.2304 2+ R.SNR**GGFR* 282-288 14 2 ETD DDA mediumviii aDMA, DMG 6 365.537 3+ R.GR*GGFR*GGFR 289-298 54 3 ETD TDA very highi MMA, MMG 7 547.8021 2+ R.GR*GGFR*GGFR 289-298 19 3 ETD TDA highiv, vii MMA, MMG 8 561.8182 2+ R.GR**GGFR**GGFR 289-298 3 3 ETD TDA mediumviii aDMA, DMG 9 434.2326 2+ R.GGFR*GGFR 291-298 33 3 ETD TDA highii MMA, MMG 10 441.2405 2+ R.GGFR**GGFR 291-298 22 3 ETD TDA highiv, v aDMA, DMG 11 433.567 3+ R.GGFR*GGFR*GGFR 291-302 23 3 ETD DDA highiv, v MMA, MMG 12 649.8469 2+ R.GGFR*GGFR*GGFR 291-302 4 3 ETD DDA highiv, vi MMA, MMG 13 477.7488 2+ R.GGFR*GGFSR 299-307 37 3 ETD DDA very highi MMA, MMG 14 532.9431 3+ R.GGFR*GGFSRGGFGGPR* 299-313 21 3 ETD TDA highii MMG 15 583.2966 2+ R.GGFSR*GGFGGPR 303-314 48 3 ETD TDA very highi MMA, MMG 16 389.2004 3+ R.GGFSR*GGFGGPR 303-314 48 3 ETD TDA very highi MMA, MMG 17 590.3045 2+ R.GGFSR**GGFGGPR 303-314 43 2 ETD TDA very highi aDMA, DMG 18 393.8724 3+ R.GGFSR**GGFGGPR 303-314 49 3 ETD TDA very highi aDMA, DMG MMA, MMG, 19 597.3113 2+ R.GGFSR**GGFGGPR* 303-314 41 2 ETD DDA very highi aDMA, DMG 20 603.3083 3+ R.GGFSR*GGFGGPR*GGFGGPR 303-321 55 3 ETD TDA very highi MMA, MMG 21 904.4609 2+ R.GGFSR*GGFGGPR*GGFGGPR 303-321 16 3 ETD TDA highiv, vi MMG MMA, MMG, 22 456.237 4+ R.GGFSR*GGFGGPR**GGFGGPR 303-321 38 3 ETD TDA very highi aDMA, DMG 23 607.9801 3+ R.GGFSR**GGFGGPR*GGFGGPR 303-321 6 3 CID DDA mediumvii possible DMG 24 459.7404 4+ R.GGFSR**GGFGGPR**GGFGGPR 303-321 52 2 ETD TDA very highi aDMA, DMG 25 612.6517 3+ R.GGFSR**GGFGGPR**GGFGGPR 303-321 87 2 ETD TDA very highi aDMA, DMG 26 430.5548 3+ R.GGFGGPR*GGFGGPR 308-321 53 3 ETD DDA very highi MMA, MMG 27 645.3286 2+ R.GGFGGPR*GGFGGPR 308-321 37 3 ETD TDA/DDA very highi MMA, MMG 28 652.3362 2+ R.GGFGGPR*GGFGGPR* 308-321 6 2 ETD TDA highiv, vii MMG 29 486.9006 3+ R.GGFGGPR*GGYGGYSR 315-329 61 3 ETD TDA very highi MMA, MMG 30 729.8474 2+ R.GGFGGPR*GGYGGYSR 315-329 50 2 ETD TDA very highi MMA, MMG 31 491.5729 3+ R.GGFGGPR**GGYGGYSR 315-329 33 2 ETD DDA highiv, v aDMA 32 736.8556 2+ R.GGFGGPR**GGYGGYSR 315-329 9 2 ETD TDA highiv, vi aDMA, DMG 33 496.2442 3+ R.GGFGGPR**GGYGGYSR* 315-329 17 2 ETD DDA highiv, vii aDMA, DMG 34 762.0255 3+ R.GGFGGPR*GGYGGYSR**GGYGGYSR 315-337 16 3 ETD TDA mediumvii none 35 766.6983 3+ R.GGFGGPR**GGYGGYSR**GGYGGYSR 315-337 27 3 ETD DDA highiv, v aDMA 36 415.6933 2+ R.GGYGGYSR* 322-329 62 2 ETD TDA very highi MMA, MMG 37 543.2461 3+ R.GGYGGYSR*GGYGGYSR 322-337 80 3 ETD TDA/DDA very highi MMA, MMG 38 814.366 2+ R.GGYGGYSR*GGYGGYSR 322-337 75 3 ETD DDA very highi MMA, MMG 39 547.9183 3+ R.GGYGGYSR**GGYGGYSR 322-337 48 3 ETD DDA very highi S-12aDMA 40 552.59 3+ R.GGYGGYSR**GGYGGYSR* 322-337 30 2 ETD DDA highiii MMA, aDMA Table S-3. The proteins identified in the trypsin digested protein mixture featuring in vivo methylated Npl3 used to produce Figures 5 and 6 of the main text. All protein identifications are derived from ETD LC-MS/MS data. The highest Mascot scores, and the highest number of peptide sequences matches (of Mascot ion scores corresponding to p > 0.05) obtained from single experiments are listed.

Protein Name Protein Accession # Mascot Score Peptide Sequence Matches (p > 0.05)

NOP3_YEAST Q01560 659 25 K2C1_HUMAN P04264 436 22 K1C9_HUMAN P35527 396 16 ILV3_YEAST P39522 391 12 KPYK1_YEAST P00549 330 12 HSP71_YEAST P10591 320 15 TCPQ_YEAST P47079 280 13 IF2G_YEAST P32481 227 9 K1C16_HUMAN P08779 196 9 ACH1_YEAST P32316 192 10 YD11A_YEAST Q03856 191 9 K2C6B_HUMAN P04259 172 9 K2C6A_HUMAN P02538 163 9 ALDH6_YEAST P54115 163 7 K1C14_HUMAN P02533 137 8 WTM1_YEAST Q12363 131 8 DAK1_YEAST P54838 129 8 GAL1_YEAST P04385 123 6 HSP75_YEAST P11484 121 7 GAL10_YEAST P04397 120 5 TRYP_PIG P00761 120 3 ACEA_YEAST P28240 113 5 PDC1_YEAST P06169 108 8 K1C10_HUMAN P13645 106 5 K2C5_HUMAN P13647 103 5 HSP60_YEAST P19882 101 6 DLD1_YEAST P32891 99 4 PUR6_YEAST P21264 98 7 PCKA_YEAST P10963 93 1 HSP73_YEAST P09435 92 5 2AAA_YEAST P31383 83 2 K22E_HUMAN P35908 83 5 KC12_YEAST P23292 83 3 RSC9_YEAST Q03124 78 1 HSP77_YEAST P12398 77 5 PUT2_YEAST P07275 77 4 VATA_YEAST P17255 76 6 ADH1_YEAST P00330 73 6 HSP82_YEAST P02829 72 2 PGK_YEAST P00560 71 4 SYDC_YEAST P04802 69 4 PHSG_YEAST P06738 67 4 OST1_YEAST P41543 65 1 VAC8_YEAST P39968 64 3 LCB1_YEAST P25045 61 1 DPH2_YEAST P32461 60 1 EF3A_YEAST P16521 59 1

S-13 EFM1_YEAST P38732 59 1 NAP1_YEAST P25293 59 1 NMD3_YEAST P38861 59 1 ENO1_YEAST P00924 56 3 AGM1_YEAST P38628 56 2 HSP60_YEAST P19882 55 4 HSP7F_YEAST P32589 53 1 G3P3_YEAST P00359 52 2 TCPE_YEAST P40413 51 2 HXKA_YEAST P04806 51 1 EF1A_YEAST P02994 51 3 K2C72_HUMAN Q14CN4 51 2 SAC1_YEAST P32368 51 2 TED1_YEAST P40533 49 1 SIR4_YEAST P11978 49 1 IMA1_YEAST Q02821 48 4 G6PI_YEAST P12709 48 1 ENP1_YEAST P38333 44 2 PABP_YEAST P04147 44 2 CBPY_YEAST P00729 44 2 TAF6_YEAST P53040 40 2 SYNC_YEAST P38707 43 2 SSZ1_YEAST P38788 42 1 FAS2_YEAST P19097 42 2 DED1_YEAS7 A6ZP47 41 2 NOP56_YEAST Q12460 39 1 FMP25_YEAST Q08023 38 1 SYRC_YEAST Q05506 37 2 Total Proteins = 75

S-14 Table S-4. The highest confidence identification obtained for each methylated and putatively methylated precursor ion observed across 6 ETD LC-MS/MS analyses of the sample used to produce Figures 5 and 6 of the main text, and Table S-3: a trypsin digested complex protein mixture, featuring in vivo methylated His-Npl3. Mono-methylated arginine residues are marked with an asterisk, and di-methylated arginine residues are marked with two asterisks. Targeted data acquisition has been abbreviated as TDA in the precursor ion selection method. The criteria used to classify the confidence of identifications are outlined in the section above, and the superscript Roman numerals refer to the specific classification criteria used. Neutral losses are abbreviated as follows: mono-methylamine (MMA), mono-methylguanidine (MMG), asymmetric di-methylamine (aDMA) and di-methylguanidine (DMG).

S-15 Precursor Methylation- Ion Observed Charge Sequence Mascot Confidence of Peptide sequence ion selection specific ID# m/z state position ion score identification method neutral losses 1 404.2148 2+ R.SNR*GGFR 282-288 22 TDA highiv, v MMA, MMG 2 411.2222 2+ R.SNR**GGFR 282-288 21 TDA highiv, v aDMA, DMG 3 365.5369 3+ R.GR*GGFR*GGFR 289-298 21 TDA highiv, v MMG MMA, MMG, 4 370.2048 3+ R.GR**GGFR*GGFR 289-298 10 TDA mediumviii aDMA MMG, aDMA, 5 374.8813 3+ R.GR**GGFR*GGFR* 289-298 15 TDA mediumviii DMG 6 434.2328 2+ R.GGFR*GGFR 291-298 31 TDA highiv, v MMA, MMG 7 441.2404 2+ R.GGFR**GGFR 291-298 26 DDA highiv, v aDMA, DMG 8 477.7486 2+ R.GGFR*GGFSR 299-307 32 DDA highiv, v MMA, MMG 9 583.2964 2+ R.GGFSR*GGFGGPR 303-314 36 TDA highiii MMA, MMG 10 389.2004 3+ R.GGFSR*GGFGGPR 303-314 48 DDA very highi MMA, MMG 11 590.3040 2+ R.GGFSR**GGFGGPR 303-314 43 DDA very highi aDMA, DMG 12 393.8723 3+ R.GGFSR**GGFGGPR 303-314 71 TDA very highi aDMA, DMG 13 430.5546 3+ R.GGFGGPR*GGFGGPR 308-321 50 DDA very highi MMA, MMG 14 645.3284 2+ R.GGFGGPR*GGFGGPR 308-321 19 TDA highiii MMA, MMG 15 486.9005 3+ R.GGFGGPR*GGYGGYSR 315-329 50 DDA very highi MMA, MMG 16 729.8469 2+ R.GGFGGPR*GGYGGYSR 315-329 50 TDA very highi MMA, MMG 17 736.8550 2+ R.GGFGGPR**GGYGGYSR 315-329 4 TDA low none 18 415.6930 2+ R.GGYGGYSR* 322-329 54 TDA very highi MMA, MMG 19 814.3668 2+ R.GGYGGYSR*GGYGGYSR 322-337 61 TDA very highi MMA, MMG 20 821.3748 2+ R.GGYGGYSR**GGYGGYSR 322-337 23 TDA highiv, v aDMA 21 732.8339 2+ R.GGYGGYSR*GGYGGSR 330-344 41 TDA very highi MMA, MMG 22 488.8918 3+ R.GGYGGYSR*GGYGGSR 330-344 29 TDA highiii MMA, MMG 23 739.8419 2+ R.GGYGGYSR**GGYGGSR 330-344 30 DDA highiii aDMA, DMG 24 493.5640 3+ R.GGYGGYSR**GGYGGSR 330-344 42 TDA very highi aDMA, DMG 25 499.8954 3+ R.GGYDSPR*GGYDSPR 345-358 25 TDA highiv, v MMA, MMG 26 749.3400 2+ R.GGYDSPR*GGYDSPR 345-358 35 DDA highiv, v MMA, MMG 27 504.5665 3+ R.GGYDSPR**GGYDSPR 345-358 3 DDA mediumviii aDMA, DMG 28 643.2997 2+ R.GGYDSPR*GGYSR 352-363 25 DDA highiv, v MMA, MMG 29 429.2022 3+ R.GGYDSPR*GGYSR 352-363 24 TDA highiv, v MMA, MMG 30 657.9847 3+ R.GGYDSPR**GGYSR**GGYGGPR 352-370 34 TDA highiv, v aDMA, DMG 31 599.2914 2+ R.GGYSR*GGYGGPR 359-370 26 TDA highiv, v MMA, MMG 32 399.8635 3+ R.GGYSR*GGYGGPR 359-370 15 TDA highiv, vi MMA, MMG 33 404.5356 3+ R.GGYSR**GGYGGPR 359-370 46 TDA very highi aDMA, DMG 34 606.2995 2+ R.GGYSR**GGYGGPR 359-370 34 TDA highiv, v aDMA, DMG 35 748.8462 2+ R.NDYGPPR*GSYGGSR 371-384 32 TDA highiii MMA, MMG 36 499.5674 3+ R.NDYGPPR*GSYGGSR 371-384 42 TDA very highi MMA, MMG 37 755.8564 2+ R.NDYGPPR**GSYGGSR 371-384 22 TDA highiv, v aDMA, DMG 38 504.2387 3+ R.NDYGPPR**GSYGGSR 371-384 22 TDA highiv, v aDMA, DMG 39 738.3439 3+ R.NDYGPPR*GSYGGSR*GGYDGPR 371-391 11 TDA highiv, vii MMA, MMG S-16 Scheme S-1. The locations of methylarginine residues identified in the amino acid sequence of in vitro methylated Npl3 using LC-MS/MS. Amino acids observed in methylated tryptic peptides are shown in black and bolded.

S-17 Scheme S-2. The locations of methylarginine residues identified in the amino acid sequence of in vivo methylated Npl3 using LC-MS/MS. Amino acids observed in methylated tryptic peptides are shown in black and bolded.

S-18 Npl3 Arginine Methylation Sites: Comparisons to Previous

Research and Observed Amino Acid Motifs

When considering the methylated peptides identified across the three in vitro methylated His-Npl3 samples, if only very high and high confidence identifications are considered, 17 arginine methylation sites can be localized; 4 of these sites are exclusively mono-methylated, 12 show variable mono- and di-methylation, and 1 site displays evidence for di-methylation only. When medium confidence identifications are also considered, a further site shows variable mono- and di- methylation (amino acid 290), and an additional arginine mono-methylation site can be localized

(amino acid 288). In the in vivo methylated Npl3 sample, of the 13 sites localized at very high or high confidence, 3 are exclusively mono-methylated, and 10 show variable mono- and di- methylation. When medium confidence identifications are also considered, a further site shows exclusive mono-methylation (amino acid 298).

As elaborated upon in the CID versus ETD section of the main text, all identified di-methylated arginine residues show strong evidence – from ETD derived neutral losses, some of which are specified in Table 1 and all of which are specified in Tables S-2 and S5 – to suggest that the di- methylations are asymmetric (with the exception of amino acid 391, in which the di-methylation type cannot be specified; when using ETD, any neutral loss-derived product ions generated from the peptides from which this residue are identified fall at m/z values beyond the mass range of the employed linear ion trap).

Using MALDI-FT-ICR data, in which MS/MS was not applied, McBride et al. suggested the presence of 17 arginine methylation sites in Npl3 . For the in vitro methylated Npl3, the LC-MS/MS identifications described in the main text allow 15 of these 17 sites to be confirmed with very high or high confidence, medium confidence evidence exists for a further one of these sites (amino acid

288), and no firm evidence can be found for the remaining site listed by McBride et al. (amino acid

344). Additionally, 2 methylation sites that were not listed by McBride et al. can be localized at very high or high confidence from the present data (amino acids 391 and 404). For the in vivo methylated

S-19 Npl3, 12 of the 17 sites listed by McBride et al. can be confirmed with very high or high confidence, while medium confidence evidence exists for a further one of these sites (amino acid 298).

The present results indicate that the majority of Npl3 arginine methylation occurs on Arg-Gly-Gly

(RGG) motif arginines (14 very high or high confidence sites in the in vitro methylated Npl3, 12 such sites in the in vivo methylated Npl3), whilst evidence also exists for methylation of 3 RG motif arginines in the in vitro methylated Npl3 (amino acids 288, 377 and 391) and 1 RG motif arginine in the in vivo methylated Npl3 (amino acid 377); these general findings are consistent with those implied by McBride et al. in their study, though some differences exist in the specific sites reported.

Additionally, the present data indicates that methylation of arginine in RD motifs is also possible

(mono-methylation of amino acid 404 in the in vitro methylated Npl3).

S-20 Figure S-2. MS/MS spectra obtained from the doubly charged tryptic Npl3 peptide GGFSR(di-meth.)GGFGGPR (Ion ID# 18 of Table S-2) via (a) CID, and (b) ETD. All singly charged b and y ions are labeled in spectrum (a), and all singly charged c’ and z’ ions are labeled in spectrum (b). All precursor or charge-reduced precursor ions, and product ions derived from neutral losses of –H2O, -NH3 or methylation specific fragments are labeled in each spectrum. Exact masses for peptide backbone fragments are listed in the tables to the right of the spectra, and observed fragments are shown in black and bolded.

S-21 Figure S-3. MS/MS spectrum obtained from the doubly charged tryptic Npl3 peptide GGFGGPR(dimeth.)GGYGGYSR (Ion ID# 33 of Table S-2) via ETD. All singly charged c’

 and z’ ions are labeled. All precursor or charge-reduced precursor ions, and product ions derived from neutral losses of -NH 3 or methylation specific fragments are labeled. Exact masses for peptide backbone fragments are listed in the tables to the right of the spectra, and observed fragments are shown in black and bolded.

S-22 n.s. (a) 8×104

7×104 ) s p c ( 6×104 y t i s n

e 4 t 5×10 n i

n o i 4 4×10 e g a r

e 3×104 v a 2×104

1×104

0 n.s. (b) 80

70

s 60 n o i

+ 2 50 %

40

30

20

0 n.s. (c) 40

35

30 s n o i 25 + 3

% 20

15

10

5

0 (d) 14

12 n.s.

s 10 n o i

+ 4

8 %

6

4

2

0 CID ETD

Figure S-4. The (a) average maximum ion intensity of individual methylated peptide ions, (b) average percentage of 2+ charged methylated peptide ions, (c) average percentage of 3+ charged methylated peptide ions, and (d) average percentage of 4+ charged methylated peptide ions observed in the 3 CID (left) and 3 ETD (right) LC-MS/MS experiments used to produce Figure 2. Brackets indicate statistical comparisons from two-tailed t-tests (n.s. refers to p>0.05).

S-23 Figure S-5. Histograms illustrating Mascot ion score distributions obtained for methylated Npl3 derived precursor ions from the 3 CID (left) and 3 ETD (right) LC-MS/MS experiments used to produce Figure 2, segmented by the charge states of the precursor ions. Mascot ion scores are associated with precursor ions corresponding to the very high, high and medium confidence methylated precursor ions in Table S-2. Searches returning incorrect peptide matches are assigned scores of 0. Data columns are normalized to the highest value in each histogram.

S-24 S-25 Figure S-6. Histograms illustrating Mascot ion score distributions obtained for methylated Npl3 derived precursor ions from the 3 CID (left) and 3 ETD (right) LC-MS/MS experiments used to produce Figure 2, segmented by the masses of the precursor ions. Mascot ion scores are associated with precursor ions corresponding to the very high, high and medium confidence methylated precursor ions in Table S-2. Searches returning incorrect peptide matches are assigned scores of 0. Data columns are normalized to the highest value in each histogram.

S-26 Figure S-7. The (a) average maximum ion intensity of individual methylated peptide ions, and (b) average number of methylated peptide ions observed in the 2 DDA (left) and 3 PTM-specific targeted data acquisition (right) ETD LC-MS/MS experiments conducted upon the trypsin digested in vitro methylated His-Npl3 sample used to produce Figures 3 and 4. Brackets indicate statistical comparisons from two-tailed t-tests (n.s. refers to p>0.05).

S-27 Figure S-8. The (a) average percentages of very high confidence methylated peptide ions selected for MS/MS, and (b) average percentages of high and very high confidence methylated peptide ions selected for MS/MS in the 2 DDA (left) and 3 PTM-specific targeted data acquisition (right) ETD LC-MS/MS experiments conducted upon the trypsin digested in vitro methylated His-Npl3 sample used to produce Figures 3 and 4. Brackets and p values indicate statistical comparisons from two-tailed t-tests.

S-28 Figure S-9. The average maximum ion intensity of individual methylated peptide ions observed in the 2 DDA (left) and 2 PTM- specific targeted data acquisition (right) ETD LC-MS/MS experiments conducted upon the trypsin digested protein mixture featuring in vivo methylated Npl3 used to produce Figures 5 and 6. Brackets indicate statistical comparisons from two-tailed t-tests (n.s. refers to p>0.05).

S-29 in vivo methylated Npl3 peptides from complex digest mixture

18 n.s. (a) 16 d e t

c 14 e l e e c s n

e s 12 d n i f o i n

o e c d

i 10 t h p g i e h p

y

d 8 r e e t v a

l # y

h 6 t e m 4

2

0 p = 0.033 (b) 40 d e e c t n c e e 35 l d i e f s

n s o n c 30

o i h

g e i d h i

t 25 y p r e e p v

d d 20 e n t a a

l h y g h i

t 15 h e

# m 10

5

0 DDA TDA

Figure S-10. The (a) average numbers of very high confidence methylated peptide ions selected for MS/MS, and (b) average numbers of high and very high confidence methylated peptide ions selected for MS/MS in the 2 DDA (left) and 2 PTM-specific targeted data acquisition (right) ETD LC-MS/MS experiments conducted upon the trypsin digested protein mixture featuring in vivo methylated Npl3 used to produce Figures 5 and 6. Brackets and p values indicate statistical comparisons from two-tailed t-tests (n.s. refers to p>0.05).

S-30 100 (a)

d 90 e t c a r

t 80 x e

s

e 70 r u t a e

f 60

e d i t

p 50 e p

d 40 e t a l y

h 30 t e m 20 % 10

0 1 MaxQuant 2 MaxQuant 3 MaxQuant Analysis Analyses Analyses

1.8×105 p = 0.020 (b) 1.6×105

) 5

s 1.4×10 p c (

y 5 t i 1.2×10 s n e t 5 n i 1×10

n o i 4 e 8×10 g a r e

v 4

a 6×10

4×104

2×104

0 unextracted extracted peptide features peptide features

Figure S-11. (a) Average percentages of methylated peptide features extracted from MS-1 data following 1, 2 and 3 MaxQuant analyses. (b) Average maximum ion intensities of methylated peptide features which are unextracted (left) and extracted (right) in individual MaxQuant analyses. All data are produced from 4 replicate LC-MS analyses of the trypsin digested in vitro methylated His-

Npl3 sample used to produce Figures 3 and 4. Brackets and p values indicate statistical comparisons from two-tailed t-tests.

S-31 FindMod Analyses: Descriptions of Total Peptide Matches, False

Discovery Rates and False Negative Rates

As described in the main text, the different digested methylated His-Npl3 samples produce between 8,500 and

23,000 features (as determined by MaxQuant). Following FindMod analyses of the monoisotopic masses associated with these features, no more than 290 matches (typically ~70 to 120 matches) to putatively methylarginine-carrying peptides are returned after duplicate matches are culled (e.g. masses returning matches to peptides featuring either two mono-methylation sites or one di-methylation site). The false positive rates associated with these analyses, which are dependent on the number of features and the number of methylated peptides observed in a given sample, vary between 10 and 75% (peptides not found in Table S-2 are considered false positives). Moreover, no false negatives are returned following FindMod analysis; when the monoisotopic masses associated with each methylated peptide ion identified in the current study (i.e. the ions of Table S-2) are run through FindMod, matches to methylarginine-carrying peptides are returned in each case.

S-32 Author Contributions

Mass spectrometry experiments, data analysis and manuscript preparation were undertaken by G.H.-S.

Contributions to manuscript preparation were obtained from J.K.K.L. and M.R.W. Conceptualization of experimental methods was undertaken by G.H.-S., J.K.K.L. and M.A.E. Cloning of genes, and expression, purification and methylation of recombinant proteins were conducted by J.K.K.L. and M.A.E. Fluorography was conducted by J.K.K.L. Additional intellectual input into experimental design and data interpretation was provided by M.R.W.

References

1. Shevchenko, A., Wilm, M., Vorm, O., Mann, M.: Mass spectrometric sequencing of proteins from silver-stained polyacrylamide gels. Anal. Chem. 68, 850-858 (1996)

2. McBride, A. E., Cook, J. T., Stemmler, E. A., Rutledge, K. L., McGrath, K. A., Rubens, J. A.: Arginine methylation of yeast mRNA-binding protein Npl3 directly affects its function, nuclear export, and intranuclear protein interactions. J. Biol. Chem. 280, 30888-30898 (2005)

S-33

Recommended publications