bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Mutations in disordered regions cause disease by creating endocytosis motifs
Katrina Meyer1, Bora Uyar2, Marieluise Kirchner3, Jingyuan Cheng1, Altuna Akalin2, Matthias Selbach1,4
1 Proteome Dynamics, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str. 10, D-13092 Berlin, Germany
2 Bioinformatics Platform, Max Delbrück Center for Molecular Medicine, Robert-Rössle- Str. 10, D-13092 Berlin, Germany
3 BIH Core Facility Proteomics, Robert-Rössle-Str. 10, D-13092 Berlin, Germany
4 Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany
Corresponding author: Matthias Selbach Max Delbrück Centrum for Molecular Medicine Robert-Rössle-Str. 10, D-13092 Berlin, Germany Tel.: +49 30 9406 3574 Fax.: +49 30 9406 2394 email: [email protected]
Word count: 2741
bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Abstract
2 Mutations in intrinsically disordered regions (IDRs) of proteins can cause a wide
3 spectrum of diseases. Since IDRs lack a fixed three-dimensional structure, the
4 mechanism by which such mutations cause disease is often unknown. Here, we employ
5 a proteomic screen to investigate the impact of mutations in IDRs on protein-protein
6 interactions. We find that mutations in disordered cytosolic regions of three
7 transmembrane proteins (GLUT1, ITPR1 and CACNA1H) lead to an increased binding
8 of clathrins. In all three cases, the mutation creates a dileucine motif known to mediate
9 clathrin-dependent trafficking. Follow-up experiments on GLUT1 (SLC2A1), a glucose
10 transporter involved in GLUT1 deficiency syndrome, revealed that the mutated protein
11 mislocalizes to intracellular compartments. A systematic analysis of other known
12 disease-causing variants revealed a significant and specific overrepresentation of
13 gained dileucine motifs in cytosolic tails of transmembrane proteins. Dileucine motif
14 gains thus appear to be a recurrent cause of disease.
1 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
15 Genome sequencing technologies have greatly facilitated the discovery of human
16 protein variants. In many cases it is not known if such variants cause disease, and even
17 when associations have been discovered, determining the mechanisms by which this
18 happens remains a major challenge 1. Most disease-causing missense mutations affect
19 evolutionarily conserved amino acids within structured regions of a protein and
20 destabilize its structure 2,3. However, over 20% of human disease mutations occur in so
21 called intrinsically disordered regions (IDRs) 4. Contrary to the traditional understanding
22 of protein structure and function, it is now clear that IDRs represent a functionally
23 important and abundant part of eukaryotic proteomes 5,6. Yet, since IDRs lack a defined
24 tertiary structure and are typically poorly conserved, the classical structure-function
25 paradigm cannot explain how mutations in IDRs cause disease.
26
27 One way to approach this issue is by analyzing protein-protein interactions (PPIs),
28 which can help to understand how mutations cause disease 7,8. The impact of PPIs on
29 disease is highlighted by the enrichment of missense mutations on interaction interfaces
30 of proteins associated with the corresponding disorders 9. Moreover, comparing the
31 interaction partners of wild-type proteins and their disease-associated variants can
32 reveal disease mechanisms 10,11. We therefore sought to systematically investigate how
33 mutations in IDRs affect PPIs.
34
35 IDRs often harbour short linear motifs (SLiMs) which mediate their function 12,13. These
36 SLiMs typically fall into two major classes -- motifs that mediate interactions with
37 globular domains and/or motifs which harbor posttranslational modification sites 14.
2 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
38 Mutations in IDRs can cause disease by disrupting such motifs or by creating novel
39 ones. A number of examples of such pathogenic changes in motifs have been reported
40 in the literature 15–18. Additionally, computational studies have revealed that pathogenic
41 mutations often target SLiMs 19–21. Despite these insights, however, there has not yet
42 been a systematic experimental analysis of the way disease-causing mutations in IDRs
43 affect interactions. One reason for this is that the small binding area between SLiMs and
44 cognate domains results in low binding affinities, which makes it difficult to study these
45 interactions 22.
46
47 A peptide-based interaction screen on disease-causing mutations
48 We reasoned that quantitative interaction proteomics with immobilized synthetic
49 peptides should enable us to systematically assess the impact of mutations in IDRs.
50 Such peptide pull-downs can maintain specificity even with low affinity interactions 23.
51 We therefore designed a scalable proteomic screen that employs peptides synthesized
52 on cellulose membranes (Fig. 1 A). These membranes carry peptides with a length of
53 15 amino acids that correspond to IDRs in both the wild-type and mutant form.
54 Membranes are incubated with cell extracts to pull-down interacting proteins. After
55 washing, peptide spots are excised and the proteins associated with them are identified
56 and quantified by shotgun proteomics. The main challenge in such interaction screens
57 is to distinguish specific interaction partners from non-specific contaminants 24–27. We
58 addressed this challenge through the use of two levels of quantification. First, two
59 replicates of a pull-down with a specific peptide sequence are compared to all other
60 peptide pull-downs via label-free quantification (LFQ) 28. This LFQ-filter selects proteins 3 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
61 that bind specifically to a given peptide. Secondly, the screen uses SILAC-based
62 quantification 29 to identify differential interaction partners of the wild-type and disease-
63 causing form of a peptide. This requires incubating the two replicates of the membrane
64 with cell lysates that have been differentially SILAC labeled. Wild-type peptide spots
65 from the heavy pull-down are combined with spots from the light pull-down that
66 correspond to the mutant forms of the same peptide, and vice versa. SILAC ratios give
67 a measure of the degree to which each particular mutation affects a specific interaction.
68
69 For the screen we selected 128 mutations in IDRs which are known to cause
70 neurological diseases (Fig. S1, Table S1). We included a peptide from an IDR in the
71 SOS1 protein that contains a proline-rich motif by which it is known to recruit several
72 specific binders via their SH3 domains 23. We analyzed the 2 x 129 pull-down samples
73 by using high-resolution shotgun proteomics in 45 min runs, resulting in a total
74 measurement time of about eight days. Replicates of the same peptide clustered with
75 each other with a median correlation coefficient of 0.87 (Pearson’s R), indicating good
76 reproducibility (Fig. S2). The LFQ data identified nine specific interactors of the SOS1
77 peptide, including four of the five that were previously known (Fig. 1 B). In the
78 corresponding SILAC data, seven of the nine LFQ-specific binders show preferential
79 binding to the wild-type compared to the mutant which has a disrupted proline-rich motif
80 (Fig. 1 C). Importantly, all interactors that are both specific (LFQ) and differential
81 (SILAC) contain SH3 domains. To further assess the relationship between peptide
82 motifs and cognate domains we also analyzed all pull-downs combined. We found that
83 mutations which disrupt a predicted SLiM in the peptide tend to reduce binding of
4 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
84 proteins with cognate domains (Fig. S3). Conversely, the gain of a SLiM in a peptide
85 tends to increase its binding to proteins with matching domains. In summary, these data
86 demonstrate that our screen efficiently detects how mutations in IDRs affect interactions
87 mediated by SLiMs.
88
89 A quantitative interaction network for disease-associated IDRs
90 Individual pull-downs typically led to the identification of ~ 400 proteins. If all of these
91 proteins were considered specific binders, this would correspond to ~ 400 binary
92 interactions per pulldown and a total of more than 100,000 interactions. However, since
93 many proteins are in fact background binders, we applied our quantitative filters with
94 cut-offs derived from the SOS1 control (Fig. 2 A). About half of the 2x128 peptides
95 showed at least one specific binder according to the LFQ-filter (Fig. S4). Applying the
96 LFQ-filter dramatically reduced the total number of interactions to 618. All of these 618
97 interactions are specific for the wild-type and/or the mutant form of a peptide as
98 compared to all other peptides in the screen (Table S2). However, not all of these
99 specific interactions are differential, i.e. affected by the mutation. Therefore, we next
100 applied the SILAC-filter, which led to a final list of 180 differential interactions (Table
101 S3). 111 of these interactions are lost through mutations in the peptide, while 69 are
102 gained. Of note, since pull-downs can also capture indirect binders, not all of these
103 interactions are necessarily direct.
104
105 To provide an overview of the data we displayed all of the differential interactions as a
106 network (Fig. 2 B). This revealed that several wild-type or mutant peptides shared 5 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
107 differential interactors, suggesting functional similarities. Moreover, subnetworks were
108 enriched in specific gene ontology terms (Fig. S5). The figure highlights two
109 subnetworks that we find particularly interesting. One is enriched in proteins that are
110 connected to clathrin-coated vesicles (see below). The other is enriched in splicing
111 factors (insets in Fig. 2 B) that interact with an IDR corresponding to amino acids 512-
112 526 of Fused in sarcoma (FUS). These interactions are disrupted by the R521C mutant.
113 FUS is an RNA-binding protein that is best known for its role in amyotrophic lateral
114 sclerosis (ALS) 30. The R521C and other mutations in the C-terminal region of the
115 protein are thought to be pathogenic because they disrupt a nuclear localisation signal
116 31. Our data suggests that impaired binding of splicing factors could be an
117 additional/alternative explanation for the pathogenicity of this mutation. This observation
118 is interesting because FUS has already been implicated in splicing 32–34. In fact, the C-
119 terminal region of the protein was found to interact with SRSF10 even before
120 pathogenic mutations in this region were identified 35.
121
122 Recruitment of clathrins through gains of dileucine motifs
123 The finding we considered most interesting is that mutated IDRs from CACNA1H,
124 GLUT1/SLC2A1 and ITPR1 led to specific interactions with clathrins (Fig. 3 A). The
125 corresponding SILAC data revealed that in all three cases clathrin had a strong
126 preference for the mutant form of the peptides over the wild-type (Fig. 3 B). Clathrins
127 mediate endocytosis and intracellular trafficking of transmembrane proteins. They are
128 recruited to the membrane by adaptor proteins that recognize specific cargo, form
129 clathrin-coated pits which are then pinched off by dynamin 36,37. Our finding that 6 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
130 clathrins are specifically recruited to mutated IDRs suggests that the mutations might
131 affect protein trafficking. Intriguingly, the three mutations share several other features
132 beyond an increased affinity for clathrin. First, all three mutations affect transmembrane
133 proteins -- a calcium channel (CACNA1H) and a glucose transporter (GLUT1) residing
134 in the plasma membrane and an inositol 1,4,5-trisphosphate receptor (ITPR1) in the ER
135 (Fig. 3 C). Second, all three mutations affect disordered regions exposed to the cytosol,
136 which makes them accessible to cytosolic adaptor proteins that mediate clathrin
137 recruitment. Third, all three mutations involve the change of a proline to a leucine
138 residue and thereby result in the appearance of a novel dileucine motif (“LL”) in the IDR
139 (Fig. 3 D). Such motifs are known to recruit clathrin to the plasma membrane or
140 intracellular locations 38. The classical dileucine motif for clathrin-dependent endocytosis
141 is [D/E]XXXL[L/I] 39, but variations of this scheme are common 38,40,41.
142
143 The P485L mutation causes mislocalization of the glucose transporter
144 GLUT1
145 The data presented so far are derived from our artificial in vitro screen based on short
146 peptides. We therefore selected the P485L mutation in GLUT1 for follow-up
147 experiments. This mutation causes GLUT1 deficiency syndrome (G1DS), a disorder
148 characterized by seizures and developmental delays 42,43. GLUT1 facilitates glucose
149 transport into the brain across the blood-brain barrier. GLUT1 mutations in G1DS
150 patients impair this glucose transport which gives rise to the disease phenotype.
151
152 To determine the impact of the P485L mutation on the subcellular localization, we 7 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
153 generated stable inducible cell lines expressing epitope tagged full-length wild-type or
154 mutant GLUT1. While the wild-type protein mainly localized to the plasma membrane,
155 the P485L mutant showed a more vesicular pattern (Fig. 4 A). Next we tested whether
156 the mutant form of GLUT1 was taken up via endocytosis by adding fluorescently labeled
157 transferrin to GLUT1 expressing cells. In contrast to wild-type GLUT1, the mutant
158 extensively colocalized with endocytosed transferrin (Fig. 4 B). To systematically
159 characterize the endocytic compartment in which GLUT1_P485L resides, we used
160 BioID as a proximity labeling method 44. We performed this experiment in a comparative
161 manner for both wild-type and mutant GLUT1 with SILAC-based quantification (see
162 Methods). GLUT1_P485L showed increased colocalization with proteins involved in
163 clathrin-mediated endocytosis, the trans-Golgi network (TGN), retrograde endosome-to-
164 TGN transport and lysosomes (Fig. 4 C). Of note, AP2B1 also showed increased
165 association with mutated GLUT1. This protein is part of the adaptor protein 2 (AP2)
166 complex that recognizes dileucine motifs to mediate endocytosis 38,40,41. Together, these
167 data indicate that the P485L mutation causes internalization of Glut1 via AP2-mediated
168 clathrin-dependent endocytosis.
169
170
171 Gains in dileucine motifs as a general disease mechanism
172 We next wondered if dileucine motif gains might be a more general mechanism by
173 which diseases could arise. To test this, we conducted a search of missense mutations
174 known to cause disease and occurring within disordered cytosolic regions of
175 transmembrane proteins. We found four additional pathogenic dileucine motif gains 8 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
176 (Fig. 5 A, Table S4). These mutations affect different proteins and cause a range of
177 diseases. Since we focused our follow-up experiments on GLUT1, we cannot state with
178 certainty if the other dileucine motif gains also cause protein mistrafficking. Alternatively,
179 the mutations could cause disease by different mechanisms and might just create
180 dileucine motifs as a by-product. If that was the case dileucine motifs should be
181 homogeneously distributed between disease-causing mutations and non-pathogenic
182 polymorphisms. Alternatively, if dileucine motif gains are responsible for pathogenesis,
183 they would be predicted to occur more often in disease than in non-pathogenic variants.
184 Moreover, pathogenic dileucine motif gains should be specific for cytosolic regions of
185 transmembrane proteins since this is where they exert their function. To test these
186 predictions, we compared the frequency of dileucine motif gains that have been found in
187 disease-causing mutations to their appearance in non-pathogenic polymorphisms. A
188 global survey of all disordered regions of the entire proteome revealed that gains in
189 dileucine motif gains occurred at about the same rate in disease and non-pathogenic
190 variants (OR = 0.81, p-value = 0.319, two-sided Fisher’s Exact Test). In the cytosolic
191 tails of transmembrane proteins, however, we observed a 3.7-fold enrichment of
192 dileucine motifs implicated in disease (OR = 3.7, p-value = 0.017, two-sided Fisher’s
193 Exact Test, Fig. 5 B). Disordered extracellular regions of transmembrane proteins do
194 not show this enrichment. We conclude that dileucine motif gains in disordered regions
195 of the cytosolic segments of transmembrane proteins are significantly and specifically
196 enriched in disease. To further assess the significance of this finding, we systematically
197 searched within cytosolic regions of transmembrane proteins for all other annotated
198 SLiMs contained in the ELM database 39. Intriguingly, of all 263 SLiMs tested, the
9 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
199 dileucine motif (LIG_diLeu_1) was the only significantly enriched motif in disease (Fig. 5
200 C).
201
202 Discussion
203 Understanding the functional relevance of protein variants is a major challenge in the
204 era of personal genomics -- especially for missense mutations in disordered regions.
205 Our proteomic screen provides a first systematic experimental analysis of how
206 mutations in disordered regions affect protein-protein interactions. Our results show that
207 the method can (i) capture known interactions, (ii) detect how mutations in SLiMs affect
208 binding of cognate domains and (iii) provide novel mechanistic insights into
209 pathogenesis. The peptide-based method is especially useful for mutations in proteins
210 that are otherwise difficult to study, such as large transmembrane proteins.
211 Nevertheless, it is also important to consider the intrinsic limitations of the approach.
212 Most importantly, in vitro pull-downs do not necessarily reflect physiological interactions
213 in vivo. For example, artifactual binding can occur when combining peptides and
214 proteins that never see each other in the cell 45. Moreover, taking IDRs out of the
215 context of the full length protein and immobilizing them as short peptides can affect
216 interactions. Finally, amino acids within IDRs often carry posttranslational modifications
217 -- a possibility which we did not consider here. In the future, it will be interesting to
218 include modified peptides, especially since mutations often affect modification sites 19,20.
219
220 Our screen revealed that a mutation in the cytosolic tail of the glucose transporter Glut1
221 generates a dileucine motif, which leads to clathrin-dependent endocytosis of the
10 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
222 protein. Consistently, we found that mutant Glut1 mislocalizes from the plasma
223 membrane to endocytic compartments. The finding that pathogenic mutations in
224 cytosolic tails of other transmembrane proteins also create dileucine motifs is
225 particularly intriguing. It is tempting to classify diseases caused by such motif gains as
226 “dileucineopathies”. Whether the other dileucine motif gains cause protein
227 mislocalization similar to Glut1 remains to be investigated. The observation that these
228 mutations are significantly and specifically enriched in cytosolic tails suggests that at
229 least some of them are functional. Why are dileucine motif gains a recurrent cause of
230 disease? We think this is due to a combination of several factors: First, the dileucine
231 motifs are not very complex and can thus arise by chance. Second, proline codons can
232 mutate to leucine codons by changing a single base. Third, proline is overrepresented in
233 IDRs, which are also the sites where the motif needs to be located in order to be
234 functional.
235
236 Bioinformatic studies have established that pathogenic mutations in disordered regions
237 often affect SLiMs 19–21. However, whether these predicted motif changes affect protein-
238 protein interactions is not clear. Moreover, many motifs have not yet been defined and
239 thus escape computational predictions 14. The biochemical approach presented here
240 provides a useful complementary strategy to computational studies. Key advantages of
241 our set-up are its scalability (by using synthetic peptides) and specificity (by employing
242 two quantitative filters). While we focused on Mendelian neurological disorders here, the
243 approach can also be applied to other types of variants such as somatic mutations in
244 cancer and non-pathogenic polymorphisms.
11 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
245 Acknowledgements
246 We would like to thank Michael Krauß, Volker Haucke (both FMP Berlin) and Philip Kim
247 (University of Toronto) for helpful discussions and Russ Hodge (MDC Berlin) for
248 valuable comments on the manuscript. We also thank Daniel Perez Hernandez (MDC
249 Berlin) and Ulf Reimer (JPT) for practical suggestions in setting up the screen. Markus
250 Landthaler (MDC Berlin) kindly provided stable cell lines and the Advanced Light
251 Microscopy Technology Platform (MDC Berlin) helped with microscopy.
252
253 Author contributions
254 K.M. and M.K. established the screen and selected the candidates. K.M. performed
255 most of the wet-lab experiments with contribution from M.K. and J.C.. K.M. processed
256 and analyzed the mass spectrometric data. B.U. carried out most of the remaining
257 bioinformatic analyses (mainly motif based analyses) supervised by A.A.. M.S.
258 conceived and supervised the work and wrote the manuscript with input from all
259 authors.
260
261 Competing financial interest statement 262 263 The authors declare no competing financial interest.
12 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
264 Figure Legends
265 Fig. 1: Quantitative interaction screen with disease-associated disordered regions 266 A, Cellulose membranes with synthetic wild-type (circles) and mutated (stars) peptides 267 are incubated with lysate from light (light red) or heavy (signal red) SILAC labeled cells 268 to pull-down interacting proteins. Spots are excised, corresponding wild-type/mutant 269 pairs are combined and analyzed by quantitative shotgun proteomics. Label free 270 quantification (LFQ) identifies specific interactors by comparing both replicates to all 271 other pull-downs. SILAC identifies differential binders by directly comparing 272 corresponding wild-type and mutant pairs. B, C Results for a SOS1-derived peptide with 273 a proline-rich motif as a benchmark. B, Volcano plot from the LFQ data for wild-type 274 SOS1. Specific binders are shown as red dots. 4 out of 5 known binders (red gene 275 names) are detected. C, SILAC log2 fold changes for differential binders of the wild-type 276 and mutant SOS1 peptide. Proteins with SH3 domains are shown with black outlines. 277 278 Fig. 2: Differential interactors of wild-type and mutant IDRs 279 A, Quantitative filters to select specific and differential interactions. Only a minor fraction 280 of all detected interactions is specific (LFQ filter). Moreover, only a fraction of specific 281 interactions is differential (SILAC filter), i.e. show preferential binding to the wild-type or 282 mutant form of a peptide. Mutation-induced interaction losses are more frequent than 283 mutation-induced gains. 284 B, Network of all differential interactions. Peptides (rectangles) and interacting proteins 285 (ovals) are presented as nodes. The edges indicate preferential binding to the wild-type 286 (blue) or mutant (red) form of a peptide (edge width indicates SILAC ratios). Highlighted 287 subnetworks are enriched in splicing regulators and clathrin-coated vesicle proteins 288 (see text).
13 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
289 290 Fig. 3: Recruitment of clathrins by recurrent gains of dileucine motifs 291 A, Volcano plots for pull-downs with mutated peptides derived from CACNA1H, GLUT1 292 and ITPR1. Specific binders (relative to all other pull-downs) are highlighted in red. All 293 three peptides specifically interact with clathrins. 294 B, Corresponding SILAC plots show that clathrins and related proteins preferentially 295 bind to the mutant form of peptides (relative to the wild-type). 296 C, Graphical representation of the mutation sites. All three mutations affect cytosolic 297 regions of transmembrane proteins. CACNA1H and GLUT1 are located in the plasma 298 membrane and ITPR1 in the ER. 299 D, Aligning the three peptide sequences reveals a common gain of a dileucine motif. 300 301 302 Fig. 4: A mutation-induced dileucine motif gain causes mislocalization of the 303 glucose transporter GLUT1 304 A, Confocal images of GLUT1 localization in Hek cells stably expressing FLAG-GLUT1 305 reveal that the wild-type is localized mainly at the cell membrane while the P485L 306 mutant is mislocalized to cytoplasmic compartments. 307 B, GLUT1 expressing HeLa cells are incubated with fluorescently labeled transferrin for 308 10 min before fixation. Mutant but not wild-type GLUT1 extensively co-localizes with 309 endocytosed transferrin. 310 C, Comparison of proteins co-localizing with wild-type and mutant GLUT1 by proximity 311 labeling (BioID). The upper panel shows SILAC log2 fold changes from two replicate 312 experiments with swapped isotope labels. Proteins are colored according to their typical 313 subcellular compartment (lower panel). The P485L mutant shows increased 314 colocalization with proteins involved in in clathrin-dependent endocytosis (CDE), the 315 trans-Golgi network (TGN), retrograde endosome-to-TGN transport and lysosomes. 316 Figure adapted from 46.
14 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
317 Fig. 5: Mutation-induced gains of dileucine motifs are a significant cause of 318 disease 319 A, A systematic bioinformatic search revealed four additional pathogenic mutations in 320 cytosolic segments of transmembrane proteins that create dileucine motifs. 321 B, Relative frequency of dileucine motif gains in disease mutations and polymorphisms 322 in different disordered regions (IUPred Score >= 0.4) of the proteome. Dileucine motif 323 gain is significantly enriched only in disordered regions of the cytoplasmic domains of 324 transmembrane proteins (two-sided Fischer’s exact test). 325 C, Comparison of gained motifs in disordered regions of cytoplasmic tails of 326 transmembrane proteins reveals the dileucine motif to have the most significant specific 327 enrichment when compared with polymorphisms.
15 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
328 References
329
330 1. Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal
331 variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).
332 2. Subramanian, S. & Kumar, S. Evolutionary anatomies of positions and types of disease-
333 associated and neutral amino acid mutations in the human genome. BMC Genomics 7, 306
334 (2006).
335 3. Yue, P., Li, Z. & Moult, J. Loss of protein structure stability as a major causative factor in
336 monogenic disease. J. Mol. Biol. 353, 459–473 (2005).
337 4. Vacic, V. et al. Disease-associated mutations disrupt functionally important regions of
338 intrinsic protein disorder. PLoS Comput. Biol. 8, e1002709 (2012).
339 5. Wright, P. E. & Jane Dyson, H. Intrinsically disordered proteins in cellular signalling and
340 regulation. Nat. Rev. Mol. Cell Biol. 16, 18–29 (2014).
341 6. Uversky, V. N., Oldfield, C. J. & Dunker, A. K. Intrinsically disordered proteins in human
342 diseases: introducing the D2 concept. Annu. Rev. Biophys. 37, 215–246 (2008).
343 7. Ryan, C. J. et al. High-resolution network biology: connecting sequence with function. Nat.
344 Rev. Genet. 14, 865–879 (2013).
345 8. Wang, P. I. & Marcotte, E. M. It’s the machine that matters: Predicting gene function and
346 phenotype from protein networks. J. Proteomics 73, 2277–2289 (2010).
347 9. Wang, X. et al. Three-dimensional reconstruction of protein networks provides insight into
348 human genetic disease. Nat. Biotechnol. 30, 159–164 (2012).
349 10. Zhong, Q. et al. Edgetic perturbation models of human inherited disorders. Mol. Syst. Biol.
350 5, 321 (2009).
351 11. Hosp, F. et al. Quantitative interaction proteomics of neurodegenerative disease proteins.
16 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
352 Cell Rep. 11, 1134–1146 (2015).
353 12. Van Roey, K. et al. Short linear motifs: ubiquitous and functionally diverse protein
354 interaction modules directing cell regulation. Chem. Rev. 114, 6733–6778 (2014).
355 13. Fuxreiter, M., Tompa, P. & Simon, I. Local structural disorder imparts plasticity on linear
356 motifs. Bioinformatics 23, 950–956 (2007).
357 14. Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the
358 molecular biologist. Mol. Cell 55, 161–169 (2014).
359 15. Kadaveru, K., Vyas, J. & Schiller, M. R. Viral infection and human disease--insights from
360 minimotifs. Front. Biosci. 13, 6455–6471 (2008).
361 16. Silvis, M. R. et al. A mutation in the cystic fibrosis transmembrane conductance regulator
362 generates a novel internalization sequence and enhances endocytic rates. J. Biol. Chem.
363 278, 11554–11560 (2003).
364 17. Cordeddu, V. et al. Mutation of SHOC2 promotes aberrant protein N-myristoylation and
365 causes Noonan-like syndrome with loose anagen hair. Nat. Genet. 41, 1022–1026 (2009).
366 18. Vogt, G. et al. Gains of glycosylation comprise an unexpectedly large group of pathogenic
367 mutations. Nat. Genet. 37, 692–700 (2005).
368 19. Radivojac, P. et al. Gain and loss of phosphorylation sites in human cancer. Bioinformatics
369 24, i241–i247 (2008).
370 20. Narayan, S., Bader, G. D. & Reimand, J. Frequent mutations in acetylation and
371 ubiquitination sites suggest novel driver mechanisms of cancer. Genome Med. 8, 55
372 (2016).
373 21. Uyar, B., Weatheritt, R. J., Dinkel, H., Davey, N. E. & Gibson, T. J. Proteome-wide analysis
374 of human disease mutations in short linear motifs: neglected players in cancer? Mol.
375 Biosyst. 10, 2626–2642 (2014).
376 22. Neduva, V. & Russell, R. B. Linear motifs: Evolutionary interaction switches. FEBS Lett.
17 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
377 579, 3342–3345 (2005).
378 23. Schulze, W. X. & Mann, M. A novel proteomic screen for peptide-protein interactions. J.
379 Biol. Chem. 279, 10756–10764 (2004).
380 24. Gingras, A.-C. & Raught, B. Beyond hairballs: The use of quantitative mass spectrometry
381 data to understand protein-protein interactions. FEBS Lett. 586, 2723–2731 (2012).
382 25. Gstaiger, M. & Aebersold, R. Applying mass spectrometry-based proteomics to genetics,
383 genomics and network biology. Nat. Rev. Genet. 10, 617–627 (2009).
384 26. Meyer, K. & Selbach, M. Quantitative affinity purification mass spectrometry: a versatile
385 technology to study protein–protein interactions. Front. Genet. 6, (2015).
386 27. Smits, A. H. & Vermeulen, M. Characterizing Protein-Protein Interactions Using Mass
387 Spectrometry: Challenges and Opportunities. Trends Biotechnol. 34, 825–834 (2016).
388 28. Cox, J. et al. Accurate proteome-wide label-free quantification by delayed normalization
389 and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics 13, 2513–
390 2526 (2014).
391 29. Mann, M. Functional and quantitative proteomics using SILAC. Nat. Rev. Mol. Cell Biol. 7,
392 952–958 (2006).
393 30. Deng, H., Gao, K. & Jankovic, J. The role of FUS gene variants in neurodegenerative
394 diseases. Nat. Rev. Neurol. 10, 337–348 (2014).
395 31. Dormann, D. et al. ALS-associated fused in sarcoma (FUS) mutations disrupt Transportin-
396 mediated nuclear import. EMBO J. 29, 2841–2857 (2010).
397 32. Rogelj, B. et al. Widespread binding of FUS along nascent RNA regulates alternative
398 splicing in the brain. Sci. Rep. 2, 603 (2012).
399 33. Ishigaki, S. et al. Position-dependent FUS-RNA interactions regulate alternative splicing
400 events and transcriptions. Sci. Rep. 2, 529 (2012).
401 34. Qiu, H. et al. ALS-associated mutation FUS-R521C causes DNA damage and RNA splicing
18 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
402 defects. J. Clin. Invest. 124, 981–999 (2014).
403 35. Yang, L., Embree, L. J., Tsai, S. & Hickstein, D. D. Oncoprotein TLS Interacts with Serine-
404 Arginine Proteins Involved in RNA Splicing. J. Biol. Chem. 273, 27761–27764 (1998).
405 36. McMahon, H. T. & Boucrot, E. Molecular mechanism and physiological functions of
406 clathrin-mediated endocytosis. Nat. Rev. Mol. Cell Biol. 12, 517–533 (2011).
407 37. Ferguson, S. M. & De Camilli, P. Dynamin, a membrane-remodelling GTPase. Nat. Rev.
408 Mol. Cell Biol. 13, 75–88 (2012).
409 38. Pandey, K. N. Functional roles of short sequence motifs in the endocytosis of membrane
410 receptors. Front. Biosci. 14, 5339–5360 (2009).
411 39. Dinkel, H. et al. ELM 2016—data update and new functionality of the eukaryotic linear motif
412 resource. Nucleic Acids Res. 44, D294–D300 (2015).
413 40. Staudt, C., Puissant, E. & Boonen, M. Subcellular Trafficking of Mammalian Lysosomal
414 Proteins: An Extended View. Int. J. Mol. Sci. 18, (2016).
415 41. Traub, L. M. Tickets to ride: selecting cargo for clathrin-regulated internalization. Nat. Rev.
416 Mol. Cell Biol. 10, 583–596 (2009).
417 42. De Vivo, D. C. et al. Defective Glucose Transport across the Blood-Brain Barrier as a
418 Cause of Persistent Hypoglycorrhachia, Seizures, and Developmental Delay. N. Engl. J.
419 Med. 325, 703–709 (1991).
420 43. Pascual, J. M. et al. Structural signatures and membrane helix 4 in GLUT1: inferences from
421 human blood-brain glucose transport mutants. J. Biol. Chem. 283, 16732–16742 (2008).
422 44. Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein
423 identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810
424 (2012).
425 45. Gibson, T. J., Dinkel, H., Van Roey, K. & Diella, F. Experimental detection of short
426 regulatory motifs in eukaryotic proteins: tips for good practice as well as for bad. Cell
19 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
427 Commun. Signal. 13, (2015).
428 46. Raiborg, C. & Stenmark, H. The ESCRT machinery in endosomal sorting of ubiquitylated
429 membrane proteins. Nature 458, 445–452 (2009).
20 a Defining specific interactions
SILAC Li 1.5 ght
p-value
bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright1.0 holder for this preprint (which was not certified by peer review)SILAC is the He author/funder. All rights reserved. No reuse allowed without permission. avy Log10 -
0.5 LFQ specific
0.0 Log2FC Mut/other peptides Log2FC Wt/other peptides SILAC differential Defining differential interactions Quantitative Peptide array shotgun proteomics
FC Wt/Mut FC
Ratio H/L Log2 H/L Ratio
Ratio L/H Log2FC Wt/Mut
b SNX9 PACSIN3 c PACSIN2 BIN1 NCK1 GRB2 EPS15L1 CD2AP RCN2 SH3 domain 8 1.5 6 GRB2 CRKL CD2AP NCK1 Mut 4 SNX9 PACSIN3 CAPZA1 PACSIN2 CLTC 1.0 2 PFN2
C Wt/ C p value p BIN1
0
-log[10] -log[10] L Log2F L -2 0.5 -4
Ratio H/ Ratio -6
0.0 -8
-10 -8 -6 -4 -2 0 2 4 6 8 10 -8 -6 -4 -2 0 2 4 6 8 Log2FC Wt/other peptides Ratio L/H Log2FC Wt/Mut
Fig.1 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
a Peptide-protein interactions Fig. 2
lost
gained
101,765 618 180 detected specific differential interactions interactions interactions
CCDC137 RPL39P5 C19orf53 MORF4L2 ZNF768 RUVBL2 RUVBL1 CSNK1D LANCL1 b IPO11 regulation of mRNA splicing, SUPT5H TINF2_K280E DDOST via spliceosome (p = 3.9e-12) TUBB2B RPUSD3 CACNA1A_R2135C DARS2_Y629C PCNA SRRM2 SRSF1 SRSF2 UBA1_S547G CSNK1A1 SLTM SSR4 RPN2 TRA2B EARS_E96K SPAST_P293L DDX39B PDIA3 SRSF3 RTL1 AURKA CALU MAP1B YWHAG PPM1G TPM3 FUS_R521C HDGFRP2 ALYREF HIST1H1D SEC63 L1CAM_S1194L EMG1 YWHAH COPA SRSF9 NASP SRP14 SRSF6 CCDC47 RBMX ZC4H2_R213W FAM91A1 CTNNBL1 PDIA6 HNRNPK NXF1 H1FX TRPV4_R315W SFPQ SRSF5 NOC2L RBBP4 TCOF1 MAPRE1 DARS2 SRSF10 HNRNPA1 AP3B1 NAP1L1 FUS_R216C DDX49 HDGF BCKDK CHMP2B_D148Y CASR_R898Q GJB1_R230C YWHAZ G3BP2 HNRNPA0 PANK2_E134G C1QBP HSP90AA1 PDIA4 CALR FTSJ3 HSP90B1 CKB NONO
EARS2_R168G PTMA MAP2 Clathrin-coated vesicle (p = 1.4e-05) PDCD11 P4HB TCF4_R565W NME4 ITPR1_P1059L UBL5 HUWE1 CACNA1H_P648L G3BP TMEM240_P170L POLRMT CHTOP AF1Q
PYCR1 SET PRIC295 ERCC6_P1042L ANP32A MDN1 WDR11 CLTC CLTB SOD1_I152T ZC2HC1A AP2A1 CLTA WWOX_P47T CC2D2A_T1114M XPO1 CC2D1A DYNC1H1 RPUSD4 PHGDH ADAR_K999N CKMT1A
SETX_R1294C RRP12 SLC2A1_P485L ANP32E ZEB2_Q1119R FLOT2 ANP32B DEPD5_S1073R
ACADM DKC1_G402R UBR4 CLN6_R6T DECR1 CLINT1
GJB1_R264C ERLIN2 CD2AP IPO9 FLOT1 SETBP1_I871T GJB1_C280G IF140T_E664K MATR3_T622A MECP2_G161V SMPD1_A196P EGR2_I268N GCH1_P23L TYMP_R44Q CDKL5_N399T CACNA1H_A748V AAAS_Q15K
CAPZA1 IPO7 DNCL1 POLDIP2 TRAF2 SPTAN1 GPHN RCN2 ERH VCP SSBP1 MYCBP2 a CACNA1H P648L GLUT1 P485L ITPR1 P1059L CLTA CLINT1 CLTC CLTC CLTA CLTB CLTA CLTB DECR1
1.5 1.5 1.5
bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1.0 1.0 1.0 -log[10]pvalue -log[10]pvalue -log[10]pvalue
0.5 0.5 0.5
0.0 0.0 0.0
-8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 Log2FC Mut/other peptides Log2FC Mut/other peptides Log2FC Mut/other peptides b
-2 -2 -2 CLINT1 DECR1 -4 CLTA -4 -4 CLTC CLTC CLTB -6 -6 CLTB -6 CLTC Ratio H/L Log2FC Wt/Mut Log2FC Ratio H/L Ratio H/L Log2FC Wt/Mut Log2FC Ratio H/L CLTA Wt/Mut Log2FC Ratio H/L -8 -8 -8 CLTA
-8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 Ratio L/H Log2FC Wt/Mut Ratio L/H Log2FC Wt/Mut Ratio L/H Log2FC Wt/Mut c d
GLUT1 CACNA1H wild-type mutant CACNA1H PGTGGHGPLSLNSPD PGTGGHGLLSLNSPD GLUT1 TPEELFHPLGADSQV TPEELFHLLGADSQV
L L L P648L ITPR1 GGSEENTPLDLDDHG GGSEENTLLDLDDHG P485L L
2.0 2.0 ITPR1 L L FG H F H P1059L 1.0 G E E D D 1.0 G E G D D bits E PGS GHH G SPG HH P cytoplasm D P S G G G ER E L G E LDS 0.0 TPTGLNT SANDQV TGL S QV L 0.0 TP NT AND 5 P 10 15 LL 5 10 15
Fig. 3 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. a c FLAG-GLUT1 Wt FLAG-GLUT1 P485L 4
2
0 TFRC AP2B1 ST7 AGFG1 APP CLCN7 -2 CPD IGF2R ITM2B ITM2C
Ratio M/H Log2FC Wt/Mut Log2FC Ratio M/H CLINT1 SMAP2
b -4 SNX1 FLAG-GLUT1 Wt Transferrin Overlay -6 -4 -2 0 2 4 6 Ratio H/M Log2FC Wt/Mut
Plasma membrane
Cytosol
CIE Fast recycling
CDE Slow recycling
Recycling endosome Early Endocytic endosome FLAG-GLUT1 P485L Transferrin Overlay degradation Late endosome TGN
MVE Golgi
Endoplasmic reticulum
Lysosome Nucleus
Fig. 4 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. a L1CAM RET CACNA1H
S1149L L L L L P618L P189L L
L P1039L
L RHBDF2 L
ER cytoplasm
b c LIG_diLeu_1 p = 0.017
1.5
3
n.s. 1 2
n.s.
-log10(pval)
Odds Ratio dileucine Ratio Odds gain disease mutations disease polymorphisms vs
n.s. 1 0.5
0 All proteins Transmembrane Cytosplasmic Extracellular proteins domains domains -2 0 2 of of transmembrane transmembrane log2oddsRatio proteins proteins polymorphism disease
Fig. 5 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Materials and Methods
2 Peptide-protein interaction screen
3 Candidate selection
4 Disease mutations in humans were taken from UniProt annotations 1 of Online 5 Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, 6 Johns Hopkins University (Baltimore, MD), https://omim.org/. This dataset consists of 7 experimentally validated missense mutations that contribute to inherited diseases. 8 Inherited disease mutations were downloaded from UniProt 9 (http://www.uniprot.org/docs/humsavar.txt, Release: 2015_07 of 24-Jun-2015, 2). Only 10 mutations that were associated to ‘Disease’ were kept. ‘Unclassified’ mutations or 11 ‘Polymorphisms’ were excluded. The 26,649 disease mutations were further filtered by 12 applying a disorder cut-off. Disorder tendencies of 15 amino acids (AAs) long peptides, 13 with the AA mutated in disease if possible located at position eight, were predicted 14 using IUPred 3 using the ‘SHORT’ profile considering sequential neighborhood of 25 15 residues. IUPred disorder scores above 0.5 denote regions of the proteins that have 16 95% likelihood to be disordered. For filtering, the mean disorder score for all 15AA as 17 well as the mutation position were required to be >0.5. This resulted in 1,878 disease 18 mutations in disordered regions. Next we assigned disease classes to 3,119 different 19 diseases included in the humsavar database by combining a manual approach with 20 automatic annotation with the Human Phenotype Ontology database, HPO 4. We 21 selected 305 mutations causing neurological diseases. After manual inspection, we 22 remained with 128 mutations causing 124 distinct neurological diseases that were used 23 for the peptide-protein interaction screen.
24
25 Experimental setup
26 Peptides of 15 AAs, in total 128 wild-type peptide and 128 related peptides containing 27 the disease causing mutation (256 peptides) plus one control peptide pair were bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
28 synthesized in situ on cellulose membrane using PepTrack™ techniques (JPT Peptide 29 Technologies, Berlin, Germany). Two of those peptide filters were moistened in cell lysis 30 buffer [50 mM HEPES pH 7.6 at 4°C, 150mM NaCl, 1 mM EGTA, 1mM MgCl2, 10% 31 Glycerol, 0.5% Nonidet P-40, 0.05% SDS and 0.25% Sodiumdeoxycholate, 32 supplemented with protease inhibitor (Roche) and benzonase (Merck)]. In order to 33 reduce unspecific binding, the membrane was incubated with 1 mg/ml yeast t-RNA 34 (Invitrogen) for 10 min and then washed twice with cell lysis buffer. The entire peptide 35 libraries were incubated with 15 ml of light or heavy SILAC labeled cell lysate (5 mg/ml) 36 from SH-SY5Y cells for 2h. Membranes were washed three times and air dried.
37 38 Cell culture 39 SH-SY5Y, Hek293 and HeLa cells were cultured under standard cell culture conditions. 40 In brief, cells were cultured in DMEM (life technologies) complemented with 10% fetal 41 calf serum (Pan-Biotech). 42 Cells used for SILAC based experiments were cultured in SILAC DMEM (life 43 technologies) complemented with glutamine (Glutamax, life technologies), Pyruvate (life 44 technologies), non-essential amino acids (life technologies) and 10% dialyzed fetal calf 45 serum (Pan-Biotech). The SILAC DMEM was supplemented with standard L-arginine 46 (Arg0, Sigma-Aldrich) and L-lysine (Lys0, Sigma-Aldrich)(“light”) as in 5. Alternatively, 47 Arg6 and Lys4 (“medium-heavy”) or Arg10 and Lys8 (“heavy”) were added in place of
48 their light counterparts. Cells were cultured at 37°C and 5% CO2.
49
50 Sample preparation for mass spectrometric analysis
51 Single spots were punched out from cellulose membrane with a 2mm diameter ear 52 punch (Carl Roth) and SILAC pairs were placed together in a 96-well plate (Thermo 53 Scientific) prepared with 30 µl of denaturation buffer [6M urea (Sigma-Aldrich), 2M 54 thiourea (Sigma-Aldrich), 10mM HEPES, pH 8]. Samples were reduced by incubating 55 with 10µl of 3.3 mM DTT (Sigma-Aldrich) for 30 min at RT, followed by an alkylation 56 step using 10µl of 18.3 mM iodoacetamide (IAA) (Sigma-Aldrich) for 60 min at RT. The 57 samples were first digested using 1µg endopeptidase LysC (Wako, Osaka, Japan) for 4 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
58 hours. The samples were diluted by adding 100 µl of 50mM ammonium bicarbonate (pH 59 = 8.5), and finally digested with 1µg trypsin (Promega) for 16h. The digestion was 60 stopped by acidifying each sample to pH < 2.5 by adding 10% trifluoroacetic acid 61 solution. The peptide extracts were purified and stored on stage tips according to 6.
62
63 LC-MS/MS analysis
64 Peptides were eluted using Buffer B (80% Acetonitrile and 0.1% formic acid) and 65 organic solvent was evaporated using a speedvac (Eppendorf). Samples were diluted in 66 Buffer A (5% acetonitrile and 0.1% formic acid). Peptides were separated on a 67 reversed-phase column with 45 min gradient with a 250 nl/min flow rate of increasing 68 Buffer B concentration on a High Performance Liquid Chromatography (HPLC) system 69 (ThermoScientific). Peptides were ionized using an electrospray ionization (ESI) source 70 (ThermoScientific) and analyzed on a Q-exactive plus Orbitrap instrument 71 (ThermoScientific). Dynamic exclusion for selected precursor ions was 30 s. The mass 72 spectrometer was run in data dependent mode selecting the top 10 most intense ions in 73 the MS full scans, selecting ions from 300 to 1700 m/z (Orbitrap resolution: 70,000; 74 target value: 1,000,000 ions; maximum injection time of 120 ms). The resulting MS/MS 75 spectra from the Orbitrap had a resolution of 17,500 after a maximum ion collection time 76 of 60 ms with a target of reaching 100,000 ions.
77 78 Data analysis 79 The resulting raw files were analyzed using MaxQuant software version 1.5.2.8 7. 80 Default settings were kept except that ‘match between runs’ and ‘re-quantify’ was turned 81 on. Lys8 and Arg10 were set as labels and oxidation of methionines and n-terminal 82 acetylation were defined as variable modifications. Carbamidomethyl of cysteines was 83 set as fixed modification. The in silico digests of the human Uniprot database (2015-12), 84 a fasta file containing all peptides used for pull-down and a database containing 85 common contaminants were done with Trypsin/P. The false discovery rate was set to 86 1% at both the peptide and protein level and was assessed by in parallel searching a 87 database containing the reversed sequences from the Uniprot database. Following bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
88 statistics and figures were done using R (R version 3.2.1, RStudio Version 1.0.143). 89 The resulting text files were filtered to exclude reverse database hits, potential 90 contaminants, and proteins only identified by site. Missing LFQ-intensity values were 91 imputed with random noise simulating the detection limit of the mass spectrometer 8. To 92 this end, imputed values were taken from a log normal distribution with 0.25× the 93 standard deviation of the measured, logarithmized values, down-shifted by 1.8 standard 94 deviations. In this way, a distribution of quantitative values for each protein across 95 samples is obtained. For determination of specific interactions, two replicated pull- 96 downs for the same peptide were tested against all other pull-downs, excluding the 97 corresponding variant peptide, by the nonparametric Mann–Whitney U test. Resulting p- 98 values and fold-changes (log2 space) have been plotted as volcano plots to determine 99 cut-offs. For cut-offs, an approach was used that employs a graphical formula to 8 10( ) 100 combination a fold-change and p-value cut-off : | | with x: enrichment 𝑐𝑐 101 factor of a protein, p: p-value of the Mann–Whitney−𝑙𝑙𝑙𝑙𝑙𝑙 U test𝑝𝑝 calculated≥ 𝑥𝑥 −𝑥𝑥0 from replicates, x0: 102 fixed minimum enrichment, c: curvature parameter. The curvature parameter c 103 determines the maximum acceptable p-value for a given enrichment x.
104 The parameters c and x0 can be optimized based on prior knowledge of known true and 105 false positives 8. Here, cut-offs were chosen according to known interaction partners of 8,9 106 the SOS1 control peptide . This resulted in x0=0, c=8. 107 This cut-off was applied to all other pull-downs to separate specific binders from 108 background. SILAC ratios were normalized by subtracting the median SILAC ratio of 109 every experiment from all SILAC ratios in that experiment. To define interaction partners 110 that bind differentially to wild-type and mutant peptide, a SILAC cut-off has been 111 defined. For wild-type specific interaction partners, the mean SILAC ratio of the two 112 replicates needed to be >1 and none of the two ratios <0 (mutant specific mean SILAC 113 ratio < -1 and none of the two ratios >0). Resulting figures were modified in Inkscape 114 (0.91). bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
115 Follow-up on GLUT1 116 117 Generation of vectors 118 We purchased SLC2A1 (GLUT1) from Harvard Plasmid repository (HsCD00378964). A 119 stop codon has been added to the gene with the following primers 120 Fw:TCCCAAGTGTAATTGCCAACTTTCTTGTACAAAGTTG, 121 Rev:ATCAGCCCCCAGGGGATG. 122 P485L Mutation has been introduced by changing c.1454 C>T 10 with Q5® Site-Directed 123 Mutagenesis Kit (NEB) Fw:CTGTTCCATCtCCTGGGGGCT, 124 Rev:CTCCTCGGGTGTCTTGTCAC. 125 SLC2A1 has been further cloned into two destination vectors adding either an N- 126 terminal FLAG-HA Tag or a BirA-FLAG Tag with Gateway cloning strategy (Thermo 127 Fisher Scientific).
128 BioID
129 Light, medium-heavy and heavy labelled Hek293 cells have either been mock- 130 transfected (light cells served as a control for background binding) or transiently 131 transfected with BirA-FLAG-GLUT1 wt or mutant (medium-heavy and heavy labeled 132 cells). SILAC labeling allowed for quantitative comparison of proteins that have been 133 proximity labelled by the transiently expressed constructs (Forward experiment: Light - 134 Control, Medium-heavy - wt, Heavy - mut; Label swap experiment: Light - Control, 135 Medium-heavy - mut, Heavy - wt). All three cell lines have been incubated for 24h in cell 136 culture medium containing biotin. BioID experiment has been performed essentially as 137 in 11, with minor adaptations. 138 Mass spec setup and analysis was done similarly as to samples from peptide pull- 139 downs, but digested peptides were separated on a 2,000 mm monolithic column with a 140 100-µm inner diameter filled with C18 material that was kindly provided by Yasushi 141 Ishihama (Kyoto University) using a 4 hr linear gradient with a 300 nl/min flow rate of 142 increasing Buffer B concentration on a High Performance Liquid Chromatography 143 (HPLC) system (ThermoScientific). bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
144 GLUT1 localization
145 Hek293 cells stably expressing either FLAG-tagged GLUT1 wt or P485L mutant 146 exhibiting tetracycline-inducible expression were generated using the Flp-In system 147 developed by Life Technologies according to the manufacturer’s protocol. After 148 induction for 24h in doxycyclin (0.1 µg/ml) containing media, cells have been stained 149 against FLAG (F1804, SIGMA). Nucleus has been stained with DAPI (Sigma).
150 Transferrin uptake 151 HeLa cells seeded on coverslips coated with poly-l-lysine (Sigma) have been transiently 152 transfected with FLAG-GLUT1-wt or -P485L construct. After 24h they were serum- 153 starved for 1 h and used for Transferrin (Tf) uptake. For Tf uptake, cells were treated 154 with 20 μg ml−1 Tf-Alexa568 (life technologies) for 10 min at 37 °C.
155 Fluorescence Microscopy 156 Cells were cultured on poly-l-lysine (Sigma) coated coverslips and fixed with 4% PFA. 157 Standard procedures were used for immunostaining. Images were acquired by confocal 158 microscopy (Transferrin uptake: Zeiss Observer.Z1; GLUT1 localization: Leica, 159 DMI6000). Images were further processed with Fiji 12. 160
161 Analysis of human missense variants and short linear motifs 162 (SLiMs)
163 SLiM regular expression patterns
164 262 annotated SLiM class definitions (regular expression patterns) were downloaded 165 from the Eukaryotic Linear Motif (ELM) database 13. In order to analyse dileucine motifs, 166 an additional motif ‘.LL.’ was added to this compilation and named ‘LIG_diLeu_1’ in 167 order to conserve the naming convention followed by the ELM database. 168 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
169 Pathogenic and non-pathogenic missense variants
170 For the analysis of the missense variants that lead to de novo SLiM instances in protein 171 sequences Uniprot Humsavar dataset (version 12-Apr-2017) 2 was downloaded and 172 filtered for missense variants. Variants that are classified as ‘Disease’ or ‘Polymorphism’ 173 in this dataset were selected. 174
175 Protein domains
176 PFAM domain annotations of proteins were downloaded from the PFAM database 177 (ftp://ftp.ebi.ac.uk/pub/databases/Pfam//releases/Pfam30.0/proteomes/9606.tsv.gz) 14. 178
179 SLiM - PFAM associations
180 PFAM domains and SLiM classes that are known to interact were downloaded from the 181 ELM database (http://elm.eu.org/interactiondomains). 182 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
183 Analysis of gain of SLiMs via missense variants in disordered regions
184 For each reviewed human protein from Uniprot (20191 proteins), the disorder scores of 185 each residue were calculated using IUPred (using the ‘short’ setting). Using a IUPred 186 disorder score cut-off of 0.4, the missense variants in disordered regions were selected. 187 The missense variants that overlap PFAM domains were further filtered out based on 188 the PFAM domain annotations found in the protein feature files downloaded from 189 Uniprot in GFF format (e.g. the link to the GFF file for GLUT1 is 190 http://www.uniprot.org/uniprot/P11166.gff). These protein feature files were also used to 191 detect the transmembrane proteins and their cytoplasmic/extracellular regions. The 192 missense variants in disordered regions and not overlapping any PFAM domains were 193 further classified as variants from 1) the whole proteome, 2) the transmembrane 194 proteins (only those that have annotation of at least one cytoplasmic domain or an 195 extracellular domain, in total 3836 proteins), 3) the cytoplasmic domains of 196 transmembrane proteins, and 4) extracellular domains of transmembrane proteins. For 197 each of these classes, the number of disease-causing variants and the number of 198 polymorphisms that lead to a gain of SLiMs was counted and a two-sided Fisher’s Exact 199 Test was applied to see if there is a statistically significant difference for the likelihood of 200 a given class of SLiMs to be gained via disease-causing variant compared to that of 201 polymorphisms.
202 Peptide-Protein Interaction Network Analysis
203 180 peptide-protein interactions that passed the strict LFQ filter and showed significant 204 differential SILAC ratios between wild-type and mutant forms of the peptides were used 205 to compose a peptide-protein interaction network. The network was visualized using 206 Cytoscape 3.5.1 15. Sub-graphs of the significant interactions were generated using R 207 package igraph (version 1.0.1) 16 (using fastgreedy.community function) and visualized 208 using the R packages ggnetwork 17 and ggplot2 18. Enriched GO terms for each sub- 209 graph were calculated using the topGO R package 19. 210 211 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
212 References 213
214 1. UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource
215 (UniProt). Nucleic Acids Res. 40, D71–5 (2012).
216 2. Famiglietti, M. L. et al. Genetic variations and diseases in UniProtKB/Swiss-Prot: the ins
217 and outs of expert manual curation. Hum. Mutat. 35, 927–935 (2014).
218 3. Dosztányi, Z., Csizmok, V., Tompa, P. & Simon, I. IUPred: web server for the prediction of
219 intrinsically unstructured regions of proteins based on estimated energy content.
220 Bioinformatics 21, 3433–3434 (2005).
221 4. Köhler, S. et al. The Human Phenotype Ontology in 2017. Nucleic Acids Res. 45, D865–
222 D876 (2017).
223 5. Schwanhäusser, B. et al. Global quantification of mammalian gene expression control.
224 Nature 473, 337–342 (2011).
225 6. Rappsilber, J., Ishihama, Y. & Mann, M. Stop and go extraction tips for matrix-assisted
226 laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in
227 proteomics. Anal. Chem. 75, 663–670 (2003).
228 7. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized
229 p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol.
230 26, 1367–1372 (2008).
231 8. Keilhauer, E. C., Hein, M. Y. & Mann, M. Accurate protein complex retrieval by affinity
232 enrichment mass spectrometry (AE-MS) rather than affinity purification mass spectrometry
233 (AP-MS). Mol. Cell. Proteomics 14, 120–135 (2015).
234 9. Schulze, W. X. & Mann, M. A Novel Proteomic Screen for Peptide-Protein Interactions. J.
235 Biol. Chem. 279, 10756–10764 (2003).
236 10. Slaughter, L., Vartzelis, G. & Arthur, T. New GLUT-1 mutation in a child with treatment- bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
237 resistant epilepsy. Epilepsy Res. 84, 254–256 (2009).
238 11. Couzens, A. L. et al. Protein interaction network of the mammalian Hippo pathway reveals
239 mechanisms of kinase-phosphatase interactions. Sci. Signal. 6, rs15 (2013).
240 12. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat.
241 Methods 9, 676–682 (2012).
242 13. Dinkel, H. et al. ELM 2016--data update and new functionality of the eukaryotic linear motif
243 resource. Nucleic Acids Res. 44, D294–300 (2016).
244 14. Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future.
245 Nucleic Acids Res. 44, D279–85 (2016).
246 15. Shannon, P. et al. Cytoscape: a software environment for integrated models of
247 biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
248 16. Csardi, G. & Nepusz, T. The igraph software package for complex network research.
249 InterJournal, Complex Systems 1695 (2006).
250 17. Briatte, F. ggnetwork: Geometries to Plot Networks with ‘ggplot2’, R package version 0.5.1.
251 CRAN (2016).
252 18. Wilkinson, L. ggplot2: Elegant Graphics for Data Analysis by WICKHAM, H. Biometrics 67,
253 678–679 (2011).
254 19. Alexa, A. & Rahnenfuhrer, J. topGO: Enrichment Analysis for Gene Ontology. R package
255 version 2.24.0. CRAN (2016).
256 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Supplementary figure legends 2 3 Fig. S1: Candidate selection for peptide-protein interaction screen 4 Candidates were selected from missense disease mutations in the Humsavar 5 database (Uniprot) by selecting mutations in disordered regions that cause 6 neurological diseases. Only mutations in disordered regions of a protein were kept. 7 Mutations were further filtered for those that cause neurological diseases and the 8 final candidate set was manually selected. 9 10 Fig. S2: Reproducibility of technical replicates 11 Pearson’s R shows significantly higher correlation of technical replicates than 12 correlations between all pull-downs. 13 14 Fig. S3: SILAC ratio distributions of detected interactions that can be 15 explained by presence of SLiMs in the peptides and PFAM domains in the 16 interaction partners 17 Peptide-protein interactions detected in the screen were classified as ‘gained’ or 18 ‘lost’ according to the following criteria: An interaction is classified as ‘gained’ if the 19 mutant peptide sequence matches a SLiM pattern that doesn’t match the wild-type 20 peptide sequence and the mutant peptide has an interaction partner that contains a 21 compatible PFAM domain to bind that SLiM instance. On the other hand, an 22 interaction is classified as ‘lost’ if the wild-type peptide sequence matches a SLiM 23 pattern that is not matched in the mutant peptide sequence and the wild-type peptide 24 has an interaction partner that contains a compatible PFAM domain to bind that 25 SLiM instance. Gained and lost interactions are further sub-classified as ‘LFQ 26 positive’ and ‘LFQ negative’ depending on whether the peptide-protein interaction 27 has an LFQ value that passes the loose LFQ cut-off. The median SILAC ratio 28 distributions (wild-type versus mutant) of each of these four categories of interactions 29 (‘Gained interactions - LFQ negative’, ‘Gained interactions - LFQ positive’, ‘Lost 30 interactions - LFQ negative’, and ‘Lost interactions - LFQ positive’) are compared 31 with the median SILAC ratio distributions of all detected interactions from the array 32 using a Wilcoxon-Mann-Whitney test. Compared to the background distribution of 33 median SILAC ratios (in red), the gained interactions that pass the LFQ filter (in bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 34 green) show a significant negative skew while the lost interactions that pass the LFQ 35 filter (in purple) show a significant positive skew. 36 37 Fig. S4: Impact of specificity cut-off (LFQ) and differential cut-off (SILAC) on 38 peptide candidates 39 After applying the specificity cut-off (derived from control peptide, see Methods) on 40 all interactions, only about half of the 2x128 peptides showed at least one specific 41 binder according to the LFQ-filter (left pie chart, red). More than one third of all 42 peptide pairs with specific interactors for wild-type and/or mutant show differential 43 interactions after applying the SILAC cut-off (see Methods) (right pie chart, red). 44 45 Fig. S5: GO Term analysis of subclusters of the peptide-protein interaction 46 network composed of significant differential interactions 47 180 peptide-protein interactions that passed the strict LFQ filter and showed 48 significant differential SILAC ratios between wild-type and mutant forms of the 49 peptides were used to compose a peptide-protein interaction network. Communities 50 (sub-graphs) of this network was extracted using fastgreedy.community function of 51 the igraph R package (See Methods). GO term enrichment was calculated for the 52 nodes of each subgraph of the network. Within the subgraphs, the edges 53 (interactions) are colored differentially depending on whether the interaction is with a 54 mutant or wild-type peptide. The peptides are depicted as triangles while the proteins 55 are depicted as circles and they are differentially colored. Below the subgraph figure, 56 a bar plot of the top 10 enriched GO terms are displayed (differentially colored by the 57 GO term category as ‘Biological Process’, ‘Molecular Function’ or ‘Cellular 58 Compartment), in which the x axis shows the log10 p-values corrected for multiple- 59 testing correction using the Benjamini Hochberg method. bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. S1
Mutation database 26,649 Humsavar UniProt annotations of OMIM
Disorder prediction 1,878 IUPred
Disease classification 305 Neurological diseases Human Phenotype Ontology, Manual Manual selection 128 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. S2 p < 2.2e-16
0.9
0.8
0.7 Pearson's R
0.6
0.5
All Replicates bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 3.4e−11
0.052 10 0.00064
0.6
5
0 Median.SILAC.ratio.Wt/Mut
−5
N: 99526 N: 38 N: 15 N: 112 N: 40
All interactions
Fig. S3 Lost interactions (specific (LFQ)) Gained interactions (specific Lost(LFQ)) interactions (unspecific (LFQ)) Gained interactions (unspecific (LFQ)) All peptides bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted MayPeptide 24, 2017. pairs The with copyright specific interactors holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
n = 136 n = 120 n = 76 n = 44
peptides with specific interactors peptide pairs with differential interactions
Fig. S4 bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
RTL1 ● HDGFRP2 A ● ● ● ● MAP1B ● NOC2L AP3B1 TCOF1 ● CCDC47 SRRM2 ● TRPV4_R315W PDIA6 NASP ●SEC63 ●● TINF2_K280E PDIA3 ● DKFZp686J1372 ● PCNA YWHAG CTNNBL1 ● ● CALU TRA1 ● ● RBBP4 ● DDX39B MAPRE1● PPM1G ● ● CALR● ● NAP1L1 YWHAH ● SPAST_P293L ZC4H2_R213W ● TM240_P170L ●HDGF CXB1_R230C P4HB PDIA4 ● ● ● ● HNRNPK SUPT5H PTMA ● ● ● ● YWHAZ CKB HSP90AA1
mutant−edge peptide protein wt−edge
physiological muscle hypertrophy positive regulation of cardiac muscle ce... regulation of DNA damage checkpoint U4 snRNP positive regulation of DNA−templated tra... U6 snRNA binding Term U4 snRNA binding transcription export complex RNA binding poly(A) RNA binding 0 5 10 15 −log10(bh)
Biological_Process Fig. S5 Cellular_Compartment Molecular_Function bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
DARS2 L1CAM_S1194L B CC2D1A● ● G3BP FAM91A1 ● SRP14 ● ● ● ●NXF1 G3BP2 ● H1FX EMG1●● RPUSD4 ALYREF● ● HNRNPA0 AP2A1● RBMX ● CHM2B_D148Y HIST1H1D ●HNRNPA1 ● NONO ZC2HC1A ● BCKDK ● FTSJ3 ● MAP2● ● NME4 ● ● ● COPA DDX49 ● ● PYCR1 PDCD11 ● POLRMT SFPQ WDR11● ● ● CHTOP
mutant−edge peptide protein wt−edge
macromolecular complex organic cyclic compound binding heterocyclic compound binding transcription export complex membrane−enclosed lumen organelle lumen Term intracellular organelle lumen nucleic acid binding RNA binding poly(A) RNA binding 0 3 6 9 −log10(bh)
Cellular_Compartment Molecular_Function bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
C19orf53● C ● ● ● RPL39P5 AURKA CCDC137 ● DDOST ZNF768 ● ● ALYREF SYEM_E96K LANCL1● ● ● EMG1 SLTM CAC1A_R2135C ●MORF4L2 CSNK1D ● RPUSD3 ● ● ● CSNK1A1 RPN2
mutant−edge peptide protein wt−edge
positive regulation of proteasomal prote... Term
positive regulation of proteasomal ubiqu...
0.0 0.3 0.6 0.9 −log10(bh)
Biological_Process bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● D PANK2_E134G ANP32B ANP32E● SYEM_R168G ● SETX_R1294C C1QBP ●ANP32A DSRAD_K999N FUS_R216C ●SET
WWOX_P47T CASR_R898Q ITF2_R565W ● PTMA TRA1 HUWE1● ●
mutant−edge peptide protein wt−edge
nuclear part
nuclear lumen
membrane−enclosed lumen Term organelle lumen
intracellular organelle lumen
0.0 0.5 1.0 1.5 2.0 −log10(bh)
Cellular_Compartment bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● E ACADM ●CKMT1A ● ZEB2_Q1119R RRP12 ●XPO1 ● C2D2A_T1114M DEPD5_S1073R UBR4 PRIC295●
CLN6_R6T MDN1 ● ● POLRMT
mutant−edge peptide protein Term
−log10(bh) bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● F ● DKFZp686F18120 SRSF2 ● SRSF1 ● SRSF6 SRSF9 FUS_R521C ● ● SRSF3 ●SRSF10 SRSF5●
peptide protein wt−edge
mRNA transport RNA splicing, via transesterification re... mRNA splicing, via spliceosome RNA splicing, via transesterification re... termination of RNA polymerase II transcr... ribonucleoprotein complex localization Term RNA export from nucleus ribonucleoprotein complex export from nu... mRNA−containing ribonucleoprotein comple... mRNA export from nucleus 0.0 2.5 5.0 7.5 10.0 12.5 −log10(bh)
Biological_Process bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● G CLTB ITPR1_P1059L ●CLTC
DECR1● GTR1_P485L CLTA●
CAC1H_P648L CLINT1●
mutant−edge peptide protein
Golgi−associated vesicle coated pit coated vesicle Golgi−associated vesicle membrane vesicle coat clathrin coat Term clathrin−coated vesicle trans−Golgi network transport vesicle clathrin vesicle coat clathrin binding 0 2 4 6 −log10(bh)
Cellular_Compartment Molecular_Function bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● H RUVBL1 RUVBL2 ● IPO11 ●
SYDM_Y629C SSR4● ●CALU UBA1_S547G
PCNA TUBB2B● ●
peptide protein wt−edge
acetyltransferase complex protein acetyltransferase complex SWI/SNF superfamily−type complex histone acetyltransferase complex histone methyltransferase complex MLL1 complex Term MLL1/2 complex INO80−type complex DNA helicase complex Ino80 complex 0.0 0.5 1.0 1.5 2.0 −log10(bh)
Cellular_Compartment bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● I DYNC1H1 ●PHGDH
ERCC6_P1042L
SODC_I152T ●AF1Q UBL5●
mutant−edge peptide protein wt−edge Term
−log10(bh) bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● J FLOT2
●ERLIN2 DKC1_G402R
CXB1_R264C
FLOT1●
mutant−edge peptide protein wt−edge
positive regulation of toll−like recepto... cortical actin cytoskeleton positive regulation of myoblast fusion positive regulation of heterotypic cell−... cell−cell adhesion mediated by cadherin membrane biogenesis Term glutamate receptor binding membrane assembly ionotropic glutamate receptor binding membrane raft organization 0 5 10 15 20 −log10(bh)
Biological_Process Cellular_Compartment Molecular_Function bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
CAC1H_A748V K●CD2AP
CAPZA1●
peptide protein wt−edge Term
−log10(bh) bioRxiv preprint doi: https://doi.org/10.1101/141622; this version posted May 24, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
● L IPO9
AAAS_Q15K
IPO7●
mutant−edge peptide protein
GTPase binding
small GTPase binding
Ras GTPase binding
Term histone binding
protein transporter activity
Ran GTPase binding
0 1 2 −log10(bh)
Molecular_Function