Supplementary material for

Multiple polyvalency provided by intrinsically disordered segments is a key feature of postsynaptic scaffold

Annamária Kiss-Tóth, Bálint Péterfia, Annamária F. Ángyán, Balázs Ligeti, Gergely Lukács, Zoltán Gáspári

Supplementary results

Gene ontology analysis Results of the GO term analysis (using the PATHER GO-SLIM cellular component option) are shown in Table S2. GO terms enriched in the sets with above average number of IDRs/ANCHOR regions point to the prevalence of cytoskeletal and nuclear proteins in these sets. Nevertheless, the term ‘Postsynaptic membrane’ is also identified. Related enrichment analyses using the full GO term list analysis are consisnent with these results.

Distributon of selected descriptors in the functional groups Figures S1-S4 contain box plots showing the distribution of selected descriptors in protein subsets defined differently. Figure S1 shows the same analysis as in Figure 2 of the main text with proteins in multiple subsets omitted. Figures S2-S3 show results on the nonredundant and the reviewed (SwissProt) sets, with no overlaps in the subsets allowed. Note that the number of proteins in a non-overlapping subset can be very low (Table S3), especially in proteins involved in synaptic processes as there are many such sets defined. Therefore, we have selected to include the results on overlapping subsets in the main text and Figure S2.

Figure S1. Box plots of selected descriptors in the full human proteome set. Subsets are exclusive, i.e. proteins in multiple subsets have been omitted.

Figure S2. Box plots of selected descriptors in the nonredundant protein set. Sets are non- exlcusive (a protein can be present in multiple sets).

Figure S3. Box plots Selected descriptors in reviewed proteins (SwissProt set). Sets are non- exlcusive (a protein can be present in multiple sets).

Supplementary discussion

We have selected some proteins form more detailed discussion both from the Ontology analysis and the PSSCAF (postsynaptic scaffold) set. It should be noted that we do not think that any of our sets is perfect in the sense that they contain only proteins with well-established postsynaptic scaffold function. Nevertheless, the trends observed are clear and this short literature survey on selected proteins is consistent with our main findings. The list of proteins discussed below are listed in Table S6.

Proteins bassoon and piccolo The structurally similar proteins piccolo (PCLO) and bassoon (BSN) are primarily know for their role in presynaptic axon terminals (1), yet both are listed in SynaptomeDB as postsynaptic proteins and have been also associated with PSD-related terms in our analysis. Both proteins are characterized by an exceptionally high interaction diversity descriptor. Piccolo is more than 5000 residues long (Figure S6 and S7), and Bassoon is almost 4000. Piccolo has been modeled as an elongated scaffold protein of a length of 80 nm (2). Piccolo contains only a few annotated globular domains, a PDZ and two C2 domains, but has multiple proline-, glutamine- and serine-rich regions as well as a segment with 10- residue repeats. Based on precictions, the presence of multiple coiled coil segments have been suggested (2) along with an uncertain SAH region (3).

Figure S6. ELM server prediction output for human Piccolo (UniProt ID Q9Y6V0) highlighting LIG and DOC ELM classes.

1

PCLO_HUMAN

0.5 PCLO_MOUSE Score G3QNK0_GORGO H2QUU9_PANTR K7ESZ2_PONAB

0

1

251 501 751

2751 1001 1251 1501 1751 2001 2251 2501 3001 3251 3501 3751 4001 4251 4501 4751 5001 Position

Figure S7. ANCHOR server prediction output for human Piccolo (UniProt ID Q9Y6V0) and its orthologs.

Microtubule-associated protein 2 MAP2 is a protein of more than 1800 residues and is largely disordered (Figure S8 and S9). It has a suggested function of stabilizing microtubules. It has tau/MAP repeats with a possible role in microtubule binding, which is regulated by (4). MAP2 has been shown to be involved in the regulation of cellular process outgrowth (5). MAP2 protein is a typical example of a cytoskeletal protein involved in neuronal processes.

Figure S8. ELM server prediction output for human Microtubule-associated protein 2 (UniProt AC P11137) highlighting LIG and DOC ELM classes.

1

MTAP2_HUMAN

0.5 MTAP2_MOUSE Score G3RLX2_GORGO H2R011_PANTR 0 H2P8G4_PONAB 1 251 501 751 1001 1251 1501 1751 Position

Figure S9. ANCHOR server prediction output for human Microtubule-associated protein 2 (UniProt AC P11137) and its orthologs.

Leucine-rich repeat-containing protein 7 Leucine-rinch repeat.containing protein 7, LRRC7, also known as Densin-180, contains an N- terminal array or LRR repeats, a central disordered region and a C-terminal PDZ domain (Figure S10 and S11). This protein has been shown to be associated with PSD-95 mediated by MAGUIN-1. More specifically, the PDZ domain of LRRC7 binds to the C-terminus of MAGUIN-1 (6). Splice variants of LRRC7, including those with deletions in the disordered central region of the protein show different partner binding properties and subcellular localization pattern in rats (7).

Figure S10. ELM server prediction output for human LRRC7 (UniProt ID Q96NW7) highlighting LIG and DOC ELM classes.

1

LRRC7_HUMAN Score 0.5 LRRC7_MOUSE G3RGK3_GORGO H2PZ79_PANTR H2N730_PONAB 0 1 251 501 751 1001 1251 1501 Position

Figure S11. ANCHOR server prediction output for human LRRC7 (UniProt ID Q96NW7) and its orthologs.

Neuron navigator Neuron navigator is an example of a protein that was identified only in our Gene Ontology analysis, and although clearly has a neuronal function, it is not present in any of our functional sets. Neuron navigator is 1877 residues long and contains four annotated coiled coil regions in UniProt (Figure S12 and S13). Two related proteins, NAV2 and NAV3 are also present in the human proteome and all of them, to a different extent, are expressed in developing and/or adult brain tisse. Neuron navigator is homolpgous to the C. elegans gene unc-53 that is involved in axon guidance (8).

Figure S12. ELM server prediction output for human NAV1 (UniProt ID Q8NEY1) highlighting LIG and DOC ELM classes.

1

NAV1_HUMAN

0.5 NAV1_MOUSE Score G3QV71_GORGO H2RDF1_PANTR

0 H2N472_PONAB 1 251 501 751 1001 1251 1501 1751 0 Position

Figure S13. ANCHOR server prediction output for human NAV1 (UniProt ID Q8NEY1) and its orthologs.

TNIK and MINK1 TNIK (TRAF2 and NCK-interacting protein kinase) and MINK1 (Misshapen-like kinase) 1 are members of the MAP4K4 kinase family. They contain an N-terminal kinase domain and a C- terminal CNH domain with a long, largely disordered segment between the two (Figure S14 and S15). TNIK has been shown to be localized in denrtitic spines with a possible role in regulating synaptic strength (9). Both TNIK and MINK1 were shown to interact with TANC1, a postsynaptic scaffold protein (10).

Figure S14. ELM server prediction output for human TNIK (UniProt ID Q9UKE5) highlighting LIG and DOC ELM classes.

1

TNIK_HUMAN

0.5 TNIK_MOUSE Score G3QX47_GORGO H2R2F6_PANTR H2PBZ9_PONAB

0 1 251 501 751 1001 1251 Position

Figure S15. ANCHOR server prediction output for human TNIK (UniProt ID Q9UKE5) and its orthologs.

Guanylate kinase associated protein (GKAP/DLGAP1) GKAP (Guanylate kinase associated protein), also know as DLGAP1 (disks-large-associated protein 1) was identified as a binding partner of the protein binding the guanylate kinase-like (GK) domain of the major PSD protein PSD-95 (11, 12). GKAP contains a long disordered region with several identified binding motifs and a C-terminal helical GH1 domain (Figure S16 and S17; 13). Through the interaction of GKAP with the GK domain of PSD-95, GKAP wes shown to colocalize with NMDA receptors and Shaker K+ channels in vivo (11). GKAP binds to the GK domains of several other, but not all MAGUK (membrane-associated guanylate kinase) proteins such as PSD-93 and SAP102 (11). GKAP binds GK domains with a region containing 5 copies of a repeat sequence of 14 amino acid residues. Constructs containing tandem copies of repeats (repeats 1-2 or 3-4) showed strong GK domain binding, which could not be competitively inhibited by a single copy of the repeat. It was suggested that the sequence/structural context of these repeats might be an important factor in the interaction (11). GKAP also binds the PDZ domain of Shank proteins with its C-terminal residues (14) and the dynein light chain (DLC) with an internal binding site (15, 16).

Figure S16. ELM server prediction output for human GKAP (UniProt ID O14490) highlighting LIG and DOC ELM classes.

1

DLGP1_HUMAN

0.5 DLGP1_MOUSE Score G3SKG4_GORGO H2QE80_PANTR H2NVX7_PONAB

0 1 251 501 751 Position

Figure S17. ANCHOR server prediction output for human GKAP (UniProt ID O14490) and its orthologs.

The Shank protein family The Shank (SH3 and multiple ankyrin repeat domains protein) family proteins (Shank1, 2 and 3) are among the most important scaffold proteins of the PSD (14). Shank proteins contain 5- 6 ankyrin repeats, an SH3 and a PDZ domain, a long proline-rich region and a C-terminal SAM (sterile alpha motif) domain (Figure S18 and S19; 17). The SAM domain of Shank3 was shown to form sheets composed of parallel helical fibers (18). This self-organizing property of the SAM domain is proposed to be physiologically relevant even in full-size Shank molecules and is likely key for organizing the protein network. The proline-rich domain of Shank proteins was shown to be responsible for binding the SH3 domain of the actin nucleation factor cortactin and this interaction is key in maintaining and remodeling the structure of the PSD (19). Besides binding GKAP, the Shank PDZ domain has been shown to dimerize, providing yet another way of forming a complex scaffold and providing a means of association with dimeric partners (20). Shank proteins exhibit very high interaction diversity values in our survey, which is primarily attributable to their long proline-rich region with a number of SH3 domain binding sites.

Figure S18. ELM server prediction output for human Shank1 (UniProt ID Q9Y566) highlighting LIG and DOC ELM classes.

1

SHAN1_HUMAN

0.5 SHAN1_MOUSE Score G3S561_GORGO H2RDR5_PANTR H2NZS7_PONAB

0 1 251 501 751 1001 1251 1501 1751 2001 Position

Figure S19. ANCHOR server prediction output for human Shank1 (UniProt ID Q9Y566) and its orthologs.

Supplementary references

1. Fenster, S. D. et al. Piccolo, a presynaptic zinc finger protein structurally related to bassoon. Neuron 25, 203–214 (2000). 2. Terry-Lorenzo, R. T. et al. Trio, a Rho family GEF, interacts with the presynaptic active zone proteins piccolo and bassoon. PLoS ONE 11, (2016). 3. Kovács, Á. et al. Detection of single alpha-helices in large protein sequence sets using hardware acceleration. Journal of Structural Biology 204, 109–116 (2018). 4. Ozer, R. S. & Halpain, S. Phosphorylation-dependent localization of microtubule- associated protein MAP2c to the actin cytoskeleton. Molecular Biology of the Cell 11, 3573–3587 (2000). 5. Zamora-Leon, S. P. & Shafit-Zagardo, B. Disruption of the actin network enhances MAP-2c and Fyn-induced process outgrowth. Cell Motility and the Cytoskeleton 62, 110–123 (2005). 6. Ohtakara, K. et al. Densin-180, a synaptic protein, links to PSD-95 through its direct interaction with MAGUIN-1. to Cells 7, 1149–1160 (2002). 7. Jiao, Y., Robison, A. J., Bass, M. A. & Colbran, R. J. Developmentally regulated alternative splicing of densin modulates protein–protein interaction and subcellular localization. Journal of Neurochemistry 105, 1746–1760 (2008). 8. Maes, T., Barceló, A. & Buesa, C. Neuron navigator: A human gene family with homology to unc-53, a cell guidance gene from Caenorhabditis elegans. Genomics 80, 21–30 (2002). 9. Burette, A. C. et al. Organization of TNIK in dendritic spines. Journal of Comparative Neurology 523, 1913–1924 (2015). 10. Nonaka, H. et al. MINK is a Rap2 effector for phosphorylation of the postsynaptic scaffold protein TANC1. Biochemical and Biophysical Research Communications 377, 573–578 (2008). 11. Kim, E. et al. GKAP, a novel synaptic protein that interacts with the guanylate kinase- like domain of the PSD-95/SAP90 family of channel clustering molecules. The Journal of Cell Biology 136, 669–678 (1997). 12. Satoh, K. et al. DAP-1, a novel protein that interacts with the guanylate kinase-like domains of hDLG and PSD-95. Genes to Cells 2, 415–424 (1997). 13. Tong, J., Yang, H., Eom, S. H., Chun, C. & Im, Y. J. Structure of the GH1 domain of guanylate kinase-associated protein from Rattus norvegicus. Biochemical and Biophysical Research Communications 452, 130–135 (2014). 14. Naisbitt, S. et al. Shank, a novel family of proteins that binds to the NMDA receptor/PSD-95/GKAP complex and cortactin. Neuron 23, 569–582 (1999). 15. Naisbitt, S. et al. Interaction of the postsynaptic density-95/guanylate kinase domain- associated protein complex with a light chain of myosin-V and dynein. The Journal of Neuroscience 20, 4524–4534 (2000). 16. Rapali, P. et al. Directed evolution reveals the binding motif preference of the LC8/DYNLL hub protein and predicts large numbers of novel binders in the human proteome. PLoS ONE 6, (2011). 17. Sala, C., Vicidomini, C., Bigi, I., Mossa, A. & Verpelli, C. Shank synaptic scaffold proteins: keys to understanding the pathogenesis of autism and other synaptic disorders. Journal of Neurochemistry 135, 849–858 (2015). 18. Baron, M. K. An architectural framework that may lie at the core of the postsynaptic density. Science 311, 531–535 (2006). 19. MacGillavry, H. D., Kerr, J. M., Kassner, J., Frost, N. A. & Blanpied, T. A. Shank- cortactin interactions control actin dynamics to maintain flexibility of neuronal spines and synapses. European Journal of Neuroscience 43, 179–193 (2015). 20. Im, Y. et al. Crystal structure of the Shank PDZ-ligand complex reveals a class I PDZ interaction and a novel PDZ-PDZ dimerization. J Biol Chem. 278, 48099–48104 (2004).