ARTICLE

DOI: 10.1038/s41467-018-03523-2 OPEN An AP-MS- and BioID-compatible MAC-tag enables comprehensive mapping of interactions and subcellular localizations

Xiaonan Liu 1,2, Kari Salokas 1,2, Fitsum Tamene1,2,3, Yaming Jiu 1,2, Rigbe G. Weldatsadik1,2,3, Tiina Öhman1,2,3 & Markku Varjosalo 1,2,3

1234567890():,; Protein-protein interactions govern almost all cellular functions. These complex networks of stable and transient associations can be mapped by affinity purification mass spectrometry (AP-MS) and complementary proximity-based labeling methods such as BioID. To exploit the advantages of both strategies, we here design and optimize an integrated approach com- bining AP-MS and BioID in a single construct, which we term MAC-tag. We systematically apply the MAC-tag approach to 18 subcellular and 3 sub-organelle localization markers, generating a molecular context database, which can be used to define a protein’s molecular location. In addition, we show that combining the AP-MS and BioID results makes it possible to obtain interaction distances within a protein complex. Taken together, our integrated strategy enables the comprehensive mapping of the physical and functional interactions of , defining their molecular context and improving our understanding of the cellular interactome.

1 Institute of Biotechnology, University of Helsinki, Helsinki 00014, Finland. 2 Helsinki Institute of Life Science, University of Helsinki, Helsinki 00014, Finland. 3 Proteomics Unit, University of Helsinki, Helsinki 00014, Finland. Correspondence and requests for materials should be addressed to M.V. (email: markku.varjosalo@helsinki.fi)

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2

ajority of proteins do not function in isolation and their proteins. This allows us to validate the correct localization of the Minteractions with other proteins define their cellular MAC-tagged marker proteins as well as to monitor the localiza- functions. Therefore, detailed understanding of tion of the in vivo biotinylated interactors. These interactions also protein–protein interactions (PPIs) is the key for deciphering provide information about the cellular functions of the 18 marker regulation of cellular networks and pathways. During the last proteins. Furthermore, the 18 localization markers and their 1911 decade, the versatile combination of affinity purification and mass interactions form a reference molecular context repository, which spectrometry (AP-MS) revolutionized the detailed characteriza- can be used for ‘mass spectrometry (MS) microscopy’ analysis of tion of protein complexes and protein-interaction networks1. The a protein of interest. Moreover, the combined analysis with AP- AP-MS approach relies on expression of a bait protein coupled MS and BioID allows deriving relative spatial distances for pro- with an epitope tag or antibodies targeting the endogenous bait teins within a complex. Taken together, the MAC-tag and cor- protein, allowing purification of the bait protein together with the responding analysis approaches provide a plethora of information associating proteins (preys). This approach has been proven well on the cellular functions and the molecular context of proteins. suited for even large-scale high-throughput studies, and to yield highly reproducible data in both intra- and inter-laboratory Results usage2. The most commonly used epitope tags in medium to Development of MAC-tag based AP-MS and BioID pipelines. large-scale studies include FLAG3, His4, MYC5,HA6, GFP7 and To generate a versatile approach for identification of both stable Strep8, of which the Strep-tag has become the gold-standard in physical and transient functional protein-protein interactions we affinity purification proteomics due to unparalleled protein purity integrated and optimized the BioID approach with our single-step in physiological purification conditions as well as the possibility Strep AP-MS pipeline10,19. Both of these approaches have become for native competitive elution using biotin. the method of choice for interactomics analyses. We have recently AP-MS can also be combined with quantitative proteomics shown the effectiveness of using these approached compli- approaches to better understand the protein complex stoichio- mentarily10,19. However, the complementary use of the two metry9 and the dynamics of protein–complex (dis)assembly1,10. techniques has been labor-intense, involving tagging of the bait The combination of AP-MS with other techniques, such as bio- proteins with BirA* and Strep-tag individually, as well as gen- chemical fractionation, intact mass measurement and chemical eration of two set of lines per bait. To overcome the major crosslinking11,12, has been used to characterize supramolecular limitations, we have developed an integrated experimental organization of protein complexes. workflow utilizing a MAC-tag containing both StrepIII-tag and Although AP-MS remains the most used method for mapping BirA* (Supplementary Fig. 1a). In addition to optimizing the protein-protein interactions, the recently developed proximity experimental steps, we focused on the compatibility of the two labeling approaches, such as BioID13 and APEX14, have become methods and to the simplicity of the analysis pipelines to generate complementary and somewhat competing methods. BioID a process with improved performance and reproducibility on involves expression of the protein of interest fused with a pro- detecting protein-protein interactions. In contrast to coupling karyotic biotin ligase (BirA) and the subsequent biotinylation of BirA* with epitope tags such as Myc10 or FLAG20, the two MAC- the amine groups of the neighboring proteins when excess of tag pipelines differ only in the activation of the BirA* by addition biotin is added to the cells. Whereas the wild-type BirA from E. of biotin to the cell culture media and harsher lysis condition in Coli is capable of transferring the biotin only to a substrate the BioID pipeline (Fig. 1 and Supplementary Fig. 1a, b). Without bearing a specific recognition sequence, the generation of a pro- biotin addition the BirA* in the MAC-tag is inactive (Supple- miscuous BirA* (Arg118Gly mutant) allows the biotinylation of mentary Fig. 1b, c), resulting in identical (cor = 0.88–0.99) single- any protein found within a 10 nm labeling radius13,15. While step affinity purification results as vector with only StrepIII-tag BioID has the abilities to capture weak and/or transient protein- (Supplementary Fig. 1d, e). Similarly, when biotin was added the protein interactions, the identified interactions are not limited to results compare (cor = 0.95–0.97) to that of a vector with BirA* direct binders but can include proximate proteins as well. (Supplementary Fig. 1d, e and Supplementary Data 1d). In order to avoid artefactual interactions caused by over- The developed integrated approach significantly enhances expression of the bait proteins, majority of the large-scale inter- (> two-fold) the throughput of generating bait-expressing cell action proteomic studies employ the Flp-In T-REx 293 cell line lines, facilitates a comprehensive analysis of protein-protein allowing moderate and inducible bait protein expression from interactions utilizing both the BioID and AP-MS, and allows isogenic cell clones16. Although the system allows rapid genera- analysis of protein complexes and even transient functional tion of transgene stably expressing cell lines, comprehensive interaction networks with high sensitivity and reproducibility. analyses utilizing complementarily both AP-MS and BioID is Additionally, the MAC-tag allows visualization of the bait protein resource-intense in the respect of construct and cell line gen- with anti-HA antibody detecting the HA epitope. This versatility eration. To address this caveat and allow high-throughput com- of our approach was expected to give detailed view on the bait prehensive interactome analyses, we generated a Gateway®- protein formed complexes, interactions, and actual molecular compatible MAC (Multiple Approaches Combined) -tag enabling context via the detected stable, transient and/or proximal both the single-step Strep AP-MS as well as the BioID analysis interactions. with a single construct and with single affinity reagent, which decreases the number of required individual cell lines by 50% and should improve the data reproducibility, respectively. In addition MAC-tagged cellular localization markers localize correctly. to allow visualization of tagged bait protein by immunohis- We then went on and evaluated the MAC-tag system with 18 tochemistry, we included as well a nine hemagglutinin bona fide cellular localization markers (Supplementary Data 1a) (HA)-epitope. The HA epitope also facilitates additional follow- that cover most of cellular organelles to have more comprehen- up approaches such as ChIP-Seq17 and purification of the sive view of the application of our integrated multiple approach crosslinked proteins for cross-linking coupled with mass spec- system. Initially 18 localization markers were cloned to the MAC- trometry (XL-MS)18, making the MAC-tag almost as versatile as tag vector, and as a first step we explored their localization using the Swiss Army knife. fluorescence microscopy. The tagged-localization markers were To benchmark the usability and performance of the MAC-tag visualized with anti-HA antibody and the in vivo biotinylated we applied it to 18 bona fide subcellular localization marker interactors with Alexa Fluor 594 streptavidin (Fig. 2). These

2 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE

ORF MAC-tag N-MAC-Tag BirA* HA StrepIII ORF insert Bait StrepIII BirA* C-MAC-Tag ORF insert StrepIII HA BirA* Bait Flp-in recombination Prey

Strep-tactin® Sepharose

AP-MS Select and expand isogenic cell clones BioID +Tetracycline +Tetracycline & biotin

Bait Bait

Collection of cells and sample preparation Mild lysis Harsh lysis

Single-step streptavidin purification

Bait Bait

LC-MS/MS analysis

Bait Bait Bioinformatics

MS microscopy Protein complex components Transient or proximal and stoichiometrics interactions

Fig. 1 MAC-tag-based workflow for identification of protein complexes and interactions. Gateway compatible MAC-tag destination vectors containing StrepIII, HA and BirA* were designed to allow the of interest either C-terminal or N-terminal tagging. The expression vector can then be transfected into Flp-In T-REx 293 to establish the transgenic stably and inducible expressing isogenic cell lines. For the AP-MS and BioID analysis approaches, the cell line is separated into two cultures, BioID cells receiving addition of 50 μM biotin in their culture medium. In the following protein extraction process, optimized lysis and affinity purification conditions for both analysis approaches were used. The interacting proteins were then analyzed by quantitative mass spectrometry and high-confidence interaction proteins (HCIPs) were inferred via stringent statistical filtering. This integrated workflow allows laborless generation of cellular material for analyses, and results in integrated view of the formed protein complexes, protein-protein interactions and detailed molecular context definition subcellular markers included: mitochondria (-inducing BirA* does not change the correct localization of these proteins. factor 1, AIFM1); (Calnexin, CALX); We also verified this for several of the used marker proteins (11/ peroxisome (Catalase, CATA); early endosome (Early endosome 11 tested) using specific antibodies against the corresponding antigen 1, EEA1); cytoplasmic peripheral plasma membrane endogenous protein (Supplementary Fig. 2 and Supplementary marker (Ezrin, EZRI); nucleolus marker (rRNA 2′-O-methyl- Data 1a). Furthermore, the localization of the in vivo biotinylated fibrillarin, FBRL); cis-Golgi marker (Golgin subfamily interactors correlates well with that of the corresponding locali- A member 2, GOGA2); chromatin (Histone H3.1, H31); exosome zation marker (Fig. 2). In addition to verifying the correct loca- (Heat shock cognate 71 kDa protein, HSP7C); lysosome (Lyso- lization of marker proteins, the results highlight the usability of some-associated membrane glycoprotein 1, LAMP1); nuclear our MAC-tag constructs for fluorescence microscopy on detect- envelope marker (Prelamin-A/C, LMNA); proteasome (Protea- ing both the tagged protein of interest as well as the interacting some subunit alpha type-1, PSA1); recycling endosome (Ras- proteins. related protein -11A, RAB11A); late endosome (Ras-related protein Rab-9A, RAB9A); microtubule (Tubulin alpha-1A chain, TBA1A); centrosome (Tubulin gamma-1 chain, TBG1); trans- Identifying the interactions of the localization markers. Golgi (trans-Golgi network integral 2, Although many proteins and proteins families have been exten- TGON2); and (40S ribosomal protein S6, RS6) (Sup- sively studied with wide-range of cell biological or biochemical plementary Data 1a). All of the 18 MAC-tagged marker proteins methods, others and we have shown the AP-MS and BioID can localized to their corresponding well documented cellular com- reveal wealth of new molecular and functional informa- partments, illustrating that the MAC-tag or the activation of the tion10,20,21. However, for many proteins not much is known and

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2

Ribosome Exosomes

RS6 HSP7C Nuclear envelope Plasma membrane

LMNA EZRI Nucleolus Endoplasmic reticulum

FBRL CALX Mitochondria Early endosome

AIFM1 EEA1 Lysosome Late endosome

LAMP1 RAB9A Proteasomes Recycling endosome

PSA1 RAB11A Centrosome Peroxisomes

TBG1 CATA Chromatin Cis-Golgi apparatus

H31 GOGA2 Microtubules Trans-Golgi apparatus

TBA1A TGON2

+ Biotin – Biotin + Biotin biotinylated bait protein bait protein proteins

Fig. 2 Fluorescence microscopy analysis of the cellular localization markers. The 18 subcellular localization markers fused with MAC-tag were visualizedby immunofluorescence staining using Alexa Fluor 488 labeled anti-HA immunostaining (green), their in vivo biotinylated interactors with Alexa Fluor 594 streptavidin (red), and cell nuclei with DAPI (blue) (Scale bar: 10 μm) there has not been systematic methods to efficiently and com- information from both the physical and functional interactions prehensively characterize them. As shown in Fig. 2, the resolution formed by these proteins. The analysis resulted in 26527 inter- of standard fluorescence microscopy does not allow capturing actions from BioID and 9390 from AP-MS, of which 2118 high- information of the protein dynamic localization and molecular confidence interactions (HCIs) from BioID and 679 interactions context. Therefore, we MAC-tagged 18 known cellular localiza- from AP-MS were retained after using stringent statistical filter- tion markers and subjected them to our integrated method to ing (Fig. 3a, b and Supplementary Data 1c). The identified obtain detailed molecular context proteome map with average connectivity (38) of the 18 localization markers, identified

4 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE using AP-MS, matches well with the published large-scale stu- Fig. 3). In this case, the proteins detected solely with BioID likely dies10,22. As the BioID is also able to capture highly transient and represent cargo proteins in endosomal transit. Interestingly the close-proximity interactions, the total number of identified ratio of newly identified vs. known interactions in total is almost interactors as well as interactions per bait is higher than that of two-fold higher with BioID (11.3) than with AP-MS (6.8), AP-MS (Fig. 3a, b). This is seen for example with Rab9A and potentially reflecting the sensitivity of BioID to identify more Rab11A, two regulators of endosomal transport, for which BioID transient and proximal interactions (Fig. 3a). provides 16 times and 11 times more high-confidence interacting However, the complementary nature of these two methods is proteins (HCIPs) than AP-MS, respectively (Supplementary illustrated by their overlap as well as with their individually

a c 2000 PSMG3 PSMG1

PSMG2 PSMG4 MYH14

1500 MYO1D MYO6

PSA7 PSB1 PSME2 MYO1B MYO1C 1000 PSA3 PSA4 2 17 17 MYH10 PSA1 PSA2 POMP PSB6 PSA1 PSB2 PSA5 # of interactions 500 ECH1 PSB4 PSB3 AIMP2 PSME1 PSB7 GLO2 YES ARF4 AP-MS RBM28 SC16A GNAI3 0 HYOU1 TMOD3 GRPE1 BioID AP-MS BioID

BCCIP NOSIP d SENP1 PSIP1 AP-MS interaction IMA1 IMB1 GRPE1 PIAS2 BioID interaction YTHD3 RAD50 IMA7 b IMA6 PTN1 WRIP1 FBRL DPY30 AKP8L IMA5 Prey–prey interaction 400 IMA3 PPIL4 PARP1 ITB1 LBR IMA4 GNL3 BAF RFA3 PSMD7 LMNB2 LMNB1 ARF RAE1L C1QBP P53 F169A ZKSC4 300 RFA2 PSMD6 LAP2B EMD HDGF LMNA LAP2A TOIP1 ZN691 NUP37 LMNA NUP43 NU133 ZKSC3

NUP98 MAN1 SEH1 ALKB5 CMTR1 ELYS MYSM1 EMC1 200 PGAM5 HMBX1 RFA1 UBP7 NUP50 GLO2 NEUA E1B55 NU107 27 13 83 PIMT KBTB6 ZN724 EMC2 LYRIC VRK2 SCFD1 TPR TFAM NU160 SEC13 NU155 HDGR2 DHX40 STIM1 GPKOW 100 NU153 TYDP2 LANC1 P121A CX057 DDX54 LEMD2 SC16A NUP85 DNJC8 LMAN1 SSBP ZKSC1 PCY1A CCD47 KI67 CATIN

# of interactions/bait RBM33 PPIL2 DHX30 SPT5H ACD11 FEN1 K1143 0 ACSL3 NSMA3 ARI3A AP-MS

SRPRA YKT6 VAPA AN32E P121C SRPRB BioID AP-MS BioID EDF1 CHCH2 RPC3

PGRC2 LSG1 SMHD1 CX056 WDR70 e

LTV1 WDR36 DDX50 NOL10 NOL8 NOC4L DIEXF Centrosome REPI1 GPTC4 ESF1 NOL9 HEAT1

NU188

RM47

(TBG1) NOM1

SAS10

PWP2

DDX21

WDR43

AATF TSR1 KNOP1 ZC21A BRX1 BMS1 PFD5 TCPW RSBN1 DHX37 DPYL5 TM263 PDCL3 POP1 ELOA1 UBP36 HP1B3 MBB1A TBA1C T11L2 BYST

SDA1 NOG2 GLYR1 ABT1 CEP41 DDX24 DPYL2 DDX10 RL8 PFD4 TF3C3 DIM1 DPYL3 TBG1 ZKSC8 DDX31 TBL3 CTCF HN1 RFC4 RPP30 PESC RL26L DPYL1 PFD1 ECHA RFC3 RL13A RL18 TBA1A RFC1 RL3 HIG1A WDR3 CAMP2

RL19

AP3M1 RCL1 TBA1B RT34

GRIP1 LAS1L PFD3 NOP14 TBAL3 AT1B3 IF2B3 RL32 TXND9 LONP2 ACD11 ARL1 MPP10 PTGR3 HSDL2 DPYL4 TTC4 RT12 RL10A ISOC1 TYSD1 GBB2 NOP56 RL31 NPA1P NUD12 ACOX1 RS23 RL17 RS14 ARF RS9 PWP1 RL26 RS3A DHRS4 ECHP

PECR PEX5

PLPL8 HMGCL Microtubule NOC2L RT16 RL36 RRP15 RBM19 RM11 RT18B RLP24 RPOM NOL12 DECR2 ACACB TEX10 CATA HYES NVL PK1IP

PRKRA FANCI HACL1 NLTP (TBA1A) NUP93 NPM3 IMP3

SSF1 ESYT2 UTP20 OCTC G6PD SMC1A Nucleolus RL1D1 MOV10

GSTK1 THIK SMC3 PARN Z286A

VWA8 RLA1 NMNA1 ECI2

ADAS PEX14 RLA2 RPF2

ACOT8 (FBRL) ACOX3 PAHX GNPAT RLA0 NO66 ARF4 STAG2 INT1 BAP18 UBR7 MGAP SHCBP P66A NPM NOL6 SP20H KI18B TF3C2 UBE2O UT14A UBP48 RPA49 PSMG4 NELFB ANM1 PSMG2 PB1 DHX33 PSME1 DDX47 RS17 P66B NP1L4 TAD2B PSB7 BBX YTDC2 PSB2 WAPL STRBP KAT2A RL35A CCD86 PSMG3 POGZ TDRD3 PSB6 DHX40 APC16 FUBP3 MEPCE GZF1 Peroxisome ZBT10 RPA2 RL7 TMOD3 DHB4 RSBNL RENT1 PSB4 LMBL3 PSA1 TAF2 ALG13 NUCL WDR12 PSME2 ZFP91 SRFB1 PSA5 ARID2

Z512B ZCHC3 RL13 MAK16 POMP ZMYM4 LAR1B PSA4 (CATA) EHMT2 RIF1 GAR1 RL12 GNL3 PSA7 FBRL PSA3 NOLC1 ZN830 PSMG1 PSB1 RNH2B GNAI3 F120A RL4 WDR18 UIMC1 NCBP1 SMU1 RS6 MTA3 NOP58 NOP2 UTP23 PDS5A OTUD4

KIFC1 UTP18 RL5 LLPH TAF9 NH2L1

NUSAP IF2B2 GRWD1 ELF2 NP1L1 SPB1 RL28

PCGF6 ZN850 Proteasome RD23A NUD19 TAF5 RL29 EBP2 RPAC1 COIL RGAP1 U3IP2 ECH1 RPA1 DNJA1 RBBP5 DKC1 RL14 Ribosome MCCA CSDE1 IPO7

APC1 HELLS NUMA1 (PSA1) ECHB IMP4 ADNP MDC1 RRP1 (RS6) RM24 ZN644 HDGR2 MYSM1 CAF1B NEUA IF6 RT27

ZN691 COX15 EHMT1 MAVS MRRP1 UBP30 MIC27 CATIN ACOT1 C1TM RL37A H1X T126B SCMC2 CAF1A MIRO2 TTC19 HDGF MICU2 CL073 CHD8 ZKSC4 RS8 S39A7 AKAP1 ATPF1 CDC16 P33MX ATD3B LAP2A COA6 NOC3L COA7 DMAP1 AN32E RS13 RT31 LACTB F136A CCD58 CSTF3 FEN1 RL7A DDX55 P5CR2 COX19 SMHD1 UBN1 AKP8L RT22 IMA5 TIM50 CHD3 PARP1 OMA1 PSIP1 RS4X NGDN DNJC8 NDUA9 RT24 BRD7 H31 DDX51 RT23 RT26 RS24 NFIA SIM12 CLPP PTCD3 RL30 PELP1 RAD18 YMEL1 NDUA5 BCCIP BPTF RL11 LN28B

NUCG HCD2 DDX56 RT25 CC137 PDS5B DDX52

ELYS RT10 NEPRO COX17 TF3C1 GPKOW PPIL4 MICU1 UBP7 NUP43 SURF6 RT05 RL7L ASF1B K1143 CMTR1 PCID2 TACO1 GRSF1 RT15 KIF23 SEH1 RL36A MA7D1 Chromatin KBTB6 EXOSX PUM3 SSBP RL18A RRS1 HIG2A ZMYM2 RL6 BOP1 P121A CEBPZ RM28 H33 ALKB5 UBXN4 IP6K1 CHMP7 GRM1A NPL4 CYC CLPB RN185 MSPD2 WRIP1 HMGX4 TOIP2 CX057 CLMN CBX1 MACOI DNJC1 ACAD9 TIM10 ZKSC3 MMGT1 NDC1 CDC23 (H31) IMB1 UBE2N SARNP TEX2 TIDC1 CA043 RPC3 APC5 GPAT4 PPIL2 OSBL8 ATP5E TOM22 HIRA SND1 NUP85 EGLN1 TF3C4 C1QBP SYIM CMC4 BLM NFXL1 LMNA RBGPR CX056 ALR KMT2B ATD3A LMNB2 TMED8 MSH6 RL39 F134B LYAR WDR70 KAD2 ASF1A SRP68 TAPT1 NDUBB RAE1L AT2A2 SENP2 DNLI3 REXO4

SENP1 KI67 CCHL ZN281 RINT1 OCAD1 RRBP1 PREB TAF6 IMA1

TRRAP ACBD5 NDUA8 NUP50 MIC19 EMC7 PRI1 NU153

ZSC21 NOSIP WDR44 NDUV1 AAAS NDUF4 YES NASP PCY1A NSMA3 LSG1 MYO6 TF3C5 P121C CHCH2 LMNB1 MIA3 ACOT2 MYO1B RFX5 MS18A NDUAD NU133 SPT5H TYDP2 AIFM1 PSB3 TAF6L CALX ATRX IMA4 F169A GPAT3 ATPO PIAS2 E2AK3 CMC2 SC16A IMA3 NUP37 MORC2 EDF1 HYOU1 IMA6 IMA7 MSH2 NU155 ANO6 IF2M EMC3 RT28 MEN1 NU160 BUD31 NUP98 INF2 RBM28 TAF4 ETFB STIM2 PDIP2 RFC5

RFXAP GSCR2 VAS1 CUX1 STX18 ODO2 TDRKH NOP16 HMGB3 GLO2 DEK NAT10 NSDHL Nuclear envelope CS025 PPM1G HEM6 T2FA AFG32 DHX30 RAD21 DDX54 Mitochondria RFC2 KRI1 RB3GP UB2J1 MED14 RRP12 SAM50 PHB CDC27 AQR NOG1 SPCS2 T2EB EndoplasmicCHTOP reticulum Z780A GLYM HMGB2 RRP1B TIM8B RNH2A UBE4A DNMT1 (LMNA) M4K5 KIF4A TYY1 RBM34 WIZ BRD2 XRCC6 NSUN2 P5CR1 TAF1 PHLP TIM13 (AIFM1) XRCC5 T2FB TBP MSH3 CLGN REEP5 EMC8 HAX1 QCR1 MXRA7 ADCY9(CALX) SPCS3 TACC1 NDUS8 SDHA SNP29 TXTP UFL1 BAP31 MIA40 CLPX RT29 CAMLG CDCA3 TACC2 NU205 SOAT1 DNM1L PPIF RPN1 KAP0 NCK1 HMOX2 SERPH

NDUS1 MIC60 EMD HTRA2 ATPG DDX18 ZCCHV DSG2 RABL3 STX5

ATPA TRIA1 UNC5B

RHG01 MRT4 SCO2 AT5F1 VEZA QCR2 NDUS5 MK67I OST48 CG050 DDX27 TRM1L ECM29 COX5A TIM8A S4A7 ZN512 RFA1 AT2B1 RRP5 OPA1 NDUV2 MAGD2 EEA1 VIP2 UBR4 PGRC1 NDUS2 HXK1 NIP7 RPN2 SRP72 KRR1 SC24A NDUS6 ATP5J ARI4B SC24B RUFY2 ARI3A NKRF MTX2 CX6B1 LMAN1 WDR6 STAU1 RRP8 NDUS3 COX41 SCFD1 DLG1 NIBL1 TOIP1 ITB1 ERAL1 KCRU DEN4C OSBL9 FCHO2 P85A CMC1 THIO SYAP1 EHBP1 MAN1 CI072 PEAK1 PGRC2 ATP5H COX6C LBR YKT6 CC124 OSB11 ATP5L CY1 ECD NOTC2 QCR6 TOM70 GOGA3 GLSK ATPB SMCR8 LETM1 AT1A1 LIN7C NDUBA COX2 TOM40 DHE3 NU107 Early endosome LAP2B ESYT1 STT3A RAE2 ELP3 GCP60 CC115 TRPM7 NUMB GRPE1 SNAA TM209 RFIP1 CK5P3 CSKP VRK2 RFIP5 BASP1 PURA CTND1 WDR41 EMC1 HGS CYFP2 LYRIC CHP1 SWP70 DDRGK RTN4 RABX5 SC23B TFAM NSF TM230 CCD47 TMX1 FERM2 CBPD EMC2 TIM44 LRC59 VAPB (EEA1) MYH10 MYH14 SC23A AKA12 SCRIB DCTN4 AUP1 CNBP STIM1 ACSL3 JPH1 RAB11A TM199 SRPRA MYO1C PDZD8 CLCC1 MYO1D BASI COPE RBGP1 UBP11 SEC63 SAHH2 ANKL2 RABE1 FA21A VATF CKAP4 IF2A RAB5C S12A2 FND3A SYK IRS4 TPPC9 KTN1 RFA2 ACTY DVLP1

AL3A2 TRI13 STRUM CDKAL DD19A KIF5C PDIP3

TPC10 SC22B DIP2A EMC4 DJC11 STBD1 RENR IF2B CALU MLF2 STT3B GORS2 ETFA

SRPRB BAG2 RGPA1 S6A15 UBP47 SEPT8' DOCK7 ZC3HF ERBIN VAPA ENAH MPRI NEB2 PTN1 ARFG1 Recycling endosome EFHD1 S11IP HPS3 TM55B PHAR4 NBEA STON2 TB22A SMAP2 HM13 DEP1B SRBS1 STAM1 WDR20 STX12 RRAGA GCC1 ZNT5 BORC5 PALM2 NCOA7 AP2M1 EIF2A UBP32 LCAP SNX27 RUFY3 SIR2 PK3C3 AR13B KI16B BORC7 TB182 ARFG2 RBM14 RUFY1 VPS45 ANR27 RLGPB NISCH PARG VPS8 VP13C KS6C1 RAB1C CA198 TECR LRBA BORC8 (RAB9A) PACE1 PTN23 R3HCL PI3R4 ZFY16 M21D2 UBIA1 E41L1 S29A1 GOGA5 PKHM3 DP13A 4F2 GRIN1 GCC2 KXDL1 F135A SNX6 K1468 VAMP8 VAMP3 SH24A SNP23 DEN6A TFR1 CYFP1 OSB10 ABCE1 SCAM1 STX7 BIG3 SNX3 F219A ZFYV9 TRIPB RRAGC LZTL1 EPHA2 RAB9A DYST AP1M1 RAB6A TPPC3 K1211 HOOK3 GTR1 RCN1 CYLD WDR11 TPC6B MIO MPRIP SEM4C SHLB2 VP13B KIF1A LTOR2 RBNS5 CF132 BI2L1 MON2 RHG35 ADDG TNIK SRP14 VPS16 LTOR5 COBL1 PPR21 RFA3 XPOT SYFB NCKP1 GOLP3 RAI14 ABCD3 SHSA2 ODP2 PKN2 RAE1 AT133 ANM7 TARA PAK4 PDXD1 TAB1 RICTR NR3L1 EZRI ABI1 SPE39 RGPA2 MARK2 AAAT ARP10 SCYL2

LIPB1 RFTN1 PLSI ARFG3 VP33A TBCD5 BORC6 ZNT6 DIP2B LSR DSC3 SAR1A Late endosome BMP2K SEPT6' SNX29 ITSN2 CNNM3 URFB1 PTK7 ABCF2 PACN2 LTOR3 ABI2 TBC23 SNTB1 LTOR1 P55G TPPC1 AP1AR PTN13 TR112 BET1L STX8 MTMR1 KLC4 WASH7 ZDH20 ZDHC5 ASAP1 DSC2 (RAB11A) NUMBL GGA3 GAB1 VATD CLCN7 SAHH3 SNX30 PKHA1 LIMC1 RAB10 AP2A1 HACD3 ABR DPM1 NDRG1 RAB7A LAMP1 GORAB AKAP2 OCRL PLCB1 SNAPN CDC42 ANK3 VPS54 VATA VATB2

WIPI4 FA83B ANK2 T106B VPS18 S19A1 WASF2 UTRO TJAP1

EF2K DJB11 DHCR7 ADDB RADI NHRF1 GRAP1 BAIP2 NMT1 ADDA DIC LSM1 SHC1 RIC1 PlasmaM3K7 membrane P85B CTNB1 MOES OGA VTI1B ICAL CAPON SRGP2 MARK3 PLST COBL RABL6 BCR ROCK2 ELP1 SVIL TAU CADH2 MTMR5 S12A6 1C07 PDIA4

ITSN1 TGON2 (EZRI) GGA1 DP13B VANG1 TM87A TPC12 S39AE K319L KCNB2

RAB21 RELL1 S38A2 DNJC7 SDCG3 METH ARHG2 ARHGA LRP8 DPH2 HS105 TPPC5 VAT1

TPC2L RPAP1 RPGF6 CPNE8 SPY4 Lysosome GAK PICAL DPH1 HUWE1 S39AA TPC11 VP33B TRIM9 PP6R3 Trans-Golgi GOGA4 F91A1 S38A1 SYNRG TANC1 EDRF1 AP1B1 STEA3 GOPC (LAMP1) S12A4 WNK1 KCC2D GALT2 KDM2B LRC49 MPP7 TB22B PAXI1 COG1 AP4E1 WDCP CLH1 (TGON2) UBN2 ATP7A FA60A OCLN VPS51 MTMR6 HPBP1 HSP72 HSP7C TPGS1 AT2C1 STRN4 L2GL1 KDIS TPPC4 ATX10 AAKG1 STRP1 EFNB1 GOGB1 TPPC8 ZFP30 S23IP FYV1 BAG5 IRAK1 GOGA2 DAB2 ZN658 EPN4 SNX5 PHOCN CCD53 CHIP HSP74 RT35 FLOT2 ZNT1 STRN3 DMXL1 PP6R1 NCOR2 HS71L USO1 TMF1 SNX1 TPC13

DVL2 TLE1 FA21C RPB2 KMT2D S20A2 AP3B1 VAMP2 BUB1B CLCA ML12A KDM6A LAT1 AP3D1 WDR7 STRN NCOR1 DIAP1 ASNS MOT1 STX16 AP2B1 DJC13 BAG3 CXA1 STAM2 HAUS5 HS74L CN37 1A03 AP4S1

RAB1A

S12A7

RAB14

BIG2

TPD52

EPN1 EPHA4

UH1BL Cis-Golgi ARFP1 Exosome ILVBL AGFG1 F234B CTR2 WASH3 VAMP7 RCAS1 MRP (HSP7C) (GOGA2) SCAM3 SC6A8 SURF4 VDAC2 TPD54 MUC18 SPY1 UBP8 AP1G1 ATP7B STX6 P4HA1 STX10 TPC2B SNX2 VAMP4 A4 PPR37 GOSR1 AP2A2 ZFPL1 EPN2 SNX4 Newly discovered interaction Known interaction

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 detected interactions, such as the ones formed by proteosomal representing the particular time point when the image is taken. marker PSA123 and nuclear envelope marker LMNA (Fig. 3c, d). However, cellular proteins are highly diverse in their spatio- With PSA1 the overlap of AP-MS (green edges) and BioID temporal properties, thus making their characterization with (yellow edges) identified interactions is 17 components of the 20S microscopy alone extremely challenging. The BioID, in principle, core proteasome complex involved in the proteolytic degradation overcomes these limitations as monitoring of the biotinylated of most intracellular proteins (Fig. 3c and Supplementary close-proximity proteins and their quantities should allow Data 1c). BioID also captures myosins (MYH10 and MYH14) defining the BirA* -tagged bait proteins detailed molecular con- and unconventional myosins (MYO1B-D and -6), which have text within certain time period (Supplementary Fig. 3). Using the high turnover rates and after use they are either refolded for reuse 2118 high-confidence interactions from BioID, we generated a or degraded by the proteasome24. Additionally, BioID identifies cellular compartment-specific protein interaction map to the 18 proteasome activator complex subunits 1 (PSME1) and 2 bona fide localization markers (Fig. 3e and Supplementary (PSME2), which are part of the 11S (PA28) Data 1c). The HCIPs domains as well as the (GO) immunoproteasome25. term profiles for each marker were unique (Supplementary LMNA is a component of the nuclear lamina, playing an Data 2a-f and Supplementary Figs 4, 5). However, we identified important role in nuclear assembly, chromatin organization, also shared HCIs between the endomembrane system consisting and framework for the nuclear envelope and telomere of ER (CALX), the Golgi (GOGA2 and TGON2), endosomes dynamics. Not surprisingly both the AP-MS and BioID identify (EEA1, RAB9A and RAB11A) and lysosome (LAMP1). The four interactions with lamin-B (LMNB)1, LMNB2, lamin-B receptor organelles shared 17 interactors, and the combination of any (LBR), lamina-associated polypeptide (LAP)2A and LAP2B, three locations shared in total 87 interactions (Supplementary inner nuclear membrane protein Man1 (MAN1), and emerin Fig. 6a). These four organelles are involving in two major intra- (EMD). Another group of interacting proteins are nuclear pore cellular trafficking pathways: The exocytic pathway (ER via Golgi complexes (NPCs) components: nuclear envelope pore mem- (53 shared interactions) to the plasma membrane); and the brane protein POM (P121A, P121C), Nuclear pore complex endocytic pathway (plasma membrane via endosomes to (1) protein Nup (NU) 107, NU133, NU153, NU155, NU160, Golgi (101 interactions) and (2) lysosome (61 interaction) to ER NUP50, NUP85, NUP98, NUP37, NUP43, nucleoporin SEH1, (86 interactions)). This organization is also well visible with, nucleoprotein TPR, ELYS and SEC13, of which only compo- within a cell, the physically farthest from each other locating nents of the nuclear basket NUP50, TPR, SEC13 are detected of endosome and ER, sharing the least interactions of the all possible low abundance with AP-MS (Fig. 3d, and Supplementary binary combinations of the four locations. Data 1c). This suggests that for the correct localization to the Similarly, chromatin, nucleolus and nuclear envelope are all nucleoplasmic side of the nuclear envelope, LMNA needs to sub-structures in the nucleus and shared interactors with each pass through nuclear pore and during this process it transiently other (Supplementary Fig. 6b); nuclear envelope (LMNA) 22 interacts and in vivo biotinylates the NPCs components. with chromatin (H31) and chromatin 31 with nucleolus Similarly, the importin transport proteins importin subunit (FBRL). We previously already discussed the role of nuclear alpha (IMA1, IMA3, IMA4, IMA5, IMA6 and IMA7), importin envelope on chromatin organization, and chromatin control the subunit beta-1(IMB1) are detected with BioID, but only IMA5 structure of nucleolus via ribosome DNA. Nucleolus is the place and IMA7 in AP-MS. In addition, several histone modifiers and where ribosomal RNA transcription and the ribosome assembly chromatin remodelers are detected (AN32E, EDF1, MYSM1 occur. Ribosome (RS6) was detected as an outlier as it shared and PARP1). Somewhat surprisingly our analysis also identifies many of the interactors with other localization markers. This is several proteins involved in cell cycle and mitosis (RAD50, explained by the fact that protein requires ARF, KI67, HDGR2, WRIP1, AKP8L, BCCIP and P53). , which are after the synthesis of the MAC-tagged Both of the examples show that the detected high-confidence protein immediately in vivo biotinylated in the BioID approach. interacting proteins are highly specific for the studied location, Therefore the ribosome (RS6) was excluded from the further as illustrated by the retrieval of the HCIPs localization analyses. However, the other localization markers, such as information from CellWhere database26 (Fig. 3c, d). The mitochondria (AIFM1), cytoplasmic peripheral plasma mem- proteins with the highest ranking for the particular location brane (EZRI), exosome (HSP7C), peroxisome (CATA), micro- from CellWhere are shown in dark green and for the rest of the tubule (TBA1A), proteasome (PSA1) and centrosome (TBG1) ranksthenodecolorislightgreen.ProteinswithnoCellWhere had highly unique molecular context signature, which suggest ranking are shown in gray (Fig. 3c,d and Supplementary Fig. 3). the usability of this reference set in tracking of protein of interests dynamic localization in intracellular environment. Molecular context map reveals organelle-specific profiles.In Additionally comparison of the HCIPs cellular locations from addition to lacking molecular level resolution, standard fluores- CellWheredatabaseshowedthemtobeassignedtothecorrect cence microscopy is often used to produce static images cellular localization according to their bait protein, further

Fig. 3 Generation of comprehensive interactome maps for the bona fide cellular markers. The 18 localization markers we subjected to our integrated analysis, resulting in identification of 679 HCIs from the AP-MS and 2118 HCIs from BioID analysis. a The distribution of the number of known (blue) and newly identified (red) interactions within 18 bona fide subcellular organelle/structure markers, illustrate the need for systematic analyses. b The distribution of the number of interactions per localization marker by AP-MS or BioID purification approach shows similar distribution of connectivity as other publications using these approaches individually. Boxplots show the median, the 25th and 75th percentile, Tukey whiskers (median ± 1.5IQR). c, d The protein-protein interaction network and molecular context for proteasome organelle marker (PSA1) and nuclear envelope (LMNA). The HCIs that were identified from AP-MS (green line) and BioID (yellow line) are shown together with the known prey-prey interactions (dashed gray line). The nodes are color-coded based on the localization rank obtained from the CellWhere database (key: dark green = primary cellular localization for the corresponding protein, light green = possible localization, gray = different or no localization assigned for the protein). The Venn diagram highlights the complementary nature of the AP-MS and BioID approaches. e The reference molecular context map for the 18 subcellular organelles/structures. The unique high- confidence interactors from the BioID analysis are arranged in a circle around the corresponding localization marker and the shared interactors are shown with corresponding colors representing multiple localizations. Preys with more than four subcellular localizations are shown in gray color. The newly identified interactions are shown in pink edges and the known interactions with blue

6 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE

a Σn CQ Σn CR i =1 PSMi i =1 PSMi Localization score = m n C * k R Σ Σ Q Σ PSM l i PSMil r r

Q C R Q:Query PSM; R: Reference PSM C:Common proteins. k:total number of reference proteins Interaction context m: 14 cellular compartments bc d NUC

NE PER SURF6 CE152 OFD1 SC16A LANC1 A MIT PM GLO2 DPM1 NOG1 DDX54 U MIA3 TF3C3 SPB1 HYOU1

SC16A RRP12 INCE HS90B R MIC PRO CDC37 ZFP30 RIF1 RBM28 AURKB K 0.50 0.75 AURKB SUGP2 MYH14 NAT10 CP131 LYS CEN B CE192 MBB1A ARF DDX18 GRPE1 INCE NIM1 HS90A LYAR

GOL CHR NU205 DHX37 CDC37 DDX24 BOREA BIRC5 EXO END ER NUC

EHBP1 TM87A ARFG2 UNC5B VANG1 PACE1 ZNT5 M21D2 NE PER COG1 RELL1 RGP1 HUWE1 GOSR2 TFG SCYL2

KCNB2 ZNT6 DDX54 SC16A PCH2 VAMP7 SNAG VTI1A CYFP2 SNAG SC24B ATD3B CG050 SCAM3 ZN512 NU188 FANCI MIC60

STRBP HACD3 IF2A TJAP1

SC31A SNX3 MIT PM WDR11 STAU1 AHSA1 NOP16 RIC1 EPHA2 VPS51 STT3A RRBP1 AT2A2 SRPRB SPY4 LMAN1 BCCIP RRP1B BAF SC22B IF2B NUMB DDX18 SC22B

TEX10 RAB5C VPS45 CNBP PDXD1 F91A1 WAC2A VDAC2 HM13 LAT1 YKT6 ARFP1 ARFG3 F135A ECHA CANT1 EPHA4 RAB9A TMF1 NBEA SERPH DDX24 LRBA KDIS BASI STX5 VAMP8 DYST RABL3 B AGFG1 RM47 MPRI SYEP EPN4 SNX6 RABE1 ECH1 ERLN2 NSUN2 TRIPB JPH1 TFR1 SC23B VP13B CXA1 SC24A NIP7 TIM44 STEA3 CASP WDR6 K319L RT18B CALU HS71A YKT6 OCLN STAM2 ACOT9 MRRP1 GOGA4 AKP8L MIC PRO TECR PDIP3 HYOU1 ARHGA GLO2 LYAR DMXL1 DHX30 ATPG DIC E GSCR2 GRPE1 BET1 TPPC8 SMC3 BET1 UBR4 STX12 0.50 0.75 UBIA1 LSR GOGA5 RBM28 LCAP DHCR7 SAR1A RPN2 TRM1L ARHG2 GORAB SC23A BIG1 ITM2C MAN1 SEC63 GCC2 SMC2 SYAP1 VTI1A ESYT2 ZCCHV SYIC GO45 S12A4 MON2 S23IP SMC1A S6A15 RBM34 TPC11 FLOT2 GOGA3 SEC13 STX7 T PESC ZFY16 SYDC MK67I TR10B GOGA2 LYS CEN TB22B VAPB RRP8 NU205 XPOT GCP60 SYLC RABL6 GOSR1 SCFD1 DD19A STX16 STX10 4F2 NU155 GCC1 MRT4 DPM1 FYN VAPA SYK EMD RRP5 GGA3 GORS2 DDX52 GOSR2 CBPD S38A2 STX7 LIS1 S4A7 RAB6A

NKRF SCFD1 1 ZFPL1 APBA2 AIMP1 UBP8 DHE3 BET1L AT2B1 VAMP3 NOG1 CLCN7 USO1 STX6 SNP23 1A03 COPG1 RCN1 SNP23

SURF6 SNAA SRPRA DP13A CLPX GOSR1

GTR1 CN37 HGS DDX56 TXTP NSF GOL CHR ARF4 S12A2 ARF6 GOPC S39AE A4 STX5 ATPO RBM14

TMM33 STT3B SC16A SNAA

GNAS2 NUP93 TFAM RRP12 NOTC2 GOGB1 NSF CKAP4 OCRL KRR1 GBF1 EXO END DSG2 STAM1 SPB1 ER NUC

NE PER NU205 PNISR MTA3 SUGP2 TPR

EHMT1 HDGR2 MAZ MIT PM DDX56 NOG2 SC16A MED17 RPA1 CD11B CTR9 CDC37 TAF4 MAT1 PWP1 TAF2 CCNH ERCC2 C RBM28 ELOA1 RRP1B TAF5 TF2H5 TF3C1 TF2H3 RIF1 LYAR CCNL1 DDX10 TF2H3 TF2H1 YTHD3 D MIC PRO TF3C5 T2EA T2H2L MGAP CDK7 TAF6 0.50 0.75 CDK7 ZFP30 TF3C3 TF2H2 ECH1 SPB1 ERCC5 K TF2H5 ERCC3 T2H2L TF2H4 CCNL2 T2EB TF2H1 LYS CEN DDX52 RL39 PCGF6 TAF1 7 ERCC5 TF2H2 NOG1 CCNH RT12 ZN281 ELL CDC37 TF2H4 NAT10 ERCC2 RPA2 ERCC3 MAT1 GOL CHR MED31 PRI1

LAS1L KI67 ACD11 PESC COIL CNBP SURF6 EXO END ER

NUC

ZSC21 BCCIP NE PER MED13 SBNO1 ESF1 MED16 MED24 MED22 MED4 PESC FANCJ

EHMT1 MED11 MED21 RAD18 NU205 PPM1G ISY1 CX057 NADAP MIT PM MED17 MED6 ZN281 CDC37 MED16 KIF4A MED22 MED4 MTA3 LAS1L MED17 MED6 MAT1 MED15 MED30 MED13 MED30

C TAF6L MED11 MED25 HDGF ZEP1 NACC1 PAF1 T2FA MED23 MED25 MED24 RPC3 TADA3 MED21 AQR SYF1 TF3C4 MBIP1 MED15 MED14 MED7 MED14 PRO ZN131 MTBP MIC TF3C1 YETS2 D WDHD1 ZN830 MED23 MED8

CDK8 CDK8 ZZZ3 KAT2B MED27 MED8 0.50 0.75 MED7 MED9 MFAP1 WIZ CTR9 PNISR

K MED27 WDR61 WDR5 MED12 MED20 MED18 CD11B DPOA2 CSR2B CCNC LYS CEN DDX10 MED20 MED18 SPB1 DPOLA KAT2A TAD2A MED31 MED19 MSH2 TP53B MED31 MED12 MD13L MED10 8 NOG2 SUGP2 IMA5 MDC1 MED29 PRI2 MD12L MED1 MED28 CCNC MED9 PPIL2 T2EB ARI4B MGAP RIF1 T2EA MD13L MED19 BRE1A PRI1 GOL CHR TF3C3 PCID2 WAPL SC16A MED10

ZMYM4 MYSM1 CDC37 MED29 MD12L MED28 P4R3A ELOA1 MED1 EXO END TF3C5 ER

NUC

NE PER CP131 KDIS CP110 BI2L1 MAZ CADM1 WDR62 F193B

MIT PM CC138 SC16A HAUS5 LTV1 SPAG5 ALMS1 SC16A AXIN2 G FRAT1 HAUS4 MACF1 PEX5 ATG2B AXIN1 GSKIP TRAK2 DYST S SWP70 DLG5 CLAP2 PP2BA NU205 MIC PRO MA7D3 DCP1A IPP2 E41L3 STON1 ECHB GSK3B K 0.50 0.75 SDCG3 GSK3B CDC37 NBR1 CE350 PRC2C OFD1 EDC3 CE152 KAP1 RBM33 3 LYS CEN PRUNE FR1OP KANK2 APC ECD ERBIN PRUNE CDC37 KAPCA KAP0

VP13D TRAK1 b LAR1B KAP0 CRBG3 CTNB1 GSKIP KAPCB PP6R3 GOL CHR RL39 AXIN1 TPGS1 PP6R1 KAPCA LRC49 DCP1B AKA11 EXO END ER NUC

NE PER MAN1 UH1BL RAE1 LANC1 SC16A RBM33 MIT PM EMD DDX52 R GLO2 VAMP3 TPR ODP2 VPS8 PGTA ABCD3 DP13B

A ARF MPRI PIMT GNL3 MIC PRO NOG1 PGTB2 0.50 0.75 RAB5A RAB5A B DDX54 RBNS5

S6A15 RAE2 LYS CEN UBP11 SYK 5 NU205 OSBL9

GRPE1 DDX24 VP33A DP13A A TCAL1 YTHD3 RAE1 GOL CHR F135A VPS18 PGTA PGTB2 LCAP VPS45 RAE2 UBP11 EXO END ER

Fig. 4 Molecular level localization mapping using the reference molecular context maps. a The schematic overview of the MS microscopy approach to assess the queried protein localization using our reference interaction context. b The polar plot shows the location of query protein observed by MS microscopy. Each sector represents one subcellular location defined by our reference database. The color assigned to each of the localization is based on the annotation frequency (Pink: 0–0.5; Yellow: 0.5–0.75; Green: 0.75–1). c, d The PPI network obtained from BioID and AP-MS are shown separately. The localization of prey proteins was verified by CellWhere database. Node color scheme coordinates the observation localization from Fig. 4b

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 reinforcing the idea that proteins that share their interaction MS microscopy using the molecular context map as a reference. profiles are proximal. Finally, total of 1911 HCIs (excluding RS6 Despite the biological significance of dynamic subcellular locali- interactions), collapsed to 14 subcellular localizations were zations of proteins, simple tools for detecting the relative sub- integrated to build up the reference molecular context map cellular distribution have not been extensively developed. To test (Fig. 3e). Overlaying of the protein of interests BioID PPIs with the applicability of the MS microscopy on this, we selected our molecular map should allow defining the dynamic dynamic cytoplasmic signaling molecules aurora kinase B localization of the protein. In principle, the developed MS (AURKB), cyclin-dependent kinase (CDK) 7, CDK8, and glyco- microscopy approach could have high impact on cellular gen synthase kinase-3 beta (GSK3B), as well as additional mar- quantitative biology (Fig. 4a). kers for cellular locations ras-related protein Rab-5A (RAB5A)

abNUC

RM16 RM10 NE PER RM04 RM02 RM01 RM55 GRP75 RM50 CDCA3 SYNM DHTK1 TACO1 FMC1 RM28 TR61B FAD1 RM52 HINT2 THNS1 RM21 NT5D2 SUV3 RM51 GCDH HIBCH DUT AL1B1 MRS2 PREY HEM1 ACOT1 CK5P1 IDH3B MIT PM RM49 CISY PDK1 RM24 VWA8 GATA TIDC1 SYMM SYDM ECH1 GRSF1 SYHM RM47 IPYR2 MGME1 ECHB PDK3 RM45 SYIM CQ080 GLYM IF2M PLPL8 SYTM RRP12 CLPP SSBP AASS MPPA LY R M 7 LPPRC CH60 TMLH ECHM RM44 ABHDA NO66 EFGM DPOG2 CECR5 NSUN2 OXSM TIM16 SYCM CG050 NUD19 MIMIT THG1 ATPF2 XPP3 MRRP1 RM53 DDX18 THIM RM41 CN37 GLSK SCO1 RF1M SYVM NDKM ATD3B ACOT9 TRM5 GATB RT30 RT29 NAKD2 ETFB RT11 RT27 RM43 COQ3 PAPD1 RF1ML UQCC2 SYK MAOM RM19 ACS2L RT34 RT05 CIA30 MIC PRO PPCEL RT02 TRUA ACADM HS71A SYWM RT23 RT16 MMAB DPOG1 HEMH TR112 RT24 RT22 SYYM G45IP RM11 NDUF5 AKIP RT07 ECI2 NDUF4 IDH3A ODBA MRM2 PTCD3 RT14 MASU1 MUTA PRDX3 CF203 RT33 RT18B 0.50 0.75 SDHF4 MTDC ACADV MPPB RM20 KAD3 KAD4 RT18A RT10 P5CR2 OAT RM37 P5CR1 RNZ2 RT18C RT28 AL4A1 LY P L 1 ACOT2 RT06 RM42 RT35 PDIP2 GATC RBM28 SYG CAF17 CH10 PNPT1 RM46 RT31 RT25 RM14 ANGE2 ACON ES1 ZNT9 RT36 RT26 ECHA UQCC1 EFTS PDE12 RT09 PHB2 ATPF1

LY S CEN RM17 HSDL2 NDUF7 RT63 HCD2 RM03 CHTOP C1TM ACDSB PTCD1 ALDH2 SYSM SPRY4 ODO1 SETX GTPB3 ODP2 MTEF3 TIM13 TRM11 ARGI2 DLDH BCS1 ECSIT GRPE1 RM22 RM09 DCXR GLO2 SUCA TIM14 NFS1 NIPS1 CLYBL CH082 ACAD9 DNJA3 GLRX5 PREP RT4I1 PPIF ODO2 RM48 AL7A1 PDPR NT5D3 ADT1 PHB DHX30 RM39 GUF1 NDUF3 NUBPL STML2 SSDH PTPM1 NNTM LONM RM18 DHB8 OXA1L TIM44 RM23 TRAP1 S2540 ADT2 LETM1 DDX54 SCOT1 COASY CLPX RM27 NUDT8 THIL RM12 MAIP1 SUCB1 SUCB2 GOL CHR SMC1A ERAL1 SYLM SHC1 AFG32 RM32 HCDH ETFA SYAM MRM3 THEM4 SLIRP LY R M 4 ABCB7 RM38

RM33 RM15 RM35 RM40 EXO END RM30 ICT1 RM13 ER AOX

NDUBA NDUS4 NDUA2

NDUA9 NDUV2 NDUV1 NDUS5

NDUS6 ACPM NDUAD NDUS2 ATP5L ATP5J ATPG

COX2 COX41 NDUA5 NDUA7 NDUS7 NDUS3 QCR7 UCRI ATPA AT5F1 ATPB

NDUAC NDUS1 NDUA6 NDUS8 SDHA SDHB QCR2 QCR1 COX5A COX5B ATPO ATP5H ATPD

c + + + 100% 4H 4H 2H

90% e– e– 80% cyt c 70% 60% Complex e– e– Complex Complex Complex 50% UQ IMM I III IV V 40% 30% Complex 20% II 10%

Identified/total 19/44 2/4 4/10 4/19 9/19 d 60

40

abundance 20 Average component

0 Complex Complex Complex Complex Complex I II III IV V

Fig. 5 AOX localizes to inner mitochondrial membrane and in the vicinity of complex II. a The AOX from Ciona Intestinalis, was introduced to cells and the MS microscopy assigns the AOX to localize to mitochondria. b The BioID approach identifies 333 interactions of which 93.1% (310) were mitochondrial (green), 0.3% (1) was peroxisomal (pink) and 6.6% (22) were unassigned (gray), based on CellWhere database. c Total of 38 of the interactors were components of the mitochondrial respiratory chain complexes (complex I- V, key: color gradient indicates the percentage of proteins of the each individual complexes identified). d The average component abundance shows that the AOX associates most with the complex II, which is in agreement with the AOX suggested functional role. Boxplots show the median, the 25th and 75th percentile, Tukey whiskers (median ± 1.5IQR) and outliers (.)

8 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE and Golgi vesicular membrane-trafficking protein p18 (BET1) BET1, involved in the docking process of ER-derived vesicles with and applied our approach to them (Fig. 4b–d, Supplementary the cis-Golgi membrane is assigned to Golgi (Fig. 4b–d). Essential Data 1a–c, and Supplementary Fig. 7). Aurora kinase plays an component of the transcription factor II H (TFIIH), CDK7, and important role in cellular division by controlling chromatic seg- complex associating CDK8 are predominantly asso- regation, which matches well to its interactions overlaying with ciating with chromatin (Fig. 4b–d). This finding is in line with chromatin marker H31 (Fig. 4b–d). Similarly SNARE protein their important role in transcription regulation. Importantly,

ab Outer mitochondrial Inner mitochondrial Mitochondrial membrane (OMM) membrane space (IMS) matrix

OMM IMS

Matrix TOM20 SCO1 PDK1

IMM

c TOMM22 MRM1

OMM OMM

Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

MGST3 PLRKT

OMM OMM

Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

SFXN1 PTH2

OMM OMM

Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

COX14 TR61B

OMM OMM Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

COX41 HXK1

OMM OMM

Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

CH60 AOX

OMM OMM

Matrix 0.50 0.75 Matrix 0.50 0.75

IMS IMS

NUC AKIP NE PER

MT PM

OMM MIC PRO

Matrix 0.50 0.75 0.50 0.75 LYS CEN IMS Nucleus GOL CHR mitochondria

EXO END ER

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 these examples show the high resolution of the MS microscopy Molecular localization of the transgenic alternative oxidase. to distinct exact molecular locations, which could not be Alternative oxidase (AOX) present in many lower eukaryotes, but resolved by fluorescence microscopy (Supplementary Fig. 7). not in vertebrates, transfer electrons directly from ubiquinol to Glycogen synthase kinase β (GSK3B) phosphorylates many oxygen in a non-proton-motive manner38. Transgenic expression substrates in mammalian cells, and functions in many physio- of AOX in mammalian systems has been suggested as a ther- logical processes, and acts as an important regulator in Wnt and apeutic option for treating mitochondrial disease induced by Hedgehog signaling pathways27. Somewhat, to our surprise our OXPHOS dysfunction39, and additionally it has been shown that MS microscopy showed GSK3B localization to Golgi and exo- even broad expression of AOX does not disturb normal phy- somes. Recent research have demonstrated that a portion of siology in mice40. The exact molecular context of AOX in GSK3B is localized to the trans-Golgi network through per- mitochondria membrane is not currently known, but it is thought ipheral protein p23028 and that cytoplasmic GSK3B relocalizes to locate close to complex II based on its alleviating effects after to the same endosome as the internalized Wnt ligand29.Itis toxic or pathological inhibition of the mitochondrial respiratory plausible that this colocalization of GSK3B continues with chain41. Therefore, we decided to apply our MS microscopy active Wnt through endosomal organelles onto exosomes30.For method to define the molecular context of Ciona intestinalis’ validation of our endosomal location, we choose RAB5A which AOX in human cells and possibly shed some light on its inter- is known to localize to early endosomes and is involved in the actions with respiratory chain. Our approach identifies AOX recruitment of RAB7A and the maturation of these organelles predominantly to localize to mitochondria (Fig. 5a and Supple- to late endosomes31.Inouranalysiswecanconfirm the (early)- mentary Fig. 7). More specifically, >90% of AOX interactors endosomal location as well as detect a fraction of Golgi loca- (Fig. 5b) belong to mitochondria according to CellWhere data- lization, which could be related to the fusion of trans-Golgi base and 48% among them have the GO term annotation of network-derived vesicles with the early endosome32. mitochondria inner membrane. Furthermore, 38 of the inter- Our generated reference molecular context proteome map actors are components of the mitochondrial respiratory com- could be expected to be applicable with previously published plexes I–V. From these AOX prefers interactions with complex II heterogenous BioID datasets and even with other cell types. To (2/4 components detected), complex I (19/44) and complex V (9/ – test this we selected seven studies13,21,33 37 using BioID with 19) (Fig. 5c), which is also visible from the quantitative interactor heterogeneous protocols and processed the reported interac- abundance (Fig. 5d). The detected higher quantitative abundance tions against our reference molecular context proteome map. of the complex II proteins is in agreement with the immunoe- With this analysis we could show that our approach identified lectron microscopy findings of the complex II locating on the the correct subcellular localization for 22/25 of the tested inner mitochondrial membrane42, whereas the other complexes experiments (Supplementary Data 3, http://www.biocenter. (I,III,IV and V) would be on the cristae membrane. Therefore, helsinki.fi/bi/protein/msmic/example.pdf), validating and our MS microscopy findings confirm the functionally suggested further extending the general usability of our MS microscopy location (IMM, in vicinity of complex II) of transgenic AOX in approach. Additionally, we selected two commonly used human human cells. cell lines, the human bone osteosarcoma U-2 OS and the human prostate cancer DU-145 and generated cell line specific reference molecular context proteome maps. The 17 used Sub-organelle level resolution of the MS microscopy. Although localization markers in both cell lines align well with their for many studies the mapping of protein localization to 14 respective localization, when processed against the Flp-In T- compartments is sufficient, in theory the MS microscopy allows REx 293 reference molecular context proteome map (Supple- even better resolution. The resolution of MS microscopy is mentary Fig. 8, 9 and Supplementary Data 1c). We then dependent on the in vivo biotinylation radius (10–50 nm), which compared the BioID results of AURKB, BET1, CDK7, CDK8, then in principle should allow 5–10-fold higher resolution than GSK3B and RAB5A from the Flp-In T-REx 293 (Fig. 4cand standard confocal microscope. To benchmark the MS microscopy Supplementary Data 1c), with the U-2 OS, DU-145 and Flp-In on sub-organelle analyses, we selected three well-documented T-REx 293 generated maps, and the results align extremely well mitochondrial proteins and generated a sub-organelle molecular (Supplementary Fig. 10). context proteome map of the . The mitochondrion These examples clearly establish the applicability of the MS is a double membrane-bound organelle possessing an outer microscopy in defining the molecular context of many protein membrane (OMM), an inner membrane (IMM), and mitochon- (s), and also it´s expandability to cover also different cell types drial matrix within IMM (Fig. 6a). The intermembrane space or sub-organelle structures. In addition to analysis of wild-type (IMS) is 10 ∼ 20 nm in diameter, and therefore should allow, for proteins our system should be useful for defining possible example, almost complete biotinylation and subsequent identifi- altered molecular context in human diseases caused by either cation of the proteins in the IMS with MS. The three mito- somatic or germline genetic alteration, as well as for example chondrial marker proteins; OMM) outer mitochondrial analyzing functions of transgenic proteins not expressed in membrane receptor Tom20 (TOMM20, dark green); IMS) pro- human cells. tein SCO1 homolog (SCO1, light green); and matrix) pyruvate

Fig. 6 Sub-organelle level molecular context map of the mitochondria. a Mitochondrion can be divided to four compartments, namely to the outer mitochondrial membrane (dark green), inner mitochondrial space (light green), inner mitochondrial membrane (gray) and the (yellow). b The three mitochondrial proteins, TOM20, SCO1 and PDK1, used for generation of the mitochondrial sub-organelle molecular context map, with their PPI network obtained from the BioID (key: the interacting proteins are colored according to their corresponding bait mitochondrial location. Known (blue), newly identified (red) and prey-prey (black dashed line) interactions are color-coded). c Confocal microscopy analysis fails to provide sub-organelle level information of mitochondrial protein, whereas MS microscopy allows assigning the proteins within mitochondrial compartments. Confocal microscopy (HC PL APO 93×/1.30 GLYC motCORR) was applied to observe the mitochondrial localization MAC-tagged mitochondrial proteins. The MAC- tagged bait proteins are visualized with anti-HA immunostaining (green), nucleus with DAPI (blue), and mitochondria by co-transfection with pDsRed- Mito vector (red), Scale bar: 10 μm. The MS microscopy analysis and the resulting polar plots assign the mitochondrial proteins to their corresponding mitochondrial compartments. The color assigned to each sub-organelle location is based on the annotation frequency (green: 0.75–1; yellow: 0.5–0.75; pink: 0–0.5)

10 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE

a CDK7 b CDK8 c MED13

CCNH MED12 MED12 ERCC3 MAT1 10 MED14 ERCC2 MED14 MED1 10 MED1 TFIIH1 10 MED24 TFIIH4 MED13 TFIIH2 MED17 MED15 MED24 MED23 MED17 TFIIH3 CCNC MED27 MED23 MED27 MED15 MED4 CDK8 MED16 MED6 MED16 MED6 MED8 MED19 MED4 MED19 ERCC5 MED7 MED25 MED21 MED20 MED20 MED22 MED10 MED29 MED29 MED7 MED10 BioID bait normalized PSM BioID bait normalized PSM CCNCMED8 MED22 MED25 BioID bait normalized PSM MED11 MED9 MED11 MED30 MED28 MED31 MED18 MED28 MED30 TFIIH5 MED18 MED9 1 1 MED31 1 MED21

110110110 AP-MS bait normalized PSM AP-MS bait normalized PSM AP-MS bait normalized PSM d e MED10 MED19

CDK7 MED7 MED31 MED21 MED6 MED9 MED13 MED8

MED12 ERCC2 MED4 MED11 MED22 ERCC3 TFIIH2 MED17 MED1 CNCC MED18 MED14 MED20 CDK8 TFIIH5 MED15 TFIIH3

TFIIH4

MED16

f g h 1.00

MED31 MED11 MED18 0.75 MED22 CDK7 MED12 MED30 1.00 MED20 MED9 MED10 MED28 MED25 MED29 MED21 0.50 CCNH MED19 MED7 MED4 MED8 MED6 MED27 ERCC2 0.75 MED16 0.25 MED15 CCNC

MAT1 to CDK8 (PSM) Relative distance 0.00 ERCC3 MED17 Spearman’s r = 0.89, p = 4e−10 0.50 MED23 TFIIH1 0.00 0.25 0.50 0.75 1.00 TFIIH4 Relative distance to CDK8 (MS1) 0.25 MED24 TFIIH2 1.00 TFIIH3 MED1 MED14 0.00 0.75 ERCC5 Pearson’s r = 0.95, p = 1.7e–13

Relative distance to CDK8 (PSM) Spearman’s r = 0.84, p = 5.9e–08 TFIIH5 0.50 0.000.50 1.00 1.50 0.00 0.25 0.50 0.75 1.00 0.25 to MED13 (PSM) Relative distance to CDK7 Relative distance to MED13 (PSM) Relative distance Spearman’s r = 0.86, p = 9.9e−09 0.00 0.00 0.25 0.50 0.75 1.00 Relative distance to MED13 (MS1)

Fig. 7 Characterization of interaction distances by integration of MAC-tag data. a–c Distance based topology of protein complexes. The AP-MS and BioID data was blotted based on the bait normalized prey abundances and the correlated data was used to derive interaction distances for CDK7 and the TFIIH complex, as well as for CDK8 and MED13 with the Mediator complex. The CDK7 formed CAK-complex components are shown in gray and the Mediator complex components assigned to the Head (magenta), the Middle (cyan) and the Tail (green) are color-coded. d, e The derived interaction distances for CDK7, CDK8 and the MED13 are fitted into EM derived complex structures and suggested fitted interaction surface is shown in green dashed line ellipses. The color-coding in e corresponds with b, c. f Relative distances for bait protein and the other complex components can be calculated. (g, h) The calculated relative distances (using either PSM or MS1 intensity values) derived from the integrated AP-MS and BioID data results to extremely high correlation (Pearson’s and Spearman’s) and p-value as indicated (t-test) for CDK8 and MED13, two neighboring units in the Mediator kinase module

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 dehydrogenase kinase isoform 1 (PDK1, yellow)43 were processed Our results are in line with both hypothesis, the evidence through our BioID pipeline to generate the mitochondrial sub- suggesting that TFIIH1 and ERCC3 have short inter-distances as organelle map. This resulted in a mitochondrial sub-organelle well as that they both are close to the CAK module (Fig. 7d, f). reference proteome database consisting of 121 (OMM), 102 TFIIH2-4 are part of core ring structure of TFIIH complex (IMS) and 235 (matrix) proteins (Fig. 6b). For testing the gen- located adjacent to CDK7, thus having highly similar distance to erated mitochondrial reference database, we processed additional CDK7 (Fig. 7a, f). Similarly, ERCC5 and TFIIH5 are in longer 13 mitochondrial proteins with our MS microscopy and defined distances from CDK7, suggesting that they are located on the their mitochondrial sub-organelle localization. Using a confocal opposite side of the complex from CAK (Fig. 7a, d, f). microscopy we could confirm that all of the proteins are indeed The transcriptional co-activator Mediator complex has more mitochondrial, however it was impossible to obtain more detailed than 30 subunits and is ~30 nm in size55, and therefore on the sub-organelle localization information. However, using our MS upper detection limits. The Mediator complex is composed of 4 microscopy we could assign the mitochondrial proteins to the modules, the head56,57, the middle58, the tail59 and the kinase three mitochondrial sub-compartments. Majority of the mito- module60,61. The evolutionarily conserved and dissociable kinase chondrial proteins were localized solely to mitochondria when module is formed by CDK8 together with cyclin C (CCNC), using the MS microscopy on the whole cell level mainly, and also mediator of RNA polymerase II transcription subunit mediator – similarly to a single compartment within the mitochondria when complex subunit (MED) 12 and 13, (Fig. 7e)56 59. To test and using the MS microscopy on a sub-organelle level (Fig. 6c). Of the validate the reproducibility of our approach, in addition to CDK8, tested 13 mitochondrial proteins, eight (MRM1, MGST3, PLRKT, we additionally choose MED13 for analysis. Additionally this SFXN1, PTH2, COX14, TR61B and AKIP; Supplementary would allow more accurate prediction of the kinase-module Data 1a) had been analyzed previously using APEX43. Our results docking surface to the Mediator core complex. To our surprise were in good agreement with the seven of these proteins, how- the overall correlation of CDK8 and MED13 distances from the ever, we also detected COX14 in the IMS, and AKIP in the Mediator core is extremely high (c = 0.95) (Fig. 7b, c, g, h and ‘nucleus’ and ‘chromatin’—location in which AKIP has been Supplementary Data 1e), confirming that these two proteins are reported to also function44. highly proximal. Based on the analysis the closest Mediator subunits for both CDK8 and MED13 are MED12, MED14, MED1, MED24, MED23, MED17, MED15, MED27, MED16 and Defining interaction distances within a protein complex. MED6/459. This suggests that the kinase module is docking Others and we have shown that AP-MS offers accurate quanti- horizontal to the MED14 ranging from RM1 and RM2, the two fication of complex composition allowing calculations on com- repeats of a structural domain on MED14. plex stoichiometry9,10,45. With BirA* the labeling radius is limited For additional validation of our approach, we MAC-tagged (circa 10 nm), and it has been used to obtain rough maps of three components (ARPC1B, ARP2 and ARP3B) of the 220 kDa spatial distribution of proteins within structures by reciprocally and 7 seven subunit ARP2/3 complex with high-resolution crystal analyzing BirA*-tagged proteins throughout the structure15.As structure62. The obtained relative distances of the three MAC- the in vivo biotinylation is enzymatic reaction deriving relational tagged components to the other components of the ARP2/3 interaction abundances of participants cannot be done. However, complex are in good agreement with the ARP2/3 structure it can be reasoned that the more proximal proteins will be more (Supplementary Figure 11a–c and Supplementary Data 1f). efficiently biotinylated and purified in larger abundances than The CDK7, CDK8, MED13, ARP2/3 examples benchmark proteins further away15,46. On the AP-MS side this would cor- another utility of our MAC-tag system and shows that by relate with the likelihood of more abundant interactors being integration of AP-MS and BioID it is possible to derive more direct than low abundant, in which the interaction could be information on complex structure, interaction distances and mediated by other proteins and the interaction with the bait possible distance constraints. would be secondary or tertiary etc. Therefore, by blotting both the BioID and AP-MS data, in theory, we could obtain relative dis- tance of the MAC-tagged bait protein to its interacting protein in Discussion a complex. For testing this hypothesis, we selected CDK7 and In this study, we developed and optimized an integrated workflow CDK8 for which we have previously identified successfully based around MAC-tag, for characterization of the molecular quantitative complex compositions22. context of many proteins of interest from human cells. This We applied our dual-approach, and with both AP-MS and workflow features state-of-the-art affinity purification using BioID we could detect the CDK7 interactions with TFIIH core Strep-tag to identify and quantify protein-protein interaction and – components (Fig. 7a)47 49. The size of the TFIIH is estimated to protein complex stoichiometry; identification of transient or be ~10 nm47, which is still within the BirA* biotinylation range close-proximity interactions with BioID; visualization of the bait and should allow measurement of the CDK7 interaction distances protein and the proximal interactors with immunofluorescence for all of the complex components. Before associating with microscopy; and defining the molecular context with MS TFIIH, CDK7 associates and forms cyclin-dependent kinase microscopy utilizing the reference dataset obtained by identifying (CDK)-activating kinase (CAK) complex with two regulatory proximal interactors for bona fide subcellular localization mar- subunits; cyclin H (CCNH) required for CDK7 activity and with kers. Additionally our integrated workflow reduces the generation RING finger protein CDK-activating kinase assembly factor of the required cell lines for AP-MS and BioID to half, and the use MAT1 (MAT) which modulates the substrate specificity of the of a single affinity reagent simplifies the combinatorial use of both complex50. In agreement, both CCNH and MAT1 are detected as AP-MS and BioID approaches. Additionally, the use of a single closest to CDK7, followed by ERCC2, ERCC3 and TFIIH1. The affinity reagent facilitates the filtering of the HCIPs (mainly due ERCC2, TFIIH basal transcription factor complex XPD to the uniform background from the unspecific binders). Other subunit is the bridge linking the CAK module with TFIIH ring- advantages of using Strep-Tactin® in AP-MS include the lower like core and has been shown to directly interact with TFIIH basal cost and higher binding capacity of Strep-Tactin® compared to transcription factor complex helicase XPB subunit (ERCC3)51 antibody beads63 (e.g., HA22, Myc10 and FLAG20, as well as the and TFIIH152. CDK749 also has been reported to directly possibility for native elution with biotin instead of on-bead interact with TFIIH148 (Fig. 7d and Supplementary Data 1e)53,54. digestion20. The possibility for native elution also allows the use

12 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE of the purified protein complexes for example on enzymatic empower, not only the interaction proteomics community, but reactions, such as kinase assays22. In addition as the majority of also cell/molecular/structural biologists, with an experimentally interaction proteomics studies (AP-MS and/or BioID/APEX) are proven integrated workflow for mapping in detail the physical using the biotin-avidin based purification approaches, also the and functional interactions and the molecular context of proteins comparison of the obtained results with other publications is in human cells. easier with using only Strep-Tactin®. In addition to analyzing the physical and functional interac- Methods tions formed by 18 cellular localization markers, we used our Generation of MAC-tag Gateway® destination vectors. To generate Gateway integrated workflow to map interactions for four kinases compatible destination vectors, plasmids containing the tags (C-terminal: StrepIII/ ® (AURKB, CDK7, CDK8 and GSK3B), as well as for two addi- HA/BirA*, N-terminal: BirA*/HA/StrepIII) were synthesized by GeneArt , Life Technologies. These were digested with restriction and inserted into N- tional localization markers (BET1 and RAB5A). In addition to terminal: pcDNA5/FRT/TO/StrepIII/HA/GW64 or C-terminal: pcDNA5/FRT/TO/ identifying 539 interactions for these six proteins, we could StrepIII/HA/GW2 in which entire StrepIII/HA tag was removed. All the Gateway validate the accuracy of the MS microscopy method for identi- compatible entry clones, which contain subcellular marker gene of interested, were fying correct cellular localization for these proteins. We also from Human ORFeome collection. The MAC-tag constructs are made available via applied the analyses for BioID identified filtered interactions from Addgene.org. 7 publications, and derived the cellular localizations for the used fi Immunofluorescence. HeLa cells (American Type Culture Collection, ATCC, baits. The identi ed localizations were in good agreement with CCL-2) were transfected with vectors containing MAC-tagged gene of interest and the corresponding reported localizations. This illustrates the cultured either with or without supplemental biotin. Bait proteins were detected general usability of the MS microscopy and the web application, with anti-HA antibody (Biolegend, MMS-101R, dilution 1:500), followed by Alexa but also suggests that the MS microscopy would be extendable as Fluor 488-conjugated secondary antibody (Thermo Fisher Scientific, A-11001, a communal effort to cover the cell even in greater detail. The 1:800). Biotinylated proteins were detected with Alexa Fluor 594 streptavidin (Thermo Fisher Scientific; S11227, 1:800). DAPI staining was used to determine the cellular signaling state varies between the stimuli, but also with nuclei. Selected endogenous proteins were detected with specific antibodies different cell types. Although our MS microscopy resulted in (Supplementary Data 1a) and subsequently with Alexa Fluor 594 -conjugated highly similar results using data from three different cell lines, the anti-rabbit antibody (Thermo Fisher Scientific, A11012, dilution 1:800). Wide-field fl biological variation and cellular heterogeneity needs to be taken uoresce microscope (Leica, Leica DM6000, Welzlar, Germany) with HCXPL APO 63×/1.40–0.60 oil objective was used to image the samples. For imaging sub- into consideration with MS microscopy as with any biological mitochondrial proteins, confocal microscopy (Leica TCS SP8 STED, Leica) with experiment. HC PL APO 93×/1.30 motCORR glycerol object was used. The image files were Furthermore, we could show with an exogenous protein, AOX processed with LAS X (Leica), and ImageJ softwares. that our MS microscopy identifies AOX to localize to mito- chondria. Additional analysis shows that AOX localizes to inner Cell culture. For generation of the stable cell lines inducibly expressing the MAC- mitochondrial membrane from the mitochondrial matrix side tagged versions of the baits, Flp-In™ T-REx™ 293 cell lines (Invitrogen, Life fi Technologies, R78007, cultured in manufacture’s recommended conditions) were and is in close-proximity with Complex II. Our ndings validate, co-transfected with the expression vector and the pOG44 vector (Invitrogen) using the functionally suggested vicinity of AOX with Complex II. We the Fugene6 transfection reagent (Roche Applied Science). Two days after − further extended our analysis on sub-organelle level with mito- transfection, cells were selected in 50 μgml 1 streptomycin and hygromycin − chondria, by addition of three additional markers, TOMM22 for (100 μgml 1) for 2 weeks, and then the positive clones were pooled and amplified. fl OMM, SCO1 for IMS and PDK1 for the mitochondrial matrix. Stable cells expressing MAC-tag fused to green uorescent protein (GFP) were used as negative controls and processed in parallel to the bait proteins. Using this sub-organelle reference proteome map, we applied MS Each stable cell line was expanded to 80% confluence in 20 × 150 mm cell − microscopy on 13 mitochondrial proteins and could identify their culture plates. Ten plates were used for AP-MS approach, in which 1 μgml 1 sub-organelle locations, whereas the confocal microscopy (93×) tetracycline was added for 24 h induction, and 10 plates for BioID approach, in μ failed to do so. which in addition to tetracycline, 50 M biotin was added for 24 h before harvesting. Cells from 5 × 150 mm fully confluent dishes (~5 × 107 cells) were Identifying the complex components in a stoichiometry fash- pelleted as one biological sample. Thus, each bait protein has two biological ion has been shown to be possible with affinity purification mass replicates in two different approaches. Samples were snap frozen and stored spectrometry9,10. However, obtaining any further spatial infor- at −80 °C. mation of the complex formation has only been possibly in Human osteosarcoma cell line U-2 OS (ATCC, HTB-9) and prostate cancer cell 18 line DU145 (ATCC, HTB-81) were routinely maintained in ATCC-recommended combination with XL-MS . We could now show with the TFIIH conditions. For transient transfections, DU-145 or U-2 OS cells were seeded 24 h and Mediator complex as model complexes, that by utilizing both before transfection to 7 × 150 mm cell culture plates. Transient transfections for the AP-MS and BioID approaches we can obtain relative inter- U-2 OS were conducted using DreamFect Gold transfection reagent (OZ action distances for proteins in a complex. Based on the inter- Biosciences, Marseille, France), and for DU-145 with Helix-IN™ (FP29, OZ Biosciences) according to the manufacturer’s instructions. The in vivo biotinylation action distances it is possible to obtain an estimate for the was activated by addition of biotin 24 h post-transfection to a final concentration of interaction surfaces for proteins or structures, such as with the 50 μM. Cells from the 7 × 150 mm cell culture plates (~7 × 107 cells) were collected kinase submodule of the Mediator complex. We agree that and snap frozen. Analyses of the baits were performed as two biological replicates. deriving the interaction distances might only be applicable when using single affinity reagent (Strep-Tactin®) for both the AP-MS Affinity purification of the interacting proteins. For AP-MS approach, the and BioID. sample was lysed in 3 ml of lysis buffer 1 (0.5% IGEPAL, 50 mM Hepes, pH 8.0, In summary, our study showed that the integrated workflow 150 mM NaCl, 50 mM NaF, 1.5 mM NaVO3, 5 mM EDTA, supplemented with 0.5 mM PMSF and protease inhibitors; Sigma). and the reference molecular context proteome map generated For BioID approach, Cell pellet was thawed in 3 ml ice-cold lysis buffer 2 (0.5% here, allows an easy way to probe the molecular localization of IGEPAL, 50 mM Hepes, pH 8.0, 150 mM NaCl, 50 mM NaF, 1.5 mM NaVO3,5 protein of interest, and additionally an online resource of our mM EDTA, 0.1% SDS, supplemented with 0.5 mM PMSF and protease inhibitors; BioID based MS microscopy approach is available at www. Sigma). Lysates were sonicated, treated with benzonase. fi Cleared lysate was obtained by centrifugation and loaded consecutively on biocenter.helsinki. /bi/protein/msmic. We also showed that it is spin columns (Bio-Rad) containing lysis buffer 1 prewashed 200 μl Strep-Tactin also usable for existing BioID datasets and can be expanded (with beads (IBA, GmbH). The beads were then washed 3 × 1 ml with lysis buffer 1 additional localizations or cell types) once more data comes and 4 × 1 ml with wash buffer (50 mM Tris-HCl, pH 8.0, 150 mM NaCl, 50 mM available. The molecular image obtained from the MS microscopy NaF, 5 mM EDTA). Following the final wash, beads were then resuspended in 2 × 300 μlelutionbuffer(50mMTris-HCl,pH8.0,150mMNaCl,50mMNaF, analysis considers the weights of interactors and provides more 5 mM EDTA, 0.5 mM Biotin) for 5 mins and eluates collected into an Eppendorf dynamic localization information at the molecular level. The tubes, followed by a reduction of the cysteine bonds with 5 mM Tris(2- developed MAC-tag and the integrated approach should carboxyethyl)phosphine (TCEP) for 30minsat37°Candalkylationwith10

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 mM iodoacetamide. The proteins were then digested to peptides with Results visualization. The MS microscopy analyses are presented as polar plots fi sequencing grade modi ed trypsin (Promega, V5113) at 37 °C overnight. After created with an in-house python script, where the circle has been equally divided quenching with 10% TFA, the samples were desalted by C18 reversed-phase into 14 sectors, each sector representing one specific subcellular location. For the ’ spin columns according to the manufacturer sinstructions(Harvard sub-organelle localization analysis, three sectors representing matrix, IMS and Apparatus). The eluted peptide sample was dried in vacuum centrifuge and OMM are shown in the plot. Differently colored sector areas indicate the possible fi μ reconstituted to a nal volume of 30 lin0.1%TFAand1%CH3CN. location score of the query bait, with scores between 0 and 0.5 marked in red, between 0.5 and 0.75 in yellow, and 0.75 and 1 in green. Liquid chromatography–mass spectrometry (LC-MS). Analysis was performed Online interface. We have developed a web application (R-shiny; http://www. on a Q-Exactive mass spectrometer using Xcalibur version 3.0.63 coupled with an biocenter.helsinki.fi/bi/protein/msmic) for visualization of protein localizations by EASY-nLC 1000 system via an electrospray ionization sprayer (Thermo Fisher MS microscopy. A user can upload an input file (after SAINT and CRAPome Scientific). In detail, peptides were eluted and separated with a C18 precolumn filtering) and visualize the bait protein’s dynamic localization. The polar plot as (Acclaim PepMap 100, 75 μm x 2 cm, 3 μm, 100 Å, Thermo Scientific) and ana- well as a parsed data matrix can also be downloaded. lytical column (Acclaim PepMap RSLC, 75 μm × 15 cm, 2 μm, 100 Å; Thermo Scientific), using a 60 min buffer gradient ranging from 5 to 35% buffer B, followed by a 5 min gradient from 35 to 80% buffer B and 10 min gradient from 80 to 100% Determining relative intramolecular distances of complexes. The PSM values − fi buffer B at a flow rate of 300 nl min 1 (buffer A: 0.1% formic acid in 98% HPLC from both the BioID and AP-MS approaches were rst averaged from the replicate grade water and 2% acetonitrile; buffer B: 0.1% formic acid in 98% acetonitrile and samples and the averaged PSMs from the preys were normalized by dividing 2% water). For direct LC-MS analysis, 4 μl peptide samples were automatically them by the averaged PSMs of the baits. The normalized PSMs were then used to loaded from an enclosed cooled autosampler. Data-dependent FTMS acquisition calculateq theffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi relative Euclidean distance between the baits and the preys using the − was in positive ion mode for 80 min. A full scan (200–2000 m z 1) was performed ð BioID À BioIDÞ2 þð APÀMS À APÀMSÞ2 formula PSMPrey PSMBait PSMPrey PSMBait in an in-house with a resolution of 70,000 followed by top10 CID-MS2 ion trap scans with R script. These values (in log10 scale) were also used to create the scatter plots. resolution of 17,500. Dynamic exclusion was set for 30 s. Acquired MS2 spectral Pearson’s and Spearman’s correlations were used to assess the correlations between data files (Thermo.RAW) were searched with Proteome Discoverer 1.4 (Thermo the relative distance measures of the CDK8 and MED13 baits (the correlation Scientific) using SEQUEST search engine of the selected human component of coefficients and their associated p-values are shown). MaxQuant (version 1.6)69 UniProtKB/SwissProt database (http://www.uniprot.org/, version 2015–09). The was used to obtain the parent-ion mass (MS1) information. following parameters were applied: Trypsin was selected as the and a maximum of 2 missed cleavages were permitted, precursor mass tolerance at ±15 ppm and fragment mass tolerance at 0.05 Da. Carbamidomethylation of cysteine, Crystal structure and CryoEM docking. The human core TFIIH crystal structure was defined as static modifications. Oxidation of methionine and biotinylation of ( (PDB): 5IVW) was docked into cryoEM reconstructions fi described by He et al. (Electron Microscopy Data Bank (EMDB): 8131)48. The crystal and N-termini were set as variable modi cations. All reported data were 49 based on high-confidence peptides assigned in Proteome Discoverer with structure of human CDK7 (Protein Data Bank: 1UA2) was simply placed in the FDR<1%). possible interaction direction using Chimera. The human CDK8 crystal structure (PDB: 3RGF)61 was docked into cryoEM reconstructions described by Tsai et al. (EMDB: 5588)60. The mediator complex structure for head module (PDB:4GWP57, 58 59 Identification of the HCIs. Significance Analysis of INTeractome (SAINT)-express middle module , tail module(5U0S) were docked into cyroEM reconstructions 56 version 3.6.065,66 and Contaminant Repository for Affinity Purification (CRA- described by Tsai et al. (EMDB:2634) . Crystal structure of Arp2/3 complex (PDB: 62 Pome, http://www.crapome.org/)67 were used as statistical tools for identification 4JD2) was showing in surface representation. of specific high-confidence interactions from our AP-MS data. 16 GFP control runs (8 N-terminal MAC-GFP and 8 C-terminal MAC-GFP) were used as control Data availability. The MS data for all the 518 runs are available at the Pepti- counts for each hit and the final results only considering proteins with SAINT deAtlas raw data repository under the accession code PASS01076 (see also score ≥ 0.73. This corresponds to an estimated protein-level Bayesian FDR of <0.05. Supplementary Data 1b). The protein interactions from this publication have Furthermore, we used the CRAPome database with a cutoff frequency of ≥20% been submitted to the IMEx consortium (http://www.imexconsortium.org) (≥82) except the average spectral count fold change ≥3 was set for assigning HCIs. through IntAct70 and assigned the identifier IM-26301. Results of using MS microscopy on published BioID datasets are available at http://www.biocenter. helsinki.fi/bi/protein/msmic/example.pdf. Other data that support this study are Clustering analysis. Prey protein frequency count matrix was generated using available from the corresponding author upon reasonable request. DAVID gene functional classification tool to provide the gene ontology (GO) terms (domains, biological process and molecular function). The p-values associated with each annotation terms has p < 0.01(by a modified Fisher’s exact test). Hierarchical Received: 1 February 2018 Accepted: 21 February 2018 cluster was performed by centered correlation (both baits and interactors; average linkage) using Cluster 3.0 and the clusters were visualized with Tree View 1.1.6 and the matrix2png web server (http://www.chibi.ubc.ca/matrix2png/).

Networks and maps. Protein interaction networks are constructed from SAINT References data that were imported into Cytoscape 3.2.168. The known prey-prey interaction 1. Dunham, W. H., Mullin, M. & Gingras, A. C. Affinity-purification coupled to data were obtained from PINA2 database (http://omics.bjcancer.org/pina/) mass spectrometry: basic principles and strategies. Proteomics 12, 1576–1590 (2012). 2. Varjosalo, M. et al. Interlaboratory reproducibility of large-scale human fi MS microscopy database construction. The high-con dence interacting proteins protein-complex analysis by standardized AP-MS. Nat. Methods 10, 307–314 fi (HCIPs) obtained from previous ltering steps were sorted according to the cor- (2013). responding bait protein localization information to build the reference database, 3. Einhauer, A. & Jungbauer, A. The FLAG peptide, a versatile fusion tag for the containing the following localization information: peroxisome, microtubule, purification of recombinant proteins. J. Biochem. Biophys. Methods 49, endosome (combined: early, late and recycling endosome), proteasome, nuclear 455–465 (2001). envelope, Golgi (combined: trans-Golgi and cis-Golgi), lysosome, nucleolus, plasma 4. Bornhorst, J. A. & Falke, J. J. Purification of proteins using polyhistidine membrane, endoplasmic reticulum, mitochondria, centrosome, chromatin, exo- affinity tags. Methods Enzymol. 326, 245–254 (2000). some. To elucidate sub-organelle localization (OMM, IMS and matrix) within 5. Kolodziej, P. A. & Young, R. A. Epitope tagging and protein surveillance. mitochondria, three mitochondrial bait proteins were used. Methods Enzymol. 194, 508–519 (1991). 6. Tai, T. N., Havelka, W. A. & Kaplan, S. A broad-host-range vector system for – Score calculation cloning and translational lacZ fusion analysis. Plasmid 19, 175 188 (1988). . Final localization scores for all localization groups of a given fl bait of interest were calculated by first dividing the sum of the normalized peptide- 7. Gerdes, H. H. & Kaether, C. Green uorescent protein: applications in cell – spectrum match (PSM) values of the interactors that matched to the bait of interest biology. FEBS Lett. 389,44 47 (1996). and a localization group by the sum of PSM values of all bait interactors that 8. Skerra, A. & Schmidt, T. G. Use of the Strep-Tag and streptavidin for fi matched any localization group. This was then multiplied by the sum of PSM detection and puri cation of recombinant proteins. Methods Enzymol. 326, – values of interactors of the localization group that matched the bait interactors of 271 304 (2000). interest divided by the sum of PSM values of all interactors of the localization 9. Wepf, A., Glatter, T., Schmidt, A., Aebersold, R. & Gstaiger, M. Quantitative group. The PSMs were bait normalized, i.e., the PSM of each interactor was divided interaction proteomics using mass spectrometry. Nat. Methods 6, 203–205 by the PSM of the bait. The score reflects subcellular localization by numerically (2009). describing the similarity of the subcellular environment and the time spent there 10. Yadav, L. et al. Systematic Analysis of Human Protein Phosphatase between the bait of interest and each localization group. Interactions and Dynamics. Cell Syst. 4, 430–444.e5 (2017).

14 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2 ARTICLE

11. Herzog, F. et al. Structural probing of a protein phosphatase 2A network by 43. Lee, S. Y. et al. APEX Fingerprinting Reveals the Subcellular Localization of chemical cross-linking and mass spectrometry. Science 337, 1348–1352 (2012). Proteins of Interest. Cell Rep. 15, 1837–1847 (2016). 12. Jennebach, S., Herzog, F., Aebersold, R. & Cramer, P. Crosslinking-MS 44. Sastri, M. et al. A kinase interacting protein (AKIP1) is a key regulator of analysis reveals RNA polymerase I domain architecture and basis of rRNA cardiac stress. Proc. Natl Acad. Sci. USA 110, E387–E396 (2013). cleavage. Nucleic Acids Res. 40, 5591–5601 (2012). 45. Kean, M. J., Couzens, A. L. & Gingras, A. C. Mass spectrometry approaches to 13. Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase study mammalian kinase and phosphatase associated proteins. Methods 57, fusion protein identifies proximal and interacting proteins in mammalian 400–408 (2012). cells. J. Cell Biol. 196, 801–810 (2012). 46. Choi-Rhee, E., Schulman, H. & Cronan, J. E. Promiscuous protein 14. Rhee, H. W. et al. Proteomic mapping of mitochondria in living cells via biotinylation by Escherichia coli biotin protein ligase. Protein Sci. 13, spatially restricted enzymatic tagging. Science 339, 1328–1331 (2013). 3043–3050 (2004). 15. Kim, D. I. et al. Probing nuclear pore complex architecture with proximity- 47. Schultz, P. et al. Molecular structure of human TFIIH. Cell 102, 599–607 dependent biotinylation. Proc. Natl Acad. Sci. USA 111, E2453–E2461 (2014). (2000). 16. Ward, R. J., Alvarez-Curto, E. & Milligan, G. Using the Flp-In T-Rex system to 48. He, Y. et al. Near-atomic resolution visualization of human transcription regulate GPCR expression. Methods Mol. Biol. 746,21–37 (2011). opening. Nature 533, 359–365 (2016). 17. Kidder, B. L., Hu, G. & Zhao, K. ChIP-Seq: technical considerations for 49. Lolli, G., Lowe, E. D., Brown, N. R. & Johnson, L. N. The crystal structure of obtaining high-quality data. Nat. Immunol. 12, 918–922 (2011). human CDK7 and its protein recognition properties. Structure 12, 2067–2079 18. Leitner, A., Faini, M., Stengel, F. & Aebersold, R. Crosslinking and mass (2004). spectrometry: an integrated technology to understand the structure and 50. Yankulov, K. Y. & Bentley, D. L. Regulation of CDK7 substrate specificity by function of molecular machines. Trends Biochem. Sci. 41,20–32 (2016). MAT1 and TFIIH. EMBO J. 16, 1638–1646 (1997). 19. Yao, Z. et al. A global analysis of the receptor tyrosine kinase-protein 51. Giglia-Mari, G. et al. A new, tenth subunit of TFIIH is responsible for the phosphatase interactome. Mol. Cell 65, 347–360 (2017). DNA repair syndrome trichothiodystrophy group A. Nat. Genet. 36, 714–719 20. Hesketh, G. G., Youn, J. Y., Samavarchi-Tehrani, P., Raught, B. & Gingras, A. (2004). C. Parallel exploration of interaction space by bioid and affinity purification 52. Drapkin, R. et al. Dual role of TFIIH in DNA excision repair and in coupled to mass spectrometry. Methods Mol. Biol. 1550, 115–136 (2017). transcription by RNA polymerase II. Nature 368, 769–772 (1994). 21. Lambert, J. P., Tucholska, M., Go, C., Knight, J. D. & Gingras, A. C. Proximity 53. Shiekhattar, R. et al. Cdk-activating kinase complex is a component of human biotinylation and affinity purification are complementary approaches for the transcription factor TFIIH. Nature 374, 283–287 (1995). interactome mapping of chromatin-associated protein complexes. J. Proteom. 54. Rossignol, M., Kolb-Cheynel, I. & Egly, J. M. Substrate specificity of the cdk- 118,81–94 (2015). activating kinase (CAK) is altered upon association with TFIIH. EMBO J. 16, 22. Varjosalo, M. et al. The protein interaction landscape of the human CMGC 1628–1637 (1997). kinase group. Cell Rep. 3, 1306–1320 (2013). 55. Maeshima, K. et al. The physical size of transcription factors is key to 23. Rubin, D. M. & Finley, D. Proteolysis. The proteasome: a protein-degrading transcriptional regulation in chromatin domains. J. Phys. Condens. Matter 27, organelle? Curr. Biol. 5, 854–858 (1995). 064116 (2015). 24. Kachur, T. M. & Pilgrim, D. B. Myosin assembly, maintenance and 56. Tsai, K. L. et al. Subunit architecture and functional modular rearrangements degradation in muscle: Role of the chaperone UNC-45 in myosin thick of the transcriptional mediator complex. Cell 157, 1430–1444 (2014). filament dynamics. Int. J. Mol. Sci. 9, 1863–1875 (2008). 57. Robinson,P.J.,Bushnell,D.A.,Trnka,M.J.,Burlingame,A.L.&Kornberg,R.D. 25. Livneh, I., Cohen-Kaplan, V., Cohen-Rosenzweig, C., Avni, N. & Ciechanover, A. Structure of the mediator head module bound to the carboxy-terminal The life cycle of the 26S proteasome: from birth, through regulation and domain of RNA polymerase II. Proc. Natl Acad. Sci. USA 109,17931–17935 function, and onto its death. Cell Res. 26,869–885 (2016). (2012). 26. Zhu, L. et al. CellWhere: graphical display of interaction networks organized 58. Lariviere, L. et al. Model of the Mediator middle module based on protein on subcellular localizations. Nucleic Acids Res. 43, W571–W575 (2015). cross-linking. Nucleic Acids Res. 41, 9266–9273 (2013). 27. Varjosalo, M. & Taipale, J. Hedgehog: functions and mechanisms. Dev. 59. Tsai, K. L. et al. Mediator structure and rearrangements required for 22, 2454–2472 (2008). holoenzyme formation. Nature 544, 196–201 (2017). 28. Adachi, A. et al. Golgi-associated GSK3beta regulates the sorting process of 60. Tsai, K. L. et al. A conserved Mediator-CDK8 kinase module association post-Golgi membrane trafficking. J. Cell. Sci. 123, 3215–3225 (2010). regulates Mediator-RNA polymerase II interaction. Nat. Struct. Mol. Biol. 20, 29. Taelman, V. F. et al. Wnt signaling requires sequestration of glycogen synthase 611–619 (2013). kinase 3 inside multivesicular endosomes. Cell 143, 1136–1148 (2010). 61. Schneider, E. V. et al. The structure of CDK8/CycC implicates specificity in 30. Gross, J. C., Chaudhary, V., Bartscherer, K. & Boutros, M. Active Wnt the CDK/cyclin family and reveals interaction with a deep pocket binder. J. proteins are secreted on exosomes. Nat. Cell Biol. 14, 1036–1045 (2012). Mol. Biol. 412, 251–266 (2011). 31. Huotari, J. & Helenius, A. Endosome maturation. EMBO J. 30, 3481–3500 62. Luan, Q. & Nolen, B. J. Structural basis for regulation of Arp2/3 complex by (2011). GMF. Nat. Struct. Mol. Biol. 20, 1062–1068 (2013). 32. Woodman, P. G. Biogenesis of the sorting endosome: the role of Rab5. Traffic 63. Lichty, J. J., Malecki, J. L., Agnew, H. D., Michelson-Horowitz, D. J. & Tan, S. 1, 695–701 (2000). Comparison of affinity tags for protein purification. Protein Expr. Purif. 41, 33. Chan, P. K. et al. BioID data of c-MYC interacting protein partners in cultured 98–105 (2005). cells and xenograft tumors. Data Brief. 1,76–78 (2014). 64. Glatter, T., Wepf, A., Aebersold, R. & Gstaiger, M. An integrated workflow for 34. Couzens, A. L. et al. Protein interaction network of the mammalian Hippo charting the human interaction proteome: insights into the PP2A system. Mol. pathway reveals mechanisms of kinase-phosphatase interactions. Sci. Signal. 6, Syst. Biol. 5, 237 (2009). rs15 (2013). 65. Teo, G. et al. SAINTexpress: improvements and additional features in 35. Kim, D. I. et al. An improved smaller biotin ligase for BioID proximity Significance Analysis of INTeractome software. J. Proteom. 100,37–43 (2014). labeling. Mol. Biol. Cell 27, 1188–1196 (2016). 66. Choi, H. et al. SAINT: probabilistic scoring of affinity purification-mass 36. Liyanage, S. U. et al. Characterizing the mitochondrial DNA polymerase spectrometry data. Nat. Methods 8,70–73 (2011). gamma interactome by BioID identifies Ruvbl2 localizes to the mitochondria. 67. Mellacheruvu, D. et al. The CRAPome: a contaminant repository Mitochondrion 32,31–35 (2017). for affinity purification-mass spectrometry data. Nat. Methods 10, 730–736 37. Schweingruber, C., Soffientini, P., Ruepp, M. D., Bachi, A. & Muhlemann, O. (2013). Identification of Interactions in the NMD Complex Using Proximity- 68. Shannon, P. et al. Cytoscape: a software environment for integrated Dependent Biotinylation (BioID). PLoS ONE 11, e0150239 (2016). models of biomolecular interaction networks. Genome Res. 13, 2498–2504 38. Andjelkovic, A. et al. Diiron centre in Ciona intestinalis alternative (2003). oxidase abolish enzymatic activity and prevent rescue of cytochrome oxidase 69. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, deficiency in flies. Sci. Rep. 5, 18295 (2015). individualized p.p.b.-range mass accuracies and proteome-wide protein 39. Fernandez-Ayala, D. J. et al. Expression of the Ciona intestinalis alternative quantification. Nat. Biotechnol. 26, 1367–1372 (2008). oxidase (AOX) in Drosophila complements defects in mitochondrial oxidative 70. Orchard, S. et al. The MIntAct project—IntAct as a common curation phosphorylation. Cell. Metab. 9, 449–460 (2009). platform for 11 molecular interaction databases. Nucleic Acids Res. 42, 40. Szibor, M. et al. Broad AOX expression in a genetically tractable mouse model D358–D363 (2014). does not disturb normal . Dis. Model Mech. 10, 163–171 (2017). 41. Hakkaart, G. A., Dassa, E. P., Jacobs, H. T. & Rustin, P. Allotopic expression of a mitochondrial alternative oxidase confers cyanide resistance to human cell Acknowledgements respiration. EMBO Rep. 7, 341–345 (2006). We thank Sini Miettinen for technical assistance and Drs Salla Keskitalo and Petri 42. Wilkens, V., Kohl, W. & Busch, K. Restricted diffusion of OXPHOS complexes Auvinen for critical reading and comments on the manuscript. Imaging was performed in dynamic mitochondria delays their exchange between cristae and engenders at the Light Microscopy Unit, Institute of Biotechnology. This study was supported by a transitory mosaic distribution. J. Cell Sci. 126, 103–116 (2013). grants from the Academy of Finland (nos. 288475 and 294173), Sigrid Jusélius

NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications 15 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03523-2

Foundation, University of Helsinki Three-year Research Grant, Biocentrum Helsinki, Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in Biocentrum Finland, HiLIFE and Instrumentarium Research Foundation. published maps and institutional affiliations.

Author contributions Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, M.V. and X.L. conceived the study and designed experiments. M.V., X.L., K.S., Y.J., R.W., adaptation, distribution and reproduction in any medium or format, as long as you give T.Ö. and F.T. performed experiments and data analysis. M.V., X.L., K.S. and F.T. par- appropriate credit to the original author(s) and the source, provide a link to the Creative ticipated in manuscript preparation. M.V. and X.L. wrote the manuscript. Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless Additional information indicated otherwise in a credit line to the material. If material is not included in the Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- article’s Creative Commons license and your intended use is not permitted by statutory 018-03523-2. regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Competing interests: The authors declare no competing interests. licenses/by/4.0/.

Reprints and permission information is available online at http://npg.nature.com/ reprintsandpermissions/ © The Author(s) 2018

16 NATURE COMMUNICATIONS | (2018) 9:1188 | DOI: 10.1038/s41467-018-03523-2 | www.nature.com/naturecommunications