Disparate binding kinetics by an intrinsically disordered domain enables temporal regulation of transcriptional complex formation

Neil O. Robertsona, Ngaio C. Smitha, Athina Manakasa, Mahiar Mahjouba, Gordon McDonaldb, Ann H. Kwana, and Jacqueline M. Matthewsa,1

aSchool of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia; and bCentre for Translational Data Science, University of Sydney, Sydney, NSW 2006

Edited by G. Marius Clore, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, MD, and approved March 26, 2018 (received for review August 18, 2017) Intrinsically disordered regions are highly represented among C-terminal domain (Fig. 1A). LIM domain binding protein 1 mammalian transcription factors, where they often contribute to (LDB1) interacts with all LIM-HD/LMO proteins through a the formation of multiprotein complexes that regulate expres- disordered LIM interaction domain (LID) (Fig. 1A), which folds sion. An example of this occurs with LIM-homeodomain (LIM-HD) on binding to LIM1+2 domains to form extended modular com- proteins in the developing spinal cord. The LIM-HD protein LHX3 and plexes (9–11). Competition for LDB1 by LIM-HD/LMO proteins the LIM-HD cofactor LDB1 form a binary complex that gives rise to contributes to a so-called transcriptional “LIM code” that helps interneurons, whereas in adjacent cell populations, LHX3 and LDB1 determine cell fate in the developing spinal cord (12). A binary form a rearranged ternary complex with the LIM-HD protein ISL1, complex comprising LDB1 bound to the LIM-HD protein resulting in motor neurons. The protein–protein interactions within LHX3 triggers differentiation of V2-interneurons (V2-INs) these complexes are mediated by ordered LIM domains in the LIM- (Fig. 1B) (13). In neighboring cells, a ternary complex is formed HD proteins and intrinsically disordered LIM interaction domains comprising LDB1, LHX3, and a second LIM-HD protein, ISL1. (LIDs) in LDB1 and ISL1; however, little is known about how the Here, LDB1 contacts ISL1 + , forcing LHX3 + to bind a LID LIM1 2 LIM1 2 BIOCHEMISTRY strength or rates of binding contribute to complex assemblies. We LID in the C-terminal domain of ISL1. The ternary complex trig- have measured the interactions of LIM:LID complexes using FRET- gers differentiation of spinal motor neurons (sMNs) (Fig. 1C)(13, based protein–protein interaction studies and EMSAs and used these 14). Paralogues ISL2 and LHX4 can form similar complexes in data to model population distributions of complexes. The protein– developing sMNs (Fig. 1 B and C)(15–17). Despite low sequence protein interactions within the ternary complexes are much weaker identity (Fig. 1D), LDB1LID and ISL1/2LID form very similar than those in the binary complex, yet surprisingly slow LDB1: structures when in complex with their partners (11). sMN progen- ISL1 dissociation kinetics and a substantial increase in DNA binding itors also express LMO4, which has been shown to inhibit the for- affinity promote formation of the ternary complex over the binary mation of the binary complex over the ternary complex by complex in motor neurons. We have used mutational and protein competing with LHX3/4 for LDB1 (18, 19). engineering approaches to show that allostery and modular binding A lack of quantitative protein–protein and protein–DNA by tandem LIM domains contribute to the LDB1LID binding kinetics. binding data means that it is unclear what mechanism governs The data indicate that a single intrinsically disordered region can the regulation of ternary and binary complexes in sMNs. Efforts achieve highly disparate binding kinetics, which may provide a mech- to quantify LID:LIM1+2 interactions were hampered, as the anism to regulate the timing of transcriptional complex assembly. Significance intrinsically disordered proteins | binding kinetics | protein–protein interactions | protein–DNA interactions | transcriptional regulation Different combinations and permutations of transcription fac- tors work together to regulate the expression of target . These proteins often contain high levels of intrinsically disor- ntrinsically disordered regions (IDRs) are protein domains that dered regions, which are important mediators of protein–pro- Ilack a well-defined 3D structure. IDRs are frequently involved in tein interactions. We show that unusual binding kinetics – making protein protein interactions. It has been suggested that associated with an intrinsically disordered region in a tran- interactions involving IDRs offer many advantages over those in- scriptional coregulator can regulate the formation of tran- volving solely structured domains, including the combination of scriptional complexes that lead to the specification of neuronal K high specificity with low affinity ( d), increased sensitivity to en- cell subtypes. Notably, a single intrinsically disordered region vironmental conditions and posttranslational modifications, in- shows selective differences in binding kinetics for proteins of creased flexibility, and the ability to provide a hub for multiple the same family, which have implications for how intrinsic interactions (1, 2). Studies of IDR binding kinetics suggest that disorder contributes to regulatory processes and complexity in disorder can increase both association and dissociation rate con- higher organisms. stants (kon and koff, respectively) (3–6). Eukaryotic transcription factors are highly enriched in IDRs compared with other eukary- Author contributions: N.O.R., N.C.S., A.H.K., and J.M.M. designed research; N.O.R., N.C.S., otic proteins and prokaryotic transcription factors (7). Despite and A.M. performed research; N.O.R., N.C.S., A.M., M.M., G.M., A.H.K., and J.M.M. ana- their biological importance, relatively few studies report quanti- lyzed data; and N.O.R., N.C.S., A.M., A.H.K., and J.M.M. wrote the paper. tative data for IDR-mediated interactions (8). The authors declare no conflict of interest. The LIM-homeodomain (LIM-HD) and LIM-only (LMO) This article is a PNAS Direct Submission. proteins provide a model for the role of IDRs in transcriptional Published under the PNAS license. complexes. All LIM-HD/LMO proteins contain two LIM domains 1To whom correspondence should be addressed. Email: [email protected]. (LIM1+2) arrayed in tandem that take part in protein–protein This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. interactions. LIM-HD proteins also contain a central DNA 1073/pnas.1714646115/-/DCSupplemental. binding homeodomain (HD) and an intrinsically disordered

www.pnas.org/cgi/doi/10.1073/pnas.1714646115 PNAS Latest Articles | 1of6 Downloaded by guest on September 30, 2021 – A LMO LDB1 Here, we report quantification of the protein protein interac- LIM1+2 SA LID tions involved in the sMN LIM code using solution FRET-based LIM-HD methods. The binding kinetics of the LDB1:ISL1/2 interactions LIM1+2 HD LID are much slower than those of the other LID:LIM1+2 interactions in the system. We combined these data with measurements of B LDB1 NI-2V C LDB1 NMs protein–DNA interactions using EMSAs and modeled the changes in populations of complexes. The formation of sMN

LHX3/4 ISL1/2 LHX3/4

LIM1+2 LIM1+2

LIM1+2 ternary complexes is likely to be reliant on both the unusual ki- SA SA netics of LDB1 binding by the different LIM-HD proteins and the LID

LID LID higher affinity of the ternary complex for its target DNA sites. HD HD HD Binding data for mutants and single LIM domains suggest that small differences in related ordered LIM domains can modulate Binary complex target genes Ternary complex target genes the LDB1LID binding mechanism to produce highly disparate D LIM2 binding region LIM1 binding region binding kinetics for a single IDR. 339 LDB1LID 291 ISL1LID Results ISL2 301 LID LID:LIM1+2 Interactions Have Disparate Affinities and Kinetics. We K k – – measured d and off values for complex formation between the Fig. 1. Protein protein and protein DNA interactions in the LIM code. (A) + Domain structures of the LIM-HD/LMO proteins and LDB1: tandem LIM do- LIDs from LDB1, ISL1, and ISL2 and the LIM1 2 domains mains (LIM1+2), HD, LID, and self-association (SA) domain. The LID (broken from ISL1, ISL2, LHX3, and LHX4 (Table 1). The homologous box) in LIM-HDs has only been found in ISL1/2. (B) LDB1 and LHX3/4 form a competition approach, in which MBP-LDB1LID was titrated into binary complex to regulate V2-IN development. (C) LDB1 binds ISL1/2, which in cleaved fluorescent complexes, was used to estimate the stronger turn, binds LHX3/4 to form a ternary complex that regulates sMN develop- binding affinities (Kd ≤ 2 nM) (Fig. 2B), and dilution experi- ment. (D) Structure-based sequence alignment of the LDB1 and ISL1/2 LIDs. ments were used to estimate weaker binding affinities (Kd > 10 nM) (Fig. 2C). koff values were determined by titrating with a large excess of MBP-LDB1LID (Fig. 2D). koff values for slowly LIM1+2 domains aggregate when expressed in isolation. How- −5 −1 dissociating complexes (koff < 10 s )werefittedusingafixed ever, engineered “tethered complexes,” comprising the LIM −4 −1 final FRET efficiency taken from the faster (koff ∼ 10 s ) domains fused to LIDs via a flexible linker, are soluble and stable experiments. (10, 20). We recently developed a FRET-based solution method The binding affinities of this network of like interactions span to study these interactions (21). Monomeric FRET protein pairs six orders of magnitude from the weakest intramolecular LID: mYPet and mECFP are fused to the termini of the interacting −4 −5 LIM1+2 interactions in the ISL1/2 proteins (Kd ∼ 10 –10 M) domains within a tethered complex that has a human rhinovirus through the intermediate LDB1LID:ISL1/2LIM1+2 and ISL1/2LID: −7 −8 3C protease (HRV 3C)-cleavable peptide linker. After pro- LHX3/4LIM1+2 (Kd ∼ 10 –10 M) to the strongest LDB1LID: −9 teolysis of the linker, the loss of FRET by competition with a LHX3/4LIM1+2 interactions (Kd ≤ 10 M), which were similar to nonfluorescent LID peptide or by dilution of the complex is that previously reported for LDB1LID:LMO4LIM1+2 (21). The A monitored (Fig. 2 ), allowing the determination of a broad measured koff values and inferred kon values varied by up to four range of values for Kd, koff, and inferred kon. orders of magnitude (Table 1 and Fig. S1 A–C). It is generally

Fig. 2. Highly variable binding affinities and kinetics occur in the protein–protein and protein–DNA interactions of the LIM code. (A) Protein fusion con- structs and assay design for the FRET-based interaction studies. (B–D) Representative LID:LIM1+2 FRET-based competition, dilution, and dissociation assays,

respectively (n = 3–7). Coloring is consistent across these panels. (E) Relationship of experimentally determined koff and inferred kon to Kd for the LID:LIM1+ 2 interactions. (F and G) Representative EMSAs of the LHX3HD and 2HDLL constructs binding to DNAb and DNAt, respectively. Construct schematics and coloring are consistent with Fig. 1C. Gels were imaged using fluorescein-labeled oligonucleotides. (H) EMSA binding curves for LHX3HD and 2HDLL constructs binding to DNAb/i/t. Curves are representative of replicate experiments (n = 2–3).

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1714646115 Robertson et al. Downloaded by guest on September 30, 2021 Table 1. Binding parameters for protein–protein interactions in (Fig. S3). We used species concentrations of 1 nM to 100 μMto the LIM code capture the common range of in vivo transcription factor concen- − − − SI Materials and Methods LID LIM1+2 K (M) k (s 1) k (M 1 s 1) trations ( ), as nuclear concentrations d off on of these proteins are not known. For each simulation, the total − − LDB1 ISL1 2.9 ± 0.3 × 10 8 2.86 ± 0.02 × 10 5 1.0 ± 0.1 × 103 concentrations for all components were identical. LDB1 self- − − LDB1 ISL2 2.5 ± 0.1 × 10 8 1.88 ± 0.02 × 10 5 7.6 ± 0.3 × 102 association and intramolecular ISL1/2 LID:LIM1+2interac- LDB1 LHX3 4.1 ± 0.4 × 10−10 6.3 ± 0.1 × 10−4 1.5 ± 0.2 × 106 tions were excluded from modeling. LDB1 LHX4 1.0 ± 0.1 × 10−9 1.25 ± 0.01 × 10−3 1.3 ± 0.1 × 106 The equilibrium states of LDB1, ISL1, and LHX3 were modeled − − LDB1 LMO4 1.8 ± 0.1 × 10 9*6.3± 0.02 × 10 4*3.5± 0.2 × 105* in the absence of DNA. Here, the binary LDB1:LHX3 complex was − ISL1 ISL1 1.7 ± 0.3 × 10 4 N.D. N.D. the most populated species at all concentrations (Fig. 3A). Adding ISL1 LHX3 2.7 ± 0.3 × 10−7 2.9 ± 0.3 × 10−1 1.1 ± 0.2 × 106 DNA showed varied effects. For V2-IN simulations (no ISL1/2), the − − ISL1 LHX4 5.4 ± 0.5 × 10 8 6.8 ± 0.1 × 10 2 1.3 ± 0.1 × 106 binary complex bound its specific DNAb site with few off-target − ISL2 ISL2 4.4 ± 0.7 × 10 5 N.D. N.D. interactions (Fig. 3B and Fig. S4A). For sMN simulations, the ISL2 LHX3 1.6 ± 0.3 × 10−7 2.2 ± 0.1 × 10−1 1.4 ± 0.4 × 106 presence of DNAt increased ternary complex formation relative to ISL2 LHX4 3.3 ± 0.5 × 10−8 5.8 ± 0.7 × 10−2 1.8 ± 0.3 × 106 simulations without DNA, but the inclusion of DNAb led to a de- + crease in the ternary complex and an increase in the binary complex LID:LIM1 2 Kd and koff values and inferred kon were determined using B A B FRET (n = 3–7). Constants represent the mean ± SEM from replicate (Fig. 3 and Fig. S4 and ).LMO4isknowntoprotectthe experiments. ISL1/2 LID:LIM1+2 binding kinetics were not determined (N.D.). ternary complex during sMN development by preferentially

*Data for LDB1LID:LMO4LIM1+2 were reported in ref. 25. disrupting the binary complex (13, 18, 19). The addition of LMO4 to the equilibrium modeling resulted in the preferential formation of an LDB1:LMO4 complex, similar decreases in the expected that differences in Kd in like systems will be heavily ternary and binary complexes, and an increase in the tran- influenced by differences in koff. This holds true for the measured scriptionally unproductive ISL1:LHX3 complex (Fig. 3A and interactions of LHX3/4LIM1+2 and LMO4 with LDB1LID and ISL1/ Fig. S4 B and C). 2LID, as a linear trend line fits well to Kd and koff for those inter- actions (Fig. 2E). However, the values for ISL1/2LIM1+2 and Sequestration of LDB1 by ISL1 Provides a Basis for Ternary Complex LMO2 binding to LDB1LID cluster together, well away from that Formation. We hypothesized that the unusual binding kinetics that

trend line, due to large differences in kon. we detected would resolve the apparent contradictions between BIOCHEMISTRY in vivo data and our equilibrium modeling and investigated the Binary and Ternary Complex Mimetics Have High DNA Binding Affinity effects of sequestration of LDB1 by ISL1. In these “free” simu- and Specificity. We used EMSAs to assess binding of HD-containing lations, LDB1 and ISL1 were equilibrated before the addition of protein constructs to DNA, as more quantitative methods could not LHX3, leading to rapid formation of the ternary complex followed clearly distinguish between higher-affinity (presumably specific) and by slower dissociation of LDB1:ISL1 and formation of the binary low-affinity (probably nonspecific) binding events (22). Nontagged complex (Fig. 3C and Fig. S5 B and C). At 1 μM protein con- isolated HDs from ISL1 and LHX3 were used to assess binding by centrations, the binary complex took 125 min to overtake the individual proteins, whereas GST-tagged proteins and engineered concentration of the ternary complex (henceforth referred to as fusion constructs provided models for dimeric binding. The oligo- the cross-over time; tC)(Fig.3C). As Lmo4 is up-regulated by the nucleotide sequences used contain binding sites for ISL1, LHX3, ternary complex (26, 27), we performed simulations in which and the ternary complex (henceforth referred to as DNAi, DNAb LMO4 was added at intervals after the introduction of LHX3. and DNAt, respectively). Addition of LMO4 30–120 min after LHX3 caused a more than t C D A surprising outcome was that the ISL1HD bound poorly to all twofold increase in the C (Fig. 3 and Fig. S5 ). That is, addition oligonucleotides tested (Fig. S2 A–C). Kd estimates derived from of LMO4 temporarily recreated the preferential disruption of the the disappearance of the free oligonucleotide indicate that binding binary complex in both this kinetic model and in vivo. is in the range of Kd values approximately micromolar or higher To further investigate LDB1 sequestration in this system, ad- (Tables 1 and 2 and Fig. S1D), which is at the limit of detection by ditional simulations were performed assuming that LDB1 and EMSA. Use of a GST-ISL1HD fusion had no effect on DNA ISL1 were fully complexed before addition of LHX3. These binding affinity (Table 2). These results are consistent with ge- “complexed” simulations showed substantially increased tC val- nome occupancy studies, showing that in vivo DNA binding by ues at low protein concentrations (≤1 μM) but minimal effect at ISL1 is dictated by its partner proteins (23, 24). In contrast, the higher concentrations (≥10 μM) (Fig. 3D). Addition of LMO4 t D isolated LHX3HD bound strongly to DNAb but poorly to the other again increased C values (Fig. 3 ). Equilibrium modeling using oligonucleotides (Fig. 2 F and H and Table 2). GST-LHX3HD unequal protein concentrations showed that the highest ternary: showed improved binding for DNAt but reduced binding to binary ratio occurred with excess ISL1 and that the lowest DNAb (Tables 1 and 2 and Fig. S2 D–F), highlighting the po- tential for binding artifacts in experimental systems (25). In- Table 2. Binding parameters for protein–DNA interactions in cubation of both ISL1HD and LHX3HD with DNAt did not improve binding over the single proteins, suggesting that the iso- the LIM code lated HDs do not bind cooperatively (Fig. S2G). HD DNAi DNAb DNAt

Fusion constructs mimicked the ternary complex: 2HDLL com- − − − ISL1 5.9 ± 0.3 × 10 6 2.0 ± 0.9 × 10 6 3 ± 2 × 10 6 prised ISL1 and LHX3 + , whereas 2HDN comprised HD HD-LID LIM1 2-HD ± × −6 ± × −6 > × −6 ISL1 and LHX3 . 2HDLL showed strong binding to DNAt GST-ISL1HD 6 5 10 4 4 10 2 10 * HD HD ± × −6 ± × −9 ± × −6 and some evidence of weaker, probably nonspecific binding to LHX3HD 2 1 10 9 3 10 4 3 10 > × −6 ± × −6 ± × −7 DNAi and DNAb (Fig. 2 G and H,Table2,andFig. S2 H and I). GST-LHX3HD 2 10 *53 10 5 5 10 ± × −7 ± × −7 ± × −7 The 2HDN fusion protein behaves similarly, albeit with slightly 2HDN 5 2 10 4.4 0.9 10 1.6 0.4 10 2HDLL 7 ± 5 × 10−7 4 ± 2 × 10−6 2.4 ± 0.3 × 10−8 higher Kd for DNAt (Tables 1 and 2 and Fig. S2 J–L).

Kd values were determined using EMSA (n = 2–3). Constants represent the Formation of Binary Complex over Ternary Complex Is Favored at mean ± SEM from replicate experiments. Equilibrium. Competing protein–protein and protein–DNA binding *Data were more qualitative due to poor resolution of bands in some events were simulated based on their equilibrium binding constants repeats (Fig. S2).

Robertson et al. PNAS Latest Articles | 3of6 Downloaded by guest on September 30, 2021 based binding experiments for LDB1LID:ISL1LIM1+2 and LDB1LID:LHX3LIM1+2 at varying NaCl concentrations tested the role of electrostatic steering. If this phenomenon was a major contributor to association, increased ionic strength would lead to a convergence in LDB1LID kon values for ISL1LIM1+2 and LHX3LIM1+2 (28). No such convergence was observed (Fig. S6 A–F and Table S2). The sequence and structure of the LID:LIM1+2 binding in- terface and the overall folds of the complexes are very similar between the different LIM1+2 domains (Fig. S6 G, H,andK). A comparison of LIM1+2 sequences revealed five sites that correlated with binding kinetics, although these residues did not directly contact LDB1LID (Fig. 4A and Fig. S6K). We made chimeric LIM1+2 domains that swapped the residues between the slow-binding LMO2 and the fast-binding LMO4. ThechimerashadWT-likeLDB1LID Kd values, but their koff values converged, such that the inferred LDB1LID kon of the chimeric LMO4LIM1+2 was ∼7-fold higher than that of the chimeric LMO2LIM1+2 compared with ∼170-fold for the WT domains (Fig. 4B, Fig. S6 I and J,andTable S3). These data indicated that the targeted sequence differences are im- portant, but additional factors must contribute to disparate kinetics. 15 We measured N relaxation data of high kon LDB1LID: LHX3LIM1+2 and low kon LDB1LID:LMO2LIM1+2 for which we have partial NMR assignments to identify specific regions that might contribute to differences in binding. Overall, the datasets are very similar (Fig. S7A), but we could not assign peaks for part of a β-strand in LMO2LIM1 that contacts LDB1LID, suggesting that conformational exchange may be present. This region is highly ordered in LMO4LIM1 (9), and comparisons of X-ray crystal structures suggest that a loop in this region may be more flexible than the rest of LMO2LIM1+2 and LMO4LIM1+2 C D Fig. 3. Equilibrium and kinetic modeling of the LIM code. (A) Complex (Fig. S7 and ). Given these apparent differences in the formation in the sMN system at equilibrium modeled across various protein LIM1 domains from the fast and slow complexes, we investigated concentrations. (B) Comparison of the ternary and binary complexes bound the contributions of the separate LIM domains from LMO2 and to their specific DNA targets in the presence of different DNA sequences in LMO4 for binding to LDB1LID. LDB1LID:LMO2LIM1 had the the V2-IN (LDB1, LHX3, and DNAb) and sMN (LDB1, ISL1, LHX3, and DNAt) lowest Kd of the individual LIM domains (Kd = 110 ± 10 nM) systems. Points and error bars represent the complex concentration calcu-

lated using the Kd and the range calculated from Kd ± SEM. (C) Kinetic modeling of the binary and ternary complexes after the addition of LHX3 to a preequilibrated LDB1:ISL1 complex. LMO4 was added 30, 60, 120, 150, or 200 min after the addition of LHX3. Each species concentration was set at

1 μM. (D) Comparison of tC values from free and complexed simulations across different protein concentrations. (E) Competition native gel where

mYPet-LDB1LID was incubated with cleaved LDB1LID:LIM1+2 complexes for the indicated times. The gel was imaged using mYPet fluorescence.

occurred with excess LDB1 (Fig. S4D). Therefore, sequestration of LDB1 by ISL1 would allow a temporally stable ternary com- plex to form on addition or expression of LHX3. To partially corroborate these results, we performed a compe- tition native gel shift assay in which cleaved nonfluorescent com- plexes (LHX3LIM1+2:LDB1LID,GB1-ISL1LIM1+2:LDB1LID,and MBP-LMO4LIM1+2:LDB1LID) in various combinations were incu- bated with mYPet-LDB1LID for either 5 or 100 min before sepa- ration on a native gel. The difference in molecular masses of the tags (GB1 ∼ 11 kDa, MBP ∼ 42 kDa) separates complexes that contain ISL1LIM1+2,LHX3LIM1+2,orLMO4LIM1+2. Here, persis- tent binding of ISL1LIM1+2 to LDB1LID was evident when mYPet- LDB1LID was incubated with GB1-ISL1LIM1+2 for long periods Fig. 4. Investigation of the sequence and structural basis of disparate LDB1LID before the addition of LHX3LIM1+2 and MBP-LMO4LIM1+2 (Fig. + + 3E), which is consistent with sequestration of LDB1 by ISL1 binding properties by different LIM1 2 domains. (A)ResiduesintheLIM1 2 domains (gray) found to correlate with the differential LDB1 (cyan) binding temporarily preventing the formation of other LDB1 complexes. LID are highlighted in red on the crystal structures of LDB1LID:LMO2LIM1+2 (Upper; ID code 2XJY) and LDB1LID:LMO4LIM1+2 (Lower;ProteinData Allostery and Modularity of Tandem LIM Domains Contribute to Bank ID code 1RUT). (B) Inferred LDB1LID kon values for WT and chimeric LMO2/ LDB1LID Binding. We probed the mechanisms of disparate LID: 4LIM1+2.(C) Single LIM FRET dilution assays for LDB1LID:LMO2/4LIM1/2.(D) LIM1+2 binding kinetics through several approaches. FRET- Comparison of ΔG° for the LDB1LID:LMO2/4LIM1/2/1+2 interactions.

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1714646115 Robertson et al. Downloaded by guest on September 30, 2021 −1 (Fig. 4C), with koff = 0.174 ± 0.006 s and an inferred kon = The ability of a single IDR to selectively bind ordered partners − − 1.6 ± 0.2 × 106 M 1 s 1 (Fig. S7E and Table S4). The other single with different binding kinetics is likely to be important for cellular LIM domains all had lower yet similar affinities for LDB1LID events. Our kinetic and equilibrium simulations provide a model (Kd ∼ 5–9 μM) (Fig. 4C and Table S4). The koff values for for how this could occur in the sMN LIM code. In our model, these interactions could not be measured, as the complexes sequestration of LDB1 by ISL1 into a long-lived complex followed dissociated within the ∼1 s of dead time of the assay, implying by the introduction of LHX3 would lead to the formation of a −1 5 −1 −1 koff ≥ 3s (Fig. S7 E and F) and an inferred kon ≥ 10 M s , temporally stable ternary complex (Fig. 5). This timing is consis- 6 −1 −1 close to the kon of LMO2LIM1 and LMO4LIM1+2 (∼10 M s ) tent with embryonic and induced pluripotent stem cell models of 3 −1 −1 and well above that of LMO2LIM1+2 (10 M s )(Table S4). sMN development that show expression of ISL1 several days be- The equilibrium data indicate that these LID:LIM1+2 interac- fore LHX3, with both proteins expressed for several days, which tions are subadditive compared with the individual LIM1 and would provide time for the transitions predicted in the kinetic LIM2 domains, particularly for LMO2 [ΔΔG°([LIM1+LIM2]−LIM1+2) = model (34, 35). However, additional investigation of the in vivo − − 24.6 ± 0.6 kJ mol 1 compared with 8.7 ± 0.4 kJ mol 1 for LMO4] kinetics of complex formation is required, as the rates constants (Fig. 4D). are likely to differ as a result of factors, including crowded cellular environments and posttranslational modification. Nevertheless, Discussion we expect the general trends identified here to occur in vivo given IDRs can bind their ordered partners with kon values that range the large relative differences in binding parameters. Note that the 2 7 −1 −1 k from 10 to 10 M s (28). The variation in on is at least kinetic model but not the equilibrium model accounts for favored partially determined by the folding propensities of the domain. ternary complex formation and its regulation by LMO4 (13, 19). For example, two IDRs, the transactivation domain (TAD) of c- Expression of LMO4 and the HD transcription factor 9 Myb and the phosphorylated KID (pKID) domain from CREB, (HB9) is up-regulated by the ternary complex (26, 27). The similar bind to the same region of the ordered KIX domain from CREB LDB1 binding rates of LHX3 and LMO4 teamed with the low “ binding protein (CBP) using mixed-conformational selection and LDB1:ISL1 koff mean that LMO4 is able to compete with LHX3 induced folding” and nearly pure “induced folding” mechanisms, to temporarily inhibit the formation of the binary complex but not k respectively (29, 30). The lower on of CREBpKID relative to the ternary complex, leaving LHX3 free to form the ternary c-MybTAD correlates with the lower helical propensity of the pKID complex. At the same time, HB9 competes with the binary com- sequence (30). In contrast, we have found that a single IDR, plex for binding to DNA (18, 26), thus mimicking the conditions of

k BIOCHEMISTRY LDB1LID, can bind to related ordered partners with inferred on the equilibrium modeling in the absence of DNAb, which favors values that vary by up to three orders of magnitude (Tables 1 and ternary–DNA complex formation (Fig. 3B). When slow dissocia- k 2). This variation in on is not based on differences in electrostatic tion of the LDB1:ISL1 leads to the eventual breakdown of the steering (Fig. S6) (21) but instead, points to mechanistic differ- ternary complex, HB9 and LMO4 would continue to prevent ences in LDB1LID binding conferred by the ordered LIM domains. formation of the binary complex. There is some precedence for ordered proteins influencing the Mechanisms other than timing of expression may contribute to folding and binding of their IDR partners (28). For example, LDB1:ISL1 sequestration. sMN progenitors express the E3 ubiq- mutations of the ordered MCL-1 open its binding pocket for the uitin ligase RLIM, which targets LDB1 for proteasomal degrada- disordered BH3 motif from upregulated modulator of apo- tion, as well as ssDNA binding protein 1 (SSBP1), which protects k ptosis (PUMA), increasing on by approximately twofold (31), LDB1:LIM-HD complexes from degradation and is required for whereas mutations in CBPKIX changed the transition state of full ternary complex activity (36–38). RLIM and SSBP1 could work complex formation with c-MybTAD, decreasing kon by less than together in these cells to protect nascent LDB1:ISL1 complexes or equal to twofold (32). The large range of binding kinetics and remove free LDB1 to prevent binary complex formation. + that LDB1LID possesses for related LIM1 2 domains is The expressions of ISL2 and LHX4 are also induced by the highly unusual. ternary complex and can promote sMN development (16, 19, 39, Our binding data for single LIM domain interactions from 40). Our binding data show ISL2LID:LHX4LIM1+2 has an eightfold LMO2 and LMO4 suggest that modular binding to the flexible LDB1LID drives the disparate kinetics. For LMO2, affinities for LDB1LID suggest that the first step of binding is by LIM1, with a

significant energetic penalty incurred for the subsequent binding AB

LIM1+2 k LIM1+2 of LIM2. The inferred LDB1LID on of LMO2LIM1+2 is much ISL1 LHX3

LIM1+2

SA LIM1+2

lower than for LMO2LIM1/2 (Table S4), indicating that the binding ID

LID

L SA of the second module is a rate-limiting step for complex forma- LID Lmo4

LID tion. These barriers to association are likely caused by restrictions HD Hb9 HD HD in the conformational changes available to LDB1 and LMO2. LDB1 HD For LMO4, the individual LIM domains interact similarly with Ternary complex target genes LDB1LID, and although an energetic penalty is still incurred in LMO4 and HB9 C inhibit binary LI binding both domains, this is smaller than it is for LMO2. LMO4 M1+2

LIM1+2 complex formation

LIM1+2

The origin of the differences in LDB1LID binding kinetics may LIM1+2 SA arise from a loop in LIM1 that appears to be flexible in LMO2 SA

LID

LID LID HD (low kon) but not in LMO4 or LHX3 (high kon), but the se- E HB9 quences of the loops are very similar (Fig. S7 ). The same loop HD HD HD seems well-ordered in the crystal structure of ISL1 (low kon), but other loops in ISL1LIM2 appear to be more flexible (Fig. S7F), Ternary complex target genes Binary complex target genes suggesting that the loops surrounding the β-strands that contact LDB1 contribute to binding kinetics. Notably, sequences that Fig. 5. Mechanism for the kinetic and thermodynamic regulation of ternary complex formation. (A) LDB1 is sequestered by ISL1, disrupting the low-affinity correlate with the disparate kinetics and were tested in chimeras intramolecular LID:LIM1+2 interaction and allowing the orphaned ISL1 to tend to cluster away from the binding site, indicating allosteric LID + interact with LHX3LIM1+2.(B) The intact ternary complex interacts with target regulation of LID:LIM1 2 binding. LIM domains are reported DNA, up-regulating the expression of regulator proteins LMO4 and HB9. (C) to contain extensive hydrogen bonding networks and high flexi- LMO4 and HB9 cooperate to disrupt binary complex formation by competing bility (33), which could facilitate this type of allostery. for protein and DNA binding partners, respectively. SA, self-association.

Robertson et al. PNAS Latest Articles | 5of6 Downloaded by guest on September 30, 2021 higher Kd than the ISL1-LHX3 equivalent, resulting in higher ra- either ISL1HD or LHX3HD alone or both HDs. More detail and binding equa- tios of ternary:binary complexes (Fig. S8). Thus, ISL2 and LHX4 tions are in SI Materials and Methods. could be even more effective that the canonical ISL1/LHX3 pair for sMN production from inducible pluripotent stem cells. Native Gel Competition Assay. Cleaved LID:LIM1+2 complexes (1 μM) were In summary, the modular and competing interactions between incubated with mYPet-LDB1LID (50 nM) at room temperature for 5 or the intrinsically disordered LIDs and LIM domains arrayed in 100 min as indicated. Samples were resolved on a 4–16% acrylamide tandem result in variable binding kinetics between a single IDR NativePAGE Bis·Tris gel (Thermo Fisher Scientific) in 100 mM Bis·Tris·tricine buffer and a family of related binding proteins. This phenomenon pro- at 150 V for 105 min. mYPet-fluorescence was visualized using the Typhoon vides a mechanistic basis for the temporal regulation of tran- FLA 9500 scanner (GE Healthcare) using the default GFP filter set with exci- scription factor assembly and the resulting expression of sets of tation at 473 nm and emission recorded at 510 nm. genes, with profound impacts on the development and evolution Modeling Equilibria. Competing protein–DNA and protein–protein interac- of complex organisms. tions were modeled by numerical methods using either equilibrium or Materials and Methods binding kinetics data using the MATLAB software (MathWorks). Binding affinities and rate constants were used to simulate competitive binding over Construct Cloning, Expression, and Purification. All constructs were generated a range of species concentrations (0.001–100 μM), with all species in a single by PCR using genes for mouse proteins and cloned into vectors for expression with 6×His, 6×His GB1, GST, or MBP tags for purification. The proteins were simulation kept to equal total concentrations. The mean affinities and rate expressed in Escherichia coli BL21(DE3) or Rosetta II cells and purified by constants were used to calculate the amount of complex formed. To account batch affinity chromatography followed by cation exchange and/or size for the errors in the EMSA data simulations involving DNA, species were exclusion chromatography (SI Materials and Methods). repeated with Kd = mean ± SEM. The populations of the relevant species are displayed as the fraction bound. Equilibrium equations were reduced in Fluorescence Spectroscopy and FRET-Based Interaction Methods. FRET-based LID: Mathematica (Wolfram) before their numeric solving in MATLAB. The in- LIM1+2 assays were described previously (21). Experimental details acquisition teractions and equations used are detailed in SI Materials and Methods. parameters and binding equations are described in SI Materials and Methods. ACKNOWLEDGMENTS. N.O.R., N.C.S., and A.M. were supported by Australian EMSA. Fluorescence EMSAs have been described previously (41). Oligonu- Postgraduate Awards. The work was supported by Grants DP140102318 and cleotides were designed to mimic in vivo promoters that were shown to bind DP170103539 from the Australian Research Council.

1. Liu Z, Huang Y (2014) Advantages of proteins being disordered. Protein Sci 23: 23. Cho HH, et al. (2014) Isl1 directly controls a cholinergic neuronal identity in the de- 539–550. veloping forebrain and spinal cord by forming cell type-specific complexes. PLoS 2. Berlow RB, Dyson HJ, Wright PE (2015) Functional advantages of dynamic protein Genet 10:e1004280. disorder. FEBS Lett 589:2433–2440. 24. Rhee HS, et al. (2016) Expression of terminal effector genes in mammalian neurons is 3. Dogan J, Jonasson J, Andersson E, Jemth P (2015) Binding rate constants reveal dis- maintained by a dynamic relay of transient enhancers. Neuron 92:1252–1265. tinct features of disordered protein domains. Biochemistry 54:4741–4750. 25. Wissmueller S, et al. (2011) Protein-protein interactions: Analysis of a false positive 4. Huang Y, Liu Z (2009) Kinetic advantage of intrinsically disordered proteins in coupled GST pulldown result. Proteins 79:2365–2371. folding-binding process: A critical assessment of the “fly-casting” mechanism. J Mol 26. Lee S, et al. (2012) Fusion protein Isl1-Lhx3 specifies motor neuron fate by inducing Biol 393:1143–1159. motor neuron genes and concomitantly suppressing the interneuron programs. Proc 5. Shammas SL, Travis AJ, Clarke J (2013) Remarkably fast coupled folding and binding Natl Acad Sci USA 109:3383–3388. of the intrinsically disordered transactivation domain of cMyb to CBP KIX. J Phys Chem 27. Erb M, et al. (2017) The Isl1-Lhx3 complex promotes motor neuron specification by – B 117:13346 13356. activating transcriptional pathways that enhance its own expression and formation. 6. Umezawa K, Ohnuki J, Higo J, Takano M (2016) Intrinsic disorder accelerates disso- eNeuro 4:ENEURO.0349-16.2017. – ciation rather than association. Proteins 84:1124 1133. 28. Shammas SL, Crabtree MD, Dahal L, Wicky BI, Clarke J (2016) Insights into coupled 7. Minezaki Y, Homma K, Kinjo AR, Nishikawa K (2006) Human transcription factors folding and binding mechanisms from kinetic studies. J Biol Chem 291:6689–6695. contain a high fraction of intrinsically disordered regions essential for transcriptional 29. Sugase K, Dyson HJ, Wright PE (2007) Mechanism of coupled folding and binding of – regulation. J Mol Biol 359:1137 1149. an intrinsically disordered protein. Nature 447:1021–1025. 8. Shammas SL (2017) Mechanistic roles of protein disorder within transcription. Curr 30. Arai M, Sugase K, Dyson HJ, Wright PE (2015) Conformational propensities of in- – Opin Struct Biol 42:155 161. trinsically disordered proteins influence the mechanism of binding and folding. Proc 9. Deane JE, et al. (2003) Structural basis for the recognition of ldb1 by the N-terminal Natl Acad Sci USA 112:9614–9619. LIM domains of LMO2 and LMO4. EMBO J 22:2224–2233. 31. Rogers JM, et al. (2014) Interplay between partner and ligand facilitates the folding 10. Deane JE, et al. (2004) Tandem LIM domains provide synergistic binding in the LMO4: and binding of an intrinsically disordered protein. Proc Natl Acad Sci USA 111: Ldb1 complex. EMBO J 23:3589–3598. 15420–15425. 11. Matthews JM, Potts JR (2013) The tandem β-zipper: Modular binding of tandem 32. Toto A, et al. (2016) Molecular recognition by templated folding of an intrinsically domains and linear motifs. FEBS Lett 587:1164–1171. disordered protein. Sci Rep 6:21994. 12. Shirasaki R, Pfaff SL (2002) Transcriptional codes and the control of neuronal identity. 33. Konrat R, Weiskirchen R, Bister K, Krautler B (1998) Bispheric coordinative structuring Annu Rev Neurosci 25:251–281. in a protein: NMR analysis of a point mutant of the carboxy-terminal LIM 13. Thaler JP, Lee SK, Jurata LW, Gill GN, Pfaff SL (2002) LIM factor Lhx3 contributes to domain of quail cysteine- and glycine-rich protein CRP2. J Am Chem Soc 120: the specification of motor neuron and interneuron identity through cell-type-specific 7127–7128. protein-protein interactions. Cell 110:237–249. 34. Hester ME, et al. (2011) Rapid and efficient generation of functional motor neurons 14. Bhati M, et al. (2008) Implementing the LIM code: The structural basis for cell type- specific assembly of LIM-homeodomain complexes. EMBO J 27:2018–2029. from human pluripotent stem cells using gene delivered transcription factor codes. – 15. Gadd MS, et al. (2011) Structural basis for partial redundancy in a class of transcription Mol Ther 19:1905 1912. factors, the LIM homeodomain proteins, in neural cell type specification. J Biol Chem 35. Mazzoni EO, et al. (2013) Synergistic binding of transcription factors to cell-specific – 286:42971–42980. enhancers programs motor neuron identity. Nat Neurosci 16:1219 1227. 16. Sharma K, et al. (1998) LIM homeodomain factors Lhx3 and Lhx4 assign subtype 36. Lee B, Lee S, Agulnick AD, Lee JW, Lee SK (2016) Single-stranded DNA binding pro- identities for motor neurons. Cell 95:817–828. teins are required for LIM complexes to induce transcriptionally active chromatin and – 17. Tsuchida T, et al. (1994) Topographic organization of embryonic motor neurons de- specify spinal neuronal identities. Development 143:1721 1731. fined by expression of LIM homeobox genes. Cell 79:957–970. 37. Ostendorff HP, et al. (2002) Ubiquitination-dependent cofactor exchange on LIM 18. Lee S, et al. (2008) A regulatory network to segregate the identity of neuronal sub- homeodomain transcription factors. Nature 416:99–103. types. Dev Cell 14:877–889. 38. Xu Z, et al. (2007) Single-stranded DNA-binding proteins regulate the abundance of 19. Song MR, et al. (2009) Islet-to-LMO stoichiometries control the function of tran- LIM domain and LIM domain-binding proteins. Genes Dev 21:942–955. scription complexes that specify motor neuron and V2a interneuron identity. 39. Hutchinson SA, Eisen JS (2006) Islet1 and Islet2 have equivalent abilities to promote Development 136:2923–2932. motoneuron formation and to specify motoneuron subtype identity. Development 20. Deane JE, et al. (2001) Design, production and characterization of FLIN2 and FLIN4: 133:2137–2147. The engineering of intramolecular ldb1:LMO complexes. Protein Eng 14:493–499. 40. Thaler JP, et al. (2004) A postmitotic role for Isl-class LIM homeodomain proteins in 21. Robertson NO, Shah M, Matthews JM (2016) A quantitative fluorescence-based assay for the assignment of visceral spinal motor neuron identity. Neuron 41:337–350. assessing LIM domain-peptide interactions. Angew Chem Int Ed Engl 55:13236–13239. 41. Wilkinson-White L, et al. (2015) GATA1 directly mediates interactions with closely 22. Smith NC, Matthews JM (2016) Mechanisms of DNA-binding specificity and functional spaced pseudopalindromic but not distantly spaced double GATA sites on DNA. gene regulation by transcription factors. Curr Opin Struct Biol 38:68–74. Protein Sci 24:1649–1659.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1714646115 Robertson et al. Downloaded by guest on September 30, 2021