<<

Downloaded by guest on September 28, 2021 h rttoses .. h iadidcdalsei transition, study system. allosteric model to ligand-induced as approach domain the PDZ2 an i.e., a outline employing steps, we two work, first simu- this the (MD) In dynamics (16–18). molecular of lations path- limitations time-scale transition because the also the and of (15) the makes experimentally because observe changes to part, challenging structural ways in the rarely, protein. only of the directly smallness of observed process site been nonequilibrium remote has a a and to is at transition” leads affinity “allosteric so-called eventually binding undergo This the 3) to of protein which change the evolution, a of con- time its atoms nonequilibrium of the the change a 1) 2) a steps: causes by (14) three initiated to (usually centration) includes is ligand us This a signal” requires allostery. of (un)binding of “allosteric ultimately genesis examination the the stringent of study A nature known. the not plausible, appear may by done for mainly accounts been only has however, dynamics. equilibrium which, allostery of (11–13), dynamic spectroscopy absence of NMR Study- effects apparent (3–10). the explain (un)binding ing ligand of to upon change invoked change associated conformational been “dynamic the termed has to scenario, latter allostery,” Referring the (3). fluctuations, contribution structural entropic energy the an free gives the and ligand structure to hand, the other of change changes which variance the a flexibility, the protein’s On with the (2). alter associated also an structure may often (un)binding mean from are protein’s transition state) the the active of (e.g., an function to protein’s inactive the biological of ligand-induced Since structure, changes (1). protein site to active related intimately the is at function affinity the modifies site tal A simulations allostery to landscape free-energy its adapt signals. hand-in- to struc- incoming go protein lim- actually the that number allowing communication, as proposed hand, allosteric discussed small of often is a allostery, scenarios It of iting driven dynamically populations states. landscape, and the free-energy well-defined turally in rugged structurally shifts its of subtle of very remodeling with a four domain by from covering ranging PDZ2 nalized protein, time, the the switch- of of in upon decades response transition changes subsequent allosteric that domain The an way protein. PDZ2 initiating a the in thus to ligand ing, peptide affinity a binding azobenzene- to its an linked end, is this domain, photoswitch accompanied To derived PDZ2 simulations. well spectroscopy the dynamics infrared not molecular allostery, transient by are of by site system investigated model remote is studied a widely in at and transition a conformational structure, affinity ligand-induced protein the binding Here the understood. the of at of evolution (un)binding time change ligand regula- resulting of 2020) protein site, process 23, for June one dynamical review importance for underlying (received paramount the 2020 24, of tion, August approved is and allostery IL, Urbana, While Urbana–Champaign, at Illinois of University Gruebele, Martin by Edited Freiburg, 79104 a Wolf Steffen rniin naPZdomain Bozovic Olga PDZ a in transitions allosteric ligand-induced of observation Real-time www.pnas.org/cgi/doi/10.1073/pnas.2012999117 eateto hmsr,Uiest fZrc,85 uih wteln;and Switzerland; Zurich, 8057 Zurich, of University Chemistry, of Department hl ohmdl,srcua hnev.dnmcchange, dynamic vs. change structural models, both While rti ope,weetebnigo iadt h dis- the to ligand or a protein of binding a the in where sites complex, two protein of a coupling the represents llostery | rnin nrrdspectroscopy infrared transient | D domains PDZ b a,1 ehr Stock Gerhard , lui Zanobini Claudio , sto ns ∼1 b,2 | oeua dynamics molecular n ee Hamm Peter and , a,1 da Gulzar Adnan , ∼10 ,cnb ratio- be can µs, b,1 a,2 rniaJankovic Brankica , b ooeua yais nttt fPyis nvriyo Freiburg, of University Physics, of Institute Dynamics, Bomolecular ftepoeni eltm n efr xesv mr than (more extensive MD nonequilibrium perform all-atom and time) time simulation aggregate real ms in 0.5 protein change structural the the monitor of to strategy labeling time. isotope in connection an point with in defined spectroscopy precisely vibrational time-resolved a employ at We ligand the of to affinity binding strategy this apply can one addi- trigger, In system. a domain. any as virtually PDZ2 ligand the the of using pocket tion, binding covalently the was in across photoswitch obtained the linked than where 31), construct (30, artificial study previous less this our all, much at a modified to not the leads is on strategy domain PDZ2 element the (approximately photoswitchable Since extent instead. a ligand same such the introducing the to by bind- perturbed in the fivefold) that be difference see can will sevenfold affinity We (29). ing to (28), domain PDZ2 domains five- the a toward PDZ approximately affinity residue, of an (−2) processes to serine regulatory leads the similar in very of target a phosphorylation for common reported the recently that was It system 1). pho- as (Fig. linked (27) moiety azobenzene toswitch an with domain (26) 2) nucleotide PDZ2 factor guanidine exchange associating the (Ras/Rap1 RA-GEF- is derivative a peptide and here 2 1E) protein phosphatase considered tyrosine the (human system hPTP1E in from to The networks thought 22–25). allosteric is (20, conserved (4, domains complexes via PDZ transduction transduced in of signal be flow many examples information in Allosteric prime role 21). interactions pivotal domain-mediated considered a PDZ play 19). are 6, (4, allostery domains dynamic PDZ binding, is ulse coe ,2020. 5, October published First doi:10.1073/pnas.2012999117/-/DCSupplemental at online information supporting contains article This 2 1 the under Published Submission.y Direct PNAS a is article This interest.y competing no paper.y declare the authors wrote P.H. The analyzed and B.J., P.H. G.S., and A.G., S.W., G.S., S.W., C.Z., C.Z., M.P., O.B., B.J., O.B., and A.G., research; C.Z., data; O.B., designed research; P.H. performed M.P. and and G.S., D.B., S.W., O.B., contributions: Author owo orsodnemyb drse.Eal [email protected] [email protected] Email: addressed. be [email protected] may correspondence whom work.y this To to equally contributed A.G. and C.Z., O.B., etdhr,alwn st netgt t lotrctransition allosteric pre- detail. its unprecedented is investigate in to design can us a one allowing which Such here, sented in events. systems, (un)binding protein ligand of sig- synchronize a design such the of requires observation real-time nal molecular The speed. a that properties its on dynamical nor process, known or change) structural is nonequilibrium is signal it site, a whether the (e.g., of level is effector nature signal the the prop- neither allosteric The to but latter. an the transduced of of activity is ligand- agation the upon site of sensed regulation distal for signal allowing a a which at in binding process a is Allostery Significance ypoosmrzn h zbneemit,w hnethe change we moiety, azobenzene the photoisomerizing By ligand- upon change conformational modest their for Known PNAS a ai Buhrke David , NSlicense.y PNAS | coe 0 2020 20, October a atisPost Matthias , | . y o.117 vol. https://www.pnas.org/lookup/suppl/ | o 42 no. b | , 26031–26039

BIOPHYSICS AND COMPUTATIONAL BIOLOGY iment. Nevertheless, we will be able to observe the adaptation of the protein to a perturbed peptide conformation in the bind- ing pocket and its transition to unspecific binding on the protein surface. We investigate the ligand-induced conformational transition with the help of transient IR spectroscopy in the range of the amide I band (see Materials and Methods for details) (42–44). This band originates from mostly the C=O stretch vibration of the peptide/protein backbone and is known to be strongly structure-dependent (45). While one cannot invert the prob- lem and determine the structure of a protein from the amide I band, any change in protein structure will cause small but distinct changes in this band (Fig. 2 A–C). Fig. 1. Ligand-switched PDZ2 domain. Main secondary structural elements Fig. 2 shows the transient IR response in the spectral region and Cα distances d20,71 and d4,55 discussed below (Results, MD Simulations) of the amide I vibration after photoswitching in either the are indicated. In the trans conformation of the photoswitch (red), the lig- trans-to-cis (Fig. 2D–F) or the cis-to-trans direction (Fig. 2 G– and (blue) fits well in the binding pocket, while it starts to move out when I). To be directly comparable, the two datasets were scaled switching to . cis in a way that they refer to the same amount of isomeriz- ing molecules and not the same amount of excited molecules. The scaling took into account the different pump-pulse ener- simulations combined with Markov modeling to interpret the gies used in the experiments (Materials and Methods), cross- experimental results in terms of the structural evolution of the sections (23,500 cm−1M−1 for trans at 380 nm vs. 2,000 system. We find that the mean structural change of the protein is cm−1M−1 for cis at 420 nm) (27), and isomerization quantum rather small. However, in both experiment and MD simulations, yields (8% for trans-to-cis switching and 62% for cis-to-trans the free-energy surface of the protein can be characterized by switching) (46). a small number of metastable conformational states. In agree- Selective isotope labeling can be used to disentangle the ment with the view of allostery as an interconversion between contribution of the peptide ligand from that of the protein. the relative population of metastable states, we see how the 13C15N-labeling of the protein backbone down-shifts the vibra- ligand-induced response of the PDZ2 domain is best described tional frequency of the amide I band by ≈25 cm−1. By taking as remodeling of the free-energy landscape (32–36) and how the double-difference spectra between the sample with isotope- response is transduced from the ligand to the protein without labeled protein vs. that with nonlabeled protein cancels out the introducing a significant structural change. contribution of the peptide ligand, which is not labeled in either case. By doing so, we implicitly assume that the spectra of pro- Results tein and ligand are additive and that coupling between them can Experimental. To set the stage, we have investigated the influ- be neglected. This idea is utilized in Fig. 2, showing the response ence of photoswitching of the ligand on its binding affinity. with the nonlabeled protein in graphs and that with the By choosing the spacing between the anchoring points of the 13C15N-labeled protein in the middle graphs. The transient IR azobenzene moiety, the peptide ligand was designed such that responses of both isotopologs look quite similar, as the signal is the longer trans conformation mimics the native extended β- dominated by the photoswitchable peptide, which is perturbed strand conformation, while the cis configuration shortens the directly by the azobenzene moiety. The double-difference spec- peptide and perturbs it from its extended form. To that end, tra, removing the contribution of the photoswitchable peptide the alanine residue at position −1 (the ligand is labeled by neg- ligand, are shown in graphs of Fig. 2, with some of ative numbers) was chosen as the first anchoring spot for the the more prominent features highlighted in Fig. 3 A–D. Great photoswitch, since it has been shown that a mutation at this care was taken that protein and peptide concentrations were position does not significantly affect the binding, while residues exactly the same in both experiments. Furthermore, both exper- that are crucial for binding [Val(0), Ser(−2) and Val(−3)] are iments were performed right after each other without changing preserved (37, 38). The second anchoring point chosen was any setting of the laser setup. Asp(−6) which allows the peptide to be maximally stretched Overall, the kinetics of these double-difference spectra are in the trans configuration of the photoswitch. Protein and pep- quite complex and cover many orders of magnitudes in time tide have been expressed/synthesized using standard procedures (47). Furthermore, the responses for trans-to-cis (Figs. 2F and (30, 39) (see Materials and Methods for details). The dissociation 3 A and C) vs. cis-to-trans switching (Figs. 2I and 3 B and D) constants (KD ) in the two configurations of the photoswitch- are not mirror images of each other, which one might expect if able peptide were determined by isothermal titration calorimetry the protein would take the same pathway in the opposite direc- (ITC), fluorescence, and circular dichroism (CD) spectroscopy tion. For example, the strongest band at 1,636 cm−1 (marked (SI Appendix, Figs. S2 and S3) (40). The obtained values aver- as ∗1 in Figs. 2F and 3A) reveals the biggest step at around 1 aged for all methods (KD,trans = 2.0 ± 0.6 µM, KD,cis = 9.6 ± ns in the trans-to-cis data, while the complementary feature in 0.5 µM; SI Appendix, Table S1) reveal an appreciable approxi- cis-to-trans data (marked as ∗2 in Figs. 2I and 3B) develops in mately fivefold difference in the binding affinity, with the cis state a very stretched manner, from ≈3 ns to ≈3 µs. Worthwhile not- being the destabilized one, as anticipated. ing is also a transient band at 1,579 cm−1 in the trans-to-cis data Considering these binding affinities and the relatively high (marked as ∗3 in in Figs. 2F and 3C), existing up to ≈100 ns, concentrations needed for the transient infrared (IR) experi- that has no complementary counterpart in the cis-to-trans data ment (1.25 mM for the peptide and 1.5 mM for the protein), (Figs. 2I and 3D). it is clear that most of the ligands are bound in both states to The red lines in Fig. 3 A–D are fits revealed from a time-scale a protein of the photoswitch (97% in cis and 99% in trans); analysis of the signals using a maximum entropy method (48). hence, we will not observe many binding or unbinding events. Furthermore, as binding and unbinding in similar PDZ/ligand systems was observed to occur on 10- to 100-ms time scales (41), X −t/τk S(ωi , t) = a0(ωi ) − a(ωi , τk )e . [1] these processes are hardly within the time window of our exper- k

26032 | www.pnas.org/cgi/doi/10.1073/pnas.2012999117 Bozovic et al. Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 ooi tal. et Bozovic h inl hc srpeetdb utepnnilfunction multiexponential a scales by time represented with is which signal, the protein the with sample the for the that show show graphs graphs right middle and protein, WT the for data (black). spectrum 42 to µs 2. Fig. et,w efre l-tmepii-ovn Dsmltosof simulations MD explicit-solvent all-atom performed we ments, Simulations. MD 42 after the that vation black red and the blue in the at in 1,600 counterpart response band at The negative the photoswitching. the after represents of time which infinite (black), effectively spectrum ference trans-to-cis 42 cis-to-trans 2 (i.e., experiment Fig. transient in our to accessible seen is is 51). that (50, experiments effect of photo- an type this the moiety, in by azobenzene universally released the energy of vibrational isomerization the from as discrete originating (labeled of number ps small 100 relatively 3 (Fig. a scales indicate time to in seems shown S4, dataset Fig. complete the over averages which in shown examples the of all for and 3 spectra, different Fig. time-scale is these peaks in of peak pattern a the as up shows above cussed 3 Fig. in Here, in discussed are features labeled the and datasets the of scaling relative The changes. h rnin pcr ttelts uppoedlytm that time delay pump–probe latest the at spectra transient The eetees h yaia otn (49), content dynamical the Nevertheless, A–D. ω rnin Rsetao D2i h eino h md band. I amide the of region the in PDZ2 of spectra IR Transient oices inlt-os)for signal-to-noise) increase to µs i µs. cm A–D eoe h rb rqec and frequency probe the denotes A–C cis-to-trans ore-rnfr nrrdsetocp FI)dif- (FTIR) spectroscopy infrared Fourier-transform wthn.Te r oprdwt rpryscaled properly a with compared are They switching. −1 sbu ie.Ec ftekntcpoessdis- processes kinetic the of Each lines. blue as nbu for blue in D–F mre as (marked cis-to-trans oadteitrrtto fteaoeexperi- above the of interpretation the aid To τ D k E h iesaespectra time-scale The . hwtecmlt rnin aafor data transient complete the show (τ ∗ nFg.2F Figs. in 4 and k pcrm ecnld rmti obser- this from conclude We spectrum. = ) 13 C 15 F " -Tdfeec aa e oosin colors Red data. difference N-WT .W trbt h rtpa around peak first the attribute We ). trans-to-cis rniini o opeeyfinished completely not is transition X ∗ i a o e evolved yet not has 2C) Fig. in 5 GHI D A a 2 (ω oa“etsignal” “heat a to 3E) and i , trans-to-cis τ wthn n nrdfor red in and switching k ) # 1/2 a t trans-to-cis (ω , h ea ieof time delay the bu)and (blue) i , )aeshown are µs) τ k IAppendix , SI ) trans-to-cis r shown are E B spectra cis-to-trans D–I niaepstv-bobnecags n leclr niaenegative-absorbance indicate colors blue and changes, positive-absorbance indicate [2] A–C wthn and switching opr rnin aaa ogpm–rb ea ie aeae rm20 from (averaged times delay pump–probe long at data transient compare rd wthn,tgte ihapoel scaled properly a with together switching, (red) o h lseig efidta two that find we clustering, the for S6 Fig. Appendix, proposed SI recently (see a (59) and approach (58) machine-learning followed clustering (57), data density-based simulation robust all by of of (PCA) distances coordinates analysis normalized component essential the principal on the a performed identify we To system, the S5). Fig. Appendix, (SI (h∆d icantly as 56 determined we protein, tances the of characterization tural (53) package software 510 v2016 Amber99 GROMACS the the and changes Using conformational PDZ2. ligand-induced of the of (52) simulations between PDZ2: of distribution d conformational the characterize itatively the pnn ftebnigpce.Bt ttsaemroe ystates by mirrored state are states while Both 3 (60), pocket. binding structure the state crystal of opening identifies the clustering to Density-based close as system. four the reveals of landscape conformational states metastable free-energy indicating The minima, local PDZ2. well-defined of state bound region N-terminal 5× 4A and Fig. surface (C)- coordinates, energy these carboxy Employing the 1). (Fig. of compactness and terminus the (N) amino the between D2udrosannqiiru ieeouinutli relaxes it until evolution time nonequilibrium a undergoes PDZ2 20,71 and pnsicigtelgn from ligand the switching Upon d 13 cis .Frtestruc- the For Methods). and (Materials time simulation µs i C ,j µs-long Experimental 15 hc r hfe olre auso coordinate of values larger to shifted are which 4, and conigfrtewdho h idn oktlocated pocket binding the of width the for accounting d and -aee teppielgn otisntrlyabundant naturally contains ligand peptide (the N-labeled G–I i ,j β hwta for that show 2 ewe residues between trans d ij and i trans ,j ≥ i F C ±1 ∆ qiiru ttsa ela oeulbimMD nonequilibrium as well as states equilibrium PNAS . ∗ α n hs nebeaeaecagssignif- changes average ensemble whose and ) 0.5 G qiiru iuain eciigteligand- the describing simulations equilibrium LNfrefil 5–6,w olce ntotal in collected we (54–56), field force ILDN 2 swl as well as , = o eal) hl eue i dimensions six used we While details). for )drn h oeulbimsimulations nonequilibrium the during A) ˚ | −k cis-to-trans coe 0 2020 20, October B T i ln and P d wthn.Telf rpsso the show graphs left The switching. 4,55 (d j C htaentrdnat(such redundant not are that 20,71 α aeil n Methods and Materials trans-to-cis α ersnigtedistance the representing | itne ufiet qual- to suffice distances 1 -β , o.117 vol. trans-to-cis d 4 4,55 op hc reflects which loop, bandfrom obtained ), hw h free- the shows | 2 configuration, o 42 no. TRdifference FTIR niae an indicates d C 4,55 | 12 α C 26033 14 and dis- . N), 1

BIOPHYSICS AND COMPUTATIONAL BIOLOGY contact changes were found for the trans-to-cis reaction (61). AB Furthermore, in contrast to Fig. 4 A and B, the cis and trans free-energy landscapes hardly overlapped in the cross-linked photoswitchable PDZ2 domain (31, 49). These findings indi- cate that ligand-switching is considerably less invasive than a cross-linked photoswitch and therefore better mimics the natural unbiased system. Is the above-discussed population shift as well as the very CD occurrence of states an inherent property of the protein’s rugged free-energy landscape (32, 33), or are these features rather induced by the ligand? Fig. 4C addresses this question by showing the free-energy landscape obtained from previously performed 6×1µs-long simulations of PDZ2 without a ligand (61). While the state separation along coordinate d4,55 still exists, we find that states 1, 2, and 5 merge into a single energy minimum. It is EF centered at the position of state 2, but is wide enough to cover a large part of states 1 and 5. Similarly states 3 and 4 form a weakly populated (2 %) single minimum. This indicates that ligand-free PDZ2 provides the flexibility to assess the entire free- energy landscape explored during binding and unbinding, while the interaction with the ligand appears to stabilize conforma- tional states 1 and 4. Returning to the question at the beginning of the paragraph, we find that it is a bit of both, i.e., an inherent G H property of the protein’s rugged free energy that is modified to a certain extent by the ligand. Showing protein structures of the main states together with position densities of the ligand, Fig. 4F illustrates these inter- actions (see also SI Appendix, Fig. S8). For one, we notice that the opening and closing of the binding pocket (described by d20,71) is associated with the conventional binding of the lig- and’s C terminus in this pocket, which stabilizes closed state 1 in trans. In the open state 2, the probability to find the lig- 13 15 −1 Fig. 3. Transient C N-WT difference data at 1,636 cm (A and B) and and in its binding mode is significantly decreased, pointing to a 1,579 cm−1 (C and D) for trans-to-cis (Left) and cis-to-trans (Right) switch- ∗ ∗ reduced ligand affinity of the protein. On the other hand, we find ing, highlighting features labeled as 1 to 3 in Fig. 2. Red lines are fits that the distinct conformations of the protein’s termini described obtained from the time-scale analysis in Eq. 1, and blue lines represent the by d4,55 are a consequence of the formation of contacts with resulting time-scale spectra a(ωi, τj). E and F show the corresponding dynam- ical content; the heat signal labeled as ∗4 is discussed in Experimental. G and the ligand’s N terminus in states 3 and 4, which are absent in H show the MD dynamical content, obtained from a time-scale analysis of states 1, 2, and 5. In particular, state 5 represents a situation the nonequilibrium time evolution of the mean Cα distances (SI Appendix, where the hydrophobic photoswitch of the ligand forms a con- Fig. S5). tact with a hydrophobic bulge at the protein surface around Ile20, which can be classified as unspecific binding of the ligand to the protein surface. within a few microseconds into its cis equilibrium state, describ- Adopting our trans-to-cis nonequilibrium simulations, we can ing the perturbed protein–ligand complex. Performing 25× describe the overall structural evolution of PDZ2 in terms 10 µs-long trans-to-cis nonequilibrium simulations, we took the of time-dependent expectation values of various observables. last 7 µs of each trajectory to estimate the rather heteroge- As an example, Fig. 5 A and B shows the time evolution neous conformational distribution of the cis equilibrium state. of the two Cα distances d20,71 and d4,55 introduced above. When we compare the resulting free-energy landscapes of cis Following trans-to-cis ligand-switching, it takes about 100 ns and trans, Fig. 4 A and B reveals that the accessible confor- until the sub-picosecond photoisomerization of the photoswitch mational space in cis is considerably increased, along with the affects the protein’s binding region (indicated by d20,71), which occurrence of additional state 5 that reports on a further open- becomes wider as the ligand moves out. The flexible N-terminal ing of the binding pocket. While states 2 and 5 largely overlap in region indicated by d4,55, on the other hand, undergoes con- this two-dimensional representation of the free energy, they are formational changes already within a few nanoseconds. The well separated when a third distance (e.g., d27,69) is invoked (SI weak correlation between the two interresidue distances (i.e., 2 2 −1/2 Appendix, Fig. S6). Representing the populations of all states in hd20,71d4,55i(hd20,71ihd4,55i) . 0.02 for all data), however, trans and cis as a histogram, Fig. 4D demonstrates that the pho- indicates that this early motion of the terminal region may be toswitching of the ligand causes a notable (. 30 %) shift of the not directly related to the functional dynamics of PDZ2. Interest- state populations, mostly from state 1 to states 2 and 5. ingly, the associated rmsd values of the two distances show quite To illustrate the conformational changes associated with these similar behavior. Moreover, SI Appendix, Fig. S9 displays various states, Fig. 4E displays an overlay of minimum-energy structures ligand–protein distances and contact changes, which illustrate of states 1 and 2 as well as the cis-specific state 5. We find that that the ligand leaves the binding pocket on time scales of the opening of the binding pocket described by d20,71 mainly 0.1–1 µs. Similar to the experimental analysis (cf. Eq. 2), we reflects a shift of the α2 helix down and away from the pro- also calculated the dynamical content associated with all consid- tein core. Interestingly, the structural rearrangement between ered intraprotein Cα distances (Fig. 3 G and H). While MD and main states 1 and 2 results in an overall rms displacement of experimental results are seen to cover the same time scales, the only . 1 A˚ and causes only few (∼ 5) contacts to change (SI peaks of the respective distributions differ clearly as they account Appendix, Fig. S7). This is in striking contrast to the cross-linked for different physical observables. In principle, one would expect photoswitchable PDZ2 studied by Buchli et al. (30), where 34 that the positions of the peaks coincide, even if the amplitudes

26034 | www.pnas.org/cgi/doi/10.1073/pnas.2012999117 Bozovic et al. Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 eigenvectors state from jumps a tem calculate we memory- end, via this matrix To transition PDZ2 states. metastable of which between 63), jumps dynamics (62, less conformational (MSM) model the find- state these Markov describes rationalize a construct To we microseconds. ings, within states other the 5C time Fig. at (60), starts structure crystal the con- trans-to-cis initial to Choosing close states. metastable ditions protein’s the of simulation. ulations MD the that of but accuracy the observables, for physical much are too different comparison asking the the be for might to points Fixed due state. different specific a are to belonging snapshots simulation all in within atom ligand a find to C 0.4 the of probability minimal a with of (C (61) and of motion downward a (E by pocket PDZ2. ligand-binding of shift population ligand-induced 4. Fig. ooi tal. et Bozovic scales time exponential the any over onto sum of a value as written functions expectation be time-dependent can observable the dynamical that theory states MSM transitions, (62) interstate occurring rarely and tuations faster. 10 factor a even is bind- that back-rate the of of scale time transition states a from open–close on the occurs that pocket ing see We system. the 5E Fig. (75×1 MSM, scales nonequilibrium time our of shorter 25×10 bias toward the simulations reflects agreement which MD qualitative decade, only last but time the of in decades three first the for (using corresponding the predictions and MSM simulations MD nonequilibrium the from 5 Fig. sion, F ABCD E ti ntutv ocnie h eutn iedpnetpop- time-dependent resulting the consider to instructive is It suigatm-cl eaainbtenfs nrsaefluc- intrastate fast between separation time-scale a Assuming B ersnswal ouae (. populated weakly represents i.S10 Fig. Appendix , SI α qiiru iuain fPZ,potda ucino w seta nersdedsacs h naee tt-iefauea h otmright bottom the at feature state-like unlabeled The distances. interresidue essential two of function a as plotted PDZ2, of simulations equilibrium ) tm fstrands of atoms dnicto fmtsal ofrainlsae.Fe-nrylnsae i nt of units (in landscapes Free-energy states. conformational metastable of Identification -ogdt) hwn ewr ersnaino the of representation network a Showing data). µs-long k e hegnetro rniinmatrix transition of eigenvector th −t ieeouino h tt ouain.Tesystem The populations. state the of evolution time 1 t C k ψ /t and t =−τ lutae h onciiyadtasto ie of times transition and connectivity the illustrates k and k 0 = n eigenvalues and egtdb h rjcino h observable the of projection the by weighted T 2 lag lotcmltl nstate in completely almost D otiigteprobabilities the containing ostates to β / i 4 oprstesaeppltosobtained populations state the compares ln to and τ λ o ehia eal) safis impres- first a As details). technical for lag j k β ihnlgtime lag within =1 6 fteMMteeoegvr the govern therefore MSM the of 4 . and s.W n xeln agreement excellent find We ns). urgoso states of subregions 1%) λ k 3 (see ∼ r atr4fse iha with faster 4 factor a are oprsno iiu-nrysrcue h fstates of the structures minimum-energy of Comparison ) 1 ,weestransitions whereas µs, aeil n Methods and Materials α τ 2 lag (F . 1 n eemn its determine and tutrso ttstgte ihpsto este ftelgn.Teioufc nlssavolume a encloses isosurface The ligand. the of densities position with together states of Structures ) T T n ovrsto converts and ij h implied The . httesys- the that -ogand µs-long xiisthe exhibits 2 and itga ftesaeppltosin populations state the of Histogram (D) 5. rti pncmlt eoa ftelgn.Tesuyo these of study The ligand. the the of of removal We complete changes upon method. conformational protein the further minor of exclude such cannot sensitivity unpin- currently 2), observe structural (Fig. extraordinary can spectroscopy the we IR ning that transient remarkable by is changes structural it in result, resulted which this protein, of 0.9 the of of rmsd groove linked an covalently binding was the photoswitch to the directly where of 30, order ref. the the in reported average, in On only 4B). therefore 0.3 (Fig. is found (∼ change are modest structural only population measurable and state’s states, the (. tertiary different of similar and the ligand- quite secondary in are upon the ment) protein is, changes the structure That of structural well-defined structure small. in mean with rather described the are states switching be although metastable can 4), few PDZ2 (Fig. a be of of to landscape terms thought free-energy is the found which have that We domain, communication. conforma- allosteric PDZ2 protein for ligand-induced the responsible in the transition described MD tional have nonequilibrium we and spectroscopy simulations, IR transient Combining Conclusions and experimental Discussion the in present also are which signals. ns, time 100 time to on Moreover, 10 evolutions features of scale. transient population scales various time exhibits and microsecond populations spectral on MSM the completed both be that to find appear we 3), 5F (Fig. (Fig. results simulation the MSM the of of an point starting comparison run the also a we facilitate evolutions, using time To simulation simulated (64). and populations spec- experimental state vibrational as and observables different tra such of evolution time .Ti sasgicnl mle ofrainlcag as change conformational smaller significantly a is This A. ˚ k ,a eemndb M pcrsoy nlight In spectroscopy. NMR by determined as A, ˚ B PNAS T bandfo the from obtained ) trans 1, | and 2, qiiru nta odtos hc is which conditions, initial equilibrium coe 0 2020 20, October trans-to-cis eeln nicesdoeigo the of opening increased an revealing 5, trans oteeprmna ietraces time experimental the to ) and trans | cis xeiet.Comparing experiments. (A), o.117 vol. qiiru,rvaigthe revealing equilibrium, cis 1 ,adligand-free and (B), - m displace- rms A ˚ | o 42 no. 30 )shifts %) | 26035

BIOPHYSICS AND COMPUTATIONAL BIOLOGY ture. Nevertheless, this diffusion will be the first rate-limiting AB step after cis-to-trans switching, which might be the reason that the ligand does not completely localize in the binding pocket within 42 µs. The existence of well-defined metastable conformational states implies a time-scale separation between fast intrastate fluc- tuations and rarely occurring interstate transitions. This allowed us to construct a MSM, which illustrates the connectivity and transition times between the metastable states (Fig. 5D). In particular, the discrete time scales predicted by the MSM are directly reflected in the dynamical content calculated for experi- CD ments and MD simulations (Fig. 3 E–H), which both cover time scales from ∼ 1 ns to 10 µs. Reflecting different observables (transition dipole vs. Cα distances, respectively), the weights of the various peaks are different. While ligand-switching was shown to cause a conformational transition of PDZ2 in terms of the mean structure, at the same time, it may also effect a change of the protein’s fluctuations. Comparing the time evolution of the means of the distances and their rmsd, Fig. 5 A and B reveals that the two quantities cor- relate closely, a behavior that is found for all considered Cα EFdistances (SI Appendix, Fig. S5). This finding reflects the fact that the Cα-distance distributions pertaining to the individual states are in most cases well separated (SI Appendix, Fig. S11), such that a transition between two states affects both mean and variance. Accounting for an entropic contribution of the con- formational transition, a change in variance is often referred to as “dynamic allostery” (3, 4, 6). The above findings indi- cate that allosteric transitions may involve both conformational and dynamic changes in the case of the PDZ2 domain (8). The answer to what is the dominant effect will greatly depend on Fig. 5. Time evolution of various structural descriptors, following trans-to- the system under consideration and on the applied experimental cis ligand-switching of PDZ2. Shown are means (blue) and rmsd (orange) of method. While the overall structural change (. 0.3 -A˚ rms dis- Cα distances d20,71 (A) and d4,55 (B), as well as populations of conformational placement) may be too small to be detected by structure analysis, states (C, D, and F). For easier representation, all MD data were smoothed. Starting at time t=0 almost completely in state 1, we compare results from NMR-relaxation methods can sensitively explore the structural the nonequilibrium MD simulations (C) with the corresponding predictions flexibility of proteins. The IR spectrum of the amide I band, of an MSM (D). (E) Network representation of the MSM. The size of the in contrast, is commonly thought of as a measure of structure states indicate their population, the thickness of the arrows and numbers (45), but dephasing due to fast fluctuation might also affect indicate the transition times (in microseconds). For clarity, we discard tran- the IR line shape. sitions that take longer than 2.5 µs. (F) MSM simulations of the trans-to-cis In conclusion, we have characterized the nonequilibrium transition, using trans equilibrium initial conditions. allosteric transition in a joint experimental–theoretical approach. The protein per se was kept unmodified; hence, ligand-switching mimics very closely the naturally occurring allosteric perturba- effects will require new concepts, both experimentally as well as tion caused by ligand (un)binding events. We employed a widely computationally, as the expected time scales are very long (10 to studied model system for this purpose, the PDZ2 domain, which 100 ms). is small enough to allow for a characterization of the process in Using isotope labeling to discriminate the dynamics of pro- atomistic detail by MD simulations, but we believe that the find- tein and ligand, the resulting time-resolved double-difference IR ings are of more general nature. That is, while the ligand-induced spectra have revealed complex kinetics of the protein that cover allosteric transition originates from a population shift between many time scales (Fig. 2). The spectra for trans-to-cis and cis- various metastable conformational states, the measurable mean to-trans ligand-switching are not mirror images of each other, structural change of the protein may be tiny and therefore dif- and the trans-to-cis signals exhibit short-time transients that are ficult to observe (8). Moreover, we suggest that the separation not found for cis-to-trans. Moreover, the cis-to-trans transition between purely dynamically driven allostery and allostery upon does not seem to be finished within 42 µs (Fig. 2C). The overall a conformational change may not be as clear-cut as previously slower response of the cis-to-trans transition reflects the general thought but rather that there may be an interplay between both observation that enforced leaving of a well-defined (low entropy) that allows proteins to adapt their free-energy landscape to ligand-binding structure (here trans) occurs faster than starting in incoming signals. The photoswitching approach presented here is a conformationally disordered (high-entropy) state (here cis) and very versatile and allows us to shed light on the aspects of “time” trying to find stabilizing interactions to end in a more organized and “speed” in allosteric communication. structure (65). More specifically, the trans-to-cis nonequilibrium simulations Materials and Methods reveal that the ligand remains bound with its C terminus to the Protein and Peptide Preparation. Expression of the wild-type (WT) PDZ2 protein-binding site between β2 and α2 up to about 1 µs. In this 13 15 1 domain from human phosphatase 1E (26), isotope-labeled ( C N) protein way, it stabilizes the main bound-protein conformation (state ). variant and synthesis of the photoswitchable peptide ligand was performed At longer times, it starts to move out from the binding pocket as described earlier (30, 39). The WT RA-GEF-2 sequence was modified in but remains nonspecifically bound to the protein surface. While order to enable cross-linking the photoswitch, while preserving residues diffusion on the surface may continue for long times after trans- that are important for regulation and binding. That is, amino acids at posi- to-cis switching, it only little affects the protein internal struc- tions (−1) and (−6) were chosen as anchoring points for the photoswitch

26036 | www.pnas.org/cgi/doi/10.1073/pnas.2012999117 Bozovic et al. Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 ueet eepromdo irClIC0 Mlen.I re to order In (Malvern). ITC200 MicroCal the for values a obtained the on ensure performed were surements Affinity. Binding peptide the the Determining for nm 310 and analysis. acid protein amino the by the for confirmed via nm and determined 280 was at samples absorption the tyrosine of concentration dur- exchange The H/D experiments. eliminated ing measurements the before temperature room sdsrbdpeiul 4) nete ae h rti ocnrto was concentration protein the varied. case, 5 either at constant In kept (40). fluores- previously spectrofluorimeter, tryptophan Perkin-Elmer described Intrinsic on as done (40). was experiment previously 0.1-cm quenching a described cence in spectropolarimeter as J810 cuvette, mea- model CD Jasco quartz equilibrium. a on bimolecular done a the were assuming concentration, surements fitted protein be and can peptide when affinity of hence, binding complex; dependence protein–ligand in a sig- them of spectroscopic measuring formation Both the quenching. upon fluorescence change as nals well as spectroscopy CD (power in laser shown are (cw) results The continuous-wave (40). 370-nm CrystaLaser) a with the illuminated for while experiment, cell the sample The data. 40 the with of loaded reproducibility was 250 the with trip- ensure loaded in to was performed exactly order was under experiment and in The measurements licate solu- both conditions. for stock experimental protein same same and the the peptide using the performed of tion were experiments the comparable, tran- For resuspended 8.5). and (pH buffer lyophilized (D water NaCl were deuterated mM samples in 150 measurements, borate, infrared mM sient 50 against Appendix dialyzed con- (SI was analysis samples spectrometry sequence all mass facili- final of by purity firmed and The The RWAKSEAKECEQVSCV. solubility construct. was water the peptide the the of of improve determination to concentration order the were in tate (RWAK) sequence residues the N-terminal to Four added residues. cysteine into mutated and ooi tal. et Bozovic azobenzene the of nm 380 at cross-section absorption the the in that 70/30% photoequilibrium helps is more the volume that sample estimated isomeriza- total (46)—we and yield (27), quantum cross-sections light absorption tion pump volume, by sample light—determined pump total power, 380-nm the by induced probability Bruker a in cis-to-trans taken conditions. been sample have same pep- the spectra using photoswitchable spectrometer, difference free, FTIR FTIR 27 of Tensor of reference, response excess a the the slight with As eliminate A saturated tide. to protein. fully was the order peptide for in the mM that protein; 1.5 ensure to and needed peptide was the protein for mM shot 1.25 laser at subsequent the the for 400 at completely after sample essentially of exchanged (≈42 loss speed sample minimize time the flow to The delay order reservoir. in pump–probe a optimized largest and was cell spacer sample Teflon the 50-µm in a by separated dows N with purged 44. ref in described value as delay performed maximum was of suppression the resolution to spectral up a acquired of were with spectra detector detected Pump–probe array and pixel. MCT spectrograph 2×64 a a through passed in mea- (42), amplifier the sample: parametric the during cal on diameter degradation beam at fs; sample centered ≈100 pulses the probe minimize Mid-IR before and surements. to amplifier regenerative compressor) the after the directly light at the pulse extracting pump (by the of was diameter position beam sample The the crystal. as BBO tuned a in was generation laser monic pump (1.3 the (2.1 nm of pulses 420 wavelength pump systems 380-nm The laser obtain kHz. to Ti:Sapphire 2.5 at synchronized running electronically (43) two using recorded Spectroscopy. IR Transient methods. different all from obtained affinities binding the while quenching, fluorescence tryptophan the satraiemto odtrietebnigafiiy eas used also we affinity, binding the determine to method alternative As o h xeietwith experiment the For (≈700 samples The ≈42 trans i.S3 Fig. Appendix, SI ihatm eouinof resolution time a with µs esrmn,tesse a eti h akfrtedrto of duration the for dark the in kept was system the measurement, nteohrhn.Tecnetain ftesmlswr set were samples the of concentrations The hand. other the on µs akrato.B oprn t aewt h isomerization the with rate its comparing By reaction. back )frthe for µJ) 2 h ytmcnitdo apecl ihtoCaF two with cell sample a of consisted system The . ,rsetvl,wieteppiecnetain were concentrations peptide the while respectively, µM, f800 of µL f80 of µL )wr updtruhacoe o-elsystem flow-cell closed a through pumped were µL) 2 cis-to-trans ) nuaino h ape nD in samples the of Incubation O). ≈180 rnin iil upI rb pcr were spectra probe pump–IR visible Transient hw h eut o h Dsetocp and spectroscopy CD the for results the shows trans-to-cis cis ,epoigapledrto of duration pulse a employing µm, D2dmi ouin n h syringe the and solution, domain PDZ2 µM htsical etd ouin For solution. peptide photoswitchable µM trans/cis stemlttainclrmty(T)mea- (ITC) calorimetry titration Isothermal esrmn h yig a constantly was syringe the measurement cis xeiet epciey i eodhar- second via respectively, experiment, )frthe for µJ) and ) nteoehn,btt have to but hand, one the on µs), 0 s omlzto o noise for Normalization ps. ≈200 ≈150 trans wthn,w eido thermal on relied we switching, uigmaueet tfurther- It measurement. during i.S2 . Fig. Appendix, SI al S1 Table Appendix, SI ,3 cm ≈1,630 )wr bandi opti- a in obtained were µm) esrmn eemutually were measurement trans-to-cis .Alsmlswere samples All S1). Fig. , −1 pleduration: (pulse xeietand experiment 2 vrih at overnight O cm ≈2 0mW; ≈90 compares 0 ps ≈200 −1 2 win- per h rttoPscvr4%o h vrl utain,wiesxPsyield PCs six while principal fluctuations, overall the the of (yielding 43% PCs). cover the eigenvectors PCs of two fluctuations its first the (reflecting The eigenvalues obtain and was resulting and [PCs]) we the short components which Diagonalizing of matrix, (71). PCA, weighting normalized relative subsequent were covariance adequate data the the For distances, for (57). long data chosen all are on performed coordinate these tion, of change a the show during value that angles dihedral backbone ( simulations librium (hd significantly as (57), system (such the redundant of 56 transitions conformational determined the we for account that dinates Clustering. ensemble and an Reduction via Dimensionality calculated were observables trajectories. nonequilibrium these 100 over of average values distributions Time-dependent mean PDZ2. of and segments various between interresidue contacts angles, dihedral backbone 8 of length a to extended were simulations 10 100 photoswitching, Following nonequilibrium 3.0µs. around time simulation 25 the from for the structures initial of generate To distribution state. conformational 7 last heterogeneous the rather took we within simulations, relaxes rium 25×10 it Performing until its complex. protein–ligand into evolution microseconds time few nonequilibrium a a undergoes PDZ2 1 100 for run All surface- were potential-energy (52). developed approach previously a switching using performed see was mostly definition, ing conditions, consist initial state that these Employing (for structures ). Clustering 1 and starting state 100 metastable of of perform total to a trajectories last yielding the NVT of simulations, four each these from of snapshots ns chosen 50 randomly 25 per- end selected 5×µs-long were the we perform Moreover, from each to snapshots ns chosen trajectories randomly 100 these five of of selected we runs one, NVT For formed. distributions) velocity sys- added initial the were of different ligand equilibration the NPT connec- of Following in covalent terminus tem Preparation). N provide Peptide the to and at experiment (Protein missing in in Residues as attached points. cysteines tion was to photoswitch mutated azobenzene been the Bank Here, Data (Protein (60). trans structure 3LNX) crystal code the on ID time based (39) coupling previously a prepared with (70) al. et Berendsen Bussi of the of constant via method maintained coupling coupling a was pressure with K algorithm) of 300 velocity-rescale constant of as time temperature interactions known A Waals (also (69) nm. der thermostat 1.4 van to and set electrostatic was The for scheme. electro- distance cutoff short-range Verlet cutoff the the with minimum whereas explicitly (68), treated involving were method computed interactions bonds Ewald were static interactions Mesh All electrostatic Particle Long-range M. the fs. allowing 2 by 0.1 (67), of step algorithm of time LINCS a the concentration for using constrained salt minimal were a a atoms hydrogen with with Na box 16 system dodecahedron nm; neutral a 7 approximately in of with (66) distance image solvated molecules were water structures TIP3P 8,000 Protein–ligand taken 39. were ref. photoswitch field azobenzene from force the Amber99*ILDN of the parameters and Force-field (54–56). (53) package software v2016 GROMACS Simulations. MD CrystaLaser). light mW; of (150 excess laser an cw a with from reservoir nm the 370 illuminating at by back switched actively direction. isomerization desired the that undergo conclude experiment to us leads which the in moiety ic h interresidue the Since tools Gromacs the from ligand the switching Upon was PDZ2 to bound ligand photoswitched the of structure starting The with experiment the For ) hc had which ( −1), and (−6) positions at ligand the to conformation trans τ ofrainfr1 s orsaitclyidpnet(.. with (i.e., independent statistically four ns, 10 for conformation P trans-to-cis saeis trans-state . ps. 0.1 = ;2 fte eeetne oalnt f10 of length a to extended were them of 25 µs; ij rjcoiswr iuae o rjcoylnt f1 of length trajectory a for simulated were trajectories ≥ i trans-to-cis l Dsmltoso D2wr efre sn the using performed were PDZ2 of simulations MD All τ m angle gmx T PNAS C 0.5 d . s pressure A ps. 0.1 = α .Mroe,w osdrdall considered we Moreover, S5). Fig. Appendix, SI i ,j distances )drn h rtmicrosecond first the during A) and ˚ rjcois10rnol hsnsaso ta at snapshot chosen randomly 100 trajectories | C 0tmslre hnta fthe of that than larger times ≈20 α coe 0 2020 20, October oeulbimsimulations. nonequilibrium d cis and cis-to-trans i itne pert rvd oeinforma- more provide to appear distances ,j + ±1 qiiru tt,dsrbn h unbound the describing state, equilibrium 7 ftemlclsi the in molecules the of >97% d n 6Cl 16 and m mindist gmx n hs nebeaeaechanges average ensemble whose and ) i ,j trans-to-cis ewe residues between fec rjcoyt siaethe estimate to trajectory each of µs cis-to-trans C P ocos utbeitra coor- internal suitable choose To wthn,tesml ol be could sample the switching, α a a otolduigthe using controlled was bar =1 - trans µs-long eeaddt il charge- a yield to added were itne,adtenme of number the and distances, trans | µs. eeepoe ocompute to employed were nonequilibrium trans-to-cis o.117 vol. iesoaiyReduction Dimensionality othe to htsicig etook we photoswitching, trans-to-cis equilibrium & trans-to-cis 10 trans-to-cis i ◦ and | cis rmterinitial their from o 42 no. nonequilibrium cis configuration, j photoswitch- cis µs. htaenot are that simulations. equilibrium nonequilib- simulations cis-to-trans trans-to-cis tt (27), state nonequi- | 26037 µs;

BIOPHYSICS AND COMPUTATIONAL BIOLOGY about 65%. Calculating the free-energy profiles pertaining to the PCs, we MSM. On the basis of the above-defined seven metastable states, we con- find that in particular PCs 1 to 4, 6, and 7 show multistate behavior reflecting structed an MSM (62) of the trans-to-cis transition of PDZ2, using all metastable states. (75×1 µs and 25×10 µs) trans-to-cis nonequilibrium trajectories. A gen- Including these six PCs, we performed robust density-based clustering eral problem with the definition of metastable states is that, due to the (58), which first computes a local free-energy estimate for every structure inevitable restriction to a low-dimensional space combined with insuffi- in the trajectory by counting all other structures inside a six-dimensional cient sampling, we often obtain a misclassification of sampled points in hypersphere of fixed radius R. Normalization of these population counts the transition regions, which causes intrastate fluctuations to be mistaken yields densities or sampling probabilities P, which give the free-energy esti- as interstate transitions. As a simple but effective remedy, we use dynam- mate ∆G = −kBT ln P. Thus, the more structures are close to the given one, ical coring, which requires that a transition must a minimum time τcor in the lower the free-energy estimate. By reordering all structures from low to the new state for the transition to be counted (73, 74). A suitable quan- high free energy, finally, the minima of the free-energy landscape can be tity that reflects these spurious crossings is the probability Wi(t) to stay in identified. By iteratively increasing a threshold energy, all structures with state i for duration t (without considering back-transitions). As shown in a free energy below that threshold that are closer than a certain lump- SI Appendix, Fig. S10, without coring, we observe a strong initial decay of ing radius will be assigned to the same cluster, until all clusters meet at Wi(t) for all states, instead of a simple exponential decay we would expect their energy barriers. In this way, all data points are assigned to a cluster as for Markovian states. Applying coring with increasing coring times, this ini- one branch of the iteratively created tree. For PDZ2, we used a hypersphere tial drop vanishes because fluctuations on time scales t . τcor are removed. R = 0.579 that equaled the lumping radius employed in the last step. Here, we determined τcor = 1 ns as the shortest coring time, which removes SI Appendix, Fig. S6, Top shows the resulting total number of states the spurious interstate transitions. obtained as a function of the minimal populations Pmin a state must con- SI Appendix, Fig. S10 shows the resulting implied time scales and eigen- tain. Here, we chose Pmin = 50 000, resulting in a clustering into 12 states. vectors of the model. Using a lag time of 1 ns, we moreover show the time According to visual inspection of the resulting free-energy landscapes (SI evolution of the state populations, assuming that we start completely in a Appendix, Fig. S6, Middle), these states separate accurately all density max- specific state. ima of the system. Since the five lowest-populated states cover less than 5% of the total population, we lumped them to main states 1 to 7 as fol- Data Availability. The experimental data used in this article have been lows: (1, 9)→1, (2, 10)→2, (4, 12)→4, (5, 8, 11)→5. This is justified due to deposited in Zenodo (https://zenodo.org: doi.org/10.5281/zenodo.3991616). their geometric vicinity in the free-energy landscape (SI Appendix, Fig. S6, MD data can be obtained from the authors upon request. MD data analy- Middle), as well as due to their kinetic vicinity in the transition matrix. Fol- sis tools including FastPCA, Robust Density-Based Clustering, and Essential lowing the calculation of the time-dependent states populations, in a last Coordinates Learning are available at www.moldyn.uni-freiburg.de. step, we lumped states (4, 7)→4 and states (5, 6)→5 for the sake of easy interpretability. ACKNOWLEDGMENTS. We thank Rolf Pfister for the synthesis of the pep- tides and the Functional Genomics Center Zurich, especially Serge Chesnov Finally, we employed a recently proposed machine-learning approach and Birgit Roth, for their help with the mass spectrometry and amino acid (59) to identify the internal coordinates that allow to discuss the five main analysis. We also thank Benjamin Lickert, Daniel Nagel, and Georg Diez states of PDZ2 in a two-dimensional free-energy landscape. On the basis for many enlightening discussions concerning the MD data analysis. The of the decision tree-based program XGBoost (72), we trained a model work has been supported by the Swiss National Science Foundation through that determines the features of the molecular coordinates that are most the National Center of Competence and Research Molecular Ultrafast Spec- important to discriminate given metastable states. Using an algorithm that troscopy and Technology (NCCR MUST) and Grant 200020B 188694/1, as well exploits this feature importance via an iterative exclusion principle, we iden- as by Deutsche Forschungsgemeinschaft (DFG) through Grant STO 247/10-2. We acknowledge support by the High Performance and Cloud Computing tified the essential internal coordinates, that is, the most important Cα Group at the Zentrum fur¨ Datenverarbeitung of the University of Tubingen¨ distances of PDZ2. SI Appendix, Fig. S6, Bottom shows that three distances, and the Rechenzentrum of the University of Freiburg, the state of Baden- d20,71, d4,55, and d27,69, suffice to qualitatively distinguish the five main states Wurttemberg¨ through Baden-Wurttemberg¨ high performance computing of PDZ2. The XGBoost parameters are chosen as in ref. 59, including learn- and DFG Grants INST 37/935-1 FUGG (RV bw16I016) and INST 39/963-1 FUGG ing rate η = 0.3, maximum tree depth of 6, 10 training rounds, and 70 and (RV bw18A004), the Black Forest Grid Initiative, and the Freiburg Institute 30% of the data used for training and validation, respectively. for Advanced Studies of the Albert Ludwig University of Freiburg.

1. S. J. Wodak et al., Allostery in its many disguises: From theory to applications. 18. C. A. Smith et al., Allosteric switch regulates protein-protein binding through Structure 27, 566–578 (2019). collective motion. Proc. Natl. Acad. Sci. U.S.A. 113, 3269–3274 (2016). 2. J. P. Changeux, Allostery and the Monod-Wyman-changeux model after 50 years. 19. E. J. Fuentes, C. J. Der, A. L. Lee, Ligand-dependent dynamics and intramolecular Annu. Rev. Biophys. 41, 103–133 (2012). signaling in a PDZ Domain. J. Mol. Biol. 335, 1105–1115 (2004). 3. A. Cooper, D. T. F. Dryden, Allostery without conformational change. Eur. Biophys. J. 20. E. Kim, M. Sheng, PDZ domain proteins of synapses. Nat. Rev. Neurosci. 5, 771–781 11, 103–109 (1984). (2004). 4. E. J. Fuentes, S. A. Gilmore, R. V. Mauldin, A. L. Lee, Evaluation of energetic and 21. M. Sheng, C. Sala, PDZ domains and the organization of supramolecular complexes. dynamic coupling networks in a PDZ domain protein. J. Mol. Biol. 364, 337–351 (2006). Annu. Rev. Neurosci. 24, 1–29 (2001). 5. I. Bahar, C. Chennubhotla, D. Tobi, Intrinsic dynamics of enzymes in the unbound 22. S. W. Lockless, R. Ranganathan, Evolutionarily conserved pathways of energetic state and relation to allosteric regulation. Curr. Opin. Struct. Biol. 17, 633–640 (2007). connectivity in protein families. Science 286, 295–299 (1999). 6. C. M. Petit, J. Zhang, P. J. Sapienza, E. J. Fuentes, A. L. Lee, Hidden dynamic allostery 23. A. B. Law, E. J. Fuentes, A. L. Lee, Conservation of side-chain dynamics within a in a PDZ domain. Proc. Natl. Acad. Sci. U.S.A. 106, 18249–18254 (2009). protein family. J. Am. Chem. Soc. 131, 6322–6323 (2009). 7. T. C. B. McLeish, T. L. Rodgers, M. R. Wilson, Allostery without conformational change: 24. Y. Kong, M. Karplus, Signaling pathways of PDZ2 domain: A molecular dynamics Modelling protein dynamics at multiple scales. Phys. Biol. 10, 056004 (2013). interaction correlation analysis. Proteins 74, 145–154 (2009). 8. R. Nussinov, C. J. Tsai, Allostery without a conformational change? Revisiting the 25. K. A. Reynolds, R. N. McLaughlin, R. Ranganathan, Hot spots for allosteric regulation paradigm. Curr. Opin. Struct. Biol. 30, 17–24 (2015). on protein surfaces. Cell 147, 1564–1575 (2011). 9. J. Guo, H. X. Zhou, Protein allostery and conformational dynamics. Chem. Rev. 116, 26. G. Kozlov, D. Banville, K. Gehring, I. Ekiel, Solution structure of the PDZ2 domain from 6503–6515 (2016). cytosolic human phosphatase hPTP1E complexed with a peptide reveals contribution 10. D. Thirumalai, C. Hyeon, P. I. Zhuravlev, G. H. Lorimer, Symmetry, rigidity, and of the beta 2-beta 3 loop to PDZ domain-ligand interactions. J. Mol. Biol. 320, 813–820 allosteric signaling: From monomeric proteins to molecular machines. Chem. Rev. 119, (2002). 6788–6821 (2019). 27. Z. Zhang, DC. Burns, J. R. Kumita, O. S. Smart, G. A. Woolley, A water-soluble azoben- 11. A. G. Palmer, NMR characterization of the dynamics of biomacromolecules. Chem. zene cross-linker for photocontrol of peptide conformation. Bioconjugate Chem. 14, Rev. 104, 3623–3640 (2004). 824–829 (2003). 12. A. Mittermaier, L. E. Kay, New tools provide new insights in NMR studies of protein 28. T. T. Cao, H. W. Deacon, D. Reczek, A. Bretscher, M. Von Zastrow, A kinase-regulated dynamics. Science 312, 224–228 (2006). PDZ-domain interaction controls endocytic sorting of the β2-adrenergic receptor. 13. D. Bourgeois, A. Royant, Advances in kinetic protein crystallography. Curr. Opin. Nature 401, 286–290 (1999). Struct. Biol. 15, 538–547 (2005). 29. A. Toto, A. Mattei, P. Jemth, S. Gianni, Understanding the role of phosphorylation in 14. X. Deupi, B. K. Kobilka, Dynamics and function. Physiology 25, 293–303 (2010). the binding mechanism of a PDZ domain. Protein Eng. Des. Sel. 30, 1–5 (2017). 15. S. Bruschweiler¨ et al., Direct observation of the dynamic process underlying allosteric 30. B. Buchli et al., Kinetic response of a photoperturbed allosteric protein. Proc. Natl. signed transmission. J. Am. Chem. Soc. 131, 3063–3068 (2009). Acad. Sci. U.S.A. 110, 11725–11730 (2013). 16. C. Hyeon, D. Thirumalai, Mechanical unfolding of RNA hairpins. Proc. Natl. Acad. Sci. 31. S. Buchenberg, F. Sittel, G. Stock, Time-resolved observation of protein allosteric U.S.A. 102, 6789–6794 (2005). communication. Proc. Natl. Acad. Sci. U.S.A. 114, E6804–E6811 (2017). 17. F. Pontiggia et al., Free energy landscape of activation in a signalling protein at atomic 32. H. Frauenfelder, S. G. Sligar, P. G. Wolynes, The energy landscapes and motions of resolution. Nat. Commun. 6, 7284 (2015). proteins. Science 254, 1598–1603 (1991).

26038 | www.pnas.org/cgi/doi/10.1073/pnas.2012999117 Bozovic et al. Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 7 .N h,A ah .Srmar,S ini .Jmh iadbnigb D domains. PDZ by allostery. binding Ligand of Jemth, P. basis Gianni, S. energetic Strømgaard, K. and Bach, A. Structural Chi, Motlagh, N. C. N. H. 37. Wrabl, O. works”. J. allostery Hilser, “how J. of V. view 36. unified dynamically. A Nussinov, signals of R. Tsai, view” Sending J. “new C. Gierasch, The 35. funnels: M. L. to pathways Smock, to G. Levinthal R. From 34. Chan, S. H. Dill, A. K. 33. 1 .Gianni S. 41. Jankovic B. 40. specificity, Structure, partners: binding Zanobini their C. and domains 39. PDZ Zheng, J. J. Lee, J. H. 38. 2 .H gyn .Sok oeulbimmlclrdnmc iuaino a of simulation dynamics molecular Abraham Nonequilibrium J. M. Stock, 53. G. Nguyen, H. P. 52. Baumann T. 51. photoisomeriza- ultrafast after cooling Vibrational Zinth, W. Ohline, M. S. communication. Hamm, allosteric P. to 50. approach nonequilibrium A Hamm, P. Stock, G. 49. folding. protein light- L in A. kinetics V. in strange of 48. switching Observations Gruebele, conformational M. Ervin, of J. Sabelko, J. Reversibility 47. Woolley, A. G. Borisenko, V. proteins. refer- about 46. us with tell vibrations scheme What Zscherp, suppression C. Barth, noise A. General 45. Ge, H. N. to Vinogradov, picoseconds I. from Feng, scanning Y. Continuous 44. Hamm, P. Helbing, J. Bredenbeck, J. light mid-infrared 43. femtosecond in suppression Noise Stenger, J. Kaindl, A. R. Hamm, P. 42. ooi tal. et Bozovic etraint opeeunbinding. complete to perturbation Biofactors Biophys. Rev. Annu. (2014). e1003394 10, (2009). kinetics. folding protein h idn mechanism. binding the binding. ligand study (2018). to proteins in label infrared modification. and ut-ee aalls rmlpost supercomputers. to laptops from (2015). parallelism multi-level peptide. photoswitchable heater. (2019). ultrafast encoded genetically spectroscopy. infrared (1997). femtosecond 519–529 106, by measured azobenzene of tion Sci. Biol. B transform. Soc. R. Laplace Trans. the of inversion entropy maximum Spectrosc. by spectra resolved U.S.A. Sci. Acad. Natl. Proc. peptides. sensitive (2002). 430 spectroscopy. (2017). nonlinear heterodyne in detection ence spectroscopy. nonlinear and (2004). 4462–4466 linear 75, resolved time in microseconds sources. orenz-Fonfr ´ p.Lett. Opt. h ieiso D oanlgn neatosadipiain for implications and interactions domain-ligand PDZ of kinetics The al., et 0–1 (2006). 407–417 60, 3–4 (2012). 338–348 38, zdhmaaie iial naie estl,adsensitive and versatile, invasive, minimally A Azidohomoalanine: al., et iersle bevto fvbainleeg rnfruiga using transfer energy vibrational of observation Site-resolved al., et tal. et rmc:Hg efrac oeua iuain through simulations molecular performance High Gromacs: al., et a .Knoi rnfraino iersle pcr olifetime- to spectra time-resolved of Transformation Kandori, H. ´ ıa, elCmu.Signal. Commun. Cell 7810 (2000). 1798–1800 25, htcnrligpoenppieitrcin:Fo minimal From interactions: protein–peptide Photocontrolling , .Poohm htbo.AChem. A Photobiol. Photochem. J. 8–0 (2012). 585–609 41, 0717(2018). 20170187 373, a.Src.Biol. Struct. Nat. .Bo.Chem. Biol. J. hm Phys. Chem. 0163 (1999). 6031–6036 96, .A.Ce.Soc. Chem. Am. J. 40–41 (2005). 34805–34812 280, 64 (2006). 36–44 323, ne.Ce.It d Engl. Ed. Int. Chem. Angew. (2010). 8 8, 01 (1997). 10–19 4, .Py.Ce.B Chem. Phys. J. 12 (2005). 21–28 173, p.Express Opt. 00–01 (2019). 10702–10710 141, .Rv Biophys. Rev. Q. SoftwareX Science LSCmu.Biol. Comput. PLoS 10118–10112 122, e.Si Instrum. Sci. Rev. 26262–26279 25, 2899–2903 58, .Ce.Phys. Chem. J. 198–203 324, 19–25 1-2, 369– 35, Philos. Appl. 3 .Snut,B toe,Mro oesfrteeuiaino lotrcregulation. allosteric of elucidation the for models Markov Strodel, B. Sengupta, U. 63. 4 .Ngl .Wbr .Lcet .Sok yaia oigo akvsaemodels. state Markov of coring Dynamical Stock, G. Lickert, B. Weber, A. most Nagel, by D. revealed HP35 of 74. landscape energy free folding Hierarchical Stock, G. Jain, A. in 73. system” boosting tree scalable A “XGBoost: Guestrin, C. Chen, T. 72. rescaling. velocity through sampling Canonical Parrinello, M. Donadio, D. Bussi, G. 69. Essmann simulation. U. molecular for 68. solver constraint linear parallel A P-LINCS: Hess, B. 67. Comparison Klein, L. M. Impey, W. R. Madura, D. J. Chandrasekhar, J. Jorgensen, L. W. 66. Wolf S. 65. No F. 64. Noe, F. Pande, S. V. Bowman, R. G. 62. reaction biomolecular of learning Machine Zhang Stock, J. G. Ernst, 60. M. Sittel, F. conforma- Brandt, metastable identify S. to 59. clustering density-based metastable Robust Stock, and G. variables Sittel, collective F. of 58. Identification Perspective: Stock, G. Sittel, the F. to 57. applied fields force dynamics Lindorff-Larsen K. molecular Optimized 56. Hummer, G. Best, B. R. 55. Hornak V. 54. 1 .Ent .Ste,G tc,Cnataddsac-ae rnia opnn analysis component principal distance-based and Contact Stock, G. Sittel, Molec- F. Haak, Ernst, R. M. J. DiNola, 71. A. Gunsteren, van F. W. Postma, M. P. J. Berendsen, C. J. H. 70. conformational Long-range Stock, G. Hamm, P. Walser, R. Knecht, V. Buchenberg, S. 61. imlclrdnmc ihsmltosadkntcexperiments. kinetic U.S.A. and Sci. simulations with dynamics biomolecular B Soc. R. Trans. Phil. 2013). , field. force protein agtdmlclrdnmc simulations. dynamics molecular targeted dynamics. protein of states polypeptides. of transition helix-coil (2006). 712–725 viktor. parameters backbone protein improved rbbept clustering. path probable (1995). 8593 Comput. Theor. water. liquid simulating (1983). for functions potential simple of phosphatase tyrosine protein of domain PDZ second 1E. the to binding peptide of impact coordinates. proteins. of states tional hm Phys. Chem. Data and Discovery Knowledge on Conference Mining International SIGKDD ACM 22nd the dynamics. protein of bath. external an to coupling (1984). with dynamics ular Phys. Chem. simulation dynamics molecular A protein: allosteric study. photoswitchable a of transition Biochemistry e ´ .Py.Ce.B Chem. Phys. J. yaia nepit o rbn niiulrlxto rcse in processes relaxation individual probing for fingerprints Dynamical al., et 21) o.1–7 p 785–794. pp. 13–17, vol. (2016), siaino rti-iadubnigkntc sn non-equilibrium using kinetics unbinding protein-ligand of Estimation al., et rsalgahcadncermgei eoac vlaino the of evaluation resonance magnetic nuclear and Crystallographic al., et 8242 (2011). 4822–4827 108, oprsno utpeabrfrefilsaddvlpetof development and fields force amber multiple of Comparison al., et mohpril ehEadmethod. Ewald mesh particle smooth A al., et 911(2019). 094111 150, (2007). 014101 126, .Py.Ce.Lett. Chem. Phys. J. 1–2 (2008). 116–122 4, 2099 (2010). 9280–9291 49, mrvdsd-hi oso oetasfrteAbrff99SB Amber the for potentials torsion side-chain Improved al., et PNAS rtisSrc.Fnt Bioinforma. Funct. Struct. Proteins 0718(2018). 20170178 373, .Ce.Phys. Chem. J. 36–37 (2014). 13468–13476 118, .Ce.Ter Comput. Theor. Chem. J. .Py.Ce.B Chem. Phys. J. | .Ce.Phys. Chem. J. coe 0 2020 20, October 1425 (2018). 2144–2150 9, nItouto oMro tt Models State Markov to Introduction An 414(2015). 244114 143, .Py.Ce.B Chem. Phys. J. 591(2018). 150901 149, .Ce.If Model. Inf. Chem. J. 7076 (2014). 7750–7760 118, rtisSrc.Fnt Bioinforma. Funct. Struct. Proteins 4623 (2016). 2426–2435 12, | o.117 vol. 0491 (2009). 9004–9015 113, .Ce.Phys. Chem. J. 9015 (2010). 1950–1958 78, .Ce.Phys. Chem. J. .Ce.Phys. Chem. J. 1554 (2019). 5135–5147 59, | o 42 no. rc al Acad. Natl. Proc. rceig of Proceedings 3684–3690 81, 926–935 79, 8577– 103, (Springer, | .Chem. J. 26039 65, J. J.

BIOPHYSICS AND COMPUTATIONAL BIOLOGY