Downloaded by guest on September 30, 2021 aeo icvr nti eyatv rao eerhhsbeen has research of The area B active (18). 16), very formation this organiza- (19–28). (15, accelerating in autophagosome translation 13), discovery and of of (7, pace (17), regulation plasticity elucidated response (14), and cell recently chromatin formation their of tissues synapse tion by vertebrate sand- in underscored in of roles as ver- elastin perform case functions, condensates and the satile phase-separated dynamic, (12) in These and adhesive as (8). 11), (4, worm materials organelles that extracellular castle membraneless granules, of as including stress to precursor (10), and referred nucleoli condensates often as biomolecular are such of forma- bodies, and variety underpins intracellular intracellular a LLPS of both (1–9). tion in tem- organismal biomolecules and extracellular general of spatial a functional organization is achieve poral acids disordered to nucleic mechanism and intrinsically biophysical proteins, containing folded (IDRs), proteins regions (IDPs), proteins A condensates biomolecular impact dilute overall modest the a only in to LLPS. on lead interactions trends opposing protein–solvent The phase. favorable more phase dependence condensed but the this in protein interactions that interprotein aqueous enhanced suggests entails on theory depends Analytical potentials. general concentration. in rela- statistical which by common permittivity, signifi- modulated tive are by are interactions posited electrostatic interactions Protein–protein than phenylalanine weaker LLPS-driving cantly that sug- mutants gest tyrosine-to-phenylalanine and FtoA because on however, assembly experiments condensate necessary, in of role is phenylalanine’s of elucidation mutants especially Further impor- RtoK potentials. degree, for a statistical to found that, LLPS-driving also underscoring tant a LAF-1, simu- was has protein between mutant experiment P-granule Consistency RtoK and propensity. the lation LLPS over- that statistics the diminished including reproduce contact much trend, structures on experimental of protein based all mutant globular interactions folded (RtoK) contrast, among arginine-to-lysine In an IDR. Ddx4 and phenylalanine- a (FtoA), charge-scrambled, a to-alanine wild-type, the on data LLPS cation– π arginine/lysine-aromatic with hydrophobicity augmented one common a nor a scheme with Neither schemes potential. electrostatic interaction evalu- shared residue–residue We different LLPS. three sequence-dependent to ated acid case, amino test in a cation–π interactions hydrophobic, as electrostatic, Ddx4, of disor- helicase roles DEAD-box intrinsically assess of N-terminal (IDR) conducted region the dered we of proteins, simulations of by multiple-chain (LLPS) underlain separation coarse-grained condensates phase predictive biomolecular liquid–liquid for transferable, model a explicit-chain toward Endeavoring Jos by Edited Canada 0A4, a ua Das Suman proteins disordered intrinsically of separation phase sequence-dependent in interactions charge, of roles Comparative www.pnas.org/cgi/doi/10.1073/pnas.2008122117 eateto iceity nvriyo oot,Trno NMS18 aaa and Canada; 1A8, M5S ON Toronto, Toronto, of University Biochemistry, of Department neatoscnitnl cone o vial experimental available for accounted consistently interactions iudpaesprto LP)o nrnial disordered intrinsically of (LLPS) separation phase liquid rpneac frcn dacsdmntaeta liquid– that demonstrate advances recent of preponderance .Ouhc ieUiest,Hutn X n prvdOtbr2 00(eevdfrrve 6 2020) 26, April review for (received 2020 2, October approved and TX, Houston, University, Rice Onuchic, N. e ´ a iHunLin Yi-Hsuan , π rltditrcin r moidi classical in embodied are interactions -related | ebaeesorganelles membraneless a,b oetM Vernon M. Robert , | hs separation phase n aromatic and , b ui .Forman-Kay D. Julie , fhtrplmr swl sprietipc ftemperature of impact propensity pertinent LLPS as affect well residues (60) as aromatic heteropolymers hydrophobic or 61), of 53, 54), 48, residues] 51, (34, charge (4) (50, of charged pattern or sequence 60), the (39, and aromatic (54), theo- hydrophobic [number/fraction composition of different acid amino were how The address and to applied using 59). complementary are 57)—including (58, approaches (56, retical/computational representations residues struc- particle lower of at patchy or, groups 55) resolution, 53, 52, tural (47, residues either acid (51–54) for amino continuum account individual that or simulations (47–50) explicit-chain theory lattice coarse-grained field and (32–42), complex (43–46), approximation analytical more of simulations encompass levels into efforts various insights recent at elucida- theories These physical their condensates. as for vivo systems prerequisite in LLPS a vitro is in tion simpler tackle to were promis- approaches made rendering coarse-grained challenge, using 29–31), steps this theoretical initial Facing 19, ing impractical. (10, modeling processes complex, maintained atomistic nonequilibrium are acids nucleic by vivo and proteins often in still of condensates species is many Biomolecular involving condensates infancy. biomolecular its of in physicochemical the is ulse oebr2 2020. 2, November published First doi:10.1073/pnas.2008122117/-/DCSupplemental at online information supporting contains article This 1 the under Published Submission.y Direct PNAS a is article This interest.y competing no declare paper. authors analyzed y the The H.S.C. wrote and H.S.C. R.M.V., and S.D., Y.-H.L., Y.-H.L., S.D., S.D., research; and research; designed data; performed H.S.C. H.S.C. and and J.D.F.-K., R.M.V., R.M.V., Y.-H.L., Y.-H.L., S.D., contributions: Author owo orsodnemyb drse.Eal [email protected] Email: addressed. be may correspondence whom To rgesi aetwr rdciemdl o biomolecular for models buried predictive condensates. toward to made quantitative relative is and progress strengths conceptual fundamental con- hydrophobic ramifications contacts, nonpolar reduced these with residue– as highlighting such the solvent-exposed condensates, of largely in tacts’ aspects contacts physical residue elucidating than By weaker in struc- expected. is phenylalanine large protein reflected the residue of folded hydrophobic/aromatic well capability condensate-driving among the is although statistics tures, formation contact condensate established experi- of Critical reproduce facilitation that interactions importance. indicates behaviors various central phase mental how of of is assessment encoded sequences genetically dis- by cause governed biomolecular can are condensates condensates func- How of ease. biological dysregulation myriad whereas serve tions, organelles, includ- acids, membraneless nucleic ing and proteins of condensates Mesoscopic Significance hl xeietlpors a enteedu,ter for theory tremendous, been has progress experimental While π n hydrophobic and , b b,a oeua eiie optlfrSc hlrn oot,O M5G ON Toronto, Children, Sick for Hospital Medicine, Molecular PNAS n u u Chan Sun Hue and , | NSlicense.y PNAS oebr1,2020 17, November | a,1 . y o.117 vol. https://www.pnas.org/lookup/suppl/ π itrcin’notable -interactions’ | o 46 no. | 28795–28805

BIOPHYSICS AND COMPUTATIONAL BIOLOGY (20, 43, 51, 54), hydrostatic pressure (62–64), salt (41, 46), and relative to the WT (4, 65, 67). The CS, FtoA, and RtoK variants osmolyte (27, 63), offering physical insights into the LLPS behav- are useful probes for LLPS energetics. They were constructed iors of, for example, the DEAD-box RNA helicase Ddx4 (34, 65), specifically to study the experimental effects of sequence charge RNA-binding protein fused in sarcoma (FUS) (52), prion-like pattern (the arrangement of charges along CS sequence is less domains (60), and postsynaptic densities (64). blocky than that in WT while the amino acid composition is Developing LLPS models with transferable interaction poten- unchanged), the relative importance of aromatic/π-related vs. tials applicable to a wide range of amino acid sequences is hydrophobic/nonpolar interactions (all 14 Phe residues in WT essential for advancing fundamental physical understanding of Ddx4 IDR are mutated to Ala in FtoA), and the role of Arg vs natural biomolecular condensates and engineering of bioinspired Lys (all 24 Arg residues in WT IDR are mutated to Lys in RtoK) materials (66). In this endeavor, the rapidly expanding reper- on the LLPS of Ddx4. toire of experimental data offers critical assessment of theoret- ical and computational approaches. Building on aforementioned Assessing Biophysical Perspectives Embodied by Different Coarse- progress (34, 41, 52, 53), the present study evaluates a variety of Grained Interaction Schemes for Modeling Biomolecular Conden- interaction schemes for coarse-grained residue-based chain sim- sates. We consider the potential functions in the hydrophobicity- ulations of LLPS of IDPs or IDRs, including but not limited to scale (HPS) and the Kim–Hummer (KH) models in Dignon et al. schemes proposed in the literature (52). We do so by first com- (52) as well as the HPS potential with augmented cation–π terms paring their sequence-specific predictions against experiment on (72), all of which share the same bond energy , Ubond, for the RNA helicase Ddx4 for which extensive LLPS data on the chain connectivity and screened electrostatic term, Uel, for pairs wild-type (WT) and three mutant sequences are available to of charged residues, as described in SI Appendix, SI Text. We probe the contribution of hydrophobic, electrostatic (4), cation– focus first on the pairwise contact interactions between amino π, and possibly other π-related (4, 65, 67) interactions. We use acid residues, which correspond to the Uaa energies of either the these data to benchmark the relative strengths of different types HPS or KH model (excluding Ubond and Uel). of interaction in our model. Of particular interest are the aro- The HPS model assumes that the dominant driving force for matic (68) and other π-related (67) interactions, which have IDP LLPS is hydrophobicity as characterized by a scale for the significant impact on folded protein structure, conformational 20 amino acid residues. Pairwise contact energy is taken to be distribution of IDPs, and LLPS properties (4, 34, 39, 60, 69–73) the sum of the hydrophobicities of the two individual residues of but are often not adequately accounted for in model poten- the pair. The HPS model adopts the scale of Kapcha and Rossky, tials (67). Interestingly, a simple statistical potential based upon in which the hydrophobicity of a residue is a composite quan- folded protein structures (74, 75) accounts for the general trend tity based on a binary hydrophobicity scale of the atoms in the of LLPS properties of the four Ddx4 IDR sequences, including residue (76). that LLPS is more favored by arginine than lysine despite their In contrast, the KH model (79) relies on knowledge-based essential identical electric charges, but that a model potential potentials derived from contact statistics of folded protein struc- that relies solely on hydrophobicity (76) does not. This find- tures in the Protein Data Bank (PDB). As such, it assumes ing indicates that at the coarse-grained level of residue–residue that the driving forces for IDP LLPS are essentially identi- interactions, IDP/IDR LLPS is governed largely by similar cal to those for protein folding at a coarse-grained residue- forces—including the π-related ones—that drive protein fold- by-residue level, as obtained by Miyazawa and Jernigan (75), ing. Analogous agreement between statistical potential model without singling out a priori a particular interaction type as being prediction and experiment with respect to the arginine/lysine dominant. contrast is also found for the N-terminal RGG domain of The HPS model has been applied successfully to study the FUS P-granule RNA Ddx3 helicase LAF-1 (77). However, experimen- low-complexity domain (80), the RNA-binding protein TDP-43 tal data on tyrosine-to-phenylalanine mutants of LAF-1 and FUS (81), and the LAF-1 RGG domain as well as its charge shuffled (39) indicate that the contribution of the large aromatic residue variants (77). A temperature-dependent version of HPS (HSP-T) phenylalanine to LLPS is overestimated by statistical poten- (51) was also able to rationalize the experimental LLPS proper- tials, most likely because the interactions involving phenylalanine ties of artificial designed sequences (82). When both the HPS in the sequestered hydrophobic core of globular proteins are and KH models were applied to FUS and LAF-1, the predicted not sufficiently representative of more solvent-accessible LLPS- phase diagrams were qualitatively similar for a given sequence, driving interactions, pointing to a crucial aspect of LLPS energet- although they exhibited significantly different critical temper- ics toward which future investigations should be directed. To gain atures (52), which should be attributable to the difference in further insights into the electrostatic driving forces for LLPS, effective energy/temperature scale of the two models. Here we we have conducted explicit-water simulation and developed ana- conduct a systematic assessment of the two models’ underlying lytical theory which suggests, at variance with previous analyses biophysical assumptions by evaluating their ability to provide a (35, 37), that the physically expected dependence of effective consistent rationalization of the LLPS properties of a of IDR permittivity on IDR concentration may have a modest instead sequences. of drastic impact on LLPS propensity because of a tradeoff The scatterplot in Fig. 1A of HPS and KH energies indicates between solvent-mediated electrostatic interchain interactions that despite an overall correlation, there are significant outliers. and self-interactions. These findings and their ramifications are The most conspicuous outliers are interactions involving Arg discussed below. (red), which are much less favorable in HPS than in KH. By comparison, most of the interactions involving Pro, as depicted Results and Discussion by the 16 outlying blue circles as well as the single yellow and As described in Materials and Methods and SI Appendix, SI Text, single green circles to the left of the main trend, are consider- our coarse-grained protein chain model for IDP LLPS basically ably more favorable in HPS than in KH. Interactions involving follows the approaches in refs. 52, 53, which in turn are based on Phe (yellow) and Lys (green) essentially follow the main trend. a recently proposed efficient simulation protocol (78). In the first Those involving Phe are favorable to various degrees in both step of our analysis, we critically assess the models against the models. However, some interactions involving Lys are attractive experimental data on the Ddx4 IDR (SI Appendix, Fig. S1), which in HPS but repulsive in KH. For example, Lys–Lys interac- indicate that all three Ddx4 IDR mutants—the charge scrambled tion is attractive at ≈ −0.1 kcal mol−1 for HPS but is repulsive (CS), phenylalanine-to-alanine (FtoA), and arginine-to-lysine at ≈ +0.2 kcal mol−1 for KH. Fig. 1B underscores the difference (RtoK) variants—have significantly reduced LLPS propensities in interaction pattern of the two models for the WT Ddx4 IDR.

28796 | www.pnas.org/cgi/doi/10.1073/pnas.2008122117 Das et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 hw uvsqatf h ednyfragvnpi frsde oegg ncation–π in engage to residues of pair given a for C tendency the quantify curves shown cation–π a h omlvco ftepaeo h rmtcrn eemndfo h oiin fisfis he tm.Teylo ie aksailsprtosue to used separations spatial mark al. lines being et yellow line Das The blue the atoms. the three from with first frame, farthest its coordinate orbitals of local electronic positions a the the constitute of from ring determined surfaces aromatic ring the exterior aromatic of the the corner cation–π on of a the points from plane define are emanating the lines here of red vector dots and normal red green, the The blue, blue. The in ring. aromatic nitrogen the and red, in oxygen black, (Inset an separated below or residues above and structure given a in chains different on residues from cation– π collected alternate are An statistics (B) pair curves. Contact dashed (67). by set and nonredundant solid published by a represented from are Lys cation–π in six and those aforementioned here—unlike Arg ( curves and curve; plotted blue, (solid the and that favorable more green, significantly red, is as Arg–Tyr/Phe/Trp labeled are Trp and mol kcal 3.0 2. Fig. (52). model types ) of (vertical residues two KH between the Text), in SI those versus variable) (horizontal HPS 1. Fig. neato teghi setal nfr,irsetv fthe of cation–π irrespective the 1) uniform, scenarios: essentially alternate is two strength in consider explained interaction As we 2). Text, (Fig. Trp SI or Phe, Tyr, aromatic cation–π for model as specific of to set cation–π referred another KH, + study and HPS we HPS of 77), to distribution addition 73, schemes—in 72, conformational interaction 39, as (4, LLPS well and as IDP (69) structure folded below. explored be LLPS will sequence-dependent to as properties, lead in should predictions uniform differences different more These significantly is interactions. pattern repulsive HPS no attrac- the with both whereas with repulsions, heterogeneous and more tions clearly is pattern KH The positions from residue differently represent colored axes are the horizontal circles) whereas and black plot, vertical (yellow-filled 2D The Phe the positions and scales. in at circles), provided the pairs (green by residue Lys coded between circles), color energies (red are Arg Contact involving (B) contacts circles). of (blue Energies others types). acid amino 20 the α ABC eas fteiprac fcation– of importance the of Because –C ≥ Potential Energy α 0aioaisaogtesm hi.C chain. same the along acids amino 50 facnatbtena r a h o)adaPe(ttebto) ooso h hmclbnsidct ye fao novd ihcro in carbon with involved, atom of types indicate bonds chemical the of Colors bottom). the (at Phe a and top) the (at Arg an between contact a of ) itne eeacation–π a Here distance. ctepo f20piws otc nris(nuiso clmol kcal of units (in energies contact pairwise 210 of Scatterplot (A) potentials. coarse-grained residue-based acid amino two Comparing π cation– Possible (kcal/mol) lk otc dfie eo)dvddb h oa ubro eiu ar ihC with pairs residue of number total the by divided below) (defined contact -like −1 safnto frsdersdedsac o h eiu ar r–y,AgPe r–r,LsTr y–h,adLsTp hri y,Phe, Tyr, wherein Lys–Trp, and Lys–Phe, Lys–Tyr, Arg–Trp, Arg–Phe, Arg–Tyr, pairs residue the for distance residue–residue of a as sp lk otc.Teeylo ie ontrpeetceia bonds. chemical represent not do lines yellow These contact. -like ta umn h P oetaswt terms with potentials HPS the augment —that 2 abnao ln h omlo h rmtcrn naTr h,o r eiu.Ti rtro seepie ytemlclrdrawing molecular the by exemplified is criterion This residue. Trp or Phe, Tyr, a in ring aromatic the of normal the along atom carbon neatosbtenAgo y n the and Lys or Arg between interactions itne()Distance(Å) Distance (Å) π cation– model a and potential HPS coarse-grained the of Sum (A) potentials. interaction AB (a)( ar clrcdda in as coded (color pairs a lk otc srcgie fete y Zo nAgN1ntoe tmi ihn3.0 within is atom nitrogen NH1 Arg an or NZ Lys a either if recognized is contact -like ) i = i , j j eaae by separated otc nrisaesonaogietemdlptnil’rsetv oo scales. color respective potentials’ model the alongside shown are energies contact α d o oti h P oeta.(C potential. HPS the contain not A—do –C π α neatosi protein in interactions itne r iie no0.2- into divided are distances optduigasto ,4 ihrslto -a rti rsa tutrs(resolution structures crystal protein X-ray high-resolution 6,943 of set a using computed A) r ij = r 0 IAppendix, SI hr h enr–oe opnn fteptnili iiu (i minimum is potential the of component Lennard–Jones the where cπ 5 ) ij = 5ka mol kcal 1.85 10 i Arg , E j ij tutrs(7,w n hta r–rmtcpi sa least at a is within be pair to Arg–aromatic pair Lys–aromatic an a than that likely more find 75% we protein X-ray (67), high-resolution 6,943 also structures of may set a based, among is Indeed, potential and tions. KH Miyazawa Arg–π the that of which suggest those on 75) including (74, structures, Jernigan PDB dif- of cation–π to in LAF-1 statistics pointing Lys and and investigation Arg theoretical (39) of recent FUS roles RtoK ferent a of the as variants well of and as cases is (67) (77), sub- the IDR scheme Lys Ddx4 in latter to of as The mutant Arg propensity, 2B). that LLPS reduce showing (Fig. stitutions experiments Lys recent for by than motivated Arg anal- for earlier cation–π an stronger the by 2) and suggested (69), as ysis 2A), (Fig. pair cation–aromatic fthe of Lys (r is o ahbn h eaiefeunyi h ubro ntne of instances of number the is frequency relative the bin, each For bins. A 0 ˚ saetepiws oeta energies potential pairwise the are )s −1 hnLsTrPeTp(ahdcre ( curve; (dashed Lys–Tyr/Phe/Trp than ) omlzdC Normalized ) n 15 = neato rvddta h ari ptal eaae yagiven a by separated spatially is pair the that provided interaction 3 euneo TDx D [Ddx4 IDR Ddx4 WT of sequence 236 i , j α ≤ –C The n. α PNAS itne ihntenro ag ftebn hs the Thus, bin. the of range narrow the within distances 20 α –C i trcin r togrta Lys–π than stronger are attractions | 6 =j α oebr1,2020 17, November itnedpnetcnatfeunisfrthe for frequencies contact distance-dependent -ln (relative frequency) are models KH and HPS the in energies contact 10 0 2 4 6 8 5.0 neato teghi significantly is strength interaction neato ihauiom( uniform a with interaction U aa|HPS fayoeo h ons1.7 points the of one any of A Distance (Å) ˚ 7.5 neatos(3.Contact (83). interactions cπ | = 1.7Å (r N1 ≤ ) o.117 vol. ij or ) , 3.0 Å = 4]i h w potentials two the in (4)] j eesadfrlbl for labels for stand here 10.0 5ka mol kcal 0.65 U aa|KH oeta nwhich in potential | (r o 46 no. IAppendix, (SI ) sp 12.5 2 abn in carbons −1 −1 interac- C ≤ | nthe in ) .Note ). α cπ 1.8 28797 –C 15.0 ) ij A) = ˚ α A ˚

BIOPHYSICS AND COMPUTATIONAL BIOLOGY distance of ≤ 6.5 A(˚ SI Appendix, Fig. S2), a separation that is are assuringly in the same range as those deduced experimen- often taken as a criterion for residue–residue contact (74). On tally (65), the model temperature (in K) of our simulated phase top of that, given an Arg–aromatic and a Lys–aromatic pair sep- diagrams in Figs. 3B and 4 should not be compared directly with arated by the same Cα–Cα distance (Fig. 2C), the Arg–aromatic experimental temperature. This is not only because of uncertain- pair (solid curves) is more likely than the Lys–aromatic pair ties about the overall model energy scale but also because the (dashed curves) to adopt configurations consistent with a cation– models do not account for the temperature dependence of effec- π interaction. We emphasize, however, that although a signifi- tive residue–residue interactions (20, 43, 51). For simplicity, our cantly stronger Arg- than Lys-associated cation–π interaction is models include only temperature-independent energies as they explored here as an alternate scenario, it is probable, as argued are purposed mainly for comparing the LLPS propensities of dif- by Gallivan and Dougherty using a comparison between Lys-like ferent amino acid sequences on the same footing rather than for ammonium–benzene and Arg-like guanidinium–benzene inter- highly accurate prediction of LLPS behaviors of any individual actions, that the strengths of the pure cation–π parts of Arg– and sequence. Lys–aromatic interactions are similar despite the relative abun- Fig. 3 B, Left, provides the HPS phase diagrams at relative per- dance of Arg–aromatic contacts due to other factors (69) such as mittivity r = 80 (approximately equal to that of bulk water, as in π–π effects (67). ref. 52). The predicted behaviors of the CS and FtoA variants are consistent with experiments in that their LLPS propensities Hydrophobicity, Electrostatics, and Cation–π Interactions Are Insuffi- are reduced relative to WT (4, 65), but the predicted enhanced cient by Themselves to Rationalize Ddx4 LLPS Data in Their Entirety. LLPS propensity of RtoK is opposite to the experimental finding We begin our assessment of models by applying the HPS and of Vernon et al. that the LLPS propensity of RtoK is lower than HPS + cation–π potentials to simulate the phase diagrams of the that of WT (67). This shortcoming of the HPS model is a conse- four Ddx4 IDRs (SI Appendix, Fig. S1), the sequence patterns of quence of its assignment of much less favorable interactions for which are illustrated in Fig. 3A using a style employed previously, Arg than for Lys, as noted in Fig. 1A. e.g., in refs. 4, 34, 39. The phase diagrams presented here are Fig. 3 B, Right, provides the HPS + cation–π phase diagrams. coexistence curves. When the overall average IDR density lies They are computed for r = 80, 40, and 20 to gauge the effect of in between the left and right arms of the coexistence curve at a electrostatic interactions relative to other types of interactions. given temperature, the system phase separates into a dilute phase The r-dependent results are also preparatory for our subsequent and a condensed phase with IDR densities given by the low- investigation of the effect of IDR concentration-dependent per- and high-density values, respectively, of the coexistence curve at mittivity (35, 37) on predicted LLPS properties. All of the the given temperature. When the average IDR density is not in HPS + cation–π phase diagrams here still disagree with exper- the region underneath the coexistence curve, the system is not iment (65, 67) as they all predict a higher LLPS propensity phase separated (see, e.g., refs. 20, 35, 84 for introductory discus- for the RtoK variant than for WT. Apparently, the bias of the sions). Consistent with experiments (4, 65), the simulated phase HPS potential against Arg interactions is so strong that it can- diagrams (Fig. 3B) exhibit upper critical solution temperatures, not be overcome by additional Arg–aromatic interactions that which is a maximum temperature above which phase separation are reasonably more favorable than Lys–aromatic interactions does not occur (corresponding to the local maxima of the coexis- (Fig. 2B). The r = 80 results for both uniform and variable tence curves at IDR density ≈ 200 mg/mL in Fig. 3). We empha- cation–π strength exhibit an additional mismatch: contrary to size, however, that although the simulated critical temperatures experiments (4, 65), they predict similar LLPS propensities for


B 500 320 WT 300 CS 400 FtoA 280 RtoK 300

260 200

240 400

220 300

Model Temperature 200 200 0 250 500 750 0 500 1000 0 500 1000 0 500 1000 Density (mg/ml) Density (mg/ml)

Fig. 3. Simulated phase behaviors of Ddx4 IDR variants in a hydrophobicity-dominant potential augmented by cation–π interactions. (A) Sequence patterns of the WT and its charge-scrambled (CS), Phe to Ala (FtoA) and Arg to Lys (RtoK) variants. Select residue types are highlighted: Ala (orange), Asp and Glu (red), Phe (magenta), Lys (cyan), and Arg (dark blue); other residue types are not distinguished. (B) Simulated phase diagrams of WT, CS, FtoA, and RtoK Ddx4 IDR at various relative permittivities (r) as indicated, using the HPS model only (Left) and the HPS model augmented with cation–π interactions (Right) with either (Top) a uniform (cπ )ij as described in Fig. 2A or (Bottom) different (cπ )ij values for Arg and Lys as given in Fig. 2B.

28798 | www.pnas.org/cgi/doi/10.1073/pnas.2008122117 Das et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 stehg ereo naoaiiyi srbst r inter- 3B) Arg (Fig. to IDRs ascribes it Ddx4 unfavorability of of LLPS degree the high for the accounting is in HPS PDB. the of analyses bioinformatics importance the π all, After cation–π well. of as potentials knowledge-based ( permittivities relative different three at model KH the using computed were 4. Fig. a tal. et Das 67) 65, (4, experiments with the agreement for in largely trend cation–π overall augmented model—without cation–π Approximate + KH HPS an and HPS Provide the Potentials of Account Statistical Structure-Based of LLPS for account general. to adopted in themselves (76) IDRs by scale insufficient particular HPS—are the by by available accorded when of least is interactions—at trend these hydrophobicity that general cation–π follows It the data. and LLPS for Ddx4 charged, account hydrophobic, cannot account RtoK interactions, into when takes incomplete was which is experiment 3B perspective Fig. RtoK picture. the a the enters and such before WT FtoA) (34), of also performed trend possibly LLPS (and sequence-specific CS for account cation–π to and quate electrostatic only involving cation–π weak low- interactions electrostatic to this relatively strong in relative the by because variant overwhelmed signif- probably is FtoA a contribution (4), the of WT of trend of propensity experimental that LLPS the reproduce lower to icantly fail they case cation–π cation–π variable overwhelmed presumed unphysically the are con- by dielectric interactions this under electrostatic that suggesting dition, WT, and variant CS the P oe.I hsrset twudntb upiigthat surprising be not the would in it hydrophobicity respect, as thereof, this lack cation– such In or energetics model. of, HPS of dominance types the IDR particular prejudging and PDB folding without protein the both LLPS governing from effects key derived capture may potentials statistical this knowledge-based in examined IDRs Ddx4 KH the and for in S4, HPS work. provided and of are S3 calculation phase Figs. similar Appendix, condensed of the Examples (52). in models dynamics liquid- mean-square verify chain to Time-dependent used like config- been 5. has chain coordinates Fig. molecular simulated of in deviation our provided of are HPS- Illustrations urations and (52). and KH- FUS IDRs for Similar obtained LAF-1 (65). were densities temperature phase same condensed simulated the at mM 100 of value experimental at propensi- the to WT LLPS rable of For lower density 4). phase have condensed (Fig. simulated to WT potential than KH ties the by predicted –π sdsusdaoe ao as ftesotoigof shortcoming the of cause major a above, discussed As hssceso h Hmdlsget htempirical, that suggests model KH the of success This neatosi D LS(7 srcgie agl by largely recognized is (67) LLPS IDR in interactions iuae hs eair fDx D ainsuiga neato ceebsdlreyo D-eie ttsia oetas hs diagrams Phase potentials. statistical PDB-derived on largely based scheme interaction an using variants IDR Ddx4 of behaviors phase Simulated π  r n other and π austse;ie,altreDx D ainsare variants IDR Ddx4 three all i.e., tested; values r RltdDiigFre o d4LLPS. Ddx4 for Forces Driving -Related neatosi oddpoensrcue(9 and (69) structure protein folded in interactions iuto.Tkntgte,atog perspective a although together, Taken situation. loidct nadtoa imth nthis in mismatch; additional an indicate also π rltditrcin r eetdi these in reflected are interactions -related

hw htteHS+cation–π + HPS the that shows Model Temperature 200 300 400 neatos The interactions. 0 oes ietapiaino the of application direct models, Density (mg/ml) ∼  400 r ∼ 80 = 500 mg 500 neatoswsade- was interactions mL and em—ed oan to terms—leads mg  RtoK FtoA CS WT r −1 mL 20 = T 1000 ih[al = [NaCl] with ncnrs to contrast In 300 = −1 eut for results scompa- is 0 model, ,the K, Density (mg/ml) SI 500 r ). h aewt h vrg being at average favorable the with more same much the is Lys Asp, for −0. Lys, average Arg, to corresponding residues equal the charged is the Glu except and acids residues, amino the all for of one when energy model, HPS In interactions. the Lys-associated to relative interactions associated in)frefil,pst htAghstelathydrophobicity −4. least has the Lys simula- has whereas liquid Arg for that partial of posits potentials value atomic field, (optimized the force OPLS on tions) the based in scale, charges hydrophobicity Its actions. i.5. Fig. tmdltmeaue35K h oo cee r h aea hs in those as same the are schemes color The respectively. K. A–C, 325 temperature (D–F the model system. highlight at the to in differently conformations colored of are diversity chains Neighboring color. same the (B) (C 3A. snapshot. Fig. the in below the key the using by colored as provided are Same as 92) residues (91, acid scheme amino VMD the default wherein K, 375 temperature for potential KH the F E D C B A A hntecagdrsde r nldd h rn is trend the included, are residues charged the When 1276. 0 7) hsasgmn eut nhgl naoal Arg- unfavorable highly in results assignment This (76). C lutaiesasoso Ddx4 of snapshots Illustrative A E D,E n h etlathdohbci s with Asp is hydrophobic next-least the and +14.5, 1000 xetteclrshm a hw)i setal dnia othat to identical essentially is shown) (as scheme color the except ij DEF aeas Same ) (r 0 ) PNAS 0 sAg h vrg of average the Arg, is 1A) (Fig. n h othdohbci h with Phe is hydrophobic most the and +5.0,  R G A Density (mg/ml) r = | and −0. o–hs-eaae nphta model at snapshot non–phase-separated A (A) 40. HIKLM oebr1,2020 17, November B 500 0762 xetalrsde ln h aecanshare chain same the along residues all except K nuiso kcal of units in N1 Spaebhvossmltdusing simulated behaviors phase CS −0.0677 1000 F N hs-eaae snapshot phase-separated A ) | P o.117 vol. Q o r and Arg for i A R ntepairwise the in , mol | S E o 46 no. ij −1 T (r Others whereas , 0 VW ) −0.119 | over +7.5, 28799 Y j

BIOPHYSICS AND COMPUTATIONAL BIOLOGY for Lys. In contrast, for the KH model, the trend is opposite with accounting of the π-related energetics of biomolecular LLPS. Arg-associated interactions much more favorable: the corre- Physically, the general preference for Arg–π over Lys–π contacts sponding average is −0.123 for Arg and −0.041 for Lys when is underpinned in part by the Arg geometry which allows for charged residues are excluded in the averaging and −0.0990 for favorable guanidinium–aromatic coplanar packing (71), as dis- Arg and −0.0161 for Lys when charge residues are included. This cussed recently by Wang et al. (ref. 39 and references therein), trend echoes an earlier eigenvalue analysis of the Miyazawa– and that Arg–aromatic is less attenuated by the aqueous dielec- Jernigan energies (75) (which underlie the KH potential) indicat- tric medium because it is less dominated than Lys–aromatic ing that Arg has a significantly larger projection than Lys along interactions by electrostatics (83). Consistent with this perspec- the dominant eigenvector (85). tive, R to K mutants of FUS (39) and LAF-1 RGG (77) also Whereas correlation among hydrophobicity scales inferred exhibit substantially reduced LLPS propensity relative to WT. from different methods is limited (86–89) with significant vari- ations especially for the nonhydrophobic polar and charged Phenylalanine Interactions in Liquid Condensates, Expected to Be residues (87), the extremely low hydrophobicity assigned by HPS More Solvent-Exposed, Are Weaker than Statistical Estimates Based (52, 76) to Arg relative to Lys is unusual. For instance, Lys is on Mostly Core Phenylalanine Contacts in Folded Proteins. To assess substantially less hydrophobic than Arg in two of the three scales further the generality of the KH model, we apply it to simulate tabulated and compared in ref. 89. In a commonly utilized scale the phase behavior of WT LAF-1 and its RtoK and tryosine-to- based on the free energies of transfer of amino acid derivatives phenylalanine (YtoF) mutants (Fig. 6 and SI Appendix, Fig. S5). from water to octanol measured by Fauch`ere and Pliˇska (90) Simulation studies of LLPS of full-length and the RGG IDR (the second scale tabulated in ref. 89), Arg is only slightly less (52) of LAF-1, including the latter’s charge shuffled variants (77), hydrophobic (+5.72 kJ mol−1) than Lys (+5.61 kJ mol−1), and have been conducted extensively using the HPS (52, 77) and KH thus, essentially, Arg and Lys are deemed to possess equally low (52) models to gain valuable insights, but phase behaviors of the hydrophobicities. Accordingly, this scale affords a better correla- RtoK and YtoF LAF-1 RGG mutants have not been simulated tion with the Miyazawa–Jernigan energies (75) (figure 3b of ref. using these models. Recent experiments indicate that the RtoK 89) than that exhibited in Fig. 1A. mutant does not phase separate under the conditions tested, It is reasonable to expect the 210 (or more) residue–residue whereas the LLPS propensity of the YtoF mutant is reduced contact energy parameters in PDB structures-based potentials to relative to that of WT (77). As for the case of Ddx4 IDR, the KH- contain more comprehensive energetic information than merely predicted coexistence curve for RtoK LAF-1 RGG in Fig. 6 (pur- the hydrophobicities of the 20 types of amino acid residues. In ple curve) is consistent with the experimental trend as it has a far this regard, it is notable that a higher propensity for Arg than lower critical temperature than that of the WT (red curve). How- Lys to engage favorably with another residue appears to be a ever, KH posits a significantly higher LLPS propensity for YtoF robust feature of PDB statistics. It holds for the cation–aromatic (green dashed curve with a much higher critical temperature), pairs we analyze in SI Appendix, Fig. S2, for the KH potential, which is opposite to experimental observation. On closer inspec- and also for the original Miyazawa–Jernigan energies put forth tion, the basic feature that causes this shortcoming is that KH in 1985 (74). According to table V of ref. 74, the contact ener- deems Phe interactions more favorable than Tyr interactions. gies eij between Arg and aromatic or negatively charged residues According to KH, Eij (r0) for i = Phe averaged over j for all −1 are −3.54, −3.56, −2.75, −2.07, and −1.98kBT for Arg–Phe, amino acids is −0.453 kcal mol for Phe, which is significantly Arg–Trp, Arg–Tyr, Arg–Glu, and Arg–Asp, respectively (kB is more negative than the corresponding average Eij (r0) of −0.281 Boltzmann constant, and T is absolute temperature), whereas kcal mol−1 for Tyr. For this reason, KH is unlikely to repro- the corresponding contact energies are weaker for Lys at −2.83, duce the experimentally observed reduced LLPS propensities −2.49, −2.01, −1.60, and −1.32kBT for Lys–Phe, Lys–Trp, Lys– for any other YtoF mutant either, including the YtoF mutants Tyr, Lys–Glu, and Lys–Asp, respectively. All 20 Arg interactions of FUS (39). are more favorable than the corresponding Lys interactions. The This overestimation of the favorability of Phe-related interac- average eij over all Arg-associated pairs is −2.22kBT , which is tions in the LLPS context also causes the KH-predicted LLPS substantially more favorable than the corresponding average of propensity of the FtoA mutant of Ddx4 IDR (Fig. 4, green −1.4795kBT for the Lys-associated pairs. It is apparent from curves) to be substantially lower than that of the RtoK mutant the present application of KH to the Ddx4 IDRs that this fea- (purple curves). Experimentally, however, FtoA mutant LLPS ture is crucial, at least at a coarse-grained level, for an adequate is observed at ∼ 350 mg mL−1 protein concentration, but the

A WT RtoK YtoF B 300

200 WT RtoK YtoF

Model Temperature 100 0 200 400 600 Density (mg/ml)

Fig. 6. Simulated phase diagrams of LAF-1 IDRs computed by the KH model at r = 40 for the three sequences in SI Appendix, Fig. S5.(A) Sequence patterns of the WT and two variants. Highlighted residues are Phe (magenta); Lys (cyan); Arg (dark blue), as in Fig. 3A; and Tyr (pink); other residue types, including Ala, Asp, and Glu which were highlighted in Fig. 3A, are not distinguished here. (B) Phase diagrams for WT and mutant sequences are plotted in different colors as indicated.

28800 | www.pnas.org/cgi/doi/10.1073/pnas.2008122117 Das et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 a tal. et Das with KH, in Ala– strengths Ala–Ala, different interaction vastly Phe–Phe the and to contrast Phe, in Thus, 99)]. (98, oc PF)a 9 yMkwk ta.idct htthe are that minima indicate contact al. (99) all et ethylbenzene–ethylbenzene Makowski ethane–ethylbenzene and ethane–ethane, by (98), the at K mean energies of Phe, 298 free lowest potentials at and pairwise (PMFs) Ala of force simulations for intermolecu- water models potential) transferable lar as (3-point ethylbenzene explicit-TIP3P instance, sub- respectively, and For not interactions. ethane LLPS—is Ala-related using than to accessible stronger pertinent solvent stantially highly more a is in environment—which interaction sug- Phe-related evidence that computational regard, gests this hydrophobicity In common (97). by measurements quantified interactions bulk of from different those distinctly are 96)—that (95, ther- signatures modynamic properties—including have environments solvent-exposed is prominent. interactions more be folded-state-stabilizing to between expected and character in LLPS-driving 0.52, difference 0.74, the Phe’s 0.76, (94), are corre- Arg respectively) (the and 0.64, folding Lys, and Ala, upon of Tyr, 0.88 solvent for average fractions the to sponding on inaccessible of with area one proteins surface is folded its Phe in Since residues proteins. buried most folded upon based in are turn frequencies in contact KH which because potentials statistical mutants from FtoA derived to and is YtoF inability This of KH’s phase. behaviors understand condensed LLPS to capture the perspective least in a at even offers are observation solvent context to LLPS exposed the partially in because contacts protein residue–residue folded a from most of significantly core well-packed differ the stabilizing also of those can character contributions the phenylalanine, LLPS-driving as its such properties aromatic and context. interactions LLPS any Phe–Arg the over to in Tyr–Arg account for preference effects the for bonding degree hydrogen similar whether tigate ftepyia esn o hsbhvo sta h y hydroxyl the Tyr with the bond hydrogen that N a is Phe—makes in behavior One absent this 16 is state. for —which of reasons folded out physical the 14 the Sa3, destabilize of and substitutions Sa single-site not ribonuclease YtoF proteins are even globular However, gener- Tyr. LLPS the than is for hydrophobic to more Phe be hydrophobicities. to residues considered their ally acid by amino dominated necessarily of contributions getic to either. expected trend not YtoF is observed HPS the 3), insufficient capture (Fig. only experiments favorable RtoK not for is more account HPS to generally Thus, differ interactions. still Tyr-related Phe are than and interactions Tyr, Phe-related Ala, but for model HPS being less, the are in Phe energies and action Tyr, Ala, rank for −10.1 energies with free the line Fauch fer in in of instance, largely For scale scales. are water-to-octanol hydrophobicity energies common and in KH YtoF ordering for the observations mutants, experimental FtoA reproduce not do gies 0. + −0.0161 kcal KH average corresponding the iiso d4Fo n tKi poiet xeiet This experiment. to at opposite energy types, residue is KH RtoK the propen- and because LLPS is FtoA of Ddx4 ordering mg of rank 25 sities KH-predicted mg at the 400 observed words, to is other WT up of separate LLPS phase comparison, not does mutant RtoK  ned thsln enrcgie htpi neatosin interactions pair that recognized been long has it Indeed, hydrophobic both possesses that residue large relatively a For h bv osdrto nesoe neaanta h ener- the that again once underscores consideration above The ener- KH Phe-related and Tyr-, Ala-, these while Interestingly, tmo nAgrsde(3.I ol eelgtnn oinves- to enlightening be would It (93). residue Arg an of atom − ≈ mol kJ 1 −1 mol kcal −0.141, sfrlre hntecrepnigcag of change corresponding the than larger far is ) E 90=+0.0829 = 0990 −1 ij mol (r epciey ycmaio,teaeaeinter- average the comparison, By respectively. , −0.160 0 −1 −0.453 ) and −0.154, rmPet l (−0. Ala to Phe from Trwsntcniee nteestudies these in considered not was [Tyr kcal kcal E ij mol kcal (r mol −0.168 0 ` r n Pli and ere −1 ) mol o l vrgdoe l 20 all over averaged Ala for −1 smc esfvrbethan favorable less much is , −1 o h n h hnein change the and Phe for kcal rmAgt Lys. to Arg from 6 0 + 160 ˇ k 9) h trans- the (90), ska mol −1.76, −1 mL 5 +0.293 = .453 respectively, , −1 and −5.44, mL E 6) In (67). ) ij (r −1 0 = ) (in issgetta eraei effective in theo- decrease medium a that effective suggest with ries treatments lower Analytical have materials (100–102). the in Protein concentration—than phase. a IDR condensed dilute higher the to a in is affected materials there IDR is phase—where intervening the charges by two extent uniform the larger between LLPS: a upon interaction in nonuniform electrostatic certainly operate is environment to dielectric position-independent simplicity, a with for medium dielectric assumed, are inter- electrostatic IDRs, actions of Can LLPS Propensity of simulations LLPS implicit-solvent on Environ- Impact Its Dielectric Modest. but Be the Droplets Affect Condensed of Significantly ment Can to Concentration residues IDR acid LLPS. amino for potentials interaction among coarse-grained improve contacts features solvent-accessible overlooked hitherto knowledge, of this tackle With should experiment. investigations FtoA which the future context, with LLPS line the in in inter- be Phe in would and difference Ala between modest strength more action a suggest PMFs explicit-water −0.142, D tlwbtnta ihIRcnetain neetnl,the Interestingly, concentration. IDR (112, high of at salt solution not TIP3P but with for low observed at decrease is IDR effect to expected this known Here 113). is protein Permittivity the considered. on simulated 7A charges IDR 37), Fig. partial (35, in of the expectation effect from with is Consistent dielectric backbone. model the atomic to the simulated contribution in main in the difference 7A because The (Fig. S1). concentration Table IDR and S6,  Fig. simulated Text, of SI set extended an fraction the 7 A volume Fig. are of in IDR that plotted details are from values Methodological separately Lys, when in medium. treated Arg, applications provided of dielectric possible are charges for charges background chain neutralized chain are side side Glu the and which (101, constructed Asp, moment in artificially dipole (aDdx4) on performed system are Ddx4 also the permittivities are Simulations of Relative 103). boxes fluctuations chains. simulation by IDR in estimated water Ddx4 of WT model containing (111) (extended charge) SPC/E the point explicit-water or simple our (110) TIP3P for the either the used using and simulations are (108) (109) Simulations) forcefield Chemical amber99sb-ildn for Machine (Groningen (107). condensate a into partition preferentially molecules charged to and functional ions of various for be drivers as to e.g., their significance, expected condensates, interest are biomolecular of environments on been dielectric focus interior long have the settings For (105) biomolec- (106). related cellular and and (103), (104) solutions ular their folded of 101), properties (100, dielectric proteins The electro- LLPS. the for of forces account driving complete static more a analytical provide model with to information formulations to this combine not then effective will molecules, the We centration. how IDR atomic estimate explicit-water few to but use a LLPS we only Here chains involving LLPS. IDR simulation model of number to large needed a are computa- because are the costly simulations of extremely such models sites tionally but appropriate molecules, atomic protein to explicit-water and assigned water are using charges simulated partial wherein directly be can favor that 37). attractions (35, electrostatic phase condensed stronger the induces turn in lowers phase that condensed cooperative of a formation the in because manner LLPS polyampholytes enhances concentration r (φ) sotie in outlined As media aqueous polarizable in chains IDR of LLPS principle, In o d4adad4i elgbeecp tvr low very at except negligible is aDdx4 and Ddx4 for and −0.425, erae ihincreasing with decreases nrcn 5,5,5)adteaoecoarse-grained, above the and 55) 53, (52, recent In oeo h simulated the of Some Text. SI Appendix, SI PNAS | −0.756 oebr1,2020 17, November GROMACS Methods, and Materials φ (the kcal and oilsrt hi eedneon dependence their illustrate to  φ r r rvddin provided are s ∝ mol ,likely S6), Fig. Appendix, SI ocnrto eainand relation concentration φ −1 | o l ovn conditions solvent all for  epciey(2,these (52), respectively , r  o.117 vol. r eed nIRcon- IDR on depends ihicesn IDR increasing with  r hnbl water bulk than s  r | oee,the However, . o 46 no. IAppendix, SI |  r  and , 28801 r (φ)  r


Fig. 7. Effects of IDR concentration-dependent relative permittivity on phase behaviors. (A) Relative permittivity r(φ) values obtained by atomic simulations (symbols) using various explicit-water models (as indicated at the bottom) are shown as functions of Ddx4 volume fraction φ (φ = 1 corresponds to pure Ddx4). The blue curve is a theoretical fit of the SPC/E, [NaCl] = 100 mM explicit-water simulated data based on the Slab [Bragg and Pippard (114)] model (equation 34 of ref. 37), viz., 1/r(φ) = φ/p + (1 − φ)/w with the fitted p = 18.9 and w = 84.5 where p and w are the relative permittivity of pure protein and pure water, respectively. The black solid, dashed, and dashed-dotted lines are approximate linear models of r(φ) = pφ + w(1 − φ) with the same w but different p values as indicated (at the top right), resulting in dr(φ)/dφ slopes of −65.6, −83.9, and −42.2. (B and C) Theoretical phase diagrams of the four Ddx4 IDR variants were obtained by a RPA theory that incorporates an r(φ) linear in φ. Solid, dashed, and dashed-dotted curves correspond, as in A, to the three different p values used in the theory. The electrostatic contribution to the phase behaviors is calculated here using either (B) the expression for ˜ fel given in SI Appendix, SI Text, Eq. S51 [i.e., equation 68 of ref. 35 with its self-interaction term G2(k) excluded] or (C) the full expression for fel (equation 68 of ref. 35 or equivalently SI Appendix, SI Text, Eq. S2). Further details are provided in SI Appendix, SI Text.

r(φ) simulated with SPC/E water and 100 mM NaCl exhibits the final RPA + FH theories with IDR concentration-dependent nonlinear decrease with increasing φ, which is akin to that r(φ). Details of unit conversion between our explicit-chain predicted by the Bragg–Pippard (114) and Clausius–Mossotti simulation and our analytical RPA + FH formulation are in models, but the TIP3P-simulated r(φ) appears to be linear in φ, SI Appendix, SI Text. which is more in line with the Maxwell Garnett and Bruggeman In this connection, it is instructive to note that the correspond- models (37). ing averages of Eij (r0) [HPS] for the HPS model are −0.1214 We utilize the salient features of the coarse-grained KH chain for WT and CS, −0.1179 for FtoA, and −0.1294 for RtoK. In model for Ddx4 (Fig. 4) and the IDR concentration-dependent this case, the more favorable (more negative) average energy of permittivities from explicit-water simulations (Fig. 7A) to inform RtoK than WT underlies the mismatch of HPS prediction with an analytical theory for IDR LLPS, referred to as RPA + experiment seen in Fig. 3B. FH, that combines a random phase approximation (RPA) of Fig. 7 B and C show the phase diagrams of the four Ddx4 IDRs charge sequence-specific electrostatics and Flory–Huggins (FH) predicted using RPA + FH theories with three alternative IDR mean-field treatment for the other interactions (34, 35). An concentration-dependent r(φ) functions and KH-derived mean- in-depth analysis of our previous RPA formulation for IDR field FH parameters as prescribed above. In all cases considered, concentration-dependent r (35) indicates that only an r(φ) lin- the WT sequence (red curves) exhibit a higher propensity to ear in φ can be consistently treated by RPA (SI Appendix, SI LLPS than the three variants, indicating that this general agree- Text). In view of this recognition and considering the uncer- ment with the experimental trend seen in Fig. 4 holds up not tainties of simulated r(φ) for different water models (Fig. 7A), only under the simplifying assumption of a constant r but also three alternative linear forms of r(φ) (straight lines in Fig. 7A) when the dielectric effect of the IDRs is taken into account. As are used for the present RPA formulation to cover reasonable discussed in SI Appendix, SI Text, we have previously subtracted variations in r(φ). the self-energy term in the RPA formulation for numerical effi- The mean-field FH interaction parameters in our RPA + FH ciency because the term has no impact on the predicted phase models for the four Ddx4 IDRs are obtained from the four diagram when r is a constant independent of φ because the sequences’ average pairwise nonelectrostatic KH contact ener- self-energy contribution is identical for the dilute and condensed gies. For each of the 236-residue sequences, we calculate the phases. However, with an IDR concentration-dependent r(φ), average of the Eij (r0) [KH] quantity (Fig. 1A), for a given pair of as for the cases considered here, the self-energy—with the short- residue types, over all pairs of residues on the sequence, includ- distance cutoff of Coulomb interaction in the RPA formulation ing a residue with itself (236 × 237/2= 27,966 pairs total), except corresponding roughly to a finite Born radius (115)—is physi- those pairs involving two charged residues (Arg–Arg, Arg–Lys, cally relevant as it decreases with increasing r, and therefore, it Arg–Asp, Arg–Glu, Lys–Lys, Lys–Asp, Lys–Glu, Asp–Asp, Asp– affects the predicted LLPS properties as manifested by the dif- Glu, and Glu–Glu) because interactions of charged pairs are ference between Fig. 7 B and C. It follows that the self-energy accounted for by RPA separately. The resulting average ener- term quantifies a tendency for an individual polyampholyte chain −1 gies in units of kcal mol , −0.1047 for WT and CS, −0.0689 to prefer the dilute phase with a higher r—because of its more for FtoA, and −0.0924 for RtoK, are then input with an over- favorable electrostatic interactions with the more polarizable all multiplicative scaling factor into RPA + FH theories with environment—over the condensed phase with a lower r. This φ-independent r for three different fixed r= 80, 40, and 20. tendency disfavors LLPS. At the same time, the lower r in the The computed RPA + FH phase diagrams are then fitted to the condensed phase entails a stronger interchain attractive electro- corresponding phase diagrams simulated by coarse-grained KH static force that drives the association of polyampholyte chains. chain models in Fig. 4 to determine a single energy scaling factor Therefore, taken together, relative to the assumption of a con- from the best possible fit (SI Appendix, Fig. S7). The product of stant r, the overall impact of an IDR concentration-dependent this factor and the sequence-dependent averages of Eij (r0) [KH] r(φ) is expected to be modest because it likely entails a partial defined above is now used as the enthalpic FH χ parameters in tradeoff between these two opposing effects. This consideration

28802 | www.pnas.org/cgi/doi/10.1073/pnas.2008122117 Das et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 3 .Ce,X u .W,M hn,Paesprto ttesynapse. the at separation Phase phase Zhang, M. and Wu, coacervation H. Wu, of X. role Chen, X. The Jones, 13. P. J. Song, T. I. Wang, S. C. Stewart, J. R. 12. 0 .F aai .O e,A .Hmn .K oe,Booeua condensates: Biomolecular organelles. Rosen, membraneless of K. language M. molecular Hyman, The Shorter, A. J. A. Gomes, E. Lee, 11. O. H. Banani, F. S. 10. a tal. et Das aeas ihihe h eue lcrcpritvt niecon- inside permittivity electric We reduced proteins. the globular highlighted also of have interfaces the binding in or than cores rather sequestered interactions environments solvent-accessible such highly because in likely exist most potentials, estimated that statistical than from weaker significantly pheny- However, are of interactions binding. contributions lalanine and condensate-stabilizing the folding that protein found we study poten- statistical to PDB-derived developed common appears tials by IDRs captured of LLPS it partly for regard, be forces this to of In balance 60). the (67) that (39, reassuring propensities IDRs is other LLPS on of experiments analysis recent bioinformatics and with line in is by underpinned likely most favorable that is effect more an contacts, lysine-associated significantly over arginine-associated entails on experiment variants of arginine-to-lysine Rationalization themselves. by insufficient we cation–π are sequences, and account electrostatic, the hydrophobicity, of to that propensities find LLPS Aiming relative experiment. observed all and for simulation disagree- as between well we as variants, ment agreement two its from and knowledge RGG its essential LAF-1 and acquire WT IDR as well Ddx4 as WT model variants on three comparing data By IDRs. experimental interac- of against of LLPS predictions types for important the are to that as tions chain outlooks forces different protein embodying physical residue-based models system- the coarse-grained, by into evaluating condensates biomolecular insights atically of formation gained the drive have that we summary, In Conclusion in color same 7B). the crit- Fig. of lower curves have dashed-dotted curves than (dashed temperatures propensity ical LLPS in lower a decrease to sharper leads the a formulation appropriate self-energy, of physically the with curves for have 7C), dashed-dotted Fig. curves in color than same (dashed temperatures neglected critical is higher self-energy when the increasing propensity whereas with decrease picture, sharper physical a this with with fixed a Consistent for 4A. those Fig. than lower slightly 7 but comparable Fig. are in using for predicted accounted sities is fixed self-energy compara- a of temperatures), for effect critical those the to by ble characterized (as high LSpoeste rdce using predicted propensities LLPS 7 7C, Fig. Fig. in out borne is .N .Ndlk,J .Tyo,Bign ipyisadnuooy bratphase Aberrant neurology: and biophysics Bridging Taylor, P. J. Nedelsky, B. N. 9. Pom R. Sharpe, S. Muiznieks, D. L. phase- 8. of maturation and Formation Zeng Parker, R. M. Rosen, 7. K. M. Protter, W. S. D. Lin, Y. 6. Molliex A. 5. Nott J. T. 4. Kato M. 3. Li P. 2. Brangwynne P. C. 1. 0–1 (2020). 301–310 rniin ntesncsl omahsv system. adhesive (2017). worm 88–96 sandcastle the in transitions Chem. raieso ellrbiochemistry. cellular of Organizers disease. neurodegenerative in transitions proteins. extracellular other and (2018). elastin 4741–4753 of assembly in tion plasticity. and formation synapse understanding proteins. RNA-binding by droplets liquid separated fibrillization. pathological drives and assembly organelles. membraneless responsive environmentally hydrogels. within fibers dynamic form domains (2012). 336–340 483, dissolution/condensation. controlled hs rniin nteasml fmliaetsgaln proteins. signalling multivalent of assembly the in transitions Phase al., et 1572 (2019). 7115–7127 294, elfe omto fRAgaue:Lwcmlxt sequence complexity Low granules: RNA of formation Cell-free al., et eosiue otyatcdniya oeua ltomfor platform molecular a as density postsynaptic Reconstituted al., et tal. et hs rniino iodrdnaepoengenerates protein nuage disordered a of transition Phase al., et hs eaainb o opeiydmispooe tesgranule stress promotes domains complexity low by separation Phase , emiePgaue r iuddolt htlclz by localize that droplets liquid are granules P Germline al., et B  r r infiatylwr vrl they overall lower: significantly are (φ)s and  r s .W ely oeo iudlqi hs separa- phase liquid-liquid of Role Keeley, W. F. es, ` 40 = hnsl-nryi elce in neglected is self-energy When C. a.Rv o.Cl Biol. Cell Mol. Rev. Nat. Science π –π a.Rv Neurol. Rev. Nat. hntephysical the When 4B. Fig. in neatos hsperspective This interactions. 7913 (2009). 1729–1732 324, Cell Cell Cell φ 2–3 (2015). 123–133 163, o.Cell Mol. 5–6 (2012). 753–767 149,  1218 (2018). 1172–1187 174, r d.ClodItraeSci. Interface Colloid Adv. ed oahge LLPS higher a to leads o.Cell Mol. (φ)  r 7–8 (2019). 272–286 15, r relatively are (φ)s 8–9 (2017). 285–298 18, ihincreasing with LSpropen- LLPS B, 0–1 (2015). 208–219 60, 3–4 (2015). 936–947 57, a.Neurosci. Nat. .Ml Biol. Mol. J. interactions  r 80 =  Nature .Biol. J. r (φ) 239, 430, 23, in φ 5 .L inn .Zeg .Mta,Smlto ehd o iudlqi phase liquid-liquid liquid- studying for in challenges methods and Considerations Simulation Mittag, T. Gladfelter, Mittal, A. Alberti, J. S. Zheng, 26. W. Dignon, L. G. 25. biology. and phase physics polymer Bridging and separation: Phase Perry, preferences L. S. Conformational 24. Holehouse, S. A. protein Pappu, biological V. R. of Ruff, predictors M. First-generation K. Forman-Kay, 23. D. J. Vernon, phase M. R. sequence-dependent 22. for Theories Chan, Boeynaems S. S. H. 21. Forman-Kay, D. J. disease. and Lin, physiology Y.-H. cell in 20. condensation phase Liquid Brangwynne, P. C. Shin, Y. 19. Fujioka Y. 18. Wong E. L. 17. Kim H. T. 16. Tsang B. 15. Gibson A. B. 14. n opttoa eore rvddgnrul yCompute/Calcul by generously Discov- provided Canada H.S.C., H.S.C., to of resources RGPIN-2018-04351 to Canada. Council and computational NJT-155930 J.D.F.-K. Research to and and Wenwei RGPIN-2016-06718 Engineering MOP-84281 grants and Insti- and grants ery Canadian Schmit, Sciences by Research supported Health Jeremy Natural was work of Pappu, This tutes discussions. Rohit helpful for Mittal, Zheng Jeetain Holehouse, ACKNOWLEDGMENTS. Availability. Data (111) SPC/E RPA or analytical of with (110) in development provided conjunction the TIP3P are and in formulation with methodology (108) our and of GROMACS Details (109) using waters. forcefield IDRs amber99sb-ildn Ddx4 con- WT the are five permittivity on relative ducted concentration-dependent simulations IDR explicit-water estimating subsequent phys- The different for considered. embodying are functions energy perspectives IDP LLPS model ical con- simulated for discussed, produce condensed As 53 to slab-like (78). equilibration 52, data efficient initial for refs. an fol- allows in features molecules that figuration protocol IDP formulations simulation multiple dynamics The for Langevin LLPS. investigation in the present used largely the setup lows of modeling part first explicit-chain the coarse-grained implicit-water IDR-like Our novel Methods of and Materials design the acids advance biologi- will well. nucleic will as molecular directions materials and of fundamental these processes of in incorporation cal understanding ion Progress and our (43). small deepen 117), simulations of LLPS account 116, into and accurate (41, temperature-dependent an attractive effects (51), devising between interactions balance (53), effective proper interactions solvent-exposed a repulsive of phenylalanine, such description residues as hydrophobic/aromatic adequate large involving are to—an interactions include—and limited These for nonetheless. not remain investigated model further questions coarse-grained be the Many transferable to clarify LLPS. a represents to biomolecular toward also serves sequence-specific step it only principles, useful not general a study of present issues the aforementioned conden- told, biomolecular All given a sate. into preferential because molecules and certain biology reactions of biochemical in partition on importance impact functional potential its of of be should permittivity inter- concentration-dependent itself inter-IDR IDR by on of and effect between the self-energies tradeoff actions, a IDR of on on because consequences influence modest its overall be may effect’s propensity this LLPS Although phases. IDR densed iudpaesprto n imlclrcondensates. biomolecular and separation phase liquid proteins. disordered of separation from Insights sequences: Sci. complexity Interface low disordered simulations. intrinsically multiscale of behavior separation. phase Biol. Cell condensates. biomolecular of behaviors Science Nature responsiveness. cell B priming deadenylation. and translation of regulation lates formation. granule mRNA U.S.A. of Sci. control Acad. bidirectional through translation separation. 0–0 (2020). 301–305 578, af32(2017). eaaf4382 357, hshrgltdFR hs eaainmdl activity-dependent models separation phase FMRP Phosphoregulated al., et hsh-eedn hs eaaino MPadCPI1recapitu- CAPRIN1 and FMRP of separation phase Phospho-dependent al., et 2–3 (2018). 420–435 28, hs eaainognzstest fatpaooeformation. autophagosome of site the organizes separation Phase al., et Cell rpriepaesprto ftosga fetr ihvesicles with effectors signal two of separation phase Tripartite al., et 69 (2019). 86–97 39, tal. et rti hs eaain e hs ncl biology. cell in phase new A separation: phase Protein al., et 7–8 (2019). 470–484 179, PNAS 2842 (2019). 4218–4227 116, l td aaaeicue nteatceand article the in included are data study All ur pn tut Biol. Struct. Opin. Curr. raiaino hoai yitiscadrgltdphase regulated and intrinsic by chromatin of Organization , ur pn tut Biol. Struct. Opin. Curr. | etakRbr et ighkGsh Alex Ghsoh, Kingshuk Best, Robert thank We oebr1,2020 17, November IText . SI Appendix, SI a.Commun. Nat. ur pn hm Eng. Chem. Opin. Curr. Biochemistry 89 (2019). 88–96 58, 4 (2020). 848 11, –0(2019). 1–10 56, Science | o.117 vol. 4920 (2018). 2499–2508 57, Cell 2–2 (2019). 825–829 365, 29 (2019). 92–98 23, 1–3 (2019). 419–434 176, | ur pn Colloid Opin. Curr. o 46 no. IAppendix SI rc Natl. Proc. | Trends 28803 .

BIOPHYSICS AND COMPUTATIONAL BIOLOGY 27. H. Cinar et al., Temperature, hydrostatic pressure, and osmolyte effects on liquid- 60. E. W. Martin et al., Valence and patterning of aromatic residues determine the phase liquid phase separation in protein condensates: Physical chemistry and biological behavior of prion-like domains. Science 367, 694–699 (2020). implications. Chem. Eur. J. 25, 13049–13069 (2019). 61. M. K. Hazra, Y. Levy, Charge pattern affects the structure and dynamics of 28. J.-M. Choi, A. S. Holehouse, R. V. Pappu, Physical principles underlying the com- polyampholyte condensates. Phys. Chem. Chem. Phys. 22, 19368–19375 (2020). plex biology of intracellular phase transitions. Annu. Rev. Biophys. 49, 107–133 62. H. Cinar, S. Cinar, H. S. Chan, R. Winter, Pressure-induced dissolution and reentrant (2020). formation of condensed, liquid-liquid phase-separated elastomeric α-elastin. Chem. 29. P. A. Chong, J. D. Forman-Kay, Liquid-liquid phase separation in cellular signaling Eur. J. 24, 8286–8291 (2018). systems. Curr. Opin. Struct. Biol. 41, 180–186 (2016). 63. S. Cinar, H. Cinar, H. S. Chan, R. Winter, Pressure-sensitive and osmolyte-modulated 30. X.-H. Li, P. L. Chavali, R. Pancsa, S. Chavali, M. M. Babu, Function and regulation of liquid-liquid phase separation of eye-lens γ-crystallins. J. Am. Chem. Soc. 141, 7347– phase-separated biological condensates. Biochemistry 57, 2452–2461 (2018). 7354 (2019). 31. C. F. Lee, J. D. Wurtz, Novel physics arising from phase transitions in biology. J. Phys. 64. H. Cinar et al., Pressure sensitivity of SynGAP/PSD-95 condensates as a model for post- D Appl. Phys. 52, 023001 (2018). synaptic densities and its biophysical and neurological ramifications. Chem. Eur. J. 26, 32. A. A. Hyman, C. A. Weber, F. Julicher,¨ Liquid-liquid phase separation in biology. Annu. 11024–11031 (2020). Rev. Cell Dev. Biol. 30, 39–58 (2014). 65. J. P. Brady et al., Structural and hydrodynamic properties of an intrinsically disordered 33. C. P. Brangwynne, P. Tompa, R. V. Pappu, Polymer physics of intracellular phase region of a germ cell-specific protein on phase separation. Proc. Natl. Acad. Sci. U.S.A. transitions. Nat. Phys. 11, 899–904 (2015). 114, E8194–E8203 (2017). 34. Y.-H. Lin, J. D. Forman-Kay, H. S. Chan, Sequence-specific polyampholyte phase 66. S. Roberts et al., Complex microparticle architectures from stimuli-responsive separation in membraneless organelles. Phys. Rev. Lett. 117, 178101 (2016). intrinsically disordered proteins. Nat. Commun. 11, 1342 (2020). 35. Y.-H. Lin, J. Song, J. D. Forman-Kay, H. S. Chan, Random-phase-approximation the- 67. R. M. Vernon et al., Pi-Pi contacts are an overlooked protein feature relevant to phase ory for sequence-dependent, biologically functional liquid-liquid phase separation of separation. eLife 7, e31486 (2018). intrinsically disordered proteins. J. Mol. Liq. 228, 176–193 (2017). 68. L. M. Salonen, M. Ellermann, F. Diederich, Aromatic rings in chemical and biological 36. Y.-H. Lin, H. S. Chan, Phase separation and single-chain compactness of charged recognition: Energetics and structures. Angew. Chem. Int. Ed. 50, 4808–4842 (2011). disordered proteins are strongly correlated. Biophys. J. 112, 2043–2046 (2017). 69. J. P. Gallivan, D. A. Dougherty, Cation-π interactions in structural biology. Proc. Natl. 37. Y.-H. Lin, J. P. Brady, J. D. Forman-Kay, H. S. Chan, Charge pattern matching as a Acad. Sci. U.S.A. 96, 9459–9464 (1999). “fuzzy” mode of molecular recognition for the functional phase separations of 70. J. P. Gallivan, D. A. Dougherty, A computational study of cation-π interactions vs salt intrinsically disordered proteins. New J. Phys. 19, 115003 (2017). bridges in aqueous media: Implications for protein engineering. J. Am. Chem. Soc. 38. L.-W. Chang et al., Sequence and entropy-based control of complex coacervates. Nat. 122, 870–874 (2000). Commun. 8, 1273 (2017). 71. P. B. Crowley, A. Golovin, Cation-π interactions in protein-protein interfaces. Proteins 39. J. Wang et al., A molecular grammar governing the driving forces for phase 59, 231–239 (2005). separation of prion-like RNA binding proteins. Cell 174, 688–699 (2018). 72. J. Song, S. C. Ng, P. Tompa, K. A. W. Lee, H. S. Chan, Polycation-π interactions are 40. J. D. Schmit, J. J. Bouchard, E. W. Martin, T. Mittag, Protein network structure enables a driving force for molecular recognition by an intrinsically disordered oncoprotein switching between liquid and gel states. J. Am. Chem. Soc. 142, 874–883 (2020). family. PLoS Comput. Biol. 9, e1003239 (2013). 41. Y.-H. Lin, J. P. Brady, H. S. Chan, K. Ghosh, A unified analytical theory of heteropoly- 73. T. Chen, J. Song, H. S. Chan, Theoretical perspectives on nonnative interactions and mers for sequence-specific phase behaviors of polyelectrolytes and polyampholytes. intrinsic disorder in protein folding and binding. Curr. Opin. Struct. Biol. 30, 32–42 J. Chem. Phys. 152, 045102 (2020). (2015). 42. A. N. Amin, Y.-H. Lin, S. Das, H. S. Chan, Analytical theory for sequence-specific binary 74. S. Miyazawa, R. L. Jernigan, Estimation of effective interresidue contact energies from fuzzy complexes of charged intrinsically disordered proteins. J. Phys. Chem. B 124, protein crystal structures: Quasi-chemical approximation. Macromolecules 18, 534– 6709–6720 (2020). 552 (1985). 43. Y. Lin et al., Narrow equilibrium window for complex coacervation of tau and RNA 75. S. Miyazawa, R. L. Jernigan, Residue-residue potentials with a favourable contact pair under cellular conditions. eLife 8, e42571 (2019). term and an unfavourable high packing density term, for simulation and threading. 44. J. McCarty, K. T. Delaney, S. P. O. Danielsen, G. H. Fredrickson, J.-E. Shea, Complete J. Mol. Biol. 256, 623–644 (1996). phase diagram for liquid-liquid phase separation of intrinsically disordered proteins. 76. L. H. Kapcha, P. J. Rossky, A simple atomic-level hydrophobicity scale reveals protein J. Phys. Chem. Lett. 10, 1644–1652 (2019). interfacial structure. J. Mol. Biol. 426, 484–498 (2014). 45. S. P. O. Danielsen, J. McCarty, J.-E. Shea, K. T. Delaney, G. H. Fredrickson, Molecular 77. B. S. Schuster et al., Identifying sequence perturbations to an intrinsically disordered design of self-coacervation phenomena in block polyampholytes. Proc. Natl. Acad. protein that determine its phase-separation behavior. Proc. Natl. Acad. Sci. U.S.A. 117, Sci. U.S.A. 116, 8224–8232 (2019). 11421–11431 (2020). 46. S. P. O. Danielsen, J. McCarty, J.-E. Shea, K. T. Delaney, G. H. Fredrickson, Small ion 78. K. S. Silmore, M. P. Howard, A. Z. Panagiotopoulos, Vapor-liquid phase equilibrium effects on self-coacervation phenomena in block polyampholytes. J. Chem. Phys. 151, and surface tension of fully flexible Lennard-Jones chains. Mol. Phys. 115, 320–327 034904 (2019). (2017). 47. S. Das, A. Eisen, Y.-H. Lin, H. S. Chan, A lattice model of charge-pattern-dependent 79. Y. C. Kim, G. Hummer, Coarse-grained models for simulations of multiprotein polyampholyte phase separation. J. Phys. Chem. B 122, 5418–5431 (2018). complexes: Application to ubiquitin binding. J. Mol. Biol. 375, 1416–1433 (2008). 48. N. A. S. Robichaud, I. Saika-Voivod, S. Wallin, Phase behavior of blocky charge lattice 80. A. C. Murthy et al., Molecular interactions underlying liquid-liquid phase separation polymers: Crystals, liquids, sheets, filaments, and clusters. Phys. Rev. E 100, 052404 of the FUS low-complexity domain. Nat. Struct. Mol. Biol. 26, 637–648 (2019). (2019). 81. A. E. Conicella et al., TDP-43 α-helical structure tunes liquid-liquid phase separation 49. J.-M. Choi, F. Dar, R. V. Pappu, LASSI: A lattice model for simulating phase transitions and function. Proc. Natl. Acad. Sci. U.S.A. 117, 5883–5894 (2020). of multivalent proteins. PLoS Comput. Biol. 15, e1007028 (2019). 82. F. G. Quiroz, A. Chilkoti, Sequence heuristics to encode phase behaviour in intrinsically 50. D. Nilsson, A. Irback,¨ Finite-size scaling analysis of protein droplet formation. Phys. disordered protein polymers. Nat. Mater. 14, 1164–1171 (2015). Rev. E 101, 022413 (2020). 83. K. Kumar et al., Cation-π interactions in protein-ligand binding: Theory and data- 51. G. L. Dignon, W. Zheng, Y. C. Kim, J. Mittal, Temperature-controlled liquid-liquid mining reveal different roles for lysine and arginine. Chem. Sci. 9, 2655–2665 (2018). phase separation of disordered proteins. ACS Cent. Sci. 5, 821–830 (2019). 84. R. V. Pappu, X. Wang, A. Vitalis, S. L. Crick, A polymer physics perspective on driving 52. G. L. Dignon, W. Zheng, Y. C. Kim, R. B. Best, J. Mittal, Sequence of pro- forces and mechanisms for protein aggregation. Arch. Biochem. Biophys. 469, 132– tein phase behavior from a coarse-grained model. PLoS Comput. Biol. 14, e1005941 141 (2008). (2018). 85. H. S. Chan, Folding alphabets. Nat. Struct. Biol. 6, 994–996 (1999). 53. S. Das, A. N. Amin, Y.-H. Lin, H. S. Chan, Coarse-grained residue-based models of 86. D. R. DeVido, J. G. Dorsey, H. S. Chan, K. A. Dill, Oil/water partitioning has a different disordered protein condensates: Utility and limitations of simple charge pattern thermodynamic signature when the oil solvent chains are aligned than when they parameters. Phys. Chem. Chem. Phys. 20, 28558–28574 (2018). are amorphous. J. Phys. Chem. B 102, 7272–7279 (1998). 54. A. Statt, H. Casademunt, C. P. Brangwynne, A. Z. Panagiotopoulos, Model for disor- 87. P. A. Karplus, Hydrophobicity regained. Protein Sci. 6, 1302–1307 (1997). dered proteins with strongly sequence-dependent liquid phase behavior. J. Chem. 88. H. S. Chan, K. A. Dill, Solvation: How to obtain microscopic energies from partitioning Phys. 152, 075101 (2020). and solvation experiments. Annu. Rev. Biophys. Biomol. Struct. 26, 425–459 (1997). 55. G. L. Dignon, W. Zheng, R. B. Best, Y. C. Kim, J. Mittal, Relation between single- 89. H. S. Chan, “Amino acid sidechain hydrophobicity” in eLS, Encyclopedia of Life molecule properties and phase behavior of intrinsically disordered proteins. Proc. Sciences (John Wiley & Sons, Chichester, United Kingdom, 2002). Natl. Acad. Sci. U.S.A. 115, 9929–9934 (2018). 90. J. L. Fauchere,` V. Pliska,ˇ Hydrophobic parameters Π of amino-acid side chains from the 56. K. M. Ruff, T. S. Harmon, R. V. Pappu, CAMELOT: A machine learning approach for partitioning of N-acetyl-amino-acid amides. Eur. J. Med. Chem. 18, 369–375 (1983). coarse-grained simulations of aggregation of block-copolymeric protein sequences. 91. W. Humphrey, A. Dalke, K. Schulten, VMD - visual molecular dynamics. J. Mol. Graph. J. Chem. Phys. 143, 243123 (2015). 14, 33–38 (1996). 57. T. S. Harmon, A. S. Holehouse, R. V. Pappu, Differential solvation of intrinsically disor- 92. J. Hsin, A. Arkhipov, Y. Yin, J. E. Stone, K. Schulten, Using VMD: An introductory dered linkers drives the formation of spatially organized droplets in ternary systems tutorial. Curr. Protoc. Bioinform. 24, 5.7.1–5.7.48 (2008). of linear multivalent proteins. New J. Phys. 20, 045002 (2018). 93. C. N. Pace et al., Tyrosine hydrogen bonds make a large contribution to protein 58. V. Nguemaha, H.-X. Zhou, Liquid-liquid phase separation of patchy particles illumi- stability. J. Mol. Biol. 312, 393–404 (2001). nates diverse effects of regulatory components on protein droplet formation. Sci. 94. G. D. Rose, A. R. Geselowitz, G. J. Lesser, R. H. Lee, M. H. Zehfus, Hydrophobicity of Rep. 8, 6728 (2018). amino acid residues in globular proteins. Science 229, 834–838 (1985). 59. A. Ghosh, K. Mazarakos, H.-X. Zhou, Three archetypical classes of macromolecular 95. M. S. Moghaddam, S. Shimizu, H. S. Chan, Temperature dependence of three-body regulators of protein liquid-liquid phase separation. Proc. Natl. Acad. Sci. U.S.A. 116, hydrophobic interactions: Potential of mean force, enthalpy, entropy, heat capacity, 19474–19483 (2019). and nonadditivity. J. Am. Chem. Soc. 127, 303–316 (2005).

28804 | www.pnas.org/cgi/doi/10.1073/pnas.2008122117 Das et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 0.H yee,H-.Zo,Amto odtriedeeti osat nnonhomo- in constants dielectric determine to method A Zhou, H.-X. Nymeyer, H. 104. to how Schr and C. proteins Rudas, of T. “constants” dielectric 103. the are What Warshel, A. Schutz, N. C. 102. simu- from proteins of properties Dielectric Gunsteren, van F. W. Falta, M. Pitera, W. J. 101. Dwyer J. J. 100. 9 .Mkwk,E ooesi .Calwk,S lze,A io .A ceaa Simple Scheraga, A. H. Liwo, A. Oldziej, S. Czaplewski, C. Sobolewski, E. Makowski, M. 99. 7 .H od .T hmsn ifrne ewe aradbulk and pair between Differences Thompson, Makowski M. T. 98. P. Wood, associ- H. Hydrophobic R. Tieleman, P. 97. D. Chan, S. H. Moghaddam, S. M. MacCallum, L. J. 96. a tal. et Das eeu ytm:Apiaint ilgclmembranes. biological to (2008). Application systems: geneous resolution. mesoscopic (2006). the at Properties II. interface. water models? electrostatic validate temperature. and pH, (2001). ligands, solvent, of effects The lation: chains. penetration. side water hydrophobic different of Pairs of IV. water. B in force Chem. interaction chains Phys. the water. side for force acid mean mean in amino of of potentials of the chains for formulas side analytical potentials physics-based acid chains. the amino side of of hydrophobic interaction parameterization identical (2007). the of and for pairs Calculation force mean 3. of tials yrpoi interactions. (1990). hydrophobic U.S.A. Sci. Acad. Natl. of ation α hlcs trcdwtig n nhli arest rti folding. protein to barriers enthalpic and dewetting, steric -helices, ihaprn ilcrccntnsi h neiro rti reflect protein a of interior the in constants dielectric apparent High al., et 18–19 (2008). 11385–11395 112, ipepyisbsdaayia omlsfrtepoten- the for formulas analytical physics-based Simple al., et dr .Brsh .Senasr iuainsuiso h protein- the of studies Simulation Steinhauser, O. Boresch, S. oder, ¨ ipy.J. Biophys. 2661 (2007). 6206–6210 104, Proteins 6012 (2000). 1610–1620 79, rc al cd c.U.S.A. Sci. Acad. Natl. Proc. 0–1 (2001). 400–417 44, .Py.Ce.B Chem. Phys. J. ipy.J. Biophys. .Ce.Phys. Chem. J. ipy.J. Biophys. 2925–2931 111, 1185–1193 94, 946–949 87, 2546–2555 80, 234908 124, Proc. J. 1.W .Bag .B ipr,Tefr iernec fmacromolecules. of birefringence form The Pippard, B. A. Bragg, field-theory A L. solutions: W. ionic of 114. constant Dielectric Orland, H. Andelman, solutions. ionic D. aqueous pair Levy, of A. effective properties Dielectric in 113. Collie, H. term C. Ritson, missing M. The D. Hasted, Straatsma, B. P. J. T. Grigera, 112. R. J. Berendsen, C. J. H. 111. Comparison Klein, L. M. Impey, W. R. Madura, D. J. Chandrasekhar, J. Jorgensen, L. W. 110. team, Lindorff-Larsen K. development GROMACS 109. Hess, B. Lindahl, E. Spoel, der van separation. D. phase Abraham, liquid-liquid J. M. on based 108. cells synthetic Dynamic Martin, N. 107. 1.Z-.Wn,Futaini lcrlt ouin:Tesl energy. self The solutions: electrolyte in Fluctuation Wang, Z.-G. 115. Zein. I. solutions. protein of constant dielectric the on Studies Jr, Wyman, J. 106. Tros M. 105. 1.J .Shi,K .Dl,Tesaiiiso rti crystals. protein of coulombic stabilities of The Dill, study A. Carlo K. Schmit, Monte D. Panagiotopoulos, J. Z. 117. A. Kumar, K. S. Orkoulas, G. 116. Crystallogr. approach. II. and I Parts 251(2010). 021501 07(2010). 4027 polyelectrolytes. in criticality potentials. water. liquid simulating (1983). for functions potential simple of field. force protein 4 Accessed http://www.gromacs.org. (2018). 2019. 2016.5 November version Manual User GROMACS BioChem Chem. Commun. 4–7 (1931). 443–476 90, ioeodoinainldnmc fwtri iigcells. living in water of dynamics orientational Picosecond al., et 5326 (2019). 2553–2568 20, 0 (2017). 904 8, hs e.Lett. Rev. Phys. .Py.Chem. Phys. J. 6–6 (1953). 865–867 6, .Ce.Phys. Chem. J. PNAS mrvdsd-hi oso oetasfrteAbrff99SB Amber the for potentials torsion side-chain Improved al., et Proteins | 2967 (1987). 6269–6271 91, 281(2012). 227801 108, oebr1,2020 17, November –1(1948). 1–21 16, 9015 (2010). 1950–1958 78, hs e.Lett. Rev. Phys. 433(2003). 048303 90, | o.117 vol. .Py.Ce.B Chem. Phys. J. .Ce.Phys. Chem. J. | o 46 no. hs e.E Rev. Phys. 926–935 79, 4020– 114, | .Biol. J. 28805 Chem- Acta Nat. 81,