Stem Cell Differentiation As a Many-Body Problem
Total Page:16
File Type:pdf, Size:1020Kb
Stem cell differentiation as a many-body problem Bin Zhanga,b and Peter G. Wolynesa,b,c,1 Departments of aChemistry and cPhysics and Astronomy, and bCenter for Theoretical Biological Physics, Rice University, Houston, TX 77005 Contributed by Peter G. Wolynes, May 9, 2014 (sent for review March 25, 2014) Stem cell differentiation has been viewed as coming from transitions transcription factors function as pioneers that can directly bind between attractors on an epigenetic landscape that governs the with the chromatin sites occupied by the nucleosome, slow dynamics of a regulatory network involving many genes. Rigorous DNA binding (14, 15) is still a good approximation to describe definition of such a landscape is made possible by the realization the effect of the progressive change of the chromatin structure that gene regulation is stochastic, owing to the small copy number of and histone modification induced by the pioneer factors on gene the transcription factors that regulate gene expression and because regulation (16). As a result, DNA binding must be treated on of the single-molecule nature of the gene itself. We develop an ap- equal footing together with protein synthesis and degradation proximation that allows the quantitative construction of the epige- to fully understand eukaryotic gene regulation (14–18). netic landscape for large realistic model networks. Applying this By increasing the dimensionality of the problem, investigating approach to the network for embryonic stem cell development ex- the effects arising from slow DNA-binding kinetics on gene net- plains many experimental observations, including the heteroge- work dynamics becomes computationally challenging. Although neous distribution of the transcription factor Nanog and its role in being in some aspects conceptually transparent, where one can safeguarding the stem cell pluripotency, which can be understood directly simulate the chemical reactions involved in a gene network by finding stable steady-state attractors and the most probable using a Monte Carlo algorithm (19), such a procedure quickly transition paths between those attractors. We also demonstrate becomes inefficient as the system complexity increases and is even that the switching rate between attractors can be significantly impractical for studying rare, transient switching events between influenced by the gene expression noise arising from the fluctua- steady states that occur on long timescales. In recognizing the close tions of DNA occupancy when binding to a specific DNA site is slow. analogy between gene networks and magnetic systems, Sasai and Wolynes suggested that analytical approaches originating from gene network | most probable path | master equation quantum statistical mechanics could be used to study the epigenetic landscape of networks and discussed the steady states of a very nderstanding the underlying mechanisms of the differenti- stylized model network with many attractors (4). Although their Uation of stem cells into many distinct cell types has long been approach was efficient in identifying steady-state solutions, it re- a goal of developmental biology (1) and regenerative medicine (2). mained to show how this approach can characterize transitions The epigenetic landscape, originally proposed by Waddington, has between different attractors. In the adiabatic limit where DNA proved to be a useful metaphor for visualizing cellular dynamics (3). binding is fast, analytical approaches to the transition process based However, is it more than a metaphor? In this view, cell phenotypes on large deviation theory have proved successful in studying noise- are identified as attractors with well-defined patterns of robust gene induced transitions (7, 20–22). Here we show how to find equiva- expression, and differentiation occurs through transitions from the lent approaches for large networks where DNA binding must be stem cell attractor to other attractors for the differentiated cells. It treated explicitly. has become clear that for simple models, taking into account sto- In this paper, we generalize a kinetic model originally pro- chastic effects allows a well-defined landscape to be constructed posed by Sasai and Wolynes (4) to explicitly model DNA binding (4–6). This generalized potential landscape governs much of the along with protein synthesis and degradation in large gene net- gene regulatory network dynamics, such that corrections to such works with multifactorial and complex switches. An approxima- a landscape picture can also be defined and formalized (7, 8). It tion method is further developed that allows the construction of remains a challenge to construct such landscapes for realistically large network models. Significance Describing the stochastic dynamics of gene networks must include the statistics of synthesizing transcription factors, their A molecular understanding of stem cell differentiation requires degradation, and their binding to genes on the DNA. These all the study of gene regulatory network dynamics that includes play crucial roles in shaping the epigenetic landscape (9). Due to the statistics of synthesizing transcription factors, their deg- the complexity of the assembly of the machinery involved for radation, and their binding to the DNA. Brute force simulation protein synthesis, it has often been assumed that protein trans- for complex large realistic networks can be computationally lation is much slower than the diffusion-limited DNA-binding challenging. Here we develop a sound approximation method process. Under this view, the ensemble of DNA occupancies, i.e., ’ built upon theories originating from quantum statistical me- the set of transcription factors bound at a gene s regulatory ele- chanics to study the network for embryonic stem cell de- ments, can be assumed to have achieved a quasi-equilibrium so that velopment. Mechanistic insight is provided on the role of the the expression dynamics of this gene can then be described as a – master regulator Nanog in safeguarding stem cell pluripotency. birth death process governed by an averaged protein production We also demonstrate the significant influence of DNA-binding rate that depends nonlinearly on transcription factor concen- kinetics, an aspect that often has been overlooked, on the BIOPHYSICS AND trations (10). Although perhaps valid for some networks in pro- transition rate between gene network steady states. COMPUTATIONAL BIOLOGY karyotic cells, this assumption on timescales is probably not completely adequate for eukaryotic systems because the pro- Author contributions: B.Z. and P.G.W. designed research, performed research, analyzed cesses of chromatin decondensation and unwrapping of DNA data, and wrote the paper. from nucleosomes, both of which are required for proteins to bind, The authors declare no conflict of interest. take time. The high-level architecture of chromatin can severely Freely available online through the PNAS open access option. limit the access of transcription factors to the DNA, slowing down 1To whom correspondence should be addressed. E-mail: [email protected]. CHEMISTRY DNA-binding kinetics (11, 12) and resulting in gene expres- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. sion noise deviating from Poisson statistics (13). Even when the 1073/pnas.1408561111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1408561111 PNAS | July 15, 2014 | vol. 111 | no. 28 | 10185–10190 Downloaded by guest on September 27, 2021 the network’s epigenetic landscape via the determination of not A Oct4 Oct4 only steady-state solutions, but also most probable transition OCT4 CDX2 paths and stochastic switching rates between steady states. When k applied to a model of the core transcriptional regulatory network h Oct4 Oct4 of embryonic stem cells, these approximations turn out to explain g Regulatory Oct4 Oct4 the observed fluctuations in the expression of the master plu- Element f Cdx2 Gene ripotent regulator Nanog and the important role it has in safe- Cdx2 Cdx2 guarding stem cell pluripotency against differentiation. We also Oct4 Gene f Regulatory g Element demonstrate the striking effect that DNA-binding kinetics have Cdx2 h Cdx2 on the switching rate between steady states, making clear the Cdx2 Cdx2 importance of explicitly modeling fluctuations of DNA occu- k pancy in studying the fate of gene networks. Theoretical Approaches for Gene Networks B Viewed as a molecular process, gene regulation is complicated, GCNF with many actors as well as attractors, and spans a large spatial GATA6 CDX2 and temporal domain. Several approximations are needed to make tractable mathematical models. We use an admittedly simplified model but one that contains many of the crucial fea- Activation tures of the real embryonic stem cell network. Repression KLF4 PBX1 Pluripotency NANOG Constructing Schematic Kinetic Models for Gene Networks. Following Sasai and Wolynes (4), we model gene regulation at a level that Differentiation includes explicit binding and unbinding of transcription factors to OCT4 SOX2 the DNA, protein translation, and protein degradation. Both the mRNA intermediary and the serial nature of macromolecular Oct4-Sox2 synthesis are neglected in the present model but are likely im- portant. Fig. 1A illustrates how we can diagram a mutual re- pression motif between genes Cdx2 and Oct4. The gene Cdx2 Fig. 1. A schematic gene regulatory network for the embryonic stem cell. can be transcribed to proteins at a rate g, which will change (A) Schematics