Motifs Emerge from Function in Model Gene Regulatory Networks
Total Page:16
File Type:pdf, Size:1020Kb
Motifs emerge from function in model gene regulatory networks Z. Burdaa, A. Krzywickib, O. C. Martinc,d,1, and M. Zagorskia aMarian Smoluchowski Institute of Physics and Mark Kac Complex Systems Research Centre, Jagellonian University, 4 Reymonta, 30-059, Krakow, Poland; bLaboratoire de Physique Théorique, Centre National de la Recherche Scientifique-Unité Mixte de Recherche 8627, Université de Paris-Sud, Orsay, F-91405, France; cLaboratoire de Génétique Végétale, Institut National de la Recherche Agronomique, Centre National de la Recherche Scientifique-Unité Mixte de Recherche 0320 / Unité Mixte de Recherche 8120, Gif-sur-Yvette, F-91190, France; and dLaboratoire de Physique Théorique et Modèles Statistiques, Centre National de la Recherche Scientifique-Unité Mixte de Recherche8626, Université de Paris-Sud, Orsay, F-91405, France Edited by Peter J. Bickel, University of California, Berkeley, CA, and approved August 19, 2011 (received for review June 10, 2011) Gene regulatory networks allow the control of gene expression overrepresentation. Unfortunately, the functional capabilities of patterns in living cells. The study of network topology has revealed GRN and the constraints they must satisfy (e.g., kinetic response that certain subgraphs of interactions or “motifs” appear at anom- characteristics or robustness to noise) are still poorly understood, alously high frequencies. We ask here whether this phenomenon so such questions cannot be addressed in a truly realistic frame- may emerge because of the functions carried out by these net- work. Instead, we will (i) work within a plausible model of tran- works. Given a framework for describing regulatory interactions scriptional regulation, (ii) impose functional constraints on the and dynamics, we consider in the space of all regulatory networks patterns of gene expression, and (iii) determine which motifs those that have prescribed functional capabilities. Markov Chain emerge when considering the space of all possible functional Monte Carlo sampling is then used to determine how these func- GRN. This particular task is related to previous work that used tional networks lead to specific motif statistics in the interactions. genetic algorithms or simulated annealing to design genetic net- In the case where the regulatory networks are constrained to works having given functional properties (12–14). Those studies exhibit multistability, we find a high frequency of gene pairs that found that the optimization procedures indeed led to particular are mutually inhibitory and self-activating. In contrast, networks architectures. Our approach differs by not relying on a design constrained to have periodic gene expression patterns (mimicking for instance the cell cycle) have a high frequency of bifan-like procedure: we want to get away from any dependence on the motifs involving four genes with at least one activating and one optimization algorithm and see how functional capability on its inhibitory interaction. own constrains the possible architectures. In this framework, two types of constraints will be applied: we will impose either a set of essential interactions ∣ genetic switch ∣ transcription factors steady-state expression patterns, or a time periodic pattern of expression motivated by previous studies of cell cycling. Interest- ’ volutionary forces have shaped living organisms since the ingly, we find very different motifs for these two types; in Alon s SCIENCES (15, 16) terminology, the first type leads to mutually inhibitory Ebeginning of life. It is thus not surprising that the framework APPLIED PHYSICAL for understanding features in today’s organisms is often based on pairs acting as bistable switches, while the second type leads to considering the history of these organisms: a remarkable feature bifan, diamond and four point cycle motifs. found today may be the trace of an evolutionary trajectory. But Our model of transcriptional regulation is simple enough to be one can also take a different perspective, one in which design used for illustration and, hopefully, for identification of generic constraints may play a significant role. The expectation is then features of genetic networks. Of course it is only a model and that the architecture of living organisms depends not just on their it does not include many known aspects of regulation such as origin, but also on their functional capabilities. In the present posttranslational modifications or chromatin remodelers; never- context, we are concerned with biological networks. Are the con- theless it is rooted in biophysical reality to avoid ad-hoc assump- straints associated with network functionality major determinants tions. The inclusion of inhibitory interactions in this framework is of network architecture? We address this question in this paper, a major qualitative advance with respect to our earlier work (17). albeit using a somewhat narrow notion of functionality based on Although technically simple, it gives access to questions arising the output patterns produced by networks. when networks have complex expression patterns, and in particu- Our focus here is on gene regulatory networks (GRN), the set lar it allows one to get insights into mechanisms of motifs emer- of interactions between genes. These interactions along with the gence. Remarkably enough, we find that the structure of a fairly gene expression machinery allow living cells to control their gene large genetic network is controlled by the cooperative action of expression patterns. In the last decade, genetic interactions have small units—motifs—although the selection pressure is exerted been measured, modified, engineered, etc., and so quite a lot is only on the network as a whole. known about how any given gene can affect another’s expression. We begin by describing our model and follow by examining its Furthermore, small gene networks have been designed to imple- properties, focusing especially on the kinds of motifs that emerge ment simple functions in vivo (1, 2), and much larger sets of interactions have been reconstructed in a number of organisms when imposing functional capabilities on the networks (in prac- (3–5). From these large networks it has been possible to show that tice, we impose gene expression patterns). Then the dependence several “motifs”—subgraphs with given interactions—arise far of the motifs on these imposed patterns is exhibited. We also more often than might be expected (6–9). One of the most stu- died motif is the so called Feed Forward Loop or FFL, a graph Author contributions: Z.B., A.K., O.C.M., and M.Z. designed research; A.K., O.C.M., and based on three genes where the first regulates the second, and M.Z. performed research; A.K., O.C.M., and M.Z. analyzed data; and A.K., O.C.M., and both the first and the second regulate the third. Another example M.Z. wrote the paper. is the bifan motif in which two genes control two others. Biolo- The authors declare no conflict of interest. gical functions have been proposed for these motifs (10, 11) This article is a PNAS Direct Submission. which give them some meaning, but one may ask whether other 1To whom correspondence should be addressed. E-mail: [email protected]. motifs could perform the same functions and what level of enrich- This article contains supporting information online at www.pnas.org/lookup/suppl/ ment might be expected if function were the sole cause of motif doi:10.1073/pnas.1109435108/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1109435108 PNAS ∣ October 18, 2011 ∣ vol. 108 ∣ no. 42 ∣ 17263–17268 Downloaded by guest on September 24, 2021 1 outline in the SI Text a way of taking into account possible coop- Recall from Eq. that Pij is the probability that the binding site erative behavior among different transcription factors. j of gene i’s regulatory region is bound by a TF. In the absence of inhibitory effects on this regulatory region, we take the transcrip- Model tion rate to be (up to an arbitrary scale) the Probability of Transcription Factor Binding. We start with N genes coding for tran- OCCupation or “POCC” (34) of the regulatory region. However, scription factors that may influence each other’s expression. To if one of the interactions j is inhibitory, we consider that the keep the model as realistic as possible, we include the known bio- presence of a TF of type j bound to its binding site will shut down physical determinants of transcriptional control. In particular, the transcription; in effect, inhibitors act as vetoes on transcrip- the binding of a transcription factor (TF) to a site is described tion in this picture. Then, denoting by Si the expression level of thermodynamically (18–20), that is through a free energy that gene i, we set depends on the mismatch of two character strings of length L, Y Y 1 − 1 − 1 − [3] one for the TF and one for the binding site. Up to an additive Si ¼½ ð PijÞ ð Pij0 Þ; ε j j0 constant, this free energy in units of kBT is taken to be dij where dij is the number of mismatches, T is the temperature, and kB is where j runs over activating interactions and j0 over inhibitory ’ ε Boltzmann s constant. The parameter is a penalty per mismatch interactions. (See SI Text for a detailed derivation.) which has been measured experimentally to be between one and The transcriptional dynamics is then defined as follows. Just three if each base pair of the DNA is represented by one char- like in many other modeling frameworks, we take time to be dis- – – 1 acter (21 23). Also, by comparing to the typical number of base crete (27 31); at each time step we first update the Pij in Eq. 3 pairs found for experimentally studied binding sites, one has (using nj ¼ nSj) and then update the Si in Eq. These updates 10 ≤ L ≤ 15. For all the work presented here, we use ε ¼ 2 and are deterministic, and in general the transcriptional trajectory L ¼ 12, but we have checked that our conclusions are not specific goes towards a fixed point (corresponding to steady-state expres- to these values.