MULTICELLULAR MATHEMATICAL MODELS OF SOMITOGENESIS
by
Mark Benjamin Campanelli
A dissertation submitted in partial fulfillment of the requirements for the degree
of
Doctor of Philosophy
in
Mathematics
MONTANA STATE UNIVERSITY Bozeman, Montana
August, 2009 c Copyright
by
Mark Benjamin Campanelli
2009
All Rights Reserved ii
APPROVAL
of a dissertation submitted by
Mark Benjamin Campanelli
This dissertation has been read by each member of the dissertation committee and has been found to be satisfactory regarding content, English usage, format, citations, bibliographic style, and consistency, and is ready for submission to the Division of Graduate Education.
Dr. Tom`aˇsGedeon
Approved for the Department of Mathematical Sciences
Dr. Kenneth Bowers
Approved for the Division of Graduate Education
Dr. Carl A. Fox iii
STATEMENT OF PERMISSION TO USE
In presenting this dissertation in partial fulfillment of the requirements for a doc- toral degree at Montana State University, I agree that the Library shall make it available to borrowers under rules of the Library. I further agree that copying of this dissertation is allowable only for scholarly purposes, consistent with “fair use” as pre- scribed in the U.S. Copyright Law. Requests for extensive copying or reproduction of this dissertation should be referred to ProQuest Information and Learning, 300 North
Zeeb Road, Ann Arbor, Michigan 48106, to whom I have granted “the exclusive right to reproduce and distribute my dissertation in and from microform along with the non-exclusive right to reproduce and distribute my abstract in any format in whole or in part.”
Mark Benjamin Campanelli
August, 2009 iv
DEDICATION
I dedicate this dissertation to my family: To my wife Amber, for her enduring patience. To my daughter Ella, whose future I am trying to improve. v
ACKNOWLEDGEMENTS
I would like to thank my advisor, Dr. Tom`aˇsGedeon, for all of his help and guidance during this research project. I would also like to thank Jesse Berwald for his good cheer and generosity concerning all things computational. Lastly, I would like to thank Dr. Konstantin Mischaikow’s group at Rutgers University, for computational time on the conley2 computer cluster. vi
TABLE OF CONTENTS
1. INTRODUCTION ...... 1
Biological Pattern Formation ...... 1 Developmental Biology and Somitogenesis ...... 2 Mathematical Insights into Somitogenesis ...... 5 Purpose and Scope of the Present Work ...... 9
2. SURVEY OF EXISTING MATHEMATICAL MODELS ...... 12
Early Models: Pattern Formation and Morphogenesis...... 12 Tissue-Based Reaction-Diffusion Models ...... 14 Cell-Based Models...... 16 Phase Oscillators...... 16 Ordinary Differential Equation (ODE) Models ...... 17 Delay Differential Equation (DDE) Models ...... 19 Modeling Scopes and Multiple Scales ...... 22
3. A MULTI-STABLE PHASE OSCILLATOR MODEL OF SOMITOGENESIS 25
Model Description...... 25 Comparison to Existing Phase Oscillator Models...... 33 Lewis’s Phase Oscillator Model ...... 33 Jaeger and Goodwin’s Cellular Oscillator Model...... 38 Discussion ...... 39
4. A DELAY DIFFERENTIAL EQUATION MODEL OF POSTERIOR CLOCK- WAVE FORMATION ...... 41
The Biological Components of the Clock ...... 43 The Clock...... 44 The Control Protein ...... 46 The Coordinating Signal...... 47 Modeling Posterior Clock-Wave Formation: Uncoupled Cells ...... 49 PSM Growth...... 49 Model Variables ...... 49 The Control Protein ...... 50 The Intracellular Clock ...... 51 Clock-Gene Regulation by a Single Repressive Transcription Factor...... 55 Modeling Posterior Clock-Wave Formation: Coupled Cells...... 59 Intercellular Signaling...... 59 vii
TABLE OF CONTENTS – CONTINUED
Clock-Gene Regulation by Both Repressive and Activating Transcription Factors...... 59 Interim Model Summary...... 64 The Fast Dimerization Approximation...... 65 Algebraic Solution of the Fast Dimerization ...... 73 An Iterative Numerical Scheme for Computing the Fast Dimerization...... 75 Model Summary ...... 78
5. MODEL VALIDATION: AN APPLICATION TO ZEBRAFISH SOMITOGE- NESIS...... 80
Computational Considerations for Validation ...... 81 Clock-Wave Formation in Zebrafish...... 83 Assignment of Model Components...... 84 Model Validation Criteria ...... 85 Parameter Value and Range Selection...... 87 Experimentally Determined Parameter Values ...... 88 Parameters Estimated from a Range of Values...... 91 Parameters for Model Scenarios I–IV ...... 93 Parameter Estimation and Model Selection...... 94 Stage One Validation...... 95 Parameter Sensitivities...... 98 Stage Two Validation ...... 102 Model Robustness...... 103 Reproduction of Experiments...... 106 The Mechanism of Gradient Controlled Oscillation Rate...... 112 Comparison to Existing Zebrafish Models ...... 113 Applicability of the PCW Model ...... 113 Further Analyses and Future Directions...... 115
6. CONCLUSION ...... 117
Future Directions ...... 118
REFERENCES CITED...... 120
APPENDICES ...... 132
APPENDIX A: Impossibility of Nontrivial Periodic Solutions in Lewis’s Un- coupled DDE Model without Delays ...... 133 APPENDIX B: Competitive Dimerization of Three Proteins ...... 135 viii
TABLE OF CONTENTS – CONTINUED
APPENDIX C: Matlab Codes ...... 140 ix
LIST OF FIGURES Figure Page
1 Formed somites in a zebrafish embryo...... 3 2 Transverse schematic of the somitic mesoderm...... 3 3 Formed and forming somites in a zebrafish embryo...... 4 4 Multiple gene expression during zebrafish somitogenesis...... 6 5 Maturity/susceptibility plots...... 29 6 Phase portrait snapshots of the multi-stable phase oscillator...... 30 7 Computed solutions of the multi-stable phase oscillator model...... 32 8 Long-term computed solution behavior of the multi-stable phase oscil- lator model...... 32 9 Reproduction of the spatiotemporal somitogenesis pattern...... 34 10 Reproduction of the long-term somitogenesis pattern...... 35 11 Asymptotic phase solutions for Lewis’s phase oscillator model...... 36 12 Asymptotic somitogenesis pattern for Lewis’s phase oscillator model...... 38 13 Total control protein gradients...... 52 14 Binding site configurations...... 58 15 Model selection...... 96 16 Parameter sensitivities for model scenario III, part 1 ...... 100 17 Parameter sensitivities for model scenario III, part 2 ...... 101 18 Model III simulated clock-wave, no noise...... 103 19 Model IV simulated clock-wave, no noise...... 104 20 Model III simulated clock-wave, with noise...... 105 21 Model III simulated clock-wave, no noise with longer gradient half-life.. 107 22 Model III simulated clock-wave, no noise with shorter gradient half-life. 108 23 Model III simulated clock-wave, no noise with exponential gradient...... 109 24 Model III simulated knockdown experiment, with noise...... 110 x
LIST OF FIGURES – CONTINUED Figure Page
25 Model III simulated clock-wave in a rectangular lattice of cells, with noise...... 111 xi
ABSTRACT
Somitogenesis is an important pattern formation process in the developmental biology of vertebrates. The phenomenon has received wide attention from experi- mental, theoretical, and computational biologists. Numerous mathematical models of the process have been proposed, with the clock and wavefront mechanism rising to prominence over the last ten years. This work presents two multicellular mathematical models of somitogenesis. The first is a phenomenological phase oscillator model that reproduces both the clock and wavefront aspects of somitogenesis, but lacks a biological basis. The second is a biologically informed delay differential equation model of the clock-wave that is produced by coordinated oscillatory gene expression across many cells. Careful and efficient model construction, parameter estimation, and model vali- dation identify important nonlinear mechanisms in the genetic control circuit of the somitogenesis clock. In particular, a graded control protein combined with differential decay of clock protein monomers and dimers is found to be a key mechanism for slow- ing oscillations and generating experimentally observed waves of gene expression. This represents a mode of combinatorial control that has not been previously examined in somitogenesis, and warrants further experimental and theoretical investigation. 1
INTRODUCTION
Biological Pattern Formation
“How does the leopard get its spots and the zebra its stripes?” Many children
(and adults!) have wondered about such questions. The living world is replete with
patterns such as spots and stripes. Other examples of patterns include the trichome
distribution on plant surfaces, the efficient branching of tree roots, tree limbs, and the
airway passages of the lung, and certain multi-organism behaviors of insects. Nature
has devised some wonderfully useful patterns, but can we achieve some understanding
how these patterns form? The answer, of course, is yes, and the tools of science and
mathematics can be employed to do so.
Organisms may form patterns at different stages of the life cycle, from earliest
development until death. For single organisms, pattern formation typically involves
some type of cellular differentiation. This differentiation can range, for example,
from a simple color change between otherwise indistinguishable epithelial cells, to
a substantial divergence of cell type and function (e.g., beta vs. acinar cells in the
pancreas).
Such differentiation is often visible at the macroscopic level, although many im-
portant biological patterns, such as the human backbone and ribcage, initially form
in utero on a microscopic scale and are not directly observable, even in the adult
organism. Furthermore, modern molecular biology has revealed a host of genetically
orchestrated biochemical activities at the sub-cellular level that are involved in pattern
formation. Such activities include epigenetic1 intracellular and intercellular signaling
1Epigenesis is the process by which genetic information, as modified by environmental influences, is translated into the substance and behavior of an organism. 2 and feedback mechanisms involving, for example, regulation of protein production and elimination.
As mentioned above, an important pattern that has evolved in higher organisms is the repeated vertebrae and related structures of the spinal column in vertebrates (the
Verbrata subphylum of the Chordata phylum of the Animalia kingdom). Backbone development begins early in the developmental sequence of vertebrates, during em- bryogenesis. This robust process occurs under a variety of conditions, for example, in a cold-blooded zebrafish embryo in a lake or pond, in a warm-blooded chick developing inside an egg, or in a human fetus in the womb.
Developmental Biology and Somitogenesis
Spinal column formation in vertebrates proceeds through several stages during embryogenesis. The invention of the microscope enabled the discovery of a key early event, called somitogenesis. Somitogenesis is the process of somite formation, which occurs in the mesoderm, a tissue that forms just after gastrulation in the developing embryo, see Figure 1. Gastrulation leads to three relatively flat layers of tissue, called germ layers. The mesoderm is the middle germ layer, lying above the bottom endoderm layer and below the top ectoderm layer, see Figure 2. Initially each germ layer is structurally amorphous, yet the cells in each layer are already destined for different tissues in the growing organism [1].
Somitogenesis is a fundamental stage of cell differentiation in the mesoderm.
Somites are transient, repeated blocks of epithelialized cells2 that eventually differ- entiate further into vertebrae, ribs, musculature, and dorsal dermis. Somites arise sequentially and in pairs from the mesoderm, in an anterior (head) to posterior (tail)
2Epithelialized cells are surrounded by a well-defined layer of border cells, called epithelial cells. 3
Notch Signalling in Somite Segmentation
Figure 1: Side-view micrograph of recently formed somites in a zebrafish embryo. The anterior (head) is to the left, posterior (tail) is to the right. Taken from [2, Figure 2a] under the Creative Commons Attribution License.
Figure 2: Transverse schematic of the somitic mesoderm of a chick embryo. The (medial) midline of the embryo occurs to the left, and one of two (lateral) sides of the embryo is depicted to the right. Somites and other structures have already formed in the somitic mesoderm, which lies between the top (dorsal) ectoderm and bottom (ventral) endoderm. Taken from the public domain via the Wikipedia Commons (http://en.wikipedia.org/wiki/File:Gray19 with color.png).
Figure 2. Blocking Notch Signalling Causes Somite Boundary Defects after a Long Delay Embryos were treated with 100 lm DAPT or with DMSO (control) medium and stained by ISH for titin at the end of somitogenesis to reveal somite boundaries. Treatment was begun (A) at 3 hpf, (B) at 5-somite stage, or (C) at 9-somite stage. Arrows with grey labels indicate stage at onset of DAPT treament; arrows with black labels indicate the level of the earliest defective somite. A detailed view of the region where disruption begins is shown to the right of each DAPT specimen. doi:10.1371/journal.pgen.0040015.g002
clock cycles before they segment overtly. From the delay we For treatment beginning at 3hpf, 7 hours before the onset see following DAPT treatment, therefore, we can infer that of somitogenesis, the calculation is slightly different. If we the pattern of clock gene expression in the posterior PSM was assume that the first somite is formed from cells in which the not effectively disrupted until long after the beginning of clock was set going synchronously by some shared devel- DAPT treatment: for treatment beginning at the 5-somite opmental cue for the initiation of somitogenesis, we can infer stage, not until 13.4 6 1.1 5, i.e., 8.4 6 1.1, clock cycles had that the critical oscillations were not significantly disrupted À elapsed; for treatment beginning at the 9-somite stage, not by the presence of Notch blockade until 7.3 6 1.3 cycles after until 12.8 6 1.9 5, i.e., 7.8 6 1.9, clock cycles had elapsed. this initiating cue. À From these delays, we should subtract the short period—2 All these observations point to the same conclusion: Notch clock cycles at most—required for DAPT to diffuse into the signalling cannot be directly required for genesis of the clock tissue and accomplish the block of Notch signalling. Thus we oscillations, since these evidently continue for 6 to 8 cycles can say that, regardless of whether the DAPT treament begins after the onset of the blockade. For the zebrafish at least, the at the 5-somite stage or the 9-somite stage, approximately 6 oscillation-generator hypothesis seems to be excluded. to 8 somite cycles must elapse from the time when the All the observations are, however, perfectly consistent with blockade of Notch signalling begins to the time when the the synchronization hypothesis, if we assume that the cells in pattern of clock gene expression is effectively disrupted. the posterior PSM continue oscillating when Notch signalling
PLoS Genetics | www.plosgenetics.org0004 February 2008 | Volume 4 | Issue 2 | e15 4
Figure 3: Top-view micrograph of formed and forming somites in a zebrafish embryo. The mesoderm runs along most of the length of the embryo. Formed somites have dark bands of stable gene expression, numbered 1–10. Two to three bands of oscillatory gene expression (labeled (11)–(12)) move in waves from posterior (right) to anterior (left) across the presomitic mesoderm in the posterior-most part of the embryo. Ex- pression in the tailbud (labeled (13)) oscillates steadily as the tail elongates, and is the source of new waves of expression that narrow as they travel from right to left. Taken from [4, Figure 1a] under the Creative Commons Attribution License. fashion, see Figure 3. Each pair forms on either side of the notochord, the spinal precursor which forms along the embryonic midline. Amorphous mesoderm without formed somites is called presomitic mesoderm (PSM), while after somite formation the tissue is called somitic mesoderm [3], see Figure 2. This transition from presomitic to somitic mesoderm is first marked by a pre-pattern of bands of gene expression that help demarcate nascent somite borders, see Figures 3 and 4.
Different species of vertebrates have different numbers of vertebrae, and the PSM elongates during somitogenesis to accommodate the length of the particular organ- ism’s trunk, e.g., a mouse as opposed to a snake. At the posterior end of the early embryo is the tailbud, a proliferative zone where immature cells are continually added to the posterior PSM. Cell division and rearrangement diminish consider- ably in the PSM, and cells’ positions relative to each other do not change consid- erably. A cell’s relative position within the PSM does change, however, as the tailbud grows away posteriorly and the oldest cells in the anterior PSM segment in groups 5
to form somites. Somitogenesis stops when the anterior formation of somites has
progressed posteriorly across the entire PSM, reaching the arresting growth in the
tailbud. [3, 5, 6, 7, 8, 9, 10, 11]
The morphological changes of somitogenesis can be more easily viewed in vivo in certain species. In particular, zebrafish (Danio rerio) has an exposed, translucent embryo, allowing both easy access and visualization with a light microscope, see
Figure 1. In the last several decades, modern microscopy, molecular biology, and
bioinformatics have enabled the determination of many of the underlying genetic
mechanisms of somitogenesis (compare the resolutions of Figures 3 and 4). Impor-
tantly, such investigations of the spatiotemporal dynamics of gene expression during
somitogenesis can be complemented by quantitative mathematical modeling.
In a larger context, an ever increasing portion of the biological sciences is now
conducted with a quantitative mathematical modeling component traditionally re-
served for engineering and the “exact sciences” such physics or chemistry. This has
led to cross-disciplinary fields in addition to mathematical biology, such as systems
biology and computational biology. Many investigations, including the present one,
draw upon all three disciplines.
Mathematical Insights into Somitogenesis
Can mathematics help uncover the biological mechanisms of pattern formation
during somitogenesis?
The spatiotemporal dynamics of somitogenesis have long been recognized as a
good candidate for mathematical modeling. Important features of these dynamics
are shared by many organisms, and consist of the following (recall Figure 3):
1. Steady, clock-like oscillatory gene expression in the growing tail of the embryo. 6
Figure 4: High resolution confocal micrographs of gene expression during zebrafish somitogenesis. The anterior direction is up. Individual cells are colored blue. The mRNA expression of three genes her1, her7, and DeltaC in the PSM are shown in green in the top three panels, respectively. Nuclear and cytosolic localization of her1 mRNA transcripts may be clearly seen in the bottom panel with higher resolution. myoD mRNA appears in red, and demarcates the neural tube and most recently formed somites. Taken from [4, Figure 3] under the Creative Commons Attribution License. 7
2. Waves of expression that emanate from the oscillations in the tail and sweep
anteriorly across the PSM.
3. A posteriorly traveling determination wavefront that follows behind the growth
of the tail and periodically arrests oscillations in the anteriorly traveling waves
of expression into fixed bands of expression.
In the 1970s, even before the genetics behind somitogenesis started to be uncov- ered, investigators such as Cooke and Zeeman [12] adapted the catastrophe theory of
Thom [13] to propose a theoretical framework for the periodic formation of somites
as a spatiotemporal sequence of “catastrophes”. The model mechanism was termed
the clock and wavefront. In more modern mathematical parlance, somite formation was proposed to be a periodic sequence of bifurcations of a dynamical system that switched cohorts of cells from an undifferentiated state in the PSM to a differentiated state in somites.
One shortcoming of the original clock and wavefront model of Cooke and Zeeman was its abstract disconnection from any experimentally observed biological mecha- nism. Interestingly, the model anticipated the future discovery (in the mid-1990s) of oscillatory gene expression in the PSM [14]. The 1980s saw other phenomenological
models applied to pattern formation, including somitogenesis. Most prominent among
these were the family of reaction-diffusion models of Meinhardt [15] that treated tis-
sues as a spatial continuum and thus employed partial differential equations (PDEs).
In the case of somitogenesis, these models also lacked a direct connection to any
experimentally observed biological mechanism.
Extensive scientific work in the first part of the 20th century led to the emergence
early in the second half of the century of the so-called central dogma of molecular
biology. The central dogma states that genetic information stored in a cell’s DNA is 8 expressed as various proteins through intermediary, information carrying molecules called messenger RNA (mRNA). Further advances in the second half of the last century ultimately led to the Human Genome Project and the complete mapping of human DNA, as well as the DNA of many other model organisms. One of the resulting challenges of the 21st century is to understand the epigenome, that is, how genetic information in the DNA is ultimately expressed in individuals given the genetic control mechanisms and their past and present interaction with the environment.
Soon after the appearance of the central dogma, many of the resulting, qualita- tively identified genetic control circuits were translated into quantitative mathemat- ical models at the cellular level. For example, see [16, 17, 18, 19, 20], which discuss both steady state and oscillatory behaviors in cellular control systems with feedback.
In many cases, this process involved extending existing compartmental models of chemical reactions, both organic and inorganic, to mRNA and protein in cells. Ex- amples include mass action kinetics, Michaelis-Menten enzyme kinetics [21], and the extension of Hill’s equation for cooperative binding of oxygen to hemoglobin [22] to the activation or repression of mRNA transcription from DNA by protein transcription factors. Consideration of statistical thermodynamics lead to the approach of Shea and Ackers [23] for modeling the control of gene expression in prokaryotes.
A large class of cell-based mathematical models of somitogenesis follows in the footsteps of these initial modeling efforts [24]. Individual eukaryotic cells can be represented by systems of ordinary or delay differential equations (ODEs or DDEs), with state variables in the newest models representing experimentally observed mRNA and proteins [2, 25, 26, 27, 28, 29, 30]. Multicellular tissues and interactions between cells can be modeled by considering each cell as a subsystem in a larger coupled system of differential equations. Cells undergoing somitogenesis are essentially fixed relative to each other in the PSM, simplifying the cell-based approach by obviating the need 9 to track cell migrations. Furthermore, non-diffusive, contact-dependent intercellular signaling mechanisms can be easily handled.
The existence of fast and powerful numerical solvers allows simulation of the re- sulting models, even for large systems of equations. Such simulations can be used to validate the model against experiment, so that the model can ultimately guide the formulation and verification of scientific hypotheses about the underlying biological mechanisms.
There are, of course, challenges in mathematically modeling biological sys- tems [31]. Like most biological systems, developmental systems usually display mul- tiple time and/or spatial scales [24, 32, 33], and a careful accounting of such scales is necessary to attain the proper balance between simplicity and accuracy. Stochastic effects are often assumed to be “averaged out” in deterministic models, but this assumption may not be strictly valid at the cellular level and may depend on the chosen time or spatial scale [34]. Furthermore, experimental data used for model construction and validation may be highly qualitative and the complex biological interactions involved may be only partially known. Finally, many model parameters may be too hard and/or costly to measure, and therefore must be estimated. In spite of such challenges, mathematical models can provide considerable insight into the mechanisms of biological systems, including the process of somite formation.
Purpose and Scope of the Present Work
Given the above setting for somitogenesis modeling, this dissertation presents two multicellular, deterministic mathematical models of somitogenesis pattern formation.
These models are carefully constructed considering both existing mathematical mod- els and the substantial body of experimental research into somitogenesis. 10
The first model represents cells in the PSM as simple, uncoupled phase oscillators.
Although this first model has limited foundation in biological experiment, it success- fully captures the essential spatiotemporal dynamics of somitogenesis. As such, the model acts as a “proof of concept” of the clock and wavefront mechanism and offers modest improvements over similar existing models such as [35, 36].
The second model is a minimal, biologically-based model of a central feature of somitogenesis. As a partial implementation of the clock and wavefront mechanism, it reproduces the coordinated oscillatory gene expression that initiates the posterior- most waves of gene expression. The model is minimal in the sense that, given the experimental data, minimal biological circuitry is used to implement the clock mech- anism. Following the work of Lewis [27] and Cinquin [30], the cell-based model employs a system of delay differential equations. The ability of the model to replicate somitogenesis experiments in zebrafish is examined, and the robustness of this model is examined with respect to both the estimated model parameters and heterogeneity in parameter values across the entire cell population.
There are several important implications of this modeling work on the underlying molecular biology of somitogenesis. First, a mathematical model is carefully con- structed with regard to the important biological factors. Second, through extensive model simulation and computational analyses, several potential biological mechanisms are considered and eliminated, leaving only one minimal mechanism that successfully reproduces experimental observations of somitogenesis. In particular, both the num- ber of binding sites for self-inhibiting transcription factors and differential decay rates of protein monomers and dimers are shown to be essential elements of the modeled system.
A key technical feat of this work is the computational component of the model simulation, analysis, and optimization. This involved novel use of existing computa- 11 tional tools, as well as significant algorithm development, computationally assisted analysis and optimization, and coding in Matlab and, to a lesser extent, Mathe- matica. Parallel computing was employed for more efficient parameter estimation using Monte Carlo simulations. Statistical analysis of the resulting large datasets was necessary for model selection and also generated useful parameter sensitivity information. Object-oriented programming tools were employed for accurate and reproducible randomization of large parameter sets used in testing model robustness to perturbation.
Before presenting the new models, a more thorough review of the existing math- ematical models of somitogenesis will be presented in the next chapter. This will provide the necessary background and perspective for proper consideration of the new models presented subsequently. 12
SURVEY OF EXISTING MATHEMATICAL MODELS
Mathematical modeling of somitogenesis stretches back some thirty years. Not
surprisingly, the mathematical models have evolved alongside the growth in scientific
understanding of somitogenesis. Theoretical understanding has progressed in the
past half-century by advances in quantitative experimental molecular biology com-
plemented more recently by mathematical models and computational tools. This
chapter presents a brief history and review of the existing mathematical models of
somitogenesis, providing perspective and a foundation for the new models presented
in Chapters 3 and 4.
Early Models: Pattern Formation and Morphogenesis
Mathematical models for pattern formation predate applications to somitogenesis.
In 1952, the pioneering work by Turing [37] showed that a system of reaction-diffusion
(RD) equations with two chemical components could produce spatial patterns. Coun- terintuitively, the addition of diffusion to a spatially uniform and temporally stable system was shown to be capable of destabilizing the uniform distribution into a tran- sient pattern (i.e., the Turing instability). Later RD models, such as the “local activa- tion, long range inhibition” model of Gierer and Meinhardt [38], added nonlinearities
into the reaction terms which stabilized these patterns [39]. These models suggested
mechanisms of morphogenesis in developmental biology, where initially homogeneous
stem cell populations differentiate spontaneously into tissues with structure and pat-
tern.
Mathematical interest in morphogenesis continued through the second half of the
20th century. Of special note is Wolpert’s proposed mechanism, in the late 1960’s, 13
of so-called morphogen gradients, whereby a spatial gradient of a biomolecule confers positional information to developing tissues [40]. The existence of such gradients
was initially speculative, yet experimental evidence for them has since been found in
several systems (e.g., [41, 42, 43]). The theoretical framework of morphogen gradients remains influential to the present day [44], and has been extended to include temporal
aspects. Contemporaneously with Wolpert’s early work, morphogenesis theories by
Thom [13] were inspired by the emergence of catastrophe theory, which lead, no-
tably, to one of the earliest and most enduring theories of somitogenesis by Cook and
Zeeman.
In 1976, Cooke and Zeeman [12] postulated that somitogenesis could be explained
by a clock and wavefront mechanism. In this model, the susceptibility of cells in the
presomitic mesoderm (PSM) to form somites continually oscillates between suscepti-
ble and insusceptible (the clock), while a determination wavefront sweeps posteriorly
across the PSM. The passing wavefront triggers cells to form somites, but does so
only when cells are susceptible, i.e., when their clocks are in the correct phase of
oscillation. Since adjacent cells are in phase, cohorts of cells are recruited in succes-
sion to form somites. Mathematically, the theory supposes a series of bifurcations (or
“catastrophes”) that underlie the sequential formation somites.
Cooke and Zeeman proposed their model with minimal biochemical evidence for
either the clock or the wavefront. Because early heat shock experiments on developing
embryos caused a periodic disruption in somite formation whose timing agreed with
the known timing of the cell cycle, the clock was initially thought to be closely linked
to the cell cycle [45]. This lead to a line of mathematical models based upon the
apparent cell cycle connection [46, 47, 48, 49, 50].
Mounting experimental evidence has since dispelled the cell cycle as the funda- mental oscillator in the clock and wavefront mechanism. In 1997, Palmeirim and 14
coworkers [14] discovered a gene with oscillatory expression in the PSM of the chick
embryo, providing an alternative candidate for the clock [51]. Experimental work has
since identified multiple oscillatory genes in each of several model organisms, including
mouse and zebrafish [3]. It should be noted that gene expression does not oscillate synchronously throughout all the cells of the PSM. Instead, the oscillatory expres-
sion in cells is coordinated so that an anteriorly traveling clock-wave is oppositely
directed to the posterior movement of the determination wavefront. Interestingly,
the discovery of oscillatory gene expression in the PSM forged a closer mathematical
connection between somitogenesis and other biological rhythms, such as those studied
extensively by Goodwin [16, 18], Winfree [52], and Goldbeter [53].
With several variants proposed along the way (e.g., [54, 55]), the clock and wave-
front mechanism has become a prominent model of somitogenesis. For additional
reviews and comparisons, see [3, 6, 7, 46, 56, 57, 58, 59, 60]. More recent mathematical
models of the clock and wavefront mechanism are discussed below.
Tissue-Based Reaction-Diffusion Models
In the 1980’s, a mathematically precise model of somite formation emerged from
the earlier pattern formation work by Gierer and Meinhardt [38]. The so-called Mein-
hardt models of somite formation [15, 61] rose to prominence despite the fact that
they were phenomenological in nature and lacked any direct link to experimentally
observed biological mechanisms. These models were extensions of the following non-
linear RD system of partial differential equations:
∂a ρ a2 ∂2a = − µ a + D , ∂t h a ∂2x ∂h ∂2h = ρ a2 − ν h + D , ∂t h ∂2x 15
where a(t, x) is the concentration of a supposed activator molecule that may diffuse
through the one-dimensional tissue with diffusion constant Da, h(t, x) is the concen-
tration of a supposed inhibitor molecule with diffusion constant Dh, and Greek letters represent positive system parameters. The activator a auto-catalyzes but is inhibited
a2 by h (note the ρ h production term). The basic conditions for pattern formation in this system are that the activator diffuses more slowly than the inhibitor (Da Dh) and the activator decays more quickly than the inhibitor (µ ν)[39, 62]. This
situation is termed “local self-enhancement and long-ranging inhibition” [62]. One of
the nicest features of the model variant used for describing somitogenesis is that it
naturally forms the observed somite polarity [62].
The Meinhardt models are still employed today and can reproduce a wide range
of developmental patterns [62]. However, newer RD models for somitogenesis have
emerged that specifically incorporate the action of the experimentally observed clock
and wavefront in the formation of the somite pre-pattern1 in the anterior-most PSM.
Prominent examples are the recent models by Baker, Schnell, and Maini [57, 59, 63],
which are a partial reformulation of earlier somitogenesis models based on the cell
cycle [50]. These models are still largely phenomenological, especially with respect
to the clock. However, they have successfully reproduced experiments in which the
wavefront is perturbed [57]
Baker and coworkers have also recently formulated a biologically informed RD
model of the wavefront, which is understood to involve an anterior-posterior (AP)
gradient of Fibroblast Growth Factor (FGF) in the PSM [64]. Decreasing levels of
FGF are at least partially responsible for triggering somite formation [65]. However,
1The somite pre-pattern refers to the stable bands of high-low gene expression in the anterior-most PSM, which form before visible morphological segmentation occurs. 16
they have not yet incorporated this model into the above model for pre-patterning.
Still other recent RD models describe the segmentation of the tissue that occurs after
pre-patterning, which involves cell rearrangement and changes in cell adhesion [49,
57, 58, 66, 67].
Cell-Based Models
In the clock and wavefront mechanism, modeling of the clock oscillations has been
dominated by cell-based models. This predominance can likely be traced back to
early ordinary differential equation (ODE) models such as those by Goodwin [16, 18] and Griffith [19, 20], where the discovery of intracellular genetic regulatory mecha-
nisms made compartmentalized, cell-based approaches for homeostasis and biological
rhythms a viable alternative to continuum, tissue-based approaches [17]. However,
phenomenological clock models continue to be proposed, most of which are premised
on cell-based phase oscillators where the clock does not have a specific biochemical
basis.
Phase Oscillators
In 1997, based on the discovery of oscillatory gene expression in chick by Palmeirim
and coworkers, Lewis [35] developed a clock and wavefront model that treated an
axial line of mesodermal cells as uncoupled phase oscillators. By prescribing an
anteriorly slowing frequency of phase oscillations in cells along the PSM tissue, the
model produced anteriorly traveling phase waves that emanated from a steady, clock-
like oscillation in the tailbud. The wavefront was associated with the frequency
of the phase oscillators decreasing to zero, and phases of successive blocks of cells
were thereby arrested anteriorly into an alternating high-low pattern. However, the 17 model contained no direct biological mechanism for the clock, the wavefront, or their interaction.
More recent phase oscillator models have also been proposed. As a discrete refor- mulation of the continuum-based Flow-Distributed Oscillator models of Kaern and coworkers [68], Jaeger and Goodwin [36] developed a cell-based, uncoupled phase oscillator model that was similar to Lewis’s earlier model. However, the authors did not view their Cellular Oscillator model as an implementation of a clock and wavefront mechanism. The general setting of the model made it capable of producing multiple kinds of fixed stripe patterns by arresting a traveling spatiotemporal wave.
The most recent phase oscillator models have included intercellular coupling [69,
70]. Intercellular signaling pathways are known to be active in PSM cells during somitogenesis [71]. Using coupled systems of ordinary [69] or delay [70] differential equations, these models focus on phase coupling while abstracting the details of the oscillator mechanism in the cells. This can simplify the mathematical analyses, allow- ing easier examination of how coupling makes pattern formation robust to the effects of system noise.
Ordinary Differential Equation (ODE) Models
After the discovery of oscillatory expression of c-hairy1 in the PSM of chick in
1997, a plethora of other genes with oscillatory cellular expression were soon found in zebrafish, chick, and mouse [3, 5]. In all three model organisms, two dominant genetic motifs for the clock emerged from these investigations. The first was an intracellular self-repression loop of so-called basic Helix-Loop-Helix (bHLH) genes, and the second was the intercellular positive feedback mechanism of the Notch pathway. The various mathematical models of the clock have typically focused on one or both of these motifs. 18
One of the first mathematical models of a clock-gene was presented in 2002 by
Hirata and coworkers [26], which investigated oscillatory expression of the bHLH Hes1 protein in mouse. They established experimentally that the Hes1 protein acted as a self-repressing transcription factor by inhibiting hes1 mRNA production, and that sustained oscillations required fast decay of the Hes1 protein by ubiquitin-proteasome- mediated degradation. They complemented their experimental investigations with the following ODE model: dx = B y − C x − A x z, (1) dt dy E = − D y, (2) dt 1 + x2 dz F = − G x − A x z, (3) dt 1 + x2 where x is the Hes1 protein concentration in the cell, y is the hes1 mRNA concen- tration, and z is a presumed “Hes1-interacting factor” that allows the system to sustain oscillations [26]. A–G are positive parameters affecting production and decay rates. Note that the x2 term in the denominator of the production terms for y and z represents some form of interaction of the Her1 protein with itself, perhaps through homodimerization or cooperative binding at multiple DNA binding sites.
Since this first model was published, additional ODE models have been put for- ward. An early model of Cinquin [72] was one of the first to question whether oscilla- tions were truly cell-autonomous as opposed to requiring intercellular signaling to be sustained. A more recent multicellular model by Tiedemann and coworkers [29] had some success incorporating a wavefront mechanism into the clock to arrest oscillations into a pattern. Like the above model by Hirata et al., this model introduced a third equation for separate tracking of the Hes1 protein in the cytosolic and nuclear com- partments, which enabled the system to exhibit sustained oscillations. This model also incorporated a nonlinear, rate-limited protein decay mechanism in the nuclear 19
compartment only, while additional results with intercellular signaling were only pre-
liminary. Another recent model took the same subcellular compartment approach
in tracking the Hes1-related clock-gene Hes7 [73], and it was also shown that rate-
limiting decay mechanisms play a dual role with the number of repressor binding sites
in the generation of sustained oscillations.
Lastly, it should be noted that in the past five years considerable genetic complex-
ity has been uncovered in the mouse oscillator [3, 7, 74]. Oscillations are currently
believed to be coupled between three interacting signaling pathways (Notch, Wnt,
and FGF), each with multiple biomolecular agents. Goldbeter and Pourqui´e[75] have recently published a comprehensive ODE model of the oscillator, which for a single cell requires sixteen ODEs with another sixteen auxiliary algebraic equations and some 77 parameters!
Delay Differential Equation (DDE) Models
The requirement of the somewhat mysterious third state variable z for sustained
oscillations in the ODE model by Hirata et al. lead to the introduction of delay
differential equation (DDE) models of the clock. These models introduced biologically
realistic transcription, translation, and transport delays into the production terms
of protein and mRNA, which allowed sustained oscillations in a system with only
these two dependent variables and a reduced number of model parameters. Negative
feedback with time delay DDE models have arguably become the most prominent
models of clock oscillations in somitogenesis.
Sustained Hes1 oscillations in mouse cells produced by a DDE model of gene
regulation were initially reported in 2003 by Jensen and coworkers [76] and also by
Monk [77]. In the same journal issue as Monk, Lewis [27] published a similar model
of delayed autoinhibition induced oscillations of homologous bHLH clock-genes in 20
zebrafish. The simplest version of Lewis’s model was the following system of DDEs:
dp(t) = a m(t − T ) − b p(t), (4) dt p dm(t) k = 2 − c m(t), (5) dt p(t−T ) 1 + m p0
where p is the protein concentration in the cell produced with delay Tp > 0, and m
is the mRNA concentration produced with delay Tm > 0. The positive parameters
a, b, k and c affect production and decay rates. po is a critical concentration of the protein at which mRNA production is half its maximum value k.
Equations (4)–(5) are essentially the same as equations (1)–(2) above with z = 0 and if Tp = Tm = 0. Dulac’s Criterion can be used to show that solutions to (4)–(5)
cannot be nontrivially periodic if Tp = Tm = 0 (see Appendix A). With sufficiently
long total delay Tp + Tm and sufficiently large decay constants b and c, this system exhibits sustained oscillations for a large range of the remaining parameter values a,
k, and p0, however, the period of oscillation is sensitive to the total delay [27]. Lewis extended this basic model with the addition of intercellular positive feed-
back on the clock-gene via Notch coupling, and showed that synchronization of two
coupled cells with different natural frequencies was possible. Lewis also considered the
role of a second self-repressing clock-gene that heterodimerizes with the first clock-
gene. Followup experimental and modeling work by Lewis and coworkers measured
certain key parameters of the model [4] and established that Notch signaling acts as
a coordinator of clock oscillations in zebrafish, but not as a fundamental driver of
oscillations in individual cells2 [2].
2Actually, a very simple DDE model for Notch coupled oscillations by Jiang and Lewis slightly predates this 2003 model [78]. 21
A 2007 paper by Cinquin [30] developed a related two clock protein model with intercellular activation that required thirteen differential equations for each cell. The model development required that numerous parameters be estimated, but was sig- nificant in that it extended Lewis’s two coupled cell model to a one-dimensional, anterior-to-posterior (AP) line of coupled cells. Furthermore, the model included an
AP graded control protein that interacted with the clock-proteins via heterodimeriza- tion. The heterodimers repressed clock-protein production alongside the other clock protein dimers. The model is notable in that it generated spatiotemporal waves of expression of the clock-genes across the PSM. However, these waves did not arrest anteriorly. Previously to this, the only multicellular extension of Lewis’s model was an examination of lateral synchronization of oscillators [28, 79], which did not consider
axial control of the oscillation rate.
Along with Hes1, other clock-genes such as Hes7, Lfng, Axin2, Notch, and Wnt
oscillate in the PSM of mouse [3]. In parallel with the above DDE modeling devel-
opments in zebrafish, a series of DDE models for Hes and other oscillators in mouse has been published. These models have focused on various aspects of the oscillator such as the instability of the protein [80], the number of repressor binding sites [81],
the role of co-repressors [82], the instability of cell-autonomous oscillations [83], the interaction between multiple signaling pathways [84], and the interaction with the
determination wavefront [25].
A limited number of the above models have had stochastic components, specifi-
cally [27, 30, 83]. In [27], Lewis showed that a certain level of transcriptional noise
added to his deterministic model could help sustain oscillations in a parameter regime
that produced damped oscillations in the corresponding deterministic model, a phe-
nomenon known as stochastic resonance. The proper accounting of internal and ex-
ternal noise during somitogenesis is an area of active interest [85]. 22
Finally, the amount of mathematical analyses that have accompanied these models has been relatively limited. Even simple linear analyses, such as the computation of
Hopf bifurcations, are complicated by the presence of delays [56]. Some very recent progress has been made [86, 87, 88, 89, 90]. The work by Verdugo and Rand [88] is notable in that they were able to continue the periodic solution of the system (4)–(5) away from the Hopf bifurcation point via an asymptotic expansion using Lindstedt’s method, as well as find closed form expressions for the period and amplitude of the approximate solutions with respect to the parameters.
Modeling Scopes and Multiple Scales
Biological systems are complex. Careful handling of this complexity is necessary when developing useful mathematical models of biological systems, such as the somi- togenesis models discussed above. Two issues of particular importance are the choice of modeling scope and the consideration of multiple scales [32, 33, 56]. These issues are somewhat interrelated.
The complexity of mathematical models of a biological system typically grows as more information about the system becomes available. Unfortunately, the information about biological systems can be simultaneously expansive yet incomplete. Making smart modeling decisions in the face of this dichotomy requires careful consideration of the scientific questions being addressed.
The choice of modeling approach used to meet the scientific objectives may be divided (somewhat artificially) into top-down vs. bottom-up. Put simply, a bottom-up approach tries to identify all the pieces of a system and their interconnections, so that when put together the emergent properties are those observed experimentally. On the other hand, a top-down approach begins with large-scale experimentally observed 23
phenomenon, and tries to determine how the overall system function depends on the
cooperative interaction of the principle subsystems. The inside workings of these
subsystems can remain very poorly understood, and, as such, may be treated as
“black boxes”.
There are trade-offs to each approach, and each approach (or a hybrid of the two)
is appropriate depending upon the circumstances [56]. For example, a prominent class
of somitogenesis models involves oscillations of gene expression on the cellular level.
In some models these oscillations are prescribed as phase oscillators, without regard
to the biochemical mechanism driving them, because the scientific focus is on the role
of oscillator slowing and/or coupling in pattern formation [70]. In other models, the
focus itself is on the detailed genetic circuits that drive oscillatory expression within
each cell [87]. The former model is simpler with respect to the myriad genes that are involved in the clock, while the latter model may identify specific experimental targets for testing hypotheses about the oscillator mechanism.
The presence of multiple scales is a main reason for the coexistence of top-down and bottom up approaches in mathematical models of biological systems. Biological systems exist across a broad spatial spectrum: from populations to individuals to organs to tissues to cells to organelles to molecules to ions. There are also a broad range of timescales: from inter-generational evolution to seasonal growth cycles to circadian rhythms to neural impulses. Appropriate consideration of multiple scales typically offers significant opportunity for simplification of mathematical models in the face of biological complexity (e.g., Michaelis-Menten kinetics), while integration of models at different scales becomes an additional consideration [32, 33].
Finally, another complication in mathematically modeling biological systems is the existence of multiple model organisms used for studying phenomena in the life
sciences. For example, the prominent model organisms in somitogenesis are zebrafish, 24 chick, and mouse, each with its own genetics, epigenetics, and evolutionary history.
Consideration of a given mathematical model requires some understanding of the model organism(s) to which the mathematical model is applicable. This is particularly critical when trying to draw conclusions about one organism (e.g., human) from another organism (e.g., mouse).
With these modeling issues in mind, the first model of this work is presented in the next chapter. 25
A MULTI-STABLE PHASE OSCILLATOR MODEL OF SOMITOGENESIS
As discussed in the previous chapter, the essential features of somitogenesis pat-
tern formation may be produced with a relatively simple, cell-based phase oscillator
model. The main drawback of such a model is that it typically does not incorporate
a biochemical mechanism, and thus provides only a phenomenological description of
the process. Nonetheless, such phase oscillator models offer a useful proof of concept
of the clock and wavefront mechanism of somite formation [35, 36], as well as for
investigations of oscillator synchronization [69].
In this chapter, a simple phase oscillator model is presented that extends the
early modeling work done by Lewis [35] and Jaeger and Goodwin [36], and offers
certain improvements on these models. Unlike a more recent phase oscillator model by
Riedel-Kruse and coworkers [69], which focused on oscillation synchronization through
coupling, the present model does not include intercellular coupling of phase oscillators.
Instead, the model is used to inform the development, in the next chapter, of a
biologically grounded model of the posterior formation of waves of gene expression in
the presomitic mesoderm (PSM). These waves of gene expression are a key component
(the clock) of the clock and wavefront mechanism.
Model Description
Experimental observations of somitogenesis in several model organisms has lead
to the following understanding of the elongating embryo [3, 5, 11, 91, 92].
The posterior tailbud consists of a progenitor zone, where the majority of new cells are added to the tailbud. Rapid cell division in a region dorsal to this progenitor zone continually supplies new mesoderm-destined cells to the posterior-most tailbud. 26
Cell division and mixing continues as these cells move towards the PSM through the initiation zone of the anterior tailbud, where coherent steady or periodic expression of certain somitogenesis genes begins.
At any point in time, the presomitic mesoderm can be divided into posterior
PSM and anterior PSM. Cell division and rearrangement diminishes considerably as cells exit the tailbud and initially traverse the posterior PSM. While in the posterior
PSM, which is about two-thirds of the presomitic tissue, cells remain in an uncom- mitted state, capable of forming any part of a future somite. Once cells pass into the anterior PSM they commit to become a specific part of a nascent somite, which has not yet physically segmented. At any point in time, the anterior PSM contains approximately two nascent somites. Segmentation steadily converts the anterior-most
PSM tissue into somites (becoming somitic mesoderm) as new presomitic mesoderm is simultaneously added to the posterior-most PSM.
The clock and wavefront mechanism incorporates several experimental observa- tions of somitogenesis [35, 63]. Cells in the initiation zone of the anterior tailbud have periodic clock-gene expression that oscillates in phase. After leaving the tailbud and entering the posterior PSM, cells’ oscillatory expression rate substantially decreases as they move toward the anterior PSM. The sequentially slowing oscillation rates across many cells produce anteriorly traveling waves of gene expression across the PSM tissue. As the determination wavefront passes posteriorly through the anterior PSM, oscillations arrest in blocks of cells in an alternating high-low expression pattern, where each block represents a nascent, polarized somite.
The present multi-stable phase oscillator (MPO) model represents a simple phe- nomenological implementation of the clock and wavefront mechanism. The clock is generated by prescribing a constant phase oscillation frequency in the tailbud. The wavefront mechanism moving posteriorly across the PSM slows oscillations into an 27 apparent traveling phase wave (termed the clock-wave), and eventually arrests oscilla- tions into constant phases that are monotonically increasing across groups of cells in a stepwise fashion (multi-stability). When these stepped phases are composed with an appropriate periodic function, an alternating high-low pattern in the anterior-most
PSM is realized.
The MPO model is constructed for a line of K total cells along the medial anterior- to-posterior (AP) axis of the mesoderm and tailbud. Each cell is associated with a
th phase oscillator, with φk(t) denoting the phase of the k cell, 1 ≤ k ≤ K. The phase may be interpreted as the state of expression of a given cell’s clock-gene(s).
A key experimental observation is that the frequency of synchronized oscillations in the tailbud is equal to the somite formation rate in the anterior-most PSM, which is approximately constant over a significant portion of developmental time [5, 93].
To be as general as possible, time is normalized to the period of tailbud oscillation.
That is, one unit of time is equal to the period of oscillation in the tailbud, which is equal to the formation time of one somite in the anterior-most PSM. Likewise, space is normalized to the AP length of one somite, so that one spatial unit equals the AP length of a single somite.
Cell division and rearrangement decrease considerably after cells exit the initiation zone in the anterior-most tailbud [5, 11, 85, 91]. Thus, cells are assumed to remain stationary relative to each other as they pass through the PSM. Individual cells are assumed to exit the anterior-most tailbud and enter the posterior-most PSM sequentially at a constant rate, traverse the PSM, and ultimately be incorporated into somites forming in the anterior-most PSM. The time when the kth cell enters the
PSM, Tk, is given by the linear relationship
k − 1 T = , (6) k λ µ 28
where λ is the number of cells per AP somite length, and µ is the somite formation rate, assumed here to be the normalized constant µ = 1 somite per time unit. λ varies with the organism under consideration, and is assumed constant. Although experimental evidence suggests µ and λ may slowly change, especially towards the end of somite formation, the constant value assumptions should be a valid approximation for the majority of somites produced during somitogenesis [93].
To capture the somitogenesis patterning phenomenon, the dynamics of the kth
cell’s phase, φk(t), is given by the following differential equation:
˙ 2 φk(t) = 1 − s t − Tk; τ 1 sin (2π φk), (7) 2
where s is the following continuously differentiable function that reflects the time the
cell has been in the PSM: 0, t ≤ 0, „ «. 2 t−τ t 1 1 e 2 , 0 < t ≤ τ 1 , 2 2 s(t; τ 1 ) = 2 „ «.„ « 2 t−τ t−2 τ 1 1 1 1 − e 2 2 , τ 1 ≤ t < 2 τ 1 , 2 2 2 1, 2 τ 1 ≤ t, 2
where τ 1 > 0 is a half-life parameter. The function s is sigmoidal, non-decreasing, 2
concave up for 0 < t < τ 1 , and concave down for τ 1 < t < 2 τ 1 . The function s may 2 2 2 be regarded as giving the maturity of a cell in the PSM, or, equivalently, as indicating a cell’s susceptibility to differentiate into part of a somite at a given time t. Zero rep- resents fully immature/unsusceptible, while one represents fully mature/susceptible.
Figure 5a shows s with the half-life parameter τ 1 = 3. 2 The function s represents the wavefront in the clock and wavefront mechanism.
Biochemical candidates for the wavefront include morphogen gradients, which are
spatially and/or temporally graded concentrations of one or more chemicals [43, 64]. 29
y=s(t−T ; 3), k=1,...,60 (a) y=s(t; 3) (b) k
1 1 t=6 0.8 0.8 t=8 0.6 0.6 y y 0.4 0.4
0.2 0.2
0 0
0 2 4 6 0 20 40 60 t (unitless time) k (cell #)
Figure 5: Maturity/susceptibility plots. (a) A plot of y = s(t; 3) showing the change in a cell’s maturity/susceptibility over time, with half-life τ 1 = 3. The cell is fully 2 immature/unsusceptible (y = 0) for t ≤ 0 and fully mature/susceptible (y = 1) for t ≥ 6 = 2 τ 1 . (b) Spatial profiles of maturities/susceptibilities of an anterior-to- 2 posterior line of sixty cells at times t = 6 and t = 8. k = 1 is anterior.
Because cells exit the tailbud in a temporal order, the composition s(t − Tk; τ 1 ) 2 represents the maturity/susceptibility of the kth cell at time t. Figure 5b shows the
resulting spatially graded profiles of the levels of s(t − Tk; τ 1 ) across an AP line of 2 sixty cells at two different times.
The function s depends on time, which makes the differential equation (7) non-
autonomous. However, for any k, s(t − Tk) is constant outside the compact transition interval h i Ik := Tk,Tk + 2τ 1 , 2 allowing autonomous analysis of the dynamics outside this interval. Consider the
th k cell, so that s(t − Tk) is constant outside the transition interval Ik. For t ≤ Tk, ˙ s(t − Tk) = 0, and so φk(t) = 1 and the model reproduces the periodic oscillations in the tailbud (the initial clock mechanism). For t ≥ Tk + 2 τ 1 , s(t − Tk) = 1 and so 2 30
y=dφ /dt, τ =3, T =0 1 1/2 1
1
0.8
0.6 y 0.4 t≤0 0.2 t=3 t≥6 0 0 0.5 1 1.5 2 φ 1
Figure 6: Phase portrait snapshots for the multi-stable phase oscillator model of the first cell given by the non-autonomous differential equation (7), with k = 1, τ 1 = 3, 2 and T1 = 0. The middle (grey) curve at t = 3 is transient, but represents the slowing phase oscillations at the instant the cell is halfway to full maturity/susceptibility.
˙ 2 2 ∗ φk(t) = 1 − sin (2π φk) has semi-stable equilibria when sin (2π φk) = 1, i.e., when
1 3 5 φ∗ = ± , ± , ± ,.... (8) k 4 4 4
See Figure 6.
For clarity in the computation of equilibria and discussion of the resulting multi-
stability, a non-generic case has been considered. Note that this example can be made
generic, and hence more robust, by stretching the range of the function s by a small
amount. For example, making the range of s to be [0, 1 + ], for some > 0, would
suffice. ˙ The positivity of φk during the transition interval guarantees that all solutions eventually approach one of these phase angles monotonically from below. Which of
these equilibria is approached depends on both the initial phase φk(0) and the slowing 31
of phase oscillations during the transition interval, which itself depends on the choice
of the half-life parameter τ 1 and shape of the function s. 2 Initial conditions for each cell may be chosen so that all cells oscillate in phase before the first cell exits the tailbud at T1 = 0. A convenient choice that properly
1 arranges the cell phases in the first and subsequent somites is φk(0) = 4 , 1 ≤ k ≤ K.
Furthermore, τ 1 is chosen to correspond to the lifetime of a cell in the PSM for a 2
given model organism, which is approximately 2 τ 1 . Identical initial conditions and 2
monotonicity of solutions, when combined with the ordering of the Tk, lead to the emanation of anteriorly traveling phase waves from synchronized tailbud oscillations.
In addition phase wave oscillations arrest blocks of cells in an alternating high-low
pattern in the anterior PSM.
Figure 7 shows numerical solutions for the first six cells of the phase oscillator
model described above, where six cells per AP somite length is assumed (λ = 6) and the maturity/susceptibility half-life is three tailbud oscillation periods (τ 1 = 3). Two 2 groups of three cells reach distinct limiting phase values, with each cell taking slightly longer than 2τ 1 = 6 oscillation periods to essentially reach full maturity/susceptibility. 2 Specifically, 17 lim φ1(t) = lim φ2(t) = lim φ3(t) = , t→∞ t→∞ t→∞ 4 and 19 lim φ4(t) = lim φ5(t) = lim φ6(t) = . t→∞ t→∞ t→∞ 4 The limiting phase difference between the two cohorts of cells represent the polarity
between the anterior and posterior half of the first somite. Figure 8 shows the long-
term, grouped behavior of the phase solutions of sixty cells across the PSM. Solutions
were computed using Matlab’s ode45 solver [94]. The code that generated the solutions and figures may be found in Appendix C. 32
y=φ (t), τ =3, T =(k−1)/6, k=1,...,6 k 1/2 k 5 φ (t) 1 4 φ (t) 2 φ (t) 3 3
y φ (t) 4 2 φ (t) 5 φ (t) 1 6
0 0 1 2 3 4 5 6 7 8 t (unitless time)
Figure 7: Computed solutions of the multi-stable phase oscillator model (7) for the 1 first somite (six total cells) with initial condition φk(0) = 4 , k = 1, 2,..., 6. Groupings of three cells into constant limiting phases can be seen as cells sequentially reach full maturity/susceptibility. Specifically, limt→∞ φ1(t) = limt→∞ φ2(t) = limt→∞ φ3(t) = 17 19 4 and limt→∞ φ4(t) = limt→∞ φ5(t) = limt→∞ φ6(t) = 4 .
y=φ (60.00), k=1,2,...,60 k 14
12
10 y 8
6
4 0 2 4 6 8 10 x (somite #)
Figure 8: Long-term computed solution behavior of the multi-stable phase oscillator 1 model (7) for ten somites (sixty total cells) with initial condition φk(0) = 4 , for k = 1, 2,..., 60. All cells have reached full maturity/susceptibility, and grouping into constant limiting phases is apparent. 33
The spatial formation of the somitogenesis pattern may be realized by composing an appropriate 1-periodic function with the phase solutions to the differential equa- tion (7). A convenient choice for converting phase to expression level in the the kth cell, pk(t), is 1 p (t) = 1 + sin (2π φ (t)) . (9) k 2 k
The expression levels pk(t) are normalized between zero and one. Figure 9 shows the development of the resulting somitogenesis pattern over two oscillation cycles in the tailbud. Figure 10 shows the computed approximation of the stable somitogenesis pattern that forms asymptotically.
Comparison to Existing Phase Oscillator Models
Although the MPO model does not incorporate intercellular coupling, it still offers certain advantages over two existing, uncoupled phase oscillator models of somitoge- nesis. These are discussed in turn below.
Lewis’s Phase Oscillator Model
The 1997 paper by Palmeirim and coworkers [14] presented the first experimental evidence of a gene with oscillatory expression in the chick PSM that was independent of the cell-cycle. In a supplement to this paper, Lewis presented a phase oscillator model of somitogenesis [35]. The temporal rate of change in phase φ(x, t) of the cell 34
y=p (6.00) y=p (7.00) (a) k (e) k 1 1
y 0.5 0.5
0 0 0 5 10 0 5 10 y=p (6.25) y=p (7.25) (b) k (f) k 1 1
y 0.5 0.5
0 0 0 5 10 0 5 10 y=p (6.50) y=p (7.50) (c) k (g) k 1 1
y 0.5 0.5
0 0 0 5 10 0 5 10 y=p (6.75) y=p (7.75) (d) k (h) k 1 1
y 0.5 0.5
0 0 0 5 10 0 5 10 x (somite #) x (somite #)
Figure 9: Reproduction of the spatiotemporal somitogenesis pattern across ten somite lengths by the MPO model, given by equations (9), where the phases φk(t) were 1 computed numerically using equations (7) with initial conditions φk(0) = 4 , for k = 1, 2,..., 60. (a)–(h) show two oscillation cycles in the tailbud. Red cells are immature/insusceptible, green cells are mature/susceptible, and grey cells are in tran- sition. Polarized somites, each with six cells, form anteriorly as more posterior cells continue to oscillate. 35
y=p (60.00) k
1
0.8
0.6 y 0.4
0.2
0 0 2 4 6 8 10 x (somite #)
Figure 10: Reproduction of the long-term somitogenesis pattern across ten somite lengths by the MPO model, given by equation (9), where the phases φk(60) were 1 computed numerically using equations (7) with initial conditions φk(0) = 4 , for k = 1, 2,..., 60. All cells are green, indicating full maturity/susceptibility. Polarized somites, each with six cells, have formed stably across the entire PSM.
at position x ≤ 0 at time t ≥ 0 was given by the following initial value problem1:
∂φ 1 = x+t , ∂t 1 + e 2 φ(x, 0) = 0,
which may be solved by simple integration to give the solution
t x 2 Z 1 1 + e 2 φ(x, t) = x+s ds = t + ln x+t . 0 1 + e 2 1 + e 2
1Lewis uses more negative values of the continuous position variable x to signify more posterior positions in the PSM, so that somites form from right to left. Non-dimensional spatial units are in somite lengths, and non-dimensional temporal units are in tailbud oscillation periods. 36
φ y= ∞(x), Lewis Phase Oscillator 12
10
8
y 6
4
2
0 −10 −8 −6 −4 −2 0 x (somite #)
Figure 11: Asymptotic phase solutions for Lewis’s phase oscillator model [35]. The posterior-most cell is at x = 0. Phases are monotonically distributed but not grouped into polarized somites.
The asymptotic spatial phase pattern φ∞(x) produced by this model is given by
φ∞(x) := lim φ(x, t) t→∞ x 2! 1 + e 2 = lim t + ln x+t t→∞ 1 + e 2 x 2! t 1 + e 2 = lim ln e + ln x+t t→∞ 1 + e 2 x 2! t 1 + e 2 = lim ln e x+t t→∞ 1 + e 2 x 2! t 1 + e 2 = ln lim e x+t t→∞ 1 + e 2
−x − x = ln (e + 2 e 2 + 1).
See Figure 11 for a plot of φ∞(x). 37
In Lewis’s model, the realization of the high-low expression of groups of cells is
given by the following function h(x, t):
h(x, t) = f(m(x, t)) w(z(φ(x, t))), (10)
where
1 f(m) = m , 1 + e 2 m(x, t) = x + t, 1 w(z) = , 1 + e10z z(φ) = cos (2π φ).
In the product of the functions f ◦ m and w ◦ z in (10), the composition f ◦ m
eliminates expression in mature cells after a certain time2, while w◦z acts to normalize
the expression pattern between zero and one. The asymptotic spatial pattern given
by w(z(φ∞(x)), shown in Figure 12, does not capture the high-low limiting phase behavior quite as well the MPO model (compare to Figure 10).
Although the MPO model is not as easily solved analytically, it improves upon
Lewis’s model in several ways. First, the construction of the MPO model greatly simplifies the realization of normalized oscillatory expression (compare equation (9)
to equation (10)). Second, the asymptotically grouped phases of cell cohorts in the
MPO model is more robust (compare Figures 8 and 11). Finally, the MPO model’s
biologically motivated maturation/susceptibility function s, with the half-life param-
eter τ 1 , makes the model readily tunable to different species. 2
2This transient expression in the paraxial mesoderm occurs for some genes, but not for others. 38
φ y=w(z( ∞(x))), Lewis Phase Oscillator
1
0.8
0.6 y 0.4
0.2
0 −10 −8 −6 −4 −2 0 x (somite #)
Figure 12: Asymptotic somitogenesis pattern for Lewis’s phase oscillator model [35]. The posterior-most cell is at x = 0. Note that some cells are not well polarized.
Jaeger and Goodwin’s Cellular Oscillator Model
In 2001, Jaeger and Goodwin presented “A Cellular Oscillator Model for Periodic
Pattern Formation” [36]. In some sense, their Cellular Oscillator (CO) model general- ized Lewis’s earlier phase oscillator model discussed above. However, the authors did not recognize their model as an implementation of a clock and wavefront mechanism3.
Instead, the authors cast their model as a cell-based version of the fluid-based Flow
Distributed Oscillator model of Kaern and coworkers [68].
The MPO model shares several key features with the CO model. Both models use
an AP line of cells with synchronized, periodic oscillations in the tailbud (called the
initiation zone in the CO model). Both models also have sequentially slowing phase
oscillations that create traveling phase waves. The MPO model has the advantage
3Cooke and Zeeman’s earliest version of the clock and wavefront model did not anticipate the anteriorly traveling waves of gene expression in the PSM [12, 51]. 39 that the phase waves are completely arrested, whereas the CO model’s waves are not [36, see p.175].
The construction and analysis of the MPO model makes its connection to the clock and wavefront mechanism more transparent than the CO model. The CO model does allow more complex striped patterns to be formed by non-autonomously altering the frequency of oscillation in the initiation zone and/or changing the rate at which cells exit this zone. Although not implemented above, such features could be added to the
MPO model without great difficulty.
Discussion
Admittedly, many of the choices in the construction of the MPO model are some- what arbitrary. While phenomenologically sound, these choices are not necessarily the only or best ones that reproduce the key qualitative features of pattern formation in somitogenesis.
For example, the squared sine function in equation (7) is a convenient choice for producing the desired phase monotonicity and slowing conditions, as well as the distribution of phase equilibria. Other choices with similar qualitative shape are certainly possible, which is also the case for the maturity/susceptibility function s.
The combined roles of these two functions during the transition interval affect the specific formation of the transitional clock-wave in the PSM.
The MPO model is primarily used as a proof of concept for a more biologically grounded, and hence more scientifically useful, model developed in the next chap- ter. Thus, further analysis concerning these function choices is not carried out here.
Likewise, the non-generic semi-stability of the equilibria given in (8) can easily be 40
remedied into more robust full stability, but little would be gained from the effort
given the purpose at hand.
Perhaps the best extension of the MPO model would be the addition of intercel-
lular coupling of the phase oscillators. This would allow the extension of the work
by Riedel-Kruse and coworkers [69], in which the effect of coupling on phase synchro- nization in the initiation zone of the tailbud was investigated. Extending the present model would allow a more complete analysis of the effect of coupling on clock-wave formation and the arresting of oscillations in the presence of noise. In fact, very recent work by Morelli and coworkers [70] has presented and analyzed such a model, which
incorporates biologically realistic delays in the intercellular coupling.
Finally, it has been tacitly assumed that coupling between cells is a coordina-
tion/synchronization mechanism operating solely to overcome system noise. There is
some evidence that, depending on the organism, the coupling may form a core com-
ponent of the oscillator mechanism itself [2, 3, 28, 69, 71, 74, 78, 83, 84, 85, 95, 96].
Thus, eliminating the coupling in some systems may dampen or abolish oscillations
altogether. This situation again reveals the limitations of phenomenological phase
oscillator models that oversimplify the underlying biological mechanism of action.
The next chapter offers a partial remedy of this problem by presenting a biologically
informed model of posterior clock-wave formation during somitogenesis. 41
A DELAY DIFFERENTIAL EQUATION MODEL OF POSTERIOR
CLOCK-WAVE FORMATION
A good mathematical model is like a good map: it has just enough detail
to help you get where you need to go.
Although the multi-stable phase oscillator model presented in the previous chapter reproduces the key spatiotemporal dynamics of somitogenesis patterning, it lacks a specific biological mechanism of action. There are no underlying genetics or biochem- istry. This can potentially limit the model’s usefulness in formulating and checking scientific hypotheses concerning somitogenesis.
This chapter presents a biologically informed mathematical model for clock-wave formation in the posterior presomitic mesoderm (PSM), for which the function of many of the genetic and biochemical elements is reasonably well-established in several model organisms. Recall that the clock-wave is an experimentally observed spatiotem- poral wave of gene expression in the PSM that arises from coordinated oscillations in gene expression at the cellular level. The clock-wave is understood to embody the clock in the clock and wavefront mechanism of somite formation. The present model reproduces the posterior portion of the waves of gene expression in the PSM that emanate from steady oscillations in the tailbud. The anterior fixation of the waves into bands of gene expression is not reproduced. Henceforth, the present model will be called the posterior clock-wave (PCW) model.
In several model organisms, the genetic and biomolecular mechanisms of experi- mentally observed gradients in the PSM are now being established [7, 43, 64]. These
spatiotemporal gradients move in accordance with the wavefront of determination
that sweeps posteriorly across the anterior PSM, triggering the successive formation 42 of somites. Furthermore, several of these gradients have been directly implicated in the susceptibility of cells in the PSM to form somites [7, 97, 65, 98, 42]. Thus, these gradients are excellent candidates for the wavefront. As such, the wavefront can be appropriately be termed a gradient-wavefront.
At this point in time, the biological elements that join the clock-wave and gradient- wavefront to fully realize somite formation are not well-understood [7, 99]. However, some mathematically informed hypotheses are beginning to emerge [25, 29, 63, 100,
101]. The PCW model does not incorporate any elements of the gradient-wavefront in the clock and wavefront mechanism, and thus cannot be expected to reproduce the entire process of somitogenesis. However, the PCW does employ an experimentally observed posterior gradient of a so-called control protein, which controls the rate of oscillations by interacting with the clock protein via heterodimerization.
Given the situation of current knowledge, the present work focuses on the initial formation of the clock-wave in the posterior PSM. While keeping the model on a
firm biological foundation, an effort is made to incorporate a minimum of biological components of the cell-based clocks whose aggregate behavior generates the posterior clock-wave in the PSM. More specifically, a minimal genetic circuit is shown to control the rate of oscillatory gene expression across coupled cells in such a way that allows proper formation of the posterior clock-wave in the PSM.
Previous delay differential equation (DDE) models by Lewis and coworkers [2,
4, 27] and Cinquin [30] are the primary motivation for the PCW model presented below. Having only four differential equations per cell, the PCW model maintains the simplicity and clarity of Lewis’s coupled two cell model, yet the PCW model can, like Cinquin’s model but with a much more transparent formulation, reproduce the dynamical waves of gene expression that form in the posterior PSM of zebrafish. 43
Furthermore, the new model suggests a novel biological mechanism underlying
how competitive dimerization controls the oscillation rate. While other models have
focused on saturating protein decay kinetics as an additional source of nonlinear-
ity [29, 73, 75], the PCW model suggests that differential decay between clock protein
monomer and dimer, including heterodimer with the control protein, is an important
mechanism for slowing oscillations and generating the clock-wave in the posterior
PSM.
The Biological Components of the Clock
The clock-wave arises from coordinated oscillatory gene-expression across cells
in the PSM. In various model organisms, experiments have revealed multiple genes
involved in oscillatory gene expression [3, 5, 74]. This has lead many investigators to
distinguish components of the clock from outputs of the clock [14, 102]. Components
of the clock include genes required to generate the clock’s oscillations as well as genes
necessary to coordinate these oscillations. The expression of a component gene may or may not itself oscillate.
Outputs of the clock include genes downstream from the component genes. Out- puts typically oscillate as they are driven passively by the components of the clock.
This distinction does not preclude outputs from being necessary for proper somite formation. Furthermore, intracellular and intercellular feedback loops in the genetic circuitry complicate the experimental identification of component vs. output [74].
With consideration of the above, the PCW model is composed of the interaction
between three key, experimentally identified components that generate the clock-
waveform. These cell-level components are: 1) a clock, 2) a control protein, and
3) a coordinating signal. Each component is discussed in turn below. Because the 44
present work was largely motivated by experiments and previous modeling work in
the zebrafish model organism, a general discussion of each component is first given,
followed by specific considerations of the component in zebrafish.
The Clock
Multiple oscillatory genes have been identified in the PSM of several model or-
ganisms, such as zebrafish, chick, and mouse [3, 5, 14, 103]. These genes typically
exhibit some redundancy in function, complicating the characterization of a central
pacemaker that drives the oscillatory expression of the remaining genes. Depending
upon the model organism, the pacemaker may be a cell-autonomous genetic oscilla-
tor requiring single or multiple genes, or, it may require an intercellular network of
genes [6].
In zebrafish, for example, expression of the basic helix-loop-helix (bHLH) her
(hairy and enhancer-of-split related) genes her1, her7, her11, her12, and her15 oscil- lates in the PSM. Some of these genes are expressed across the entire PSM (her1 and her7), while others are expressed only anteriorly (her 11) or anteriorly and posteriorly with an interim space (her12 and her15)[5]. Gene knockout and protein morpholino knockdown experiments have suggested that there is at least some redundancy in the function of these genes, whose proteins are understood to act as transcriptional repressors after homo- or heterodimerization with other bHLH proteins [5]. The re-
maining bHLH genes her4, her6, her13.2 and hey1 are expressed in a non-oscillatory
fashion in the zebrafish PSM [5]. With the exception of the posteriorly-to-anteriorly
graded her13.2, these genes are expressed anteriorly in bands that indicate nascent
somites [5].
It should be noted that several of the bHLH genes are active in other developing
tissues in the zebrafish embryo [103]. Many also have homologues in higher organisms 45
such as mouse [103], where they are referred to as Hes genes. Several Hes genes have direct analogues in human. Furthermore, the bHLH structure is well-preserved evolutionarily from more primitive organisms, and was originally discovered in the developmental genes of the fruit fly Drosophila [104].
The three oscillatory genes her1, her7, and her12 have the most prominent effect upon posterior clock-wave formation in zebrafish [5, 105], and previous models have
considered one or two clock-genes (her1 and/or her7)[2, 4, 27]. The PCW model
assumes a cell-autonomous clock with a single clock-gene. Use of a single clock-gene
is justified on the grounds that her genes show some redundancy in zebrafish, and
that Hes7 in mouse oscillates independently of, for example, Hes1 [106], which appar-
ently cannot sustain independent oscillations [83]. Admittedly, this is a considerable
simplification of the genetic circuits of all model organisms. In the next chapter, this
minimal assumption of a single clock-gene is shown to be sufficient for generating a
biologically realistic clock-wave, which raises the interesting scientific question as to
the reason for the presence of multiple clock-genes across several model organisms.
Given the above considerations, the PCW model tracks both mRNA and protein
levels of a single clock-gene. The clock protein is assumed to form a homodimer
that represses its mRNA production after a delay. Three main sources of delay are
the transcription and post-transcriptional processing of mRNA, the translation de-
lay in protein production, and the transport of molecules between the nuclear and
cytosolic compartments within a cell or between the cytosolic compartments of adja-
cent cells. This system is capable of autonomous, sustained oscillatory gene expres-
sion [27, 76, 77, 80]. The relative amounts of clock protein monomers, homodimers,
and heterodimers with the control protein are explicitly tracked, allowing different
decay rates for each [107]. 46
Control of clock mRNA transcription is modeled using the approach of Shea and
Ackers [23, 108, 109], which includes parameters such as DNA binding energies and
cooperativity between various protein transcription factors bound to the promoter
at multiple binding sites. This modeling formalism is a significant simplification
of eukaryotic transcription process [34, 108], but represents an initial step towards
biological realism as compared to existing models such as [2, 27, 30]. The addi-
tional modeling details are particularly appropriate because of the presence of both
repressors and activators controlling the clock-gene, such that monotonicity of the
mRNA transcription rate with respect to the levels of transcription factors is no
longer ensured [110].
The Control Protein
Several spatiotemporally graded biomolecules have been identified in the PSM of
multiple model organisms. In zebrafish, the posteriorly-to-anteriorly graded Her13.2
protein interacts with at least one of the clock proteins (Her1) and affects the forma-
tion of the clock-wave [111, 112]. Even though a direct interaction of oscillatory and graded bHLH proteins has not been verified in chick or mouse, the her13.2 gene is homologous to Hes6 in mouse [111, 113], and several biomolecular gradients have been
identified in chick and mouse that could interact with one or more clock-genes [3, 7].
Motivated by experiments in zebrafish [30, 111, 112], we suppose that the graded
control protein interacts with the clock protein by heterodimerization. Because
Her13.2 protein has a truncated amino acid sequence normally used for DNA bind-
ing [111], it is assumed that neither control protein homodimers nor heterodimers
with clock protein can repress clock-gene transcription. This is in contrast to the
model of Cinquin [30], in which heterodimers may also repress transcription. In fur- 47
ther contrast to the model of Cinquin [30], because Her13.2 protein has a functional
dimerization domain [111], homodimerization of the control protein is allowed.
The control protein level is prescribed as an external control on the system, with
a maximum value in the tailbud that deceases anteriorly in the PSM. The graded
level of control protein, combined with competitive dimerization between control and
clock proteins, results in slowing oscillation rates in successively anterior cells and
the formation of a clock-wave. Furthermore, simulations show that the amount of
slowing is a key factor in clock-wave formation, which agrees with experimental in-
vestigations [4]. Thus, the control protein influences the rate of oscillation generated by clock protein self-repression through competitive dimerization, which is a control mechanism seen in related bHLH networks [113, 114].
The Coordinating Signal
In multiple model organisms, experiments have revealed components of the clock’s
genetic circuit that are responsible for coordinating oscillations of clock-gene ex-
pression between cells through intercellular signaling. In all model organisms this
intercellular coupling is achieved through the notch signaling pathway [3, 5], with
more complex coupling pathways occurring in higher vertebrates such as chick and
mouse [3, 7, 74, 115]. This has lead to considerable scientific debate concerning whether the coupling acts weakly to coordinate otherwise cell-autonomous oscilla- tions, or, alternatively, whether the coupling plays a prominent role in generating oscillations [2, 72, 83, 96]. In the first case, a clock-gene’s expression continues to
oscillate in the absence of intercellular signaling but loses coherence with other cells
because of system noise. In the second case, inhibition of intercellular signaling
causes oscillations to cease altogether, perhaps after transient, incoherent damping.
Experiments precise enough to allow differentiation of these alternatives have been 48
difficult to conduct [99]. Furthermore, different scenarios could happen depending
on the model organism, e.g., amniotes such as chick and mouse vs. lower vertebrates
such as zebrafish.
In zebrafish, the role of intercellular coupling is currently thought to be one of
coordination, not generation, of oscillations [2, 5, 27, 28, 69, 78, 85, 95]. Constitutively
expressed Notch proteins are released from a cell’s membrane into the cytoplasm (as
Notch Intercellular Domain (NICD)) after extracellular binding with a Delta ligand
(also a protein) presented through the membrane of an adjacent cell. In zebrafish,
NICD has been shown to activate expression of bHLH genes such as her1 and her7,
and the Notch ligand DeltaC oscillates in phase with these two clock-genes in the
PSM [5]. The Notch ligand DeltaD does not oscillate, however, but it is necessary for coordinated oscillations and is constitutively expressed in the posterior PSM of zebrafish along with other required genes in the Notch pathway such as Su(H) [91,
116].
Informed by the oscillatory expression of DeltaC synchronized with clock-gene
expression in zebrafish, the PCW model incorporates a single coordinating-signal
gene and assumes that production of coordinating signal mRNA is repressed by clock
protein homodimer and that the clock mRNA is activated by signaling protein from
adjacent cells [27]. Because Notch signaling is non-diffusive and contact-dependent,
the effect of signaling is confined to nearest neighbors. This arrangement is very
similar to the one proposed in [27]. Following [2, 27], the effect of the coupling signal
on the clock is assumed to be weaker than the clock-gene’s self-repression. Thus,
the model anticipates that intercellular coupling coordinates oscillations instead of
helping to drive them. While likely true for zebrafish [2, 28, 69, 78, 85, 95], this may
not be a valid assumption in chick or mouse [3, 71, 74, 83, 84]. 49
Modeling Posterior Clock-Wave Formation: Uncoupled Cells
The PCW model is premised on a cell-autonomous oscillator, with a gradient
controlled oscillation rate and weak coupling for coordination of oscillations. Thus,
an uncoupled model is presented first, followed by an extension of the model that
adds intercellular coupling via a coordinating-signaling gene. The elements of the
uncoupled model and its construction are discussed next.
PSM Growth
As with the phase oscillator model in Chapter 3, a line of K total cells along the medial anterior-to-posterior (AP) axis of the mesoderm is considered. Cells are as- sumed to enter the posterior-most PSM from the initiation zone in anterior-most tail- bud at regularly spaced time intervals, without volume change or rearrangement [11].
th Tk denotes the time of entry of the k cell, and is given by the linear relationship in equation (6) from Chapter 3, page 27, which is repeated here:
k − 1 T := , (11) k λ µ
where λ is the number of cells per AP somite length (assumed constant), and µ is
the somite formation rate in somites per minute, which is equal to the oscillation
frequency in the tailbud [5] (also assumed constant). These steady growth and oscil-
lation assumptions should be a reasonable approximation over a significant portion
of developmental time [93].
Model Variables
On the cellular level, the dynamics of the oscillator are realized by a clock-gene and
a control protein that interacts with the clock protein by competitive dimerization. 50
For each cell, the model’s state variables are the amount of total clock protein and
mRNA, whereas the amount of total control protein is prescribed non-autonomously
as an external control on the system.
th For the k cell, let ck(t) represent the cytosolic copy number of clock mRNA as a function of time and define the nuclear copy number of total clock protein and total
signal protein, respectively, as follows:
Cbk(t) := Ck(t) + 2 C:Ck(t) + C:Gk(t), (12)
Gbk(t) := Gk(t) + 2 G:Gk(t) + C:Gk(t), (13)
where Ck, C:Ck, Gk, G:Gk, and C:Gk are the nuclear copy number of clock pro- tein monomer, clock protein homodimer, control protein monomer, control protein homodimer, and clock protein heterodimer with control protein, respectively.
The Control Protein
The amount of nuclear total control protein Gbk(t) is assumed to be controlled by an external system that is not explicitly modeled here. Before cell k leaves the
tailbud and enters the posterior PSM (t < Tk), Gbk(t) is assumed to be maintained at
max a maximal constant Gb . After entering the PSM (t ≥ Tk), the total control protein
is assumed to decrease monotonically to zero with half-life τ 1 . Altogether, Gbk(t) is 2 given by
max Gbk(t) = Gb · g(t − Tk; τ 1 ), (14) 2 51
with the continuously differentiable function g given by 1, t ≤ 0, „ «. 2 t−τ t 1 1 1 − e 2 , 0 < t ≤ τ 1 , 2 2 g(t; τ 1 ) = (15) 2 „ «.„ « 2 t−τ t−2 τ 1 1 1 e 2 2 , τ 1 ≤ t < 2 τ 1 , 2 2 2 0, 2 τ 1 ≤ t, 2
where τ 1 > 0 is a half-life parameter. The function g is sigmoidal, non-increasing, 2 concave down for 0 < t < τ 1 , and concave up for τ 1 < t < 2 τ 1 . When viewed axially 2 2 2 along an AP line of PSM cells, the resulting spatial profile agrees qualitatively with
those computed in [64]. See Figure 13.
Other gradient profiles are certainly possible, such as the following simple expo-
nential profile [43] 1, t ≤ 0, gexp(t; τ 1 ) = ln 2 (16) 2 − t τ 1 e 2 0 < t.
Note that gexp is continuous but not differentiable at t = 0, and therefore the resulting spatial profile is not smooth where cells exit the tailbud.
The Intracellular Clock
Clock protein is produced as monomer at the ribosomes in the cytosol, but is active
as a homodimerized repressor in the nucleus. Clock mRNA is produced at the DNA in
the nucleus, but it is translated into protein in the cytosol. Following Lewis [27], the
units of protein and mRNA concentrations are copy number per nuclear and cytosolic
compartment, respectively. 52
(a) Temporal Gradient, (b) Spatial Gradient, Cells 1 and 10 Time= 84 min.
1 Cell 1 1 Cell 10 0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
Normalized Total Control Protein 0 50 100 150 200 Normalized Total Control Protein 1 10 20 30 40 50 Time (min.) Cell # (1 is anterior)
Figure 13: Total control protein gradients. The spatiotemporal gradient of the nor-
Gbk(t) malized total control protein, = g(t − Tk; 30), where t is time, Tk is the time Gbmax that cell k exits the tailbud, and τ 1 = 30 minutes is the half-life. (a) Spatiotemporal 2 gradient as a function of time in cell one, g(t − T1; 30), and cell ten, g(t − T10; 30). Cell ten, which exits from tailbud later than the cell one (T1 < T10), maintains the high level of total control protein for a longer time. (b) Spatiotemporal gradient as a function of cell position given by g(84 − Tk; 30), k = 1 to 50. As tailbud growth adds cells to the PSM, the tailbud-PSM boundary moves right, and the spatial gradient profile follows this moving boundary. 53
Competitive dimerization of the clock and control proteins is represented by the
following three possible bidirectional dimerization reactions:
k+ k+ k+ 2 C C:C C:C, 2 G G:G G:G, and C + G C:G C:G, (17) − − − kC:C kG:G kC:G where the forward and reverse (strictly) positive reaction rates are given above and below the arrows, respectively.
Using mass action kinetics for the protein dimerization reactions (17), and consid-
ering the production and decay of all forms of the clock protein and of clock mRNA,
leads to the following system of delay differential equations that describe the dynamics
of the clock-gene in the kth cell:
˙ − Ck(t) = αC ck(t − τC ) − βC Ck(t) + 2 kC:C C:Ck(t) −
+ 2 − + 2 kC:C (Ck(t)) + kC:G C:Gk(t) − kC:G Ck(t) Gk(t), (18)
˙ − + 2 C:Ck(t) = −βC:C C:Ck(t) − kC:C C:Ck(t) + kC:C (Ck(t)) , (19)
˙ − Gk(t) = Pk(t) − βG G(t) + 2 kG:G G:Gk(t) −
+ 2 − + 2 kG:G (Gk(t)) + kC:G C:Gk(t) − kC:G Ck(t) Gk(t), (20)
˙ − + 2 G:Gk(t) = −βG:G G:Gk(t) − kG:G G:Gk(t) + kG:G (Gk(t)) , (21)
˙ − + C:Gk(t) = −βC:G C:Gk(t) − kC:G C:Gk(t) + kC:G Ck(t) Gk(t), (22)
c˙k(t) = γc Hn(C:Ck(t − τc)) − δc ck(t), (23)
where the function Hn represents the nonlinear effect of the repressive homodimer C:C on mRNA production and is described in detail below. Greek letters represent non-negative model parameters. The parameter αC is the production rate of protein
(as monomer) from mRNA occurring with delay τC . Parameters βC , βC:C , and βC:G are the linear decay rates of clock protein in monomer, homodimer, and heterodimer 54
forms, respectively. The parameter γc affects the mRNA production rate, which occurs with delay τc. The parameter δc is the linear mRNA decay rate. It is assumed that all forms of control protein may decay linearly with rates βG, βG:G, βC:G. Fur- thermore, the production of control protein monomer, given by the function Pk(t), is assumed to be chosen such that the total control protein in cell k is prescribed by (14).
Taking a time derivative of (12) and using equations (18)–(19) gives:
˙ ˙ ˙ ˙ Cbk(t) = Ck(t) + 2 C:Ck(t) + C:Gk(t)
= αC ck(t − τC ) − βC Ck(t) − 2 βC:C C:Ck(t) − βC:G C:Gk(t).
Pre-appending this differential equation for total clock protein to the above system gives the following:
˙ Cbk(t) = αC ck(t − τC ) − βC Ck(t) − 2 βC:C C:Ck(t) − βC:G C:Gk(t), (24)
˙ − Ck(t) = αC ck(t − τC ) − βC Ck(t) + 2 kC:C C:Ck(t) −
+ 2 − + 2 kC:C (Ck(t)) + kC:G C:Gk(t) − kC:G Ck(t) Gk(t), (25)
˙ − + 2 C:Ck(t) = −βC:C C:Ck(t) − kC:C C:Ck(t) + kC:C (Ck(t)) , (26)
˙ − Gk(t) = Pk(t) − βG Gk(t) + 2 kG:G G:Gk(t) −
+ 2 − + 2 kG:G (Gk(t)) + kC:G C:Gk(t) − kC:G Ck(t) Gk(t), (27)
˙ − + 2 G:Gk(t) = −βG:G G:Gk(t) − kG:G G:Gk(t) + kG:G (Gk(t)) , (28)
˙ − + C:Gk(t) = −βC:G C:Gk(t) − kC:G C:Gk(t) + kC:G Ck(t) Gk(t), (29)
c˙k(t) = γc Hn(C:Ck(t − τc)) − δc ck(t). (30) 55
Note that the special case where β = β = β =: β gives linear decay of C C:C C:G Cb total clock protein, i.e., equation (24) becomes
˙ Cbk(t) = αC ck(t − τC ) − βC Ck(t) − 2 βC:C C:Ck(t) − βC:G C:Gk(t)
= α c (t − τ ) − β (C (t) + 2 C:C (t) + C:G (t)) C k C Cb k k k = α c (t − τ ) − β C (t). (31) C k C Cb bk
Differential decay of protein monomers and dimers has been shown to have important implications on system dynamics [107], thus nonlinear decay of total protein, in which
βC , βC:C , and βC:G are not all equal, is considered during model validation in the next chapter.
System (24)–(30) can be simplified considerably by making a fast dimerization
approximation [107, 109]. This approximation relies upon the assumption that the
dimerization reactions (17) occur on a much faster timescale than all other production
and decay processes [117]. The details of this approach are discussed further below.
Clock-Gene Regulation by a Single Repressive Transcription Factor
The transcription factor C:C is assumed to down-regulate clock mRNA production through one or more cis binding sites at the gene’s promoter1, and the number of
such binding sites can be a key factor in the generation of sustained oscillations [81].
The binding of this transcription factor to DNA is assumed to be in thermodynamic
equilibrium, and the approach of Shea and Ackers [23, 108, 109] is used to derive the
function Hn in equation (30), where n = 1 or 2 binding sites. This function models the probability that the RNA polymerase II (RNAP-II) transcription complex is
1A cis binding site is on the same side of the DNA as the gene’s promoter region. 56
assembled on a gene’s promoter. Specifically,
ZON Hn(C:C) = , ZON + ZOFFn
where ZON and ZOFFn are sums of terms representing states where RNAP-II com- plex is bound or unbound, respectively, to the gene’s promoter. The ratio gives the
probability that the promoter is in the RNAP-II bound state in which clock mRNA
transcription occurs. In this case,
ZON = ρP P and
ZOFFn = 1 + Ψn(C:C), so that
ρP P Hn(C:C) = , (32) ρP P + (1 + Ψn(C:C)) The term in the numerator in (32) represents the single RNAP-II bound state in which mRNA transcription occurs, while the terms in denominator represents all possible states of the promoter. The function Ψn depends upon the number n of cis binding sites for the repressor C:C and is described more fully next.
The terms in the numerator and denominator of (32) are derived from the DNA
binding energies and concentrations of various biomolecules. For example, the binding
probability of a single RNAP-II complex to the gene’s promoter is represented by the
product of the binding affinity ρP for RNAP-II complex and the copy number P of the complex. The binding affinity has units of inverse copy number, and may
be computed from the binding energy ∆GP < 0 of the RNAP-II complex using the
∆GP thermodynamically derived formula ρP := e RT , where R is the ideal gas constant and T is the temperature [23]. In the denominator of (32), the zero energy state with
0 ∆G0 nothing bound to DNA (∆G0 = 0) is represented by the constant 1 = e = e RT . 57
While binding energies have been measured in certain model prokaryotic sys-
tems [23, 108, 109, 118], to this author’s best knowledge they have not been mea-
sured for any of the eukaryotic cells in which somitogenesis is studied. Thus, binding
affinity parameters must be estimated from a biologically realistic range during model
validation, as discussed in the next chapter. Furthermore, the unitless parameter
ρ := ρP P
is estimated instead of estimating both the binding affinity ρP for RNAP-II complex and the copy number P of the complex, which is assumed constant. The lumped parameter ρ incorporates the assembly of the RNAP-II complex, a process which, in eukaryotes, can involve several intermediate steps [119, see Fig. 6-16], and which
are assumed fast relative to other processes. This modeling formalism is a signifi-
cant simplification of the eukaryotic transcription process [34], but it represents an
appropriate step towards biological realism as compared to existing models [108].
The repression of the C:C dimer is assumed to be ideal, i.e., the binding of repressor
and RNAP-II complex is assumed to be mutually exclusive. The function Ψn(C:C) represents the states of self-repression of the clock mRNA by clock protein homodimer,
C:C, given n binding sites. See Figure 14 a–b. For one binding site,
Ψ1(C:C) := ρC:C C:C, (33)
where ρC:C is the binding affinity for the C:C dimer to the single binding site, with units of inverse copy number. For two binding sites,
Ψ (C:C) = ρ C:C + ρ C:C + ω (ρ C:C)(ρ C:C) , (34) 2 C:C- -C:C C:C C:C- -C:C where ρ and ρ are the binding affinities for the C:C homodimer to the first C:C- -C:C
and second binding site, respectively. ωC:C ≥ 0 is the unitless cooperativity between 58
Figure 14: Binding site configurations. Possible cis binding site configurations at the clock-gene promoter. P is RNAP-II transcription complex which binds to the blue region. C:C is the clock protein homodimer which prevents P from binding by binding to the red region(s). N is an activator which binds to the green region adjacent to the P and C:C binding sites. (a) One repressor binding site with no activation. (b) Two repressor binding sites with no activation. (c) One repressor binding site with activation. (d) Two repressor binding sites with activation.
two simultaneously bound C:C homodimers, which accounts for any additional energy
released (or required) when two homodimers bind simultaneously. For simplicity in
the face of lack of knowledge, each site is assumed equally likely to be bound by C:C
dimer, so that
ρ = ρ =: ρ , C:C- -C:C C:C and equation (34) reduces to
2 Ψ2(C:C) := 2 (ρC:C C:C) + ωC:C (ρC:C C:C) . (35)
Altogether, gene regulation with one or two binding sites for a single, ideal repres- sor is given by the function
ρ Hn(C:C) := , (36) ρ + (1 + Ψn(C:C))
with n = 1 or 2, and the functions Ψ1 and Ψ2 given by equations (33) and (35),
respectively. The function Hn can be viewed as a generalization of the monotonic Hill 59 functions typically used to model mRNA transcription [110], such as in differential equation (5) of the Lewis model on page 20.
Modeling Posterior Clock-Wave Formation: Coupled Cells
The model developed above does not include intercellular signaling. Coupling between cells is now added to produce the full PCW model.
Intercellular Signaling
th The dynamics of the coordinating signal protein, Sk, and mRNA, sk, in the k cell are given by the following equations, which are respectively similar to (24) and (30):
˙ Sk(t) = αS sk(t − τS) − βS Sk(t),
s˙k(t) = γs H1(C:Ck(t − τs)) − δs sk(t),
with Greek letters representing the analogous parameters and H1 as given in equa- tion (36). To the best of this author’s knowledge, there are no experimental data concerning the precise action of repressive clock protein dimers at the coordinating- signal gene’s promoter. Following [2, 27], one binding site is assumed. Note that the coordinating signal protein S acts in the nuclei of adjacent cells. Thus, τS includes the cell membrane exchange time for the coordinating signal protein, and is typically an order of magnitude larger than τC .
Clock-Gene Regulation by Both Repressive and Activating Transcription Factors
The coordinating signal protein influences adjacent cells by up-regulation of clock- gene expression in adjacent cells [120]. The combined effect of self-inhibition and 60
activation from adjacent cells requires replacing the dynamics of the clock mRNA
(equation (30)) with the following delay differential equation:
c˙k(t) = γc Fn(C:Ck(t − τc),Nk(t − τc)) − δc ck(t),
where X Nk(t) := Si(t) (37) {i: cell i adjacent to cell k} is the total signal from neighboring cells, and is called the signaling protein in cell k.
Following [2, 27], this summation is a simplification of the membrane-mediated signal
transduction of the Notch intercellular communication pathway [121]. In particular,
the possible nonlinear effects of membrane receptor saturation are not considered.
The function Fn represents the combined effect of the repressor C:C and activator N on clock mRNA production. Again using the approach of Shea and Ackers [23,
108, 109], Fn is given by the following probability ratio:
ZON Fn(C:C,N) = , ZON + ZOFFn where ZON and ZOFFn are sums of terms representing states where RNAP-II complex is bound or unbound, respectively, to the gene’s promoter. The positive integer n
is the number of cis binding sites for the repressor, while the activator is assumed
to have only one binding site. See Figure 14 c–d. As before, the binding of the
repressor and RNAP-II complex is mutually exclusive, however binding of repressor
and activator is not assumed to be mutually exclusive. Transcription proceeds only in those states in which RNAP-II complex is bound, which may or may not occur cooperatively with activator. 61
Consideration of the two possible RNAP-II bound states gives