This position paper has not been peer reviewed or edited. It will be finalized, reviewed and edited after the Royal Society meeting on ‘Integrating Hebbian and homeostatic plasticity’ (April 2016).

Cooperation across timescales between and Hebbian and homeostatic plasticity

Friedemann Zenke1 and Wulfram Gerstner2

April 13, 2016

1) Dept. of Applied Physics, Stanford University, Stanford, CA 2) Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne EPFL

Abstract We review a body of theoretical and experimental research on the interactions of homeostatic and Hebbian plasticity, starting from a puzzling observation: While of found in experiments is slow, homeostasis of synapses in most mathematical models is rapid, or even instantaneous. Even worse, most existing plasticity models cannot maintain stability in simulated networks with the slow homeostatic plasticity reported in experiments. To solve this paradox, we suggest that there are both fast and slow forms of homeostatic plasticity with distinct functional roles. While fast homeostatic control mechanisms interacting with Hebbian plasticity render intrinsically stable, slower forms of homeostatic plasticity are important for fine-tuning neural circuits. Taken together we suggest that learning and memory relies on an intricate interplay of diverse plasticity mechanisms on different timescales which jointly ensure stability and plasticity of neural circuits. Introduction DRAFT Homeostasis refers to a group of physiological processes at different spatial and temporal scales that help to maintain the body, its organs, the brain, or even individual in the brain in a target regime where they function optimally. A well-known example is the homeostatic regulation of body temperature in mammals, maintained at about 37 degrees Celsius independent of weather condition and air temperature. In , homeostasis or homeostatic plasticity often refers to the homeostatic control of neural firing rates. In a classic experiment, cultured neurons that normally fire at, say 5Hz, change their firing rate after a modulation of the chemical conditions in the culture, but eventually return to their target rate of 5Hz during the following 24 hours (Turrigiano and Nelson, 2000). Thus, the experimentally best-studied form of synaptic homeostasis happens on a slow timescale of hours or days. This slow form of synaptic homeostasis manifests itself as a

1 rescaling of the strength of all synapses onto the same by a fixed fraction, for instance 0.78, a phenomenon called “synaptic scaling” (Turrigiano et al., 1998). Mathematical models of neural networks also make use of homeostatic regulation to avoid run- away of the strength of synaptic weights or over-excitation of large parts of a neural network. In many mathematical models, homeostasis of synaptic weights is idealized by an explicit normalization of weights: if the weight (or efficacy) of one synaptic connection increases, weights of other connections onto the same neuron are algorithmically decreased so as to keep the total input in the target regime (Miller and MacKay, 1994). Algorithmic normalization of synaptic strength can be interpreted as an extremely fast homeostatic regulation mechanism. At a first glance, the result of multiplicative normalization (Miller and MacKay, 1994) is identical to “synaptic scaling” introduced above. Yet, it is fundamentally different from the experimentally studied form of synaptic homeostasis mentioned above, because of the vastly different timescales. While rescaling in the model is instantaneous, in biology the effects of synaptic scaling manifest themselves only after hours. Interestingly, researchers who have tested slow forms of homeostasis in mathematical models of plastic neural networks failed to stabilize the network activity. Thus, we are confronted with a dilemma: Theorists need fast synaptic homeostasis while ex- perimentalist have found homeostasis that is much slower. In this review we try to answer the following questions: Why do we need rapid forms of homeostatic synaptic plasticity? How fast does a rapid homeostatic mechanism have to be, hours, minutes, seconds or less? Furthermore, what are the functional consequences of fast homeostatic control? Can the combination of fast homeostasis with Hebbian learning lead to stable memory formation? And finally, if rapid homeostasis is a requirement, what is the role of slower forms of homeostatic plasticity? Conceptually, automatic control of a technical system is designed to keep the system in the regime where it works well, which is called the “working point” or the “set point” of the system. Analogously, homeostatic regulation of synapses is important to maintain the neural networks of the brain in a regime where they function well. However, in specific situations our brain needs to change. For example, when we exit the subway at an unknown station, when we are confronted with a new theoretical concept,DRAFT or when we learn how to play tennis, we need to memorize the novel environment, the new concept, or the unusual movement that we previously did not know or master. In these situations, the neural networks in the brain must be able to develop an additional desired “set point”, ideally without loosing the older ones. In neuroscience it is widely accepted that learning observed in humans or animals at the be- havioral level corresponds, at the level of neural networks, to changes in the synaptic connections between neurons (Morris et al., 1986; Martin et al., 2000). Thus, learning requires synapses to change. A strong form of synaptic homeostasis, however, keeps neurons and synapses always in the “old” regime, i.e. the one before learning occurred. In other words, strong homeostasis prevents learning. A weak or slow form of homeostasis, on the other hand, may not be sufficient to control the changes induced by memory formation. Before we can solve the riddle of stability during mem-

2 ory formation, and before we can answer the above questions, we need to classify known forms of synaptic plasticity.

Classification of Synaptic Plasticity

Synaptic plasticity, can be classified using different criteria. We consider three of these: Functional relevance, synaptic specificity, and timescale of effect duration.

(i) Classification by potential function: Memory formation or homeostasis

Functionally, synaptic plasticity can be classified as useful either for homeostatic regulation or for memory formation and action learning. Many experiments on synaptic plasticity were based on Hebb’s postulate which suggests that synaptic changes, caused by the joint activity of pre- and postsynaptic neurons, should be useful for the formation of memories (Hebb, 1949). Inspired by Hebb’s postulate, classical stimulation protocols for long-term potentiation (Bliss and Lomo, 1973; Malenka and Nicoll, 1999; Lisman, 2003), long-term depression (Lynch et al., 1977; Levy and Stew- ard, 1983), or spike-timing dependent plasticity (Markram et al., 1997; Bi and Poo, 1998; Sjöström et al., 2001), combine the activation of a presynaptic neuron, or a presynaptic pathway, with an activation, depolarization, or chemical manipulation of the postsynaptic neurons, to induce synaptic changes. In parallel, theoreticians have developed synaptic plasticity rules (Willshaw and Von Der Mals- burg, 1976; Bienenstock et al., 1982; Kempter et al., 1999; Song et al., 2000; Pfister and Gerstner, 2006; Clopath and Gerstner, 2010; Brito and Gerstner, 2016; Shouval et al., 2002; Graupner and Brunel, 2012), in part inspired by experimental data. Generically, in plasticity rules of computa- tional neuroscience the change of a from a presynaptic neuron j to a neuron i is described as d w = F (w , post , pre ) (1) dt ij ij i j where wij is the momentary “weight” of a synaptic connection, posti describes the state of the postsynaptic neuron (e.g. its membraneDRAFT potential, calcium concentration, spike times, or firing rate) and prej is the activity of the presynaptic neuron (Brown et al., 1991; Morrison et al., 2008; Gerstner et al., 2014). A network of synaptic connections with weights consistent with Hebb’s postulate can form associative memory models (Willshaw et al., 1969; Little and Shaw, 1978; Hopfield, 1982; Amit et al., 1985; Amit and Brunel, 1997), also called attractor neural networks (Amit, 1989). However, in all the cited theoretical studies, learning is supposed to have happened somewhere in the past, while the retrieval of previously learned memories is studied under the assumption of fixed synaptic weights. The reason is that a simple implementation of Hebb’s postulate for memory formation is not sufficient to achieve stable learning (Rochester et al., 1956). Without clever homeostatic control of synaptic plasticity, neural networks with a Hebbian rule always move to an undesired state where all synapses go to their maximally allowed values and

3 all neurons fire beyond control — or to the other extreme where all activity dies out. Therefore, homeostasis in the sense of keeping the network in a desired regime, is a second and necessary function of synaptic plasticity. Candidate plasticity mechanisms for the homeostatic regulation of network function by synaptic plasticity include slow synaptic scaling (Turrigiano et al., 1998; Turrigiano and Nelson, 2000), rapid renormalization of weights (Miller and MacKay, 1994), or other mechanisms discussed below (Zenke et al., 2013, 2015; Chistiakova et al., 2015); but also synaptic plasticity of inhibitory interneurons (Vogels et al., 2011, 2013) or intrinsic plasticity of neurons (Daoudal and Debanne, 2003).

(ii) Classification by specificity of synaptic changes: Homosynaptic or heterosynaptic

In some of the classic induction protocols of long-term potentiation and depression, mentioned above, a presynaptic pathway (or presynaptic neuron) was stimulated together with an activation of the postsynaptic neuron. If a change was observed in those synapses that were stimulated during the induction protocol, the plasticity was classified as synapse-specific or “homosynaptic”. If a change was observed in other (unstimulated) synapses onto the same postsynaptic neuron, the plasticity was classified as unspecific or “heterosynaptic” (Lynch et al., 1977; Abraham and Goddard, 1983; Brown and Chattarji, 1994). Heterosynaptic plasticity has moved back in the focus recently, because of its potential role in homeostatic regulation of plasticity (Royer and Paré, 2003; Chen et al., 2013; Chistiakova et al., 2014, 2015; Zenke et al., 2015). In the framework of Eq. (1), the change of a synapse may depend on the activity of the presynaptic neuron, the state of the postsynaptic neuron — or on both — as well as on the momentary value of the synaptic weight. Since we do not know the function F on the right-and-side of Eq. (1), we can imagine that it may contain different combinations of pre- and postsynaptic variables, such as

F (wij, posti, prej) = a0(wij) + f1(wij, posti) + f2(wij, prej) + a1(wij) H(posti, prej) (2)

In this picture, the term a0(wij) would represent a drift of the synaptic strength that does not depend on the input or the stateDRAFT of the postsynaptic neuron, but only on the momentary value of the weight wij itself. More interesting is the term f1(wij, posti) which depends on the postsynaptic activity, but not on the presynaptic activity. Suppose the postsynaptic neuron went through a phase of rapid firing. Such a term will then induce changes that are not synapse-specific and would therefore be classified as heterosynaptic. Conversely, the term f2(wij, prej) depends on the activity of the presynaptic pathway, but not on the state of the postsynaptic neuron. In the following we will call such a term transmitter-induced. If a presynaptic neuron is weakly stimulated for some time, the change caused by the term f2(wij, prej) can be classified as homosynaptic, but not Hebbian. In equation Eq.(2) only the last term is able to account for changes that depend on the joint activity of pre- and postsynaptic neurons. This term is therefore homosynaptic and Hebbian, hence our choice of letter H. The multiplicative factor a1(wij) in front of H highlights that the amount of synaptic

4 plasticity may depend on the momentary state of the synaptic weight. For example, it might be difficult or impossible to further increase a synapse that is already very strong, simply because strong synapses require a big spine and therefore a lot of resources. The theoretical literature implements max resource limitations by keeping the weights wij below a maximum value w . Depending on the implementation details, the limit is called a “hard bound” our “soft bound”. As an aside, we note that the term “heterosynaptic plasticity” is sometimes also used for synaptic changes that are visible at the connection from a presynaptic neuron j to a postsynaptic neuron i, but induced by the activation of a third, typically modulatory neuron (Bailey et al., 2000). However, in the following we do not consider this possibility. Equations (1) and (2) that describe the change of synaptic weights as a function of neuronal activity are called learning rules in the theoretical literature. For the rest of the paper, Eqs. (1) and (2) provide the framework in which we analyze and compare different learning rules with each other.

(iii) Classification by duration of effects: Short-term versus long-term plasticity versus consolidation

Synaptic changes induced by a sequence of four presynaptic spikes in rapid sequence typically decay within a few hundred milliseconds (Abbott et al., 1997; Tsodyks and Markram, 1997; Tsodyks et al., 1998) and are called short-term plasticity. The rapid decay of plasticity outcomes implies that the changes are not useful for memory formation, but more likely involved in gain control (Abbott et al., 1997), which implies a homeostatic role. Synaptic changes induced by classic induction protocols (Bliss and Lomo, 1973; Malenka and Nicoll, 1999; Lisman, 2003), however, last for several hours and are therefore potentially useful for memory formation. We emphasize that the induction protocol itself often lasts less than a minute. In other words, induction of synaptic plasticity is fast, but the changes persist for a long time, and this is why the phenomenon is called long-term potentiation or long-term depression, which we will abbreviate in the following as LTP or LTD. Under suitable conditions the changes induced by a protocol of LTP or LTD are further consol- idated after about an hour (FreyDRAFT and Morris, 1998; Reymann and Frey, 2007; Redondo and Morris, 2011). In the rest of the paper we mainly focus on the phase of plasticity induction and neglect consolidation and maintenance. To summarize this part, we may state that the classification of synaptic plasticity is possible in many different ways. A certain aspect of plasticity (or a certain mathematical term in a model) can be treated at the same time as Hebbian, homosynaptic, and memory-building while another term could be heterosynaptic and homeostatic, and a third term could be transmitter-induced, homosynaptic, but non-Hebbian. Each of the terms can lead to a change that persists for a long time or that decays within a few seconds. Moreover, each of the terms could be strong so that its contribution is easily verified in exper-

5 iments; or it could be weak so that only a precise experiments with multiple repetitions would be able to isolate a given term. The problem of the effect size is intimately related to the question of the timescale for induction of plasticity, which is an important topic in itself to be discussed next.

Why do we need rapid forms of homeostatic synaptic plasticity?

With the definitions and distinctions made in the previous section, we are now ready to address the first of the questions posed in the introduction: Why do we need a fast form of homeostatic synaptic plasticity? Intuitively, synaptic plasticity that is useful for memory formation must be sensitive to the present activation pattern of the pre- and postsynaptic neuron. Following Hebb’s idea of learning and cell assembly formation, the synaptic changes should make the same activation pattern more likely to re-appear in the future, to allow contents from memory to be retrieved. However, the re-appearance of the same pattern will induce further synaptic plasticity. This leads to a positive feedback loop that can quickly run out of control. Anybody who was sitting in the audience when the positive feedback loop between the speaker’s microphone and the loudspeaker resulted in an unpleasant shriek, knows what this means. Engineers have realized that positive feedback loops can only be stabilized by rapid control mechanisms. Similarly, the interaction between the firing activity in a neural network with synaptic plasticity for memory formation can lose stability, unless synaptic plasticity is controlled by rapid homeostasis. While the plasticity rules derived from experiments look at most at one or two stimulation pathways at a time, neurons embedded in a network can receive inputs from thousands of different synapses. In this section, we will outline a simple, yet powerful framework which enables us to make quantitative predictions about the stability in large networks of spiking neurons. Synapses in the network will exhibit different forms of plasticity which serve specific functions for both memory formation and homeostasis. Before we can give a precise answer to the question of “why do we need a fast form of homeostatic synaptic plasticity”, we have toDRAFT consider what it means to be “fast” or “slow”. Hebbian synaptic plasticity manifests itself tens of seconds or a few minutes after plasticity induction (Petersen et al., 1998; O’Connor et al., 2005). This sets a first timescale of plasticity induction. Intuitively, we will call a homeostatic mechanism fast if it happens on the same timescale as Hebbian plasticity induction. It will be considered as slow, if it is much slower than induction of LTP, i.e., if it manifests itself over hours or days. In the previous paragraph, we used the term “timescale”. To make the intuitive notion of timescale more precise, we need a short mathematical digression. The reader who is familiar with differential equations or dynamical systems can skip the next subsection.

6 Mathematical digression: The notion of a timescale

To formally introduce the notion of a timescale, we consider a linear system defined by the differential equation dy τ = −(y − θ) (3) dt where θ and τ are two fixed positive parameters. The solution y(t) of the differential equation relaxes exponentially to the value y = θ, on the timescale τ. Practically speaking that means that no matter how we initialize y(t) at time t = 0, if we wait for a time 3τ, the difference y(t) − θ will have decreased by 95%. The exponential evolution is typical for a linear system. Moreover, the equation has a nice scaling behavior: If we know the solution for a parameter setting τ = 1, we can get the solution for the parameter setting τ = 5 by multiplying all times by a factor of five. This is the reason why we speak of τ as the timescale of the system. The fact that a constant value of y(t) = θ for all t is a solution means that θ is a fixed point of the system. To check this, remember that the derivative of a constant is zero (so that the left-hand side of Eq. (3) vanishes). And for y = θ the right-hand side is obviously zero, too. Moreover, this fixed point is stable because from any initial condition y(t) will converge towards θ. Let us now consider the case where τ is negative. In this case, y(t) = θ is still a fixed point, but if we start with a value y(t) > θ, then y(t) will explode exponentially fast to large positive values while for an initial value y(t) < θ it will explode exponentially fast to large negative values. The timescale of explosion is again given by τ. However, our intuition for timescales breaks down for nonlinear systems. Therefore, when we dy consider a nonlinear system τ dt = f(y), it does not suffice for τ to be large to be able to talk about a slow or fast timescale. We have to be specific about the behavior of f in the regime that we are interested in. The mathematical trick to do this is to look for a fixed point of the equation, that is a value of y with f(y) = 0. Suppose y = y0 is a fixed point. We then study the derivative df/dy at 0 y0. Let us denote this derivative by f . In the neighborhood of y0 (and only there!) the nonlinear 0 0 0 equation is well approximated by a linear equation τdy/dt = (y − y0) f . Division by f brings f to the other side of the equationDRAFT and enables us to identify the effective timescale τ˜ = −τ/f 0. If you are in doubt, compare your result with equation (3). Another example of a linear system is a low-pass filter. Suppose we pass our variable y(t) through a low-pass filter with time constant τd to yield

Z t  (t − t0) y¯(t) = exp − y(t0) dt0 . (4) −∞ τd

By taking the derivative on both sides of Eq. (4), we find that the low-pass filter y¯ is the solution of a differential equation dy¯(t)/dt = −y¯(t)/τd + y(t). This equation is similar to equation (3), if we replace the constant target value θ by a time-dependent target y(t)/τd.

7 a b 1.2 1

0.8 0.5 LTP 0.4 0 θ

in synaptic efficacy 0 in synaptic efficacy 0.5 y¯ < θ LTD θ − y¯ = θ y¯ > θ 0.4 1 Change − Postsynaptic activation Change − Postsynaptic activation

c 0.6

0.4 efficacy 0.2

Synaptic 0 Plasticity protocol Time

Figure 1: Most plasticity models can reproduce the notion of a plasticity threshold reported in experiments. (a) The change in synaptic efficacy in many plasticity models is a function of variables related to postsynaptic activation. (b) Schematic illustration of the action of the homeostatic moving threshold in the BCM-model (Bienenstock et al., 1982). (c) Hypothetical evolution of synaptic efficacy as a function of time during an LTP induction experiment. The purple, linear time course is a common assumption underlying many plasticity models. However, since the data during plasticity induction are typically unobservable, many time courses and thus plasticity modes are compatible with the data.

The induction timescale of Hebbian plasticity Because both learning rules andDRAFT neurons are typically nonlinear, the mathematical definition of a timescale for Hebbian plasticity requires the existence of a fixed point. The reason is that for a nonlinear system we can define a timescale only in the vicinity of a fixed point, as we have seen above. Even if a plasticity rule in isolation looks linear, the combination of plasticity with the neuronal dynamics typically makes the network as a whole nonlinear. Consider for instance a single neuron i that is driven by N inputs arriving at synapses wij for 1 ≤ j ≤ N. In a rate model, the state of PN  the postsynaptic neuron is characterized by its firing rate yi = g j=1 wijxj where xj is the firing rate of the presynaptic neuron with index j. The function g denotes the frequency-current relation of a single neuron and we assume that it is monotonically increasing, i.e., if we increase the input, the firing rate increases as well. In the theoretical literature g is sometimes called the gain function

8 of the neuron, hence our choice of letter g. Let us first study a simple Hebbian learning rule

dw ij = η y x . (5) dt i j

Note that this is a special instance within the general framework of Eqs (1), and (2). To see this, consider a Hebbian term H(posti, prej) = yi xj, and a1(wij) = η and set all other terms in Eq. (2) P  to zero. This learning rule is linear in xj and linear in yi. However, if we insert yi = g j wijxj into the learning rule, the learning dynamics become nonlinear in xj. Nonlinearity implies that we cannot define a timescale of plasticity, unless we find a fixed point where dwij/dt vanishes. Are there fixed points of the dynamics? There is a fixed point if the postsynaptic or the presy- naptic rate is zero. However, if the neuron is embedded in a network, it is reasonable to assume that several presynaptic neurons including the presynaptic neuron j are active. Unless all weights wij are zero, the postsynaptic neuron is therefore also active, and the weight wij increases.

The argument can be formalized to show that wij = 0 is an unstable fixed point. If we increase wij by just a little, the output yi also increases which increases wij even further, which closes the positive feedback loop. More generally, models of Hebbian plasticity that are useful for memory formation all have an unstable fixed point. One important role of homeostatic plasticity in networks models with Hebbian plasticity is therefore to create (additional) stable fixed points for the learning dynamics as we will see later on in this section. The simple example above only has a trivial fixed point at zero activity (zero weights). Moreover, it is lacking the notion of LTD. Plausible plasticity models have additional stationary points defined by the plasticity threshold between LTP and LTD (Fig. 1a). Inspired by experimental data (Artola et al., 1990; Dudek and Bear, 1992; Sjöström et al., 2001), the transition from LTP to LTD depends in models on the state of the postsynaptic neuron, e.g., its membrane potential, calcium level, inter spike interval (Artola and Singer, 1993; Shouval et al., 2002; Pfister and Gerstner, 2006; Clopath and Gerstner, 2010; Graupner and Brunel, 2012). As a paradigmatic example, which stands for a plethora of different plasticity rules with a postsynaptic threshold, we consider the following rate- based nonlinear Hebbian rule DRAFT dw ij = η x y (y − θ) (6) dt j i i with a positive constant η. Whenever the postsynaptic firing rate yi equals the value of θ, the weight wij does not change. Thus this learning rule has a fixed point at the threshold yi = θ. For yi larger than θ the synaptic weights increase which corresponds to the induction of LTP in the model. For yi smaller than θ the synaptic weights decrease and LTD is induced. Let us now embed this learning rule in a network. For the sake of simplicity we assume the P  P gain function to be linear, g j wijxj = j wijxj. Moreover, we assume that (i) all weights wij have the same value wij = w and (ii) all N neurons in the network fire with the same firing rate

9 xj = 1/N, so that y = w. Inserting these assumptions into Eq. (6) yields

dw = η w (w − θ) (7) dt which characterizes the change in neural activity y as governed by synaptic plasticity. We now linearize Expression (7) at the stationary point w = θ:

dw ≈ η θ (w − θ) dt

This is a linear differential equation with a solution w(t) that, for w(t) > θ, explodes expo- 1 nentially fast on the timescale τ = ηθ (cf. discussion of Eq. (3). Thus, in this example, w = θ is an unstable fixed point and the linearization procedure has enabled us to identify the timescale of plasticity induction. In practical implementations of a plasticity model, the exponential growth of the synaptic weights wij would stop when they obtain their maximal value wmax. However, if all synapses onto a postsynaptic neuron, or even all synapses in a neural network sit at their upper bound, the network cannot function as a memory. Note that the occurrence of the parameter θ in the timescale τ is a hallmark of the nonlinearity of the full system. Just like in the well-known Hodgkin-Huxley model where the timescale of the activation and inactivation variable depends on the voltage, the effective timescale τ of a learning rule will depend on multiple factors such as the presynaptic activity, the slope of the gain function, or the threshold θ between LTD and LTP. Even though τ in the example above is not the same as the induction timescale of long-term plasticity, in experiments where only a single presynaptic pathway is stimulated, the two are related. With our formalism we can also account for the strength of the recurrent feedback that is received by a neuron embedded into a network. Just repeat the above analysis under the assumption that the presynaptic and postsynaptic neurons are mutually connected, of the same type, and fire at the same rate (Zenke et al., 2013). Whatever you consider as a reasonable scenario, the timescale τ characterizes how quickly a neuron or an entire recurrent network can generate positive feedback and is able to “run away”. DRAFT Instead of writing down a differential equation for the weights w, as in Eq. (7), we could have written the system in terms of y, too. A formulation in terms of the firing rates y highlights the fact that we have to think about run-away effects of synapses as being linked to run-away effects of neuronal activity. Note further, that in realistic neuronal network the presynaptic activity fluctuates and is different between one neuron and the next. Fluctuations give rise to a covariance matrix 1 Pn C = n t=1 xi(t)xj(t) which may look complicated at a first glance. However, due to the symmetry of C, there always exists a basis in which C is diagonal. When working in this basis, the plasticity equations decouple and take the shape of Eq. (7) with different values of η. In summary, by using the formalism introduced in this subsection, we can combine a plausible

10 data-based plasticity model and a given network model to find the effective timescale τ for this compound system. In doing so we take for instance explicitly into account that even if the change of a single synapse within one second might be small, the combined effect of millions of synapse of thousands of recurrently synapses could still be large. By distilling all these factors into a single number τ we can now return to the question of homeostatic plasticity. How can homeostatic control catch up with Hebbian plasticity that makes weights “explode” on the timescale τ?

Stability through homeostatic plasticity

To stabilize Hebbian plasticity, theorists may select and mathematically formalize homeostatic mech- anisms found in nature (Turrigiano, 2011). Two mathematical approaches to synaptic homeostasis are widespread. Either the Hebbian plasticity term is modified by a sliding threshold, or the Heb- bian plasticity term for memory formation is complemented by an independent term for rate control. Although other choices are possible, we will now focus on those two mechanisms. Can either of these mechanisms guarantee stability of Hebbian learning? Are there fundamental restrictions with respect to how fast homeostatic plasticity has to be?

Plasticity control through a sliding LTP-LTD threshold

Several models assume that the threshold that marks the boundary between LTP and LTD is not fixed but may slide on a slow timescale (Bienenstock et al., 1982; Cooper et al., 2004; Pfister and Gerstner, 2006). We refer to this model class as BCM models in the following, where BCM is the acronym for Bienenstock-Cooper-Munro. The basic idea of that model class is that a sliding threshold could induce a homeostatic mechanism to locally control the amount of LTP or LTD in each synapse. The question then arises whether the sliding threshold can indeed lead to stable memory formation without the need for a further fast homeostatic mechanism.

To formalize this idea, we use the low-pass filter y¯i of the postsynaptic firing rate (averaged over a time τd) as a sensor that drives homeostatic plasticity; see (Eq. (4)). Intuitively, a homeostatic sliding threshold θ(¯y) increases the amount of LTD if the average postsynaptic activity y¯ exceeds a given homeostatic set point DRAFTκ for extended periods of time (Fig. 1b). Conversely, when neuronal activity falls below this homeostatic activity target, the rate of LTP is increased at the expense of LTD. There are several different possible forms of such BCM models (Bienenstock et al., 1982; Cooper et al., 2004). The following is a simple, but representative example

dw ij = η x y (y − θ(¯y )) (8) dt j i i 2 y¯i with a sliding threshold θ(¯yi). A common choice for the sliding threshold is θ(¯yi) = κ . Let us analyze this sliding threshold model. If we insert the expression for θ into Eq. (8) and choose a constant firing rate yi =y ¯i = κ, we see that the weight wij does not change. Therefore

11 yi = κ is a fixed point of the dynamics. The stability of this fixed point depends on the timescale of the low-pass filter τd which in turn determines the timescale of metaplastic changes in Eq. (8) (Abraham (2008); Fig. 1b). It can be shown analytically that the system has a stable fixed point crit only if τd is chosen smaller than some critical value τd (Cooper et al., 2004; Zenke et al., 2013). Thus, a slow homeostasis of the sliding threshold is, on its own, not sufficient to achieve stability.

Rate control through synaptic scaling

To introduce the second control mechanism, let us consider a neuron embedded in a large network of stochastically firing neurons. If many synaptic weights onto a given postsynaptic neuron increase, the postsynaptic firing rate y is expected to increase. If the firing rate remains high over some timescale τd, we may consider this as an indicator that the neuron runs out of its target firing regime. Let us therefore use the low-pass filtered rate y¯(t) (Eq. (4)) as a driver of homeostasis (van Rossum et al., 2000).

Let us focus on a postsynaptic neuron i with rate yi. We adjust the weight wij from a presynaptic neuron j to our postsynaptic neuron according to the rule

dwij = η xjyi (yi − θ) +η ˜(κ − y¯i)wij (9) dt | {z } | {z } Hebb Homeostasis

The first term on the right-hand side is the Hebbian plasticity term discussed earlier, now with a fixed threshold θ that marks the transition from LTP to LTD. The multiplicative factors in the second term are η˜ (a parameter) and the weight wij itself. The term κ inside the parenthesis of the second term corresponds to a target value of the postsynaptic activity and y¯i is a low-pass filtered version of the firing rate yi. A simple and biologically plausible way of how a neuron can sense y¯i on-line, is by reading out the concentration of a slow activity-related intracellular factor. Three remarks are important. First, the homeostatic term corresponds to the heterosynaptic plasticity term f1(wij, posti) in Eq. (2). Second, the fact that the heterosynaptic term is proportional to the weight wij implies that, for y¯i > κ, larger weights are depressed more than smaller ones. Thus this heterosynaptic term implementsDRAFT synaptic scaling. Third, in the language of control theory the heterosynaptic term implements homeostasis via a proportional controller. More sophisticated forms of heterosynaptic plasticity like a proportional-integral controller are possible (van Rossum et al., 2000), but this generally does not change the core of our argument that we will develop in the following. How strong and how fast does the heterosynaptic term have to be to fulfill its homeostatic function? It can be shown analytically, that the requirement of stability in the above system (Eq. (9)) puts some tight constraints on the parameters τd and η˜ for a given value of κ (Zenke et al., 2013). First, η˜ cannot be significantly different from η. If η˜ is much larger than η homeostatic control creates oscillations which are not observed in biological systems. On the other hand, if η˜ is much

12 Priming experiments Synaptic scaling experiments

et al. (1992) et al. (2010) * et al. (2008) r et al. (2009) * e et al. (2013) * el et al. (2006) and MacKay (1994) and Gerstner (2006) and Abraham (1992) oto et al. (2008) der Malsburg (1973) rgjieva et al. (2011) Rossum et al. (2000) A Ibata Go Huang Laza Boustani et al. (2012) * Zenk urrigiano et al. (1998) Clopath Gjo T von win-Kumar and Doiron (2014) * van Miller El Pfister Lit Christie Models Experiments

instantaneous second minute hour day

Figure 2: The timescales of synaptic scaling or are faster in models than reported in experiments. Here we plot the homeostatic timescale of either synaptic scaling or homeostatic metaplasticity as used in influential modeling studies (black). For comparison we plot the typical readout time for experimental studies on synaptic scaling and metaplasticity (red). Publications suffixed with * describe network models as opposed to the other studies which relied on single neurons. For reference we also indicate the typical timescales of short-term plasticity and long-term plasticity induction at the top. Note, that homeostatic timescales in models fall in the same range as the induction timescale of Hebbian long-term plasticity, which does not seem to be the case for experimental results on homeostatic plasticity. smaller than η, the fixed point κ system loses stability altogether. Similarly, we can derive stability conditions for τd and show that stability is only guaranteed if τd is smaller than some critical value crit τd (which explicitly depends on η). The mathematical analysis thus shows that synaptic scaling needs to be “fast”, or at least faster than some reference. To further answer the question of how fast homeostatic plasticity has to be, we therefore have to quantify the speed of “normal” Hebbian plasticity. Thus, our analytical approach has allowed us to answer the first question that we posed in the introduction. We need homeostatic synaptic plasticity which is faster than some critical value in order to achieve stability of learningDRAFT dynamics. This statement is true for both the sliding thresh- old mechanism and the rate-control framework. However, what does “fast homeostatic plasticity” mean quantitatively? Is this critical timescale on the order of minutes, hours or days? To answer this question we need to quantify parameters such as η with experimental data. In the following subsection we will address this question and translate the mathematical statements of this section into interpretable numbers.

13 How fast does a rapid homeostatic mechanism have to be?

Slow homeostatic control cannot stabilize the rapid positive feedback loop induced by Hebbian learning. However, in this statement, the meaning of “rapid” and “slow” is relative and based on a comparison of the timescale of homeostasis with that of induction of Hebbian plasticity. But what is the actual timescale of Hebbian plasticity? To connect the framework of learning rules outlined in Eqs.(1–8) to data from plasticity ex- periments we can rely on a wealth of data on synaptic plasticity. Experiments which used STDP protocols to study induction of Hebbian plasticity are particularly informative (Sjöström et al., 2001; Senn et al., 2001). Theorists have made great strides in developing models which quantitatively cap- ture a wide range of these data (Senn et al., 2001; Pfister and Gerstner, 2006; Froemke and Dan, 2002; Clopath and Gerstner, 2010). The parameters of the resulting spike-based plasticity models capture the timescale of plasticity induction in STDP experiments. STDP models such as (Pfister and Gerstner, 2006) give a particular instantiation of the Hebbian term in Eq. (9) and make it possible to extract its parameters directly from experimental data. To bridge the conceptual gap between the spiking plasticity models and our simplified framework in the firing rate domain we have to consider neural firing statistics. Because neuronal spiking in vivo is highly irregular, we make the assumption that firing statistics can be approximated by a Poisson process. If both pre- and postsynaptic neurons fire stochastically and Poisson-like, the STDP models can be rewritten as firing rate models (Pfister and Gerstner, 2006). In particular, modern STDP models (Senn et al., 2001; Pfister and Gerstner, 2006; Clopath and Gerstner, 2010) exhibit a transition between LTD and LTD which depends on the firing rate, consistent with experimental data (Sjöström et al., 2001). We can exploit the connection between data, STDP models, and the effective Hebbian rate model so as to determine a plausible timescale for the induction of Hebbian plasticity. (A short aside: In the context of STDP, it might seem paradoxical to assume Poisson firing, which explicitly gets rid of any temporal correlations between the input and output spike trains. However, as long as temporal correlations are as small as commonly seen in cortical activity or mimicked in simulations of balancedDRAFT state networks (van Vreeswijk and Sompolinsky, 1996; Renart et al., 2010), the Poisson approximation gives accurate results.) By plugging our estimate of the timescale of Hebbian plasticity into the analytical expressions crit crit for the parameters which characterize the critical limiting cases τd and η˜ (Zenke et al., 2013), we obtain a quantitative answer for how rapidly homeostatic mechanisms need to be. By analyzing the stability of a low-activity background state in a recurrent network model in which synaptic weights were evolving according to a plausible Hebbian plasticity model, we found that homeostatic control mechanisms need to act on a timescale in the range of seconds to minutes (Zenke et al., 2013). In the presence of network activity with temporal correlations or to provide robustness against perturbations, the timescale of homeostasis must be even shorter.

14 More specifically, and most importantly for the topic of this paper, the time constant τd of the low-pass filter in the rate-control mechanism of Eq. (9) has to be short; otherwise the induced phase delay destabilizes the system. Possible choices for τd include τd → 0, which means that y¯ = y. In other words, heterosynaptic plasticity needs to be nearly instantaneous to achieve its aim to control of firing rates during ongoing Hebbian plasticity. Thus, if we want to achieve homeostasis by synaptic scaling, the synaptic scaling has to be extremely fast (Zenke et al., 2013). Finally, this requirement cannot be loosened by more complicated homeostatic control mecha- nisms which rely on complicated molecular signalling cascades. Homeostatic control mechanisms which require the buildup of secondary or tertiary messengers, can be formalized in the language of control theory as integral controllers. Integral controllers can pick up small deviations from a target value and correct them. Therefore they are good for fine-tuning. However, they have to be “slow” (or more precisely, slower than some critical value), otherwise they become unstable even in the absence of Hebbian plasticity (Harnack et al., 2015). In summary, the application of our general analytical theory to data on STDP experiments gives an answer to the second question posed in the introduction: Rapid homeostatic control implies a reaction of the controller on the timescale of seconds to minutes. This answer is in good agree- ment with most existing simulations of plastic network models (Fig. 2) — in each of these a rapid homeostatic control mechanisms on a timescale of seconds to minutes was implemented to maintain stability Pfister and Gerstner (2006); Lazar et al. (2009); El Boustani et al. (2012); Clopath and Gerstner (2010); Gjorgjieva et al. (2011); Litwin-Kumar and Doiron (2014).

What are functional consequences of rapid homeostatic control?

If we accept the notion that homeostatic plasticity acts on a short timescale, what are the functional consequences for plasticity and circuit dynamics? Neurons encode information in changes of their electrical activity levels. For instance, subsets of simple cells in the visual system fire spikes in response to specific edge-like features in the visual field (Hubel and Wiesel, 1962); cells in higher brain areas respond with high specificityDRAFT to complex concepts and remain quiescent when the concept they are coding for is not brought to mind (Logothetis and Sheinberg, 1996; Hung et al., 2005; Quiroga et al., 2005); and finally certain neurons respond selectively with elevated firing rates over extended periods during working memory tasks (Miyashita, 1988; Funahashi et al., 1989; Wilson et al., 1993). The ability of neurons to selectively indicate through periods of strong activity the presence of specific features in the input or specific concepts in working memory is an important condition for computation. Homeostatic plasticity, however, by definition tries to keep neuronal activity constant. In par- ticular, most models of homeostatic plasticity have a single homeostatic target or set point. That means, that homeostatic plasticity tries to achieve some notion of constancy of one specific target variable over time. In many models this is the mean neuronal firing rate, but other target variables

15 a Input

b Time Slow hom., single target Target ctivity A c Rapid hom., single target ctivity A d Rapid hom., target range Limits ctivity A

Figure 3: The effect of slow and rapid homeostasis on the neuronal code. (a) Fluctuating external world stimulus over time. (b) Neural activity A = w × input over time. A slow homeostatic mechanism adjusts w to push A towards a single homeostatic target (dashed line; Homeostasis is dw modelled as τslow dt = (Target − A). The homeostatic target is indicated by the dashed line). With a slow homeostatic mechanism,DRAFT a neuron can track input relatively accurately. (c) Same as b, but for a rapid homeostatic mechanism (τslow = 50τfast). If the homeostatic timescale becomes comparable to the timescale of the stimulus, homeostasis starts to interfere with the neuron’s ability to track the stimulus. (d) Rapid homeostatic mechanisms enforcing an allowed range (limits indicated by dotted lines). Even though the neuron cannot capture all the diversity of the input, it can capture some of it. Here we modeled homeostasis as the following nonlinear extension of the above model: dw τfast dt = f(Target − A) with f(x) = x for kx − Targetk > Limit and f(x) = 0 otherwise.

16 such as the bulk-synaptic conductance of all afferent synapses onto the same neuron are possible. Is homeostatic control of activity compatible with the task of neurons to selectively respond to stimulation? If homeostatic control of firing rates is slow, neuronal firing can deviate substantially from the mean firing rates during short times and thus encode information (Fig. 3a,b). However, we already know that a slow homeostatic control mechanism cannot stabilize the ravaging effects of Hebbian plasticity. So what can we say about a rapid homeostatic mechanism? If the homeostatic mechanism acts on a short timescales (e.g. seconds to minutes) as it would be required to stabilize Hebbian plasticity, neural codes based on neuronal activity become problematic (Fig. 3c) because homeostatic plasticity starts to suppress activity fluctuations which might be encoding important information. Even more alarmingly, certain forms of homeostatic plasticity, like sliding threshold models of homeostatic metaplasticity, not only suppress high activity periods, but also “unlearn” previously acquired selectivity and delete previously stored memories (Fig. 4a–d). Therefore rapid homeostatic mechanisms which enforce a single homeostatic set point are not hardly desirable from a functional point of view. Thus the requirement of fast homeostatic control over Hebbian plasticity poses a problem. It is important to appreciate that this problem arises from the combination of a single homeostatic target with the requirement to enforce the homeostatic constraint on a short timescale. However, there might be a simple solution to this conundrum. Suppose there are two (or more) homeostatic set points, or a target range (instead of a target point), implemented by multiple forms of rapid homeostatic control. For instance, one mechanism of homeostatic plasticity could activate above a certain activity threshold and ensure that neuronal activity does not exceed this threshold. Similarly, a second mechanisms could activate below a low-activity threshold. The combined action of the two mechanisms enforces neural activity to stay within an allowed range, but still permits substantial firing rate fluctuations inside that range (Fig. 3d). When such a homeostatic mechanism is paired with a form of Hebbian plasticity which has its plasticity threshold within the limits of the allowed activity regime, the neural activity of the compound system naturally becomes bistable. Within the allowed range no homeostatic plasticity is active, but Hebbian plasticityDRAFT is intrinsically unstable. Thus any value within the homeostatically allowed region will lead to either LTP or LTD until the system reaches the limits at which homeostatic plasticity rapidly intervenes by undoing any excess LTP or LTD from there on. The compound system exhibits therefore two stable equilibrium points one at low and one at high activity. If the high-activity fixed point corresponds to a memory retrieval state, it becomes becomes irrelevant whether the memory is recalled every other minute or if its only recalled once a year. We can therefore answer the third of the questions raised in the introduction: The functional consequences of rapid homeostatic control with a single set point are that neurons loose the flexibility that is necessary for coding. The consequences are therefore undesirable. The proposed solution is to design rapid homeostatic control mechanisms that allow for several set points (bistability) or a target range of activity. In the next sections we will argue that the well-orchestrated combination of

17 Hebbian and non-Hebbian plasticity mechanisms naturally gives rise to such bistable dynamics. We will explain why these dynamics are useful for learning and memory and why, despite their stability, they still need to be accompanied by additional slow homeostatic mechanisms.

Can the combination of Hebbian and non-Hebbian plasticity lead to stable memory formation?

We now discuss a new class of plasticity models which combines a plausible model of Hebbian plasticity with two additional homeostatic plasticity mechanisms (Zenke et al., 2015). For sensible combinations these compound models do not suffer from the run-away effects of purely Hebbian plasticity and exhibit intrinsic bistability instead (cf. Fig. 3d). The basic logic of bistable plasticity can be summarized as follows. At high activity levels a rapid form of heterosynaptic plasticity limits run-away LTP and creates synaptic competition. Similarly, at low activity levels a unspecific form of plasticity which only depends on presynaptic activity prevents run-away LTD. The well-orchestrated interplay between these adversarial plasticity mechanisms dynamically creates bistability of neuronal activity and prevents pathologic run-away effects. Our approach is general and any Hebbian plasticity model can be stabilized through the addition of two non-Hebbian forms of plasticity. For illustration purposes, we will now focus on the triplet STDP model for which biologically plausible sets of model parameters exist (Pfister and Gerstner, 2006). To prevent run-away LTP in this model, we use a form of weight-dependent, multiplicative heterosynaptic depression (Chen et al., 2013; Chistiakova et al., 2015; Zenke et al., 2015) which has also been interpreted as a rapid form of depotentiation (Zhou et al., 2003). To prevent run- away LTD, we introduce hypothetical “transmitter-induced” plasticity which depends on presynaptic activity only and slowly increases synaptic efficacy in the absence of postsynaptic spiking (Zenke et al., 2015). These three plasticity mechanisms work in symphony to generate two stable levels of neuronal activity. Let us consider the weight wDRAFTij from a presynaptic neuron j to a postsynaptic neuron i. Although the full model is an STDP model, we now express its core ideas in terms of the presynaptic firing rate xj and the postsynaptic rate yi

dwij 4 = δ · xj +η · xjyi (yi − θ) −β · (wij − w˜ij) yi . (10) dt | {z } | {z } | {z } Transmitter induced Triplet model Heterosynaptic

Here, δ and β are strength parameters for the two non-Hebbian components of the plasticity model; η is the strength parameter (learning rate) for Hebbian plasticity; and w˜j serves as a reference weight that can be related to consolidation dynamics (Zenke et al., 2015; Ziegler et al., 2015). Note that, because the homeostatic mechanisms are “rapid”, no low-pass filtered variable y¯ appears in the

18 a e Active pathway 40 Poisson at 2Hz/10Hz wact act Neuron 1 ×

Control pathway ctl Neuron 2 400 Poisson constant at 2Hz × wctl b f 12 Neuron 1 Neuron 2

[Hz] act 40 6 ctl [Hz]

Input 0 20 c Output 0 150 Triplet STDP g 75 0.3

[Hz] Neuron 1 0 0.2 [a.u.] 40 Output τd = 1h 0.1

20 EPSP τd = 10s 0 0 d h 6 0.1

[1] Neuron 2

4 [a.u.] 2 0 EPSP ∆ EPSP 0 0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400 Time [s] Time [s]

Figure 4: Intrinsically stabilized firing rates, synaptic weights and selectivity through rapid homeostatic mechanisms. (a) Schematic figure of single neuron setup with two distinct input pathways. The “active” pathway consists of 40 Poisson neurons firing at either 2 or 10Hz. The control pathway consists of 400 neurons all firing constantly at 2Hz. All weights are initialized at the same values. (b) Population firing rates of the input populations averaged over 2 s bins. Firing rates in the active pathway (solid line) are temporally changed from 2Hz to 10Hz, whereas firing rates in the control pathway are constant at 2Hz. (c) Output firing rates of a single postsynaptic neuron with triplet STDP (Pfister and Gerstner, 2006) at all input synapses for two different time constants τd of the homeostatic sliding threshold (see Zenke et al. (2013); κ=3Hz). Purple: Slow homeostatic sliding threshold, τd = 1h; Green: Fast sliding threshold, τd = 10s. Top and bottom show the same firing rate plot for different y-axis ranges. (d) Relative weight changes for 10 randomly chosen weights from each pathway for the slow (purple) and the fast (green) sliding threshold. Darker colors correspond to active pathwayDRAFT weights and lighter colors to the control pathway. (e) Simplified sketch of the setup with two identical postsynaptic neurons which receive the same input as described in a. All input synapses are plastic (see Zenke et al. (2015)). The only difference between the two neurons in this setup are the different initial conditions. For Neuron 2 the active pathway weights are initialized at a lower value than for Neuron 1 which could be thought of being the outcome of previous heterosynaptic depression. (f) Output firing rates of the two neurons over time. Neuron 1 (black) responds selectively to the elevated inputs in the selective pathway (cf. b). Neuron 2 (orange) does not respond with elevated firing rates during stimulation. (g) Evolution of weights over time for Neuron 1. Active pathway weights are plotted in black and the control pathway in gray. For Neuron 1, the synapses in the active pathway initially undergo LTP during 10Hz stimulation of the active pathway synapses. However, weights quickly saturate. Synapses in the control pathway undergo heterosynaptic depression. (h) Same as g, but for Neuron 2. In contrast to that, Neuron 2 continues to fire with low firing rates and barely responds to the 10Hz input in the active pathway. The active pathway weights are slightly depressed during initial stimulation. 19 expression. Due to its rapid action and the high power, the heterosynaptic term in Eq. (10) acts as a burst detector which dominates at high activity levels and prevents LTP run-away dynamics (Fig. 4f). For sensible choices of δ and β, neuronal firing rates remain in intermediate regimes (Fig. 4f) and synaptic weights in the model converge towards stable weights wˆ whose values are dependent on the activation history and allow the formation of long-term memories. Importantly, the model preserves the plasticity threshold between LTD and LTP of the original triplet STDP model. The triplet model together with the non-Hebbian plasticity mechanisms, dynamically creates two stable equilibrium points. Overall, the synaptic weights converge rapidly towards one of two possible stable equilibrium states (Fig. 4g–h). First, there is a “selective” equilibrium state associated with high postsynaptic activity. In this state some weights are strong while other weights onto the same postsynaptic neuron remain weak. Thus the neuron becomes selective to features in its input (neuron 1 in Fig. 4f and g). Second, there is a low-activity fixed point without selectivity (Fig. 4f and h). Which fixed point a neuron converges to depends on initial conditions and the details of the activation pattern (Fig. 4g– h). Once weights have converged to one of the respective stable states, weights keep fluctuating, but do not change on average. Moreover, since the homeostatic mechanism does not impose a single set-point, activity patterns are not unlearned when a certain input is kept active for longer times (compare Fig. 4d with 4g). There are several aspects worth noting about the model. First, heterosynaptic plasticity does not only stabilize Hebbian plasticity in the active pathway, it also introduces synaptic competition between the active and the control pathway (Fig. 4g). Unlike BCM-like models in which heterosy- naptic depression of the inactive pathway depends on intermediate periods of background activity in between stimuli (Cooper et al., 2004; Jedlicka et al., 2015), here the heterosynaptic depression happens simultaneously to LTP induction (Fig. 4g). Second, although the learning rule effectively implements a rapid rescaling of the synaptic weights, it is still a fully local learning rule which only relies on information which is locally available at the synapse (cf. Eq. (10)). Third, in general the reference weight w˜ is not fixed, but follows its own temporal dynamics on a slower timescale (≈20min and more). Such complex synapticDRAFT dynamics are essential to capture experiments on synaptic con- solidation (Frey and Morris, 1997; Redondo and Morris, 2011; Ziegler et al., 2015). Consolidation dynamics also play an important role in protecting memories from being overwritten by heterosy- naptic plasticity. Finally, the stability properties of the learning rule Eq. (10) are not limited to simple feed-forward circuits, but generalize to more realistic scenarios. Specifically, it enables stable on-line learning and recall of cell assemblies in large spiking neural networks (Zenke et al., 2015). In summary, orchestrating Hebbian and homeostatic plasticity on comparable timescales to dy- namically create bistability, can reconcile the experimentally observed fast induction of synaptic plasticity with stable synaptic dynamics and ensure stability of learning and memory at the single neuron level. We now explore potential problems of intrinsically bistable plasticity?

20 Problems intrinsically bistable plasticity cannot solve

Consider a population of neurons with plastic synapses which follow intrinsically bistable plasticity dynamics such as the ones described in the last section. To encode and process information efficiently, neuronal populations need to create internal representations of the external world. Doing this efficiently requires the response to be sparse across the population. In other words, only a subset of neurons should respond for each stimulus. Moreover, different stimuli should evoke responses from different subsets of neurons within the population to avoid that all stimuli look “the same” to the neural circuit. Finally, individual neurons should respond sparsely over time. Imagine a neuron which is active for all possible stimuli. It would be as uninformative as one which never responds to any of the inputs. Therefore, to represent and process information in neural populations efficiently, different neurons in the population have to develop selectivity to different features. Intrinsically, bistable plasticity alone does not prevent neurons from responding at low firing rates to all stimuli (see, for example, Neuron 2 in Fig. 4f). Moreover, with similar initial conditions both neuron 1 and 2 would have developed selectivity to the same input. Thus, in a large network where all synapses are changed by the intrinsically bistable plasticity rule introduced above, all neurons could end up responding to the same feature. The question then arises, whether it is possible to prevent such an undesired outcome. To successfully implement network functions such as these examples, several network parameters and properties of the learning rules themselves need to be tuned to and maintained in sensible parameters regimes. To achieve this, additional forms of homeostatic plasticity need to be in place. However, due to the intrinsic stability of the learning rule, these additional homeostatic mechanisms can now safely act on much larger timescales, similar to the ones observed in biology. Thus we have answered the fourth question posed in the introduction. Yes, the combination of Hebbian and rapid homeostatic plasticity can lead to stable memory formation, if homeostasis gives rise to intrinsic bistability. A combination of Hebbian plasticity with heterosynaptic LTD and transmitter-induced LTP gives rise to suitable bistable dynamics. However, on its own, intrinsic bistability of plasticity dynamics is not sufficient to induce complementary functionality across many neurons in a network. The questionDRAFT then arises whether slower forms of homeostasis could come into play in this context.

What is the role of slower homeostatic plasticity mechanisms?

Diverse homeostatic mechanisms exist in the brain at different temporal and spatial scales (Marder and Goaillard, 2006; Turrigiano, 2012; Davis, 2013; Gjorgjieva et al., 2016). We have argued that rapid homeostatic mechanisms are important for stability, but advantages do slow homeostatic mechanisms have and what is their computational role? An advantage of slow homeostatic processes is that they can integrate activity over long timescales

21 to achieve precise regulation of neural target set points (Turrigiano and Nelson, 2000; van Rossum et al., 2000; Harnack et al., 2015). Longer integration times also allow to integrate signals from other parts of a neural network which take time to be transmitted as diffusive factors (Turrigiano, 2012; Sweeney et al., 2015). Slower homeostasis thus seems well suited for control problems which either require fine-tuning or the more global homeostatic regulation of functions at the network level. There are at least two important dynamical network properties which are not directly controllable by the rapid homeostatic mechanisms in Eq. (10). First, temporal sparseness at the neuronal level. A neuron that never responds to any stimulus, will never do so under bistable plasticity alone unless the LTP threshold is lowered. Second, to ensure spatial sparseness at the network level lateral inhibition has to decorrelate neuronal responses. As excitatory synapses change during learning, the strength of lateral inhibition typically needs to be finely regulated. The problem of temporal sparseness can be solved by any mechanism which ensures that a neuron which has been completely silent for very long, eventually “gets a chance” to activate above the LTP threshold. This can be achieved by either lowering the threshold as in the BCM theory (Bienenstock et al., 1982; Cooper et al., 2004; Zenke et al., 2015) or by slowly increasing the gain of either the neuron itself or the excitatory synapses through other forms of slow homeostatic plasticity (Daoudal and Debanne, 2003; Turrigiano, 2011; Toyoizumi et al., 2014). Finally, similar homeostatic effects could be achieved by decreasing inhibitory synaptic input through the action of neuron specific inhibitory plasticity (Vogels et al., 2013). Decorrelating neural activity and enforcing sparse activity at the population level has typically been associated with lateral or recurrent inhibitory feedback in sparse learning paradigms (Olshausen and Field, 1996) and for models of associative memory (Willshaw et al., 1969; Little and Shaw, 1978; Hopfield, 1982; Amit et al., 1985; Amit and Brunel, 1997; Litwin-Kumar and Doiron, 2014; Zenke et al., 2015). However, to achieve sensible global network activity levels through recurrent inhibitory feedback requires fine-tuning (Amit and Brunel, 1997; Brunel, 2000). Inhibitory synaptic plasticity (ISP) can achieve fine-tuning and homeostatic functions in spiking neural networks. Specifically, Hebbian forms of ISP, for which synchronous pre- and postsynaptic activation leads to a strengtheningDRAFT of inhibitory synapses, has been shown to be a powerful mecha- nism to regulate neural firing rates, to establish and maintain excitatory-inhibitory balance and to decorrelate neuronal firing in neural networks (Vogels et al., 2011; Zenke, 2014; Litwin-Kumar and Doiron, 2014). Existing Hebbian ISP models, however, increase inhibition between neurons which are frequently active together. This is useful to create excitatory inhibitory balance and prevent individual neurons from being synchronized. However, if neurons respond repeatedly together as part of the same cell assembly, these neurons will start to inhibit each other. Associative memory models, however, require the opposite situation: neurons in the same assembly should not inhibit each other while neurons of different cell assemblies inhibit each other so as to avoid that many cell assemblies are activated simultaneously. To achieve the latter, inhibition needs to be tuned globally. Biological networks can achieve this type of global tuning through slow forms of homeostatic

22 plasticity (Turrigiano, 2012) which rely on diffusive signals (Sweeney et al., 2015). These signals could either globally scale synaptic strength at inhibitory synapses directly, or indirectly influence the levels of inhibition by modulating inhibitory learning rules through a third factor (Zenke et al., 2015; Frémaux and Gerstner, 2016). It is important to realize that not all homeostatic mechanisms have to be permanently active. For instance, once the inhibitory feedback within a model is tuned to the “sweet spot” at which the network can operate, moving threshold mechanisms and ISP can safely be turned off while the network can still learn new stimuli (Zenke et al., 2015). It thus seems likely that in nature certain forms of homeostatic plasticity could be dormant for most of the time and spring into action only to prepare a network during the initial phase of development (Turrigiano and Nelson, 2004) or when an extreme external manipulation changes the network dynamics (Turrigiano et al., 1998). Thus we are able to answer the final question as follows: Slow homeostatic mechanisms tune pa- rameters of plasticity rules and neurons to enable efficient use of the available resources in networks. For example, for the sake of efficiency, no neuron should be never active; no neuron should be always active; the number of neurons that respond to the exact same set of stimuli should stay limited. Taken together with the results from the previous sections, these insights suggest two distinct roles of homeostatic plasticity mechanisms on different timescales. First, homeostatic plasticity on short timescales stabilizes Hebbian plasticity and makes synapses onto the same neuron compete with each other. Heterosynaptic plasticity is likely to play a major role for these functionalities. Second, homeostatic mechanisms on slower timescales achieve fine-tuning of multiple network parameters. A slow shift of the threshold between LTD and LTP, the slow rescaling of all synaptic weights, or a slow regulation of neuronal parameters are good candidates for these functionalities. Some of these slow mechanisms could be important only in setting up the network initially or after a strong external perturbation to the circuit. Discussion DRAFT We discuss the relation of the above modeling approach to other models and to experimental data.

Other intrinsically bi-stable plasticity models

The notion of intrinsic stability (or bistability) has come up recently in other studies. Toyoizumi et al. (2014) proposed the following plasticity rule which formally fulfills the requirements of an intrinsically stable rule:

dw τw = [wmax − w]+ [xy − θ]+ − [w − wmin]+ [θ − xy]+ (11) dt | {z } | {z } LTP LTD

23 The full rule is multiplicatively coupled with a slow homeostatic process which we have omitted in the above equation. However, even in the absence of any additional form of homeostatic plasticity, individual synaptic weights saturate at the soft bounds wmin and wmax respectively. Similarly, to Eq. (10) the model possesses a plasticity threshold θ. However, here, the threshold is defined via the product of pre- and postsynaptic activity, similar to (El Boustani et al., 2012), but in contrast to most data-driven models in which the threshold is defined on postsynaptic quantities only (Artola and Singer, 1993; Shouval et al., 2002; Pfister and Gerstner, 2006; Clopath and Gerstner, 2010; Graupner and Brunel, 2012). Another example of rapid stabilization of an otherwise unstable learning rule was devised by Lim et al. (2015). Using an analytic framework the authors infer the following rate-based plastic- ity model directly for experimental data on excitatory-to-excitatory connections in infero-temporal cortex (ITC): dw ∝ x (y − θ) . dt Because this model is unstable and leads to responses in a recurrent network model which are much smaller than their ITC measurements, the authors added a rapid homeostatic mechanism which maintains the average synaptic strength onto a neuron. This can be formalized as:

dw ∝ (x − x¯)(y − θ) . (12) dt

It is interesting to note that in this model, weights grow with the rate xθ¯ in the absence of activity to oppose run-away LTD which is conceptionally similar to Eq. (10). However, in contrast to Eq. (10), in the present model, run-away LTP is counterbalanced at the network level through recurrent inhibitory feedback which pushes the activity y of most neurons below the plasticity threshold θ causing them to undergo LTD. This suggests that under some circumstances the rapid modulation of excitatory plasticity through inhibitory feedback might be enough to avoid run-away dynamics. While there are several different ways to stabilize Hebbian plasticity, a plethora of theoretical models seems to conceptuallyDRAFT agree on two key factors: First, to achieve stability against run-away LTP a model needs to ensure that negative feedback is rapid (Pfister and Gerstner, 2006; Lazar et al., 2009; El Boustani et al., 2012; Clopath and Gerstner, 2010; Gjorgjieva et al., 2011; Litwin-Kumar and Doiron, 2014). And second, the feedback needs to be strong enough to be able to compensate for the leading power of Hebbian plasticity (Oja, 1982; Bienenstock et al., 1982; Cooper et al., 2004; Tetzlaff et al., 2012; Zenke et al., 2013; Toyoizumi et al., 2014).

Different models capture the same plasticity experiments

A plausible plasticity model has to reproduce stable learning dynamics in simulations and it has to capture a plethora of experimental data from rate-dependent (Bliss and Lomo, 1973), voltage-

24 dependent (Artola et al., 1990) and spike-timing-dependent (Markram et al., 1997; Bi and Poo, 1998; Sjöström et al., 2001) plasticity experiments. One salient feature captured by most models (Artola and Singer, 1993; Shouval et al., 2002; Pfister and Gerstner, 2006; Clopath and Gerstner, 2010; Graupner and Brunel, 2012) is the notion of a plasticity threshold which correlates with postsynaptic voltage, calcium concentration or other neuronal variables related to postsynaptic activation (Fig. 1a). However, most existing models are purely Hebbian and do not include the notion of a rapid homeostatic mechanism. If rapid homeostatic mechanisms exist — which is what we argue here — then how can it be that existing plasticity models without them can quantitatively capture the data from experiments? There are presumably three main reason for this. First, STDP experiments typically manipulate a single pathway, either by stimulating a presynaptic neuron are a bundle of presynaptic axons. Sometimes a designated control pathway (i.e. a second presynaptic neuron) is missing, or, if it is not missing, the effect size in the control pathway is considered as weak. However, from a theoretical perspective, we expect that heterosynaptic effects caused by stimulation of one presynaptic pathway are weak when it is measured at one ’control’ synapses. Even if it is weak on a per-synapses bases, its net effect is strong because it spreads out over thousands of synapses. Therefore even weak heterosynaptic plasticity at other synapses could implement a form of rapid homeostatic control (Chen et al., 2013; Chistiakova et al., 2015; Zenke et al., 2015). Second, in an STDP experiments with 60 repetitions of pre-post-pairs, the total activation of the postsynaptic neuron is still in a reasonable regime. Therefore it is unclear whether the ’burst- detector’ for heterosynaptic plasticity would would be triggered (Chen et al., 2013; Chistiakova et al., 2015; Zenke et al., 2015). Third, experiments typically rely on repeated pre- and postsynaptic activation. Moreover, during the induction protocol, synaptic efficacy changes are usually not observable. Plasticity models are thus fitted to pairs of initial and final synaptic strength. However, the unobserved intermediate synaptic dynamics could be qualitatively very different (Fig. 1c). These differences in the dynamics contain the answers to questions such as: Is the final synaptic strength stable or would it increase further with additional pairings?DRAFT Is there a threshold of number of pairings that needs to be reached for an all or nothing effect? Because the detailed dynamics of synapses during induction are not known, different plasticity models make different assumptions about the saturation of weights. Based on these assumptions, most existing plasticity can be coarsely classified into additive and multiplicative models. Additive plasticity models, for instance, assume a linear growth for plasticity induction (Fig. 1c; Pfister and Gerstner (2006); Clopath et al. (2010); Song et al. (2000)). In the absence of a hard upper bound, which is usually present in additive models, the weight would simply keep growing. Although additive models capture plasticity experiments for a fixed number of pairings, in simulated networks with ongoing pre- and postsynaptic activity, synaptic weights diverge and “get stuck” at the upper and lower bounds (Billings and van Rossum, 2009).

25 The equations of multiplicative models on the other hand depend on the synaptic weight explic- itly which typically makes them converge to stable intermediate weight values (Gütig et al., 2003; Morrison et al., 2007; Gilson et al., 2010) and for ongoing spiking activity in network simulations gives rise to unimodal weight distributions (Morrison et al., 2007). Both multiplicative and additive models generally do not lead to stability in network models (Morrison et al., 2007; Kunkel et al., 2011; Zenke et al., 2013) unless accompanied by additional homeostatic mechanisms. However, these mechanisms are typically added ad-hoc and they are not fitted to experimental data (e.g. Pfister and Gerstner (2006); Clopath et al. (2010)). Due to the limited amount of experimental data, it is possible to construct a set of different models which are all consistent with experimental data. The model discussed in this paper is based on the triplet STDP model (and is therefore consistent with existing STDP data), but includes additional forms of activity-dependent non-Hebbian plasticity which give rise to intrinsic bistability (and which are harder to measure experimentally).

Can we constrain plasticity models by experiments?

There are multiple ways in which synaptic plasticity models could be constrained better through additional data. In the past, a large body of research has focused on homosynaptic associative plasticity, also called Hebbian plasticity, using pairing experiments with various protocols such as STDP. Here, we argue that heterosynaptic plasticity as well as transmitter-induced plasticity might be almost as important as Hebbian plasticity due to their possibly crucial role for network stability. Heterosynaptic plasticity is a promising candidate to stabilize Hebbian plasticity models against run-away LTP is heterosynaptic plasticity (Chen et al., 2013; Zenke et al., 2013, 2015; Chistiakova et al., 2015; Jedlicka et al., 2015). While heterosynaptic plasticity has been observed in various experiments (Royer and Paré, 2003; Chistiakova et al., 2014), a conclusive picture and quantita- tive models are still missing. Is it possible to measure the timescale, frequency dependence, and weight-dependence of neuron wide heterosynaptic depression by manipulating the stimulation of the postsynaptic neuron? Another important question for the interpretation of heterosynaptic plasticity is whether heterosynaptic plasticityDRAFT induces mostly synaptic depression similar or related to LTD or rather resets or prevents early LTP trough depotentiation at the unstimulated pathway (Zhou et al., 2003). Transmitter-induced plasticity is important in models and might be present in many experiments, even though it has not been reported as such. Here transmitter-induced plasticity refers to potentially weak long-term potentiation that is caused by presynaptic firing in the absence of postsynaptic activity. Why is this form of plasticity potentially important? Suppose you have a network of neurons firing at low activity, so that any given neuron can be considered a weakly active postsynaptic neuron. Since low activity typically induces LTD, many plastic network simulations have the tendency to fall silent. To compensate for this theorists introduce either lower bounds on synaptic weights or add

26 weak LTP triggered by presynaptic activity (Zenke et al., 2015; Lim et al., 2015). How realistic are these assumptions? Direct experimental evidence for such terms would for instance be the growth of synaptic efficacy during low activity “pre only” stimulation. It is possible that such a term would look like, and could easily be mistaken for, an unstable baseline (with positive drift) before the beginning of a plasticity induction experiment. Given the importance of such a term, even if very weak, from the theoretical perspective, it is worth explicit studies. Consolidation of synapses is summarized in the present model by a reference weight w˜ (Zenke et al., 2015; Ziegler et al., 2015). Simulations predict that synaptic consolidation renders synapses inert against heterosynaptic plasticity. Intuitively, the measured synaptic weights become sticky and are always attracted back to their momentary stable state, i.e. weak or strong. This prediction requires future experimental clarification. The path towards saturation of synaptic weights during a pairing experiment (Fig. 1c) is vital to building better plasticity models. Virtually any information which helps theorists to constrain how the synaptic weight increases would be helpful. Importantly, this also includes any information about conditions (or experimental protocols) which do not induce plasticity, despite the fact that either the presynaptic or the postsynaptic neuron or both have been activated.

Conclusion

One of the most striking differences between plasticity models and experimental data concerns the timescale. Hebbian plasticity can be induced within seconds to minutes (Bliss and Lomo, 1973; Artola et al., 1990; Markram et al., 1997; Bi and Poo, 1998). In simulated network models such a fast Hebbian plasticity leads to run-away activity within seconds unless Hebbian plasticity is complemented with rapid forms of homeostatic plasticity. Here, “rapid” means that homeostatic changes need to take affect on the timescale of seconds or at most a few minutes (Zenke et al., 2013). This, however, is much faster than most forms of homeostatic plasticity observed in experiments. One of the most extensively studied forms of homeostasis in experiments is synaptic scaling (Turrigiano et al., 1998) which proportionallyDRAFT scales up (down) synapses if the network activity is too low (high). However, even the fastest known forms of scaling take hours to days to cause measurable changes to synaptic weights (Fig. 2; Turrigiano and Nelson (2000); Ibata et al. (2008); Aoto et al. (2008)). This apparent difference of timescales between homeostasis required for stability in models and experimental results is a challenge for current theories (Wu and Yamaguchi, 2006; Chen et al., 2013; Zenke et al., 2013; Toyoizumi et al., 2014). To reconcile plasticity models and stability in networks of simulated neurons, we need to reconsider models of Hebbian plasticity and how they are fitted to data. In most plasticity induction experiments neither the time course of the manipulated synaptic weight nor changes of other synapses are observed during stimulation. Quantitative models of synaptic plasticity thus make minimal assumptions about these unobserved temporal dynamics and

27 generally ignore heterosynaptic effects entirely. In other words, missing experimental data makes it possible to build different models which all capture the existing experimental data, but make different assumptions about the unobserved dynamics. Importantly, some of these models become intrinsically stable (Oja, 1982; Miller and MacKay, 1994; Chen et al., 2013) or even bistable (Toyoizumi et al., 2014; Zenke et al., 2015). In most situations these models can be interpreted as compound models consisting of Hebbian plasticity and forms of rapid homeostatic plasticity. Importantly, all of the plasticity models cited in this paragraph only rely on quantities which are locally known to the synapse; i.e. the pre- postsynaptic activity as well as its own synaptic weight. Although such local forms of plasticity can solve the problem of stability at a neuronal level, in practice, most network models require additional fine-tuning of parameters to achieve plausible activity levels across a network of neurons. This role can be fulfilled slow homeostatic mechanisms which act on timescales of hours or days, consistent with experimental data on homeostatic plasticity. In summary, based on theoretical arguments we suggest that Hebbian plasticity is intrinsically stabilized on extremely short timescales by fast homeostatic control, likely to be implemented by heterosynaptic plasticity while slow forms of homeostatic plasticity set the stage for stable learning. However, this hypothesis will now have to stand the test of time. It will thus be an important challenge for the next years to go beyond homosynaptic Hebbian plasticity and to gain a more complete understanding of the diverse interactions of Hebbian and homeostatic plasticity across timescales. Acknowledgements. WG was supported for this work by the European Research Council under grant agreement number 268689 (MultiRules) and by the European Community’s Seventh Framework Program under grant no. 604102 (Human Brain Project). FZ was supported by the SNSF (Swiss National Science Foundation).

References

Larry F Abbott, J A Varela, K Sen, and S B Nelson. Synaptic depression and cortical gain control. Science, 275(5297):220–224,DRAFT January 1997. W. C. Abraham and G. V. Goddard. Asymmetric relationships between homosynaptic long-term potentiation and heterosynaptic long-term depression. Nature, 305(5936):717–719, October 1983.

Wickliffe C. Abraham. Metaplasticity: Tuning synapses and networks for plasticity. Nat Rev Neu- rosci, 9(5):387–387, May 2008.

D. J. Amit. Modeling brain function. Cambridge University Press, Cambridge UK, 1989.

D. J. Amit and N. Brunel. A model of spontaneous activity and local delay activity during delay periods in the cerebral cortex. Cerebral Cortex, 7:237–252, 1997.

28 Daniel J. Amit, Hanoch Gutfreund, and H. Sompolinsky. Storing Infinite Numbers of Patterns in a Spin-Glass Model of Neural Networks. Phys Rev Lett, 55(14):1530–1533, September 1985.

Jason Aoto, Christine I. Nam, Michael M. Poon, Pamela Ting, and Lu Chen. Synaptic Signaling by All-Trans Retinoic Acid in Homeostatic Synaptic Plasticity. Neuron, 60(2):308–320, October 2008.

A Artola and W Singer. Long-term depression of excitatory synaptic transmission and its relationship to long-term potentiation. Trends Neurosci, 16(11):480–487, November 1993.

A. Artola, S. Bröcher, and W. Singer. Different voltage-dependent thresholds for inducing long- term depression and long-term potentiation in slices of rat visual cortex. Nature, 347(6288): 69–72, September 1990.

Craig H. Bailey, Maurizio Giustetto, Yan-You Huang, Robert D. Hawkins, and Eric R. Kandel. Is Heterosynaptic modulation essential for stabilizing hebbian plasiticity and memory. Nat Rev Neurosci, 1(1):11–20, October 2000.

Guo-Qiang Bi and Mu-Ming Poo. Synaptic Modifications in Cultured Hippocampal Neurons: De- pendence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type. J Neurosci, 18(24): 10464 –10472, December 1998.

EL Bienenstock, LN Cooper, and PW Munro. Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J Neurosci, 2(1):32–48, January 1982.

Guy Billings and Mark C. W. van Rossum. Memory Retention and Spike-Timing-Dependent Plas- ticity. J Neurophysiol, 101(6):2775 –2788, June 2009.

T. V. P. Bliss and T. Lomo. Long-lasting potentation of synaptic transmission in the dendate area of anaesthetized rabbit following stimulation of the perforant path. J. Physiol., 232:351–356, 1973.

Carlos S. N. Brito and WulframDRAFT Gerstner. Nonlinear Hebbian learning as a unifying principle in receptive field formation. arXiv:1601.00701 [cs, q-bio], January 2016.

T. H. Brown, A. M. Zador, Z. F. Mainen, and B. J. Claiborne. Hebbian modifications in hippocampal neurons. In M. Baudry and J. L. Davis, editors, Long–term potentiation., pages 357–389. MIT Press, Cambridge, London, 1991.

Thomas H. Brown and Sumantra Chattarji. Hebbian Synaptic Plasticity: Evolution of the Con- temporary Concept. In Professor Eytan Domany, Professor Dr J. Leo van Hemmen, and Pro- fessor Klaus Schulten, editors, Models of Neural Networks, Physics of Neural Networks, pages 287–314. Springer New York, 1994.

29 Nicolas Brunel. Dynamics of sparsely connected networks of excitatory and inhibitory spiking neu- rons. J Comput Neurosci, 8(3):183–208, May 2000.

Jen-Yung Chen, Peter Lonjers, Christopher Lee, Marina Chistiakova, Maxim Volgushev, and Maxim Bazhenov. Heterosynaptic Plasticity Prevents Runaway Synaptic Dynamics. J Neurosci, 33(40): 15915–15929, October 2013.

Marina Chistiakova, Nicholas M. Bannon, Maxim Bazhenov, and Maxim Volgushev. Heterosynaptic Plasticity Multiple Mechanisms and Multiple Roles. Neuroscientist, 20(5):483–498, October 2014.

Marina Chistiakova, Nicholas M. Bannon, Jen-Yung Chen, Maxim Bazhenov, and Maxim Volgushev. Homeostatic role of heterosynaptic plasticity: models and experiments. Front Comput Neurosci, 9:89, 2015.

Brian R. Christie and Wickliffe C. Abraham. Priming of associative long-term depression in the dentate gyrus by theta frequency synaptic activity. Neuron, 9(1):79–84, July 1992.

C. Clopath, L. Busing, E. Vasilaki, and W. Gerstner. Connectivity reflects coding: A model of voltage-based spike-timing-dependent-plasticity with homeostasis. Nature Neuroscience, 13:344– 352, 2010.

Claudia Clopath and . Voltage and spike timing interact in STDP – a unified model. Front Synaptic Neurosci, 2:25, 2010.

Leon N Cooper, Nathan Intrator, Brian S Blais, and Harel Z Shouval. Theory of Cortical Plasticity. World Scientific, New Jersey, April 2004.

Gaël Daoudal and Dominique Debanne. Long-Term Plasticity of Intrinsic Excitability: Learning Rules and Mechanisms. Learn Mem, 10(6):456–465, November 2003.

Graeme W. Davis. Homeostatic Signaling and the Stabilization of Neural Function. Neuron, 80(3): 718–728, October 2013. DRAFT S M Dudek and M F Bear. Homosynaptic Long-Term Depression in Area CA1 of and Effects of N-Methyl-D-Aspartate Receptor Blockade. PNAS, 89(10):4363–4367, May 1992.

Sami El Boustani, Pierre Yger, Yves Frégnac, and Alain Destexhe. Stable Learning in Stochastic Network States. J Neurosci, 32(1):194 –214, January 2012.

Uwe Frey and Richard G. M. Morris. Synaptic tagging and long-term potentiation. Nature, 385 (6616):533–6, March 1997.

Uwe Frey and Richard G. M. Morris. Synaptic tagging: implications for late maintenance of hip- pocampal long-term potentiation. Trends in , 21(5):181–188, May 1998.

30 Robert C Froemke and Yang Dan. Spike-timing-dependent synaptic modification induced by natural spike trains. Nature, 416(6879):433–8, March 2002.

Nicolas Frémaux and Wulfram Gerstner. Neuromodulated Spike-Timing-Dependent Plasticity, and Theory of Three-Factor Learning Rules. Front. Neural Circuits, page 85, 2016.

S. Funahashi, C. J Bruce, and P. S Goldman-Rakic. Mnemonic Coding of Visual Space in the Monkey’s Dorsolateral Prefrontal Cortex. J Neurophysiol, 61(2):331–349, February 1989.

Wulfram Gerstner, Werner M Kistler, Richard Naud, and Liam Paninski. Neuronal dynamics: from single neurons to networks and models of cognition. Cambridge University Press, Cambridge, 2014.

Matthieu Gilson, Anthony Burkitt, David Grayden, Doreen Thomas, and J. van Hemmen. Repre- sentation of input structure in synaptic weights by spike-timing-dependent plasticity. Phys Rev E, 82(2):1–11, August 2010.

Julijana Gjorgjieva, , Juliette Audet, and Jean-Pascal Pfister. A triplet spike- timing–dependent plasticity model generalizes the Bienenstock–Cooper–Munro rule to higher- order spatiotemporal correlations. Proc Natl Acad Sci U S A, 108(48):19383 –19388, November 2011.

Julijana Gjorgjieva, Guillaume Drion, and Eve Marder. Computational implications of biophysical diversity and multiple timescales in neurons and synapses for circuit performance. Current Opinion in Neurobiology, 37:44–52, April 2016.

Anubhuthi Goel, Bin Jiang, Linda W. Xu, Lihua Song, Alfredo Kirkwood, and Hey-Kyoung Lee. Cross-modal regulation of synaptic AMPA receptors in primary sensory cortices by visual experi- ence. Nat Neurosci, 9(8):1001–1003, August 2006.

Michael Graupner and Nicolas Brunel. Calcium-based plasticity model explains sensitivity of synap- tic changes to spike pattern,DRAFT rate, and dendritic location. Proc Natl Acad Sci U S A, 109(10): 3991–3996, March 2012.

R. Gütig, R. Aharonov, S. Rotter, and Haim Sompolinsky. Learning Input Correlations through Nonlinear Temporally Asymmetric Hebbian Plasticity. J Neurosci, 23(9):3697–3714, May 2003.

Daniel Harnack, Miha Pelko, Antoine Chaillet, Yacine Chitour, and Mark C.W. van Rossum. Sta- bility of Neuronal Networks with Homeostatic Regulation. PLoS Comput Biol, 11(7):e1004357, July 2015.

D. O. Hebb. The Organization of Behavior. Wiley, New York, 1949.

31 John J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci U S A, 79(8):2554, 1982.

Y Y Huang, A Colino, D K Selig, and R C Malenka. The influence of prior synaptic activity on the induction of long-term potentiation. Science, 255(5045):730–733, February 1992.

D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol, 160(1):106–154.2, January 1962.

C. P. Hung, G. Kreiman, T. Poggio, and J. J. DiCarlo. Fast Readout of Object Identity from Macaque Inferior Temporal Cortex. Science, 310:863 – 866, 2005.

Keiji Ibata, Qian Sun, and Gina G. Turrigiano. Rapid Synaptic Scaling Induced by Changes in Postsynaptic Firing. Neuron, 57(6):819–826, March 2008.

Peter Jedlicka, Lubica Benuskova, and Wickliffe C. Abraham. A Voltage-Based STDP Rule Com- bined with Fast BCM-Like Metaplasticity Accounts for LTP and Concurrent “Heterosynaptic” LTD in the Dentate Gyrus In Vivo. PLoS Comput Biol, 11(11):e1004588, November 2015.

Richard Kempter, Wulfram Gerstner, and J. van Hemmen. Hebbian learning and spiking neurons. Phys Rev E, 59(4):4498–4514, April 1999.

Susanne Kunkel, Markus Diesmann, and Abigail Morrison. Limits to the development of feed-forward structures in large recurrent neuronal networks. Front Comput Neurosci, 4:160, 2011.

Andreea Lazar, Gordon Pipa, and Jochen Triesch. SORN: A Self-Organizing . Front Comput Neurosci, 3, October 2009.

W. B. Levy and O. Steward. Temporal contiguity requirements for long-term associative potentia- tion/depression in the hippocampus. Neuroscience, 8(4):791–797, April 1983.

Sukbin Lim, Jillian L. McKee, Luke Woloszyn, Yali Amit, David J. Freedman, David L. Sheinberg, and Nicolas Brunel. InferringDRAFT learning rules from distributions of firing rates in cortical neurons. Nat Neurosci, advance online publication, November 2015.

John Lisman. Long-term potentiation: outstanding questions and attempted synthesis. Phil Trans R Soc Lond B, 358(1432):829–842, April 2003.

W. A. Little and G. L. Shaw. Analytical study of the memory storage capacity of a neural network. Math. Biosc., 39:281–290, 1978.

Ashok Litwin-Kumar and Brent Doiron. Formation and maintenance of neuronal assemblies through synaptic plasticity. Nat Commun, 5, November 2014.

32 Nikos K Logothetis and David L Sheinberg. Visual Object Recognition. Annual Review of Neuro- science, 19(1):577–621, 1996.

G. S. Lynch, T. Dunwiddie, and V. Gribkoff. Heterosynaptic depression: a postsynaptic correlate of long-term potentiation. Nature, 266(5604):737–739, April 1977.

Robert C. Malenka and and Roger A. Nicoll. Long-Term Potentiation–A Decade of Progress? Science, 285(5435):1870–1874, September 1999.

Eve Marder and Jean-Marc Goaillard. Variability, compensation and homeostasis in neuron and network function. Nat Rev Neurosci, 7(7):563–574, July 2006.

H. Markram, J. L\protectübke, M. Frotscher, and B. Sakmann. Regulation of synaptic efficacy by coincidence of postysnaptic AP and EPSP. Science, 275:213–215, 1997.

S. J. Martin, P. D. Grimwood, and R. G. M. Morris. Synaptic Plasticity and Memory: An Evaluation of the Hypothesis. Annu Rev Neurosci, 23(1):649–711, 2000.

K. D. Miller and D. J. C. MacKay. The role of constraints in \protect Hebbian learning. Neural Computation, 6:100–126, 1994.

Yasushi Miyashita. Neural correlate of visual associative long-term memory in the primate temporal cortex. Nature, 335(27):817–820, October 1988.

R. G. M. Morris, E. Anderson, G. S. Lynch, and M. Baudry. Selective impairment of learning and blockade of long-term potentiation by an N-methyl-D-aspartate receptor antagonist, AP5. Nature, 319(6056):774–776, February 1986.

Abigail Morrison, Ad Aertsen, and Markus Diesmann. Spike-timing-dependent plasticity in balanced random networks. Neural Comput, 19(6):1437–67, June 2007.

Abigail Morrison, Markus Diesmann, and Wulfram Gerstner. Phenomenological models of synaptic plasticity based on spike timing.DRAFTBiol Cybern, 98(6):459–478, June 2008. Daniel H. O’Connor, Gayle M. Wittenberg, and Samuel S.-H. Wang. Graded bidirectional synaptic plasticity is composed of switch-like unitary events. PNAS, 102(27):9679–9684, July 2005.

Erkki Oja. Simplified neuron model as a principal component analyzer. J Math Biol, 15(3):267–273, 1982.

Bruno A. Olshausen and David J. Field. Emergence of simple-cell receptive field properties by learn- ing a sparse code for natural images. , Published online: 13 June 1996; | doi:10.1038/381607a0, 381(6583):607–609, June 1996.

33 Carl C. H. Petersen, Robert C. Malenka, Roger A. Nicoll, and John J. Hopfield. All-or-none poten- tiation at CA3-CA1 synapses. PNAS, 95(8):4732–4737, April 1998.

Jean-Pascal Pfister and Wulfram Gerstner. Triplets of Spikes in a Model of Spike Timing-Dependent Plasticity. J Neurosci, 26(38):9673–9682, September 2006.

R. Quian Quiroga, L. Reddy, G. Kreiman, C. Koch, and I. Fried. Invariant visual representation by single neurons in the human brain. Nature, 435(7045):1102–1107, June 2005.

Roger L. Redondo and Richard G. M. Morris. Making memories last: the synaptic tagging and capture hypothesis. Nat Rev Neurosci, 12(1):17–30, January 2011.

Alfonso Renart, Jaime de la Rocha, Peter Bartho, Liad Hollender, Néstor Parga, Alex Reyes, and Kenneth D. Harris. The Asynchronous State in Cortical Circuits. Science, 327(5965):587 –590, January 2010.

Klaus G. Reymann and Julietta U. Frey. The late maintenance of hippocampal LTP: Requirements, phases, ‘synaptic tagging’, ‘late-associativity’ and implications. Neuropharmacology, 52(1):24–40, January 2007.

N. Rochester, J. Holland, L. Haibt, and W. Duda. Tests on a cell assembly theory of the action of the brain, using a large digital computer. IEEE Trans Inf Theory, 2(3):80–93, September 1956.

Sébastien Royer and Denis Paré. Conservation of total synaptic weight through balanced synaptic depression and potentiation. Nature, 422(6931):518–522, April 2003.

Walter Senn, Henry Markram, and Misha Tsodyks. An Algorithm for Modifying Neurotransmitter Release Probability Based on Pre- and Postsynaptic Spike Timing. Neural Comput, 13(1):35–67, January 2001.

Harel Z. Shouval, Mark F. Bear, and Leon N. Cooper. A Unified Model of NMDA Receptor- Dependent Bidirectional Synaptic Plasticity. Proc Natl Acad Sci U S A, 99(16):10831–10836, August 2002. DRAFT

Per Jesper Sjöström, Gina G Turrigiano, and Sacha B Nelson. Rate, Timing, and Cooperativity Jointly Determine Cortical Synaptic Plasticity. Neuron, 32(6):1149–1164, December 2001.

Sen Song, Kenneth D. Miller, and L. F. Abbott. Competitive Hebbian learning through spike- timing-dependent synaptic plasticity. Nat Neurosci, 3(9):919–926, September 2000.

Yann Sweeney, Jeanette Hellgren Kotaleski, and Matthias H. Hennig. A Diffusive Homeostatic Signal Maintains Neural Heterogeneity and Responsiveness in Cortical Networks. PLoS Comput Biol, 11 (7):e1004389, July 2015.

34 Christian Tetzlaff, Christoph Kolodziejski, Marc Timme, and Florentin Wörgötter. Analysis of synaptic scaling in combination with Hebbian plasticity in several simple networks. Front Comput Neurosci, 6:36, 2012.

Taro Toyoizumi, Megumi Kaneko, Michael P. Stryker, and Kenneth D. Miller. Modeling the Dynamic Interaction of Hebbian and Homeostatic Plasticity. Neuron, 84(2):497–510, October 2014.

Misha Tsodyks, Klaus Pawelzik, and Henry Markram. Neural Networks with Dynamic Synapses. Neural Comput, 10(4):821–835, 1998.

Misha V. Tsodyks and Henry Markram. The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. PNAS, 94(2):719–723, January 1997.

Gina Turrigiano. Too Many Cooks? Intrinsic and Synaptic Homeostatic Mechanisms in Cortical Circuit Refinement. Annu Rev Neurosci, 34(1):89–103, 2011.

Gina G. Turrigiano. Homeostatic Synaptic Plasticity: Local and Global Mechanisms for Stabilizing Neuronal Function. Cold Spring Harb Perspect Biol, 4(1):a005736, January 2012.

Gina G Turrigiano and Sacha B Nelson. Hebb and homeostasis in neuronal plasticity. Current Opinion in Neurobiology, 10(3):358–364, June 2000.

Gina G. Turrigiano and Sacha B. Nelson. Homeostatic plasticity in the developing nervous system. Nat Rev Neurosci, 5(2):97–107, February 2004.

Gina G. Turrigiano, Kenneth R. Leslie, Niraj S. Desai, Lana C. Rutherford, and Sacha B. Nelson. Activity-dependent scaling of quantal amplitude in neocortical neurons. Nature, 391(6670):892– 896, February 1998.

M. C. W. van Rossum, G. Q. Bi, and G. G. Turrigiano. Stable Hebbian Learning from Spike Timing-Dependent Plasticity. J. Neurosci., 20(23):8812–8821, December 2000. C. van Vreeswijk and H. Sompolinsky.DRAFT Chaos in Neuronal Networks with Balanced Excitatory and Inhibitory Activity. Science, 274(5293):1724 –1726, December 1996.

Tim P. Vogels, Henning Sprekeler, Friedemann Zenke, Claudia Clopath, and Wulfram Gerstner. Inhibitory Plasticity Balances Excitation and Inhibition in Sensory Pathways and Memory Net- works. Science, 334(6062):1569 –1573, December 2011.

Tim P Vogels, Robert C Froemke, Nicolas Doyon, Matthieu Gilson, Julie S Haas, Robert Liu, Ari- anna Maffei, Paul Miller, Corette Wierenga, Melanie A Woodin, Friedemann Zenke, and Henning Sprekeler. Inhibitory Synaptic Plasticity - Spike timing dependence and putative network function. Front Neural Circuits, 7(119), 2013.

35 Christoph von der Malsburg. Self-organization of orientation sensitive cells in the striate cortex. Kybernetik, 14(2):85–100, December 1973.

D. J. Willshaw, O. P. Bunemann, and H. C. Longuet-Higgins. Non-holographic associative memory. Nature, 222:960–962, 1969.

David J Willshaw and Christoph Von Der Malsburg. How patterned neural connections can be set up by self-organization. Proceedings of the Royal Society of London B: Biological Sciences, 194 (1117):431–445, 1976.

Fraser A. W. Wilson, Séamas P. Ó Scalaidhe, and Patricia S. Goldman-Rakic. Dissociation of Object and Spatial Processing Domains in Primate Prefrontal Cortex. Science, 260(5116):1955–1958, June 1993.

Zhihua Wu and Yoko Yamaguchi. Conserving total synaptic weight ensures one-trial sequence learning of place fields in the hippocampus. Neural Networks, 19(5):547–563, June 2006.

Friedemann Zenke. Memory formation and recall in recurrent spiking neural networks. PhD thesis, École polytechnique fédérale de Lausanne EPFL, Lausanne, , 2014.

Friedemann Zenke, Guillaume Hennequin, and Wulfram Gerstner. Synaptic Plasticity in Neural Networks Needs Homeostasis with a Fast Rate Detector. PLoS Comput Biol, 9(11):e1003330, November 2013.

Friedemann Zenke, Everton J. Agnes, and Wulfram Gerstner. Diverse synaptic plasticity mechanisms orchestrated to form and retrieve memories in spiking neural networks. Nat Commun, 6, April 2015.

Qiang Zhou, Huizhong W. Tao, and Mu-ming Poo. Reversal and Stabilization of Synaptic Modifi- cations in a Developing Visual System. Science, 300(5627):1953–1957, June 2003.

Lorric Ziegler, Friedemann Zenke, David B. Kastner, and Wulfram Gerstner. Synaptic Consolidation: From Synapses to BehavioralDRAFT Modeling. J Neurosci, 35(3):1319–1334, January 2015.

36