Sequence learning in a spiking neuronal network with memristive synapses

Younes Bouhadjar; Sebastian Siegel; Tom Tetzlaff; Markus Diesmann; Rainer Waser; Dirk J Wouters

doi:10.1088/2634-4386/acf1c4

1.Â Introduction

In many everyday tasks, such as learning, recognizing, or predicting objects in a noisy environment, the brain outperforms conventional computing systems and deep learning algorithms at many aspects: it has a higher capacity to generalize, can learn from small training examples, is robust with respect to perturbations and failure, and is highly resource and energy efficient. To achieve this performance, it uses intricate biological mechanisms and principles. Understanding these principles is essential for driving new advances in neuroscience and for developing new real-world applications. For instance, it is known that biological neural networks are highly sparse in activity and connectivity and they can self-organize in the face of the incoming sensory stimulus using unsupervised local learning rules. A number of biologically inspired algorithms relying on these principles have been developed for sequence prediction and replay (Lazar et al 2009, Hawkins and Ahmad 2016, Bouhadjar et al 2019, 2022), pattern recognition (Masquelier and Thorpe 2007, Payeur et al 2021), and decision making (Neftci and Averbeck 2019). The spiking temporal memory (spiking TM) network proposed by Bouhadjar et al (2022) learns high-order sequences in an unsupervised, continuous manner using local learning rules. Owing to its highly sparse activity and connectivity, it provides an energy-efficient sequence learning and prediction mechanism. After learning, the network successfully predicts and recalls complex sequences in a context-specific manner, and signals anomalies in the data.

The spiking TM algorithm was originally implemented using the neural network simulator NEST (Gewaltig and Diesmann 2007). While NEST provides a simulation platform optimized for running large-scale networks efficiently in a reproducible manner, it is executed on standard von-Neumann-type computers, i.e. on hardware that is not specifically optimized for neuromorphic computing. This results in performance limitations as the simulation time and the energy dissipation become substantially high for brain-scale neural networks (Kunkel et al 2014, Jordan et al 2018). For using spiking TM in edge-computing applications, more efficient hardware is therefore required. Neuromorphic hardware offers a potential solution to the high demands imposed by the natural-density connectivity of the brain and the resulting communication load. This is achieved through dedicated solutions and specific circuit blocks that emulate neuron and synapse functionalities (Burr et al 2016, Xia and Yang 2019, MarkoviÄ et al 2020, Zhu et al 2020). The local learning rules and the sparse neuronal activation of the spiking TM model allow for efficient mapping of the algorithm on neuromorphic hardware.

Memristive devices were suggested as components in such a hardware (Yang et al 2013, Ielmini and Wong 2018, Yu 2018). They can be used to emulate certain synaptic functionalities using only a single device, by replacing more complex complementary metal-oxide-semiconductor (CMOS) based circuits (Waser et al 2009, Dittmann and Strachan 2019) and thus can provide more energy-efficient computing in edge applications (Xia and Yang 2019). Their intrinsic dynamics capture similar characteristics as the biological synapses such as variability, weight dependence of the update, and non-volatility. A particular type of memristive device is known as the valence change memory (VCM) ReRAM device (Waser 2012b). The device conductivity can be strengthened (i.e. potentiated) or weakened (i.e. depressed) by means of an applied voltage pulse. Depending on the initial resistance range and the voltage pulse amplitude and width, a VCM ReRAM device can operate in two different modes, i.e. binary or analog (CÃ¼ppers et al 2019). In the analog mode, the applied pulses result in a gradual monotonous change of the device conductance, for both potentiation and depression. This operation mode can be used to implement electrically adjustable resistors for example in analog electronics systems as well as in the implementation of spike-timing-dependent plasticity (STDP) type of learning rules (Feldman 2012). It is, however, characterized by a limited conductance range, and the device switching characteristics may slowly drift away from the analog behavior to a more abrupt conductivity change. In the binary mode, the conductivity can only be switched between two values, the low conductance state (LCS) and the high conductance state (HCS). The switching between these two states occurs abruptly. In previous works, the abrupt, binary switching is achieved using single program pulses with a sufficiently large amplitude (CÃ¼ppers et al 2019). In contrast, here, we study the switching behavior of the device in response to a certain number of pulses of smaller amplitudes. As a response to these pulses, an internal state variable N_VO gradually increases (Fleck et al 2016). Only when this N_VO exceeds a certain threshold value, a thermal runaway condition is reached resulting in an abrupt switching event. Due to intrinsic ReRAM device variabilities (Fantini et al 2013), the number of pulses to reach this thermal runaway condition shows a strong device-to-device and cycle-to-cycle variation. During the depression, the switching is intrinsically more gradual, due to the lack of an internal runaway mechanism as present for the potentiation operation. Adding a series resistance (in or outside the device) can provide such a runaway mechanism due to a voltage divider effect also in the RESET case (Hardtdegen et al 2018). Hence, in both cases, the switching behavior can be summarized as follows: at first, only a gradual change of the internal state variable N_VO is observed, associated with only a minor change of the device conductivity, followed by a strong switching effect when the internal state variable reaches a certain threshold (Suri et al 2013, Doevenspeck et al 2018, Yu 2018, Zhao et al 2019). This operation mode is of particular interest for this study, as it is similar to the structural STDP plasticity discussed and implemented in the original spiking TM model (Bouhadjar et al 2022).

In this work, we investigate how the intrinsic potentiation and depression characteristics of memristive devices influence the learning of the model in (Bouhadjar et al 2022). Thereto, we adapt the original neuroscientific synapse model to accommodate memristive-type potentiation/depression characteristics. The performance of the system is assessed by varying device characteristics such as conductance values and ranges, granularity of conductance change, and device variability. We investigate these for both the analog and the binary operation modes.

2.Â Results

2.1.Â A model of a ReRAM synapse

In this section, we introduce our model of the ReRAM device and its control circuitry, and characterize the resulting model dynamics.

The conductance of ReRAM devices can be potentiated or depressed, mimicking the plasticity observed in biological synapses. While single memristive devices may readily emulate the inference function, they cannot emulate on their own plasticity rules such as STDP or homeostatic control. The change of the memristive conductivity depends on the momentary voltage difference between its two terminals, and the device has no memory of past spike events at either of its terminals nor of their relative timing. Hebbian learning such as STDP therefore can only be emulated using a memristive device by 'reshaping' of the pre- and post-synaptic spike events using complex voltage pulses, so that the spike-time dependency is translated into a desired instantaneous voltage difference over the device (ZamarreÃ±o-Ramos et al 2011, Wang et al 2015). As a result, the learning rule is controlled outside the actual device (see figure 1). As for implementing the learning, instead of using complex voltage pulse shapes, it is more efficient to use a controller to generate simple rectangular voltage pulses that can effectuate the desired change of the device conductance in a better, more energy efficient, and also more reliable way. The change of the device conductivity as a function of the number of applied voltage pulses can hereby be seen as an intrinsic plasticity curve of the device, where the actual pulse shape can be optimized toward desired potentiation and depression characteristics.

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.**Â ReRAM control circuit. Sketch depicting the synapse model including the control circuit and the ReRAM model (red box). The circuit is composed of a read/inference path (black arrows) and a write/programming path (gray arrows). The device conductivity $\overline{G}$ is read out whenever a presynaptic neuron emits a spike, which results in a postsynaptic current $I(t) = \overline{G}\cdot V_\text{read}$ . The device conductivity is updated by the programming path. The controller receives pre- and postsynaptic spikes and decides on applying either a depression or a potentiation event (or both). In the next step, the model of device plasticity computes the conductance increment/decrement $\Delta{}G$ .
Download figure:
Standard image High-resolution image

**Figure 1.**Â ReRAM control circuit. Sketch depicting the synapse model including the control circuit and the ReRAM model (red box). The circuit is composed of a read/inference path (black arrows) and a write/programming path (gray arrows). The device conductivity $\overline{G}$ is read out whenever a presynaptic neuron emits a spike, which results in a postsynaptic current $I(t) = \overline{G}\cdot V_\text{read}$ . The device conductivity is updated by the programming path. The controller receives pre- and postsynaptic spikes and decides on applying either a depression or a potentiation event (or both). In the next step, the model of device plasticity computes the conductance increment/decrement $\Delta{}G$ .
Download figure:
Standard image High-resolution image

Previous studies suggested both physics-based and phenomenological models for VCM-type ReRAMs. Physics-based models such as the JART model (Bengel et al 2020) capture detailed physical characteristics and predict their specific experimental behavior. They require however long simulation time and lead to convergence issues. On the other hand, the more phenomenological models give a high-level description of the operational characteristics, have good accuracy, are computationally less demanding, and can hence be combined with large-scale network models. In this study, we opt for a phenomenological model to implement both the analog and the binary ReRAM device.

The conductivity of the device (i.e. synaptic weight) is either potentiated or depressed by following learning rules similar to those outlined in the spiking TM model (Bouhadjar et al 2022). The learning rules are implemented by the control circuit (figure 1) as follows: the synapse is depressed slightly at every presynaptic spike and potentiated if a postsynaptic spike follows after a presynaptic spike. In contrast to the original spiking TM model, synapses are potentiated by a fixed amount irrespective of the relative timing between the pre- and postsynaptic spikes. The potentiation is however enabled only if the time difference between pre- and postsynaptic spikes is within the interval $[\Delta{}t_\text{min}, \Delta{}t_\text{max}]$ . This prohibits synchronously firing neurons from connecting to each other and leads to improved training (Bouhadjar et al 2022). The control circuit further implements a homeostatic control mechanism (see section 2.2): in case the neuronal firing rate exceeds a certain threshold, the potentiation is disabled and instead an additional depression update is applied.

In the analog mode, the increment

$\begin{align} \Delta G_{ij} = \begin{cases} G_\text{max} \cdot\left(\lambda_{+} \cdot \left(1 -\dfrac{G_{ij}}{G_\text{max}}\right)^{\mu_{+}} + X_{ij}\right)&\text{for potentiation}\,\\[12pt] -G_\text{max} \cdot \left(\lambda_{-} \cdot \left(\dfrac{G_{ij}}{G_\text{max}}\right)^{\mu_{-}} + X_{ij}\right)&\text{for depression} \end{cases} \end{align} \tag{ 1 }$

in the conductivity G_ij of the device connecting a presynaptic neuron j to a postsynaptic neuron i following a potentiation or a depression event is modeled as in (GÃ¼tig et al 2003, Fusi and Abbott 2007), but with an additional additive noise X_ij. For each synapse and for each update, the noise $X_{ij} \sim \mathcal{N}(0,\,\sigma_\text{w}^{2})$ is randomly and independently drawn from a normal distribution with zero mean and standard deviation $\sigma_\text{w}$ . The conductance G_ij evolves between a lower and an upper bound $G_{\text{min},ij}$ and G_max, and it is clipped at these boundaries, with learning rates $\lambda_{+}$ and $\lambda_{-}$ and weight dependence exponents $\mu_{+}$ and $\mu_{-}$ . The conductance changes linearly with the internal state variable N_VO, thus no specification of the internal state variable is necessary. The initial conductance $G_{\text{min},ij} = G_{ij}(0)$ is drawn for every new device from a uniform distribution in the interval [ $G_{0,\text{min}},G_{0,\text{max}}$ ].

For the binary switching behavior, we use a similar model as the structural STDP model proposed by Bouhadjar et al (2022). The switching of the conductance between the LCS and the HCS is controlled by a permanence P_ij. The permanence plays the role of the internal state variable N_VO. If it is above a certain threshold $\theta_\text{P}$ , the conductance G_ij is set to G_max, otherwise it is set to $G_{\text{min},ij}$ :

$\begin{align} G_{ij}(t) = \begin{cases} G_\text{max} & \mbox{if}\ P_{ij}(t) \unicode{x2A7E} \theta_P \\ G_{\text{min},ij} & \mbox{if}\ P_{ij}(t) \lt \theta_P. \end{cases} \end{align} \tag{ 2 }$

At each potentiation or depression step, the permanence P of the synapse $j\to{}i$ is incremented by an amount

$\begin{align} \Delta P_{ij} = \begin{cases} P_\text{max}\cdot\left(\lambda_{+} \cdot \left(1 - \dfrac{P_{ij}}{P_\text{max}}\right)^{\mu_{+}} + X_{ij}\right) & \text{for potentiation}\,\\[8pt] -P_\text{max}\cdot\left(\lambda_{-}\cdot \left(\dfrac{P_{ij}}{P_\text{max}}\right)^{\mu_{-}} + X_{ij}\right)& \text{for depression}, \end{cases} \end{align} \tag{ 3 }$

similar to the conductance increment of the analog synapse. It has a lower and an upper bound $P_{\text{min},ij}$ and P_max and it is clipped at these boundaries. While the maximum permanences P_max are identical for synapses, the minimal permanences $P_{\text{min},ij}$ and conductances $G_{\text{min}, ij}$ are uniformly distributed in the intervals $[P_{0,\text{min}},P_{0,\text{max}}]$ and $[G_{0,\text{min}},G_{0,\text{max}}]$ , respectively.

In addition to the write noise introduced by means of the variable X_ij, both the analog and the binary synapse models incorporate a read noise. At each presynaptic spike of neuron j, a noisy component Z is added to the synaptic current

$\begin{align} I_{ij}(t) = (G_{ij}(t) + G_\text{max} \cdot Z_{ij})\cdot V_\text{read} = \overline{G}_{ij}(t)\cdot V_\text{read}, \end{align} \tag{ 4 }$

of neuron i, where $Z_{ij}\sim\mathcal{N}(0,\,\sigma_\text{r}^{2})$ is randomly and independently drawn from a normal distribution with zero mean and standard deviation $\sigma_\text{r}$ , and V_read is the applied voltage. In the course of this article, we use $\overline{G}_{i,j}$ to denote the conductance incorporating both the read and the write noise. This conductance is clamped at zero if it gets negative. In sections 2.3.3 and 3, we motivate these different types of noise both from the hardware and the biological point of view.

Figure 2 shows an exemplary switching behavior of the analog and binary synapse models for a specific set of parameters using 100 consecutive potentiation (i.e. SET) and depression (i.e. RESET) updates. We choose different learning rates ( $\lambda_{+}$ and $\lambda_{-}$ ) for the two types of devices such that they switch from the LCS to the HCS (and back) at about the same number of updates.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.**Â Intrinsic dynamics of the ReRAM model (simulation results). (a) Sketch of the experimental protocol and mapping of pre- and postsynaptic spike timing (top) to the corresponding SET (potentiation; black) and RESET (depression; blue) operations (bottom). Evolution of the conductance $\overline{G}$ in response to 100 SET (potentiation; black) updates, followed by 100 RESET (depression; blue) updates, for the analog (b) and the binary ReRAM model (c). In (c), the permanence of the binary device is plotted in gray. Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), weight dependence exponents $\mu_{+} = 0.5$ , $\mu_{-} = 0.5$ , and noise amplitudes $\sigma_{r} = 0.03$ , $\sigma_{w} = 0.01$ . For remaining parameters, see table 2.
Download figure:
Standard image High-resolution image

Learning in the spiking TM model is governed by a homeostatic form of STDP. Each presynaptic spike triggers a small decrease in the synaptic weight (depression). If this presynaptic spike is immediately followed by a postsynaptic spike, this weight decrease is overwritten by a larger weight increase (potentiation). This implementation ensures that synapses are potentiated only if a presynaptic spike is immediately followed by a postsynaptic spike. Presynaptic firing without subsequent postsynaptic firing weakens the synapse. While the potentiation is required to form sequence specific subnetworks, the depression is important to prune unused connections and thereby helps to acquire sparsity and context specificity. Under normal operation, a potentiation update is hence always accompanied by a small amount of depression (see figure S1 in the supplementary materials). In the case of the analog synapse, the total synaptic growth in the absence of noise is therefore governed by

$\begin{align} \Delta G_{ij} = G_\text{max} \left[ \lambda_{+} \cdot \left(1 - \dfrac{G_{ij}}{G_\text{max}}\right)^{\mu_{+}} - \lambda_{-} \cdot \left(\dfrac{G_{ij}}{G_\text{max}}\right)^{\mu_{-}}\right]. \end{align} \tag{ 5 }$

The stationary solution of the device conductance (fixed point) $G^*$ , obtained by setting $\Delta G_{ij} = 0$ , is always below the maximum conductance G_max (see figure S1(b) in the supplementary materials). The permanence of the binary synapse is subject to this effect, too. After a number of potentiation steps, it reaches a value $P^*$ smaller than P_max (see figure S1(c) in the supplementary materials). According to equation (2), the conductance can however still assume G_max. Only if the depression is too strong, the device may not reach the maturity threshold Î¸_P, and thus not switch to the HCS.

In the next sections, we evaluate the effects of different characteristics of the analog and the binary switching dynamics such as the weight dependence of the device update ( $\mu_{+}$ , $\mu_-$ ), the conductance range (G_min, G_max), the learning rates ( $\lambda_{+}$ , $\lambda_{-}$ ), as well as the write and the read variability ( $\sigma_\text{w}$ , $\sigma_\text{r}$ ) on the learning process of the spiking TM model.

2.2.Â A spiking neural networks with ReRAM synapses successful at sequence prediction

Sequence learning and prediction are principal computations performed by the brain and have a number of potential technological applications. The spiking temporal memory (spiking TM) model proposed a brain-inspired network of this type of computation Bouhadjar et al (2022). In this section, we utilize the ReRAM device dynamics (see above) to replace the original synaptic model and evaluate the resulting network performance on a sequence prediction task.

We briefly describe here the main mechanisms and principles of the spiking TM model. For an in-depth analysis, we refer readers to (Bouhadjar et al 2022). The model is composed of a ${N_\textrm{E}}$ excitatory ('E') and ${N_\textrm{I}}$ inhibitory ('I') neurons, which are randomly and sparsely connected. Excitatory neurons are organized into M distinct subpopulations, where the neurons in each subpopulation represent a specific sequence element and exhibit a shared stimulus preference (figure 3(a)). Excitatory neurons are recurrently connected to the inhibitory neurons implementing a winner-take-all (WTA) mechanism. We model neurons using leaky integrate-and-fire dynamics. Excitatory neurons are additionally equipped with nonlinear dendrites mimicking dendritic action potentials (dAPs). We model the dAPs as follows: if the dendritic current a threshold $\theta_{\text{dAP}}$ , it is instantly set and clamped to the dAP plateau current I_dAP for a period of duration $\tau_\text{dAP}$ . The dAP threshold is chosen such that the co-activation of Î³ presynaptic neurons reliably triggers a dAP in the target neuron:

$\begin{align} \theta_{\text{dAP}} = V_\text{read} \cdot G_{+} \cdot \gamma \cdot p. \end{align} \tag{ 6 }$

In the case of the analog synapse, $G_{+}$ is taken to be the steady-state conductance $G^*$ , and in the case of the binary synapse, it is taken to be G_max. In addition to the dendritic input, the excitatory neurons are equipped with additional inputs from external and inhibitory sources. Inhibitory neurons have only excitatory inputs. The synapses between excitatory neurons are plastic evolving according to the analog or the binary ReRAM models described in section 2.1. A homeostatic component further controls the synaptic growth: if the dAP activity, i.e. the number of generated dAPs in a certain time window, is above a target $z^*$ , the potentiation is disabled and instead a depression pulse is applied (see section 5).

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.**Â Network structure. (a) Sketch of the model architecture composed of a randomly and sparsely connected recurrent network of excitatory and inhibitory neurons. The excitatory neuron population is subdivided into subpopulations according to stimulus preference (gray circles). During learning, sequence specific, sparsely connected subnetworks with mature synapses are formed (light and dark blue arrows). For the example discussed in the main text and in panel (b), the network learns four high-order sequences {A,D,B,E,I}, {F,D,B,E,C}, {H,L,J,K,D} and {G,L,J,K,E}. In panel (a), only two of them are depicted for clarity. The gray dashed lines depict the existence of further subpopulations, which are not shown in the sketch. (b) Connectivity matrix of excitatory neurons after learning for the network with binary synapses. Target and source neurons are grouped into stimulus-specific subpopulations ('A',...,'F'). During the learning process, subsets of connections between subpopulations corresponding to subsequent sequence elements become mature and effective ({A,D,B,E,I}: light blue, {F,D,B,E,C}: dark blue, {H,L,J,K,D}: red, {G,L,J,K,E}: orange). Immature synapses are marked by light gray dots. Dark gray dots in panel (b) correspond to mature connections between neurons that remain silent after learning. Only $1\%$ of immature connections are shown for clarity.
Download figure:
Standard image High-resolution image

**Figure 3.**Â Network structure. (a) Sketch of the model architecture composed of a randomly and sparsely connected recurrent network of excitatory and inhibitory neurons. The excitatory neuron population is subdivided into subpopulations according to stimulus preference (gray circles). During learning, sequence specific, sparsely connected subnetworks with mature synapses are formed (light and dark blue arrows). For the example discussed in the main text and in panel (b), the network learns four high-order sequences {A,D,B,E,I}, {F,D,B,E,C}, {H,L,J,K,D} and {G,L,J,K,E}. In panel (a), only two of them are depicted for clarity. The gray dashed lines depict the existence of further subpopulations, which are not shown in the sketch. (b) Connectivity matrix of excitatory neurons after learning for the network with binary synapses. Target and source neurons are grouped into stimulus-specific subpopulations ('A',...,'F'). During the learning process, subsets of connections between subpopulations corresponding to subsequent sequence elements become mature and effective ({A,D,B,E,I}: light blue, {F,D,B,E,C}: dark blue, {H,L,J,K,D}: red, {G,L,J,K,E}: orange). Immature synapses are marked by light gray dots. Dark gray dots in panel (b) correspond to mature connections between neurons that remain silent after learning. Only $1\%$ of immature connections are shown for clarity.
Download figure:
Standard image High-resolution image

During the learning process, the network is repeatedly presented with a given ensemble of sequences. Before learning, presenting a sequence element causes all neurons in the respective subpopulation to fire, except the subpopulation representing the first sequence element, where only a random subset of neurons is activated. The repeated presentation of the sequences strengthens the connections between the subpopulations representing subsequently presented elements. After sufficient learning, the activation of a subpopulation by an external input causes a specific subset of neurons in the following subpopulation to generate dAPs resulting in a long-lasting depolarization of the somata. Neurons that generate dAPs signal the anticipated sequence element and are thus referred to as predictive neurons. When receiving an external input, predictive neurons fire earlier as compared to non-predictive neurons. If a certain subpopulation contains a sufficient number of predictive neurons, their advanced spike initiates fast and strong inhibitory feedback to the entire subpopulation, ultimately suppressing the firing of the non-predictive neurons. The randomness in the connectivity supplemented by the homeostatic control enables the generation of sequence-specific sparse connectivity patterns between subsequently activated neuronal subpopulations (figures 3(a) and (b)). For each pair of sequence elements in a given sequence ensemble, there is a unique set of postsynaptic neurons generating dAPs. Consequently, after learning in response to the presentation of a sequence element, the network predicts in a context-dependent manner the next element in the sequence by activating the dAPs of the corresponding subpopulation.

Here, we study the prediction performance for the network with either the binary or the analog ReRAM synapses (figure 4). We use the synaptic parameters fitted from the exemplary data discussed in section 2.1. To quantify the sequence prediction performance, we repetitively stimulate the network using the same set of sequences {A,D,B,E,I}, {F,D,B,E,C}, {H,L,J,K,D}, {G,L,J,K,E} and assess the prediction error by comparing the anticipated next sequence element with the correct one (Bouhadjar et al 2022). To ensure the performance results are not specific to a single network, the evaluation is repeated for a number of randomly instantiated network realizations with different initial connectivities. After each new network instantiation, the initial prediction error is at 1 (figure 4). With an increasing number of training episodes, the prediction error for both networks with either the binary or the analog synapses decreases to zero as both networks learn the sequences and develop context-dependent pathways between successive sequence elements.

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.**Â Prediction error. Dependence of the prediction error on the number of training episodes for the network with analog (a) or binary synapses (b). Curves and error bands indicate the median as well as the $5\%$ and $95\%$ percentiles across an ensemble of 5 different network realizations, respectively. Same parameters as in figure 2.
Download figure:
Standard image High-resolution image

**Figure 4.**Â Prediction error. Dependence of the prediction error on the number of training episodes for the network with analog (a) or binary synapses (b). Curves and error bands indicate the median as well as the $5\%$ and $95\%$ percentiles across an ensemble of 5 different network realizations, respectively. Same parameters as in figure 2.
Download figure:
Standard image High-resolution image

2.3.Â Influence of device characteristics on prediction performance

ReRAM devices reported in the literature exhibit different non-idealities, including (1) limited precision or the number of synaptic levels; (2) limited dynamic range; (3) dependence of the synaptic updates on the weight, also known as synaptic nonlinearity; (4) device variability, including read and write variability (see Zhao et al 2020, for an overview). In this section, we study how these non-idealities affect the prediction performance in the spiking TM model.

2.3.1.Â High prediction performance obtained for a broad range of on-off ratios and learning rates

The dynamic range is defined as the on-off ratio between the maximum (G_max) and the minimum conductance (G_min). Most ReRAM devices exhibit an on-off ratio in a range of 2 to $\gt$ 10⁴ (Hong et al 2018). Within the minimum and the maximum conductance, the synapse can assume different learning rates ( $\lambda_{+}$ , $\lambda_{-}$ ). Here, we investigate the influence of different on-off ratios and learning rates on the prediction performance. Note that the learning rates directly influence the number of synaptic levels, with higher rates causing the conductance to transition more rapidly from LCS to HCS, resulting in fewer level crossings.

We first evaluate how the asymmetry in the learning rates between the potentiation and depression operations ( $\lambda_{+}$ and $\lambda_{-}$ ) affects the prediction performance. To study this effect, we fix $\lambda_{+}$ and vary $\lambda_{-}$ with the state dependence exponents $\mu_{+}$ and $\mu_{-}$ being set to zero. The prediction error remains high if $\lambda_{-}\unicode{x2A7E}\lambda_{+}$ (see figure S2 in the supplementary materials). This is due to the plasticity dynamics of the spiking TM model: the potentiation operation is applied only when the postsynaptic spike follows after the presynaptic spike, in contrast, the RESET operation is applied every time the presynaptic neuron generates a spike. Therefore, for effective synaptic growth, the potentiation needs to be stronger than depression.

We next vary the on-off ratio between 5 and 40 by keeping G_min fixed and varying G_max. As G_min is drawn from a uniform distribution in the interval [ $G_{0,\text{min}}$ , $G_{0,\text{max}}$ ], we compute the on-off ratio as $G_\text{max}/G^{*}_\text{min}$ , where $G^{*}_\text{min} = (G_{0,\text{max}}+G_{0,\text{min}})/2$ . According to equation (6), a change in G_max is accompanied by a change in the dAP threshold. In addition, we vary the learning rate between 0.02 and 0.42 (figure 5). Parameters such as the read and write variability and the weight dependence exponents are taken from the exemplary data presented in section 2.1. We study the influence of the variability and the dependence of the synaptic updates on the weight more systematically in the upcoming sections. Successful learning is obtained for on-off ratios above 10 and 5 and for learning rates below 0.26 and 0.34 for the networks with analog and binary synapses, respectively (figures 5(a) and (c)). For larger learning rates, the prediction performance becomes less stable with occasional failures for some network realizations. While decreasing the learning rate yields minimum prediction error, the number of episodes to solution (i.e. learning speed, see (Bouhadjar et al 2022)) increases as either the conductances or permanences need more learning steps to reach their maximum value (figures 5(b) and (d)). For our choice of parameters (such as $\theta_\text{P}$ ), learning in the network with binary synapses is slightly faster, because, for identical learning rates, the number of update steps required to switch from the LCS to the HCS is lower for the binary device, as compared to the analog device.

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.**Â Effect of the on-off ratio and the learning rate on the prediction performance. Dependence of the prediction error and episodes to solution on the on-off ratio and the learning rate $\lambda_+$ shown for the network with either analog (a), (b) or binary synapses (c), (d). Data depicts the median across an ensemble of 5 different network realizations. Parameters: depression learning rate $\lambda_{-} = \lambda_{+}/3$ , weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ , and variability amplitudes $\sigma_{r} = 0.03$ , $\sigma_{w} = 0.01$ . For remaining parameters see table 2.
Download figure:
Standard image High-resolution image

**Figure 5.**Â Effect of the on-off ratio and the learning rate on the prediction performance. Dependence of the prediction error and episodes to solution on the on-off ratio and the learning rate $\lambda_+$ shown for the network with either analog (a), (b) or binary synapses (c), (d). Data depicts the median across an ensemble of 5 different network realizations. Parameters: depression learning rate $\lambda_{-} = \lambda_{+}/3$ , weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ , and variability amplitudes $\sigma_{r} = 0.03$ , $\sigma_{w} = 0.01$ . For remaining parameters see table 2.
Download figure:
Standard image High-resolution image

In general, the on-off ratio in the spiking TM network is limited due to the following: the transition of the network activity from being initially non-sparse to becoming sparse after learning requires small initial conductances to avoid spurious activation of the dAPs, but high conductances after learning to allow the sparse set of active neurons to generate the dAP reliably. If the on-off ratio is too small this distinction between high and small conductances cannot be realized. Moreover, for successful learning, the network with analog synapses requires a higher minimal on-off ratio compared to the network with binary synapses. This is due to the effect described in section 2.1 below equation (5), which prohibits the conductance from reaching G_max. Therefore, the effective on-off ratio is reduced. The learning mechanisms of the spiking TM also limit the range of possible learning rates. Increasing the learning rate bears the risk that a large fraction of neurons reaches the dAP threshold at the same time. The WTA mechanism selects then all neurons that generate dAP to become active. This leads to a loss of sparseness, which results in impairing the prediction performance. Decreasing the learning rate considerably is also not ideal as the network would learn very slowly.

2.3.2.Â Resilience against weight dependent updates

The conductance of realistic analog ReRAM devices grows or decays in a nonlinear manner as a function of the number of potentiation or depression update steps. The synapse model in section 2.1 captures this effect by the weight dependence exponents ( $\mu_{+}$ , $\mu_{-}$ ). During the potentiation process, the conductance tends to change rapidly at the beginning but saturates at the end of the process (see figure 6(a)). Similar behavior is also observed during the RESET. The potentiation and depression updates have, however, different dependencies on the device conductance. For high conductances, the potentiation increments are much smaller than the depression decrements. This asymmetry in the behavior can be further enhanced if the learning rates are different during the potentiation and depression operations. Similarly, it is reasonable to assume that for the binary synapses the evolution of the permanence may exhibit a weight dependence and an asymmetric behavior between the potentiation and depression dynamics (figure 6(d)).

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.**Â Effect of weight dependence on plasticity dynamics and network performance. Evolution of the synaptic conductance $\overline{G}$ ((a),(d), black and blue) and permanence (d, gray) during a sequence of 100 SET (potentiation; black) and 100 RESET events (depression; blue) for an analog (a) and a binary synapse (d) and different weight dependence exponents $\mu_{+} = \mu_{-} = 0$ (large dot), $\mu_{+} = \mu_{-} = 0.5$ (small dot), and $\mu_{+} = \mu_{-} = 1$ (tiny dot). Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and variability amplitudes $\sigma_{w} = 0$ , $\sigma_{r} = 0$ . (b),(c),(e), and (f) Effect of the weight dependence exponent on the prediction performance. Dependence of the prediction error and episodes to solution on the weight dependence exponents for both potentiation and depression ( $\mu_{+}$ and $\mu_{-}$ ) shown for the networks with either analog (b),(c) or binary synapses (e),(f). Data depicts the median across an ensemble of 5 different network realizations. Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and variability amplitudes $\sigma_{w} = 0.01$ , $\sigma_{r} = 0.03$ . For remaining parameters see table 2.
Download figure:
Standard image High-resolution image

Here, we assess the prediction performance as a function of different weight dependence exponents for both potentiation and depression ( $\mu_{+}$ and $\mu_{-}$ , respectively, see figure 6). For most exponent combinations studied here, the prediction error is low and varies only mildly with $\mu_+$ and $\mu_-$ (figures 6(b) and (e)). For larger values of $\mu_{+}$ , the learning slows down as it takes longer for either the conductance or the permanence to reach their maximum values (see figures 6(c) and (f)). Decreasing $\mu_{-}$ makes learning faster again as the depression becomes weaker compared to the potentiation. In the binary case, the steady-state permanence $P^*$ may end up below the maturity threshold $\theta_\text{P}$ such that the synapses can mature only due to the noise. The learning is therefore slowed down for large values of $\mu_{+}$ or even unsuccessful if the devices do not switch to the HCS. In the model, $\theta_\text{P}$ could be adjusted to $P^*$ (similarly to adjusting $\theta_\text{dAP}$ to $G^*$ in the analog synapse; see above). In this case, the learning in the analog and the binary networks may be similarly fast. In the physical device, however, the maturity threshold $\theta_\text{P}$ can hardly be changed.

2.3.3.Â Resilience against read and write variability

The resistive switching process (i.e. write process) of ReRAM devices involves the drift and diffusion of the oxygen vacancies. This phenomenon is highly stochastic and shows considerable variation from device to device, and even from pulse to pulse within one device (Zhao et al 2020). Further, even when no switching occurs, the oxygen vacancies exhibit random microscopic displacements resulting in read variability. In our work, we capture these effects by the read and write variability introduced in section 2.1. The influence of the read and write variability on the conductance curves is illustrated for both the analog and binary synapses in figure 7. For different trials, the write variability results in different conductance trajectories. The read variability, on the other hand, causes only a jitter in the conductance curves.

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.**Â Effect of read and write variability on plasticity dynamics and network performance. Evolution of the synaptic conductance ((a), (b), (e), (f), blue and black) and permanence ((e), (f), gray) during a sequence of 100 SET (potentiation; black) and 100 RESET events (depression; blue) for an analog (a), (b) and a binary synapse (e), (f) in the presence of read noise ( $\sigma_\text{r} = 0.03$ , (a), (e)) or write noise ( $\sigma_\text{w} = 0.03$ , (b), (d)). Each experiment is repeated 3 times (see the different curves in each panel). Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ . For remaining parameters see table 2. (c), (d), (g), and (h) Effect of the variability on the prediction performance. Dependence of the prediction error and episodes to solution on the read and write variability $\sigma_\text{r}$ , $\sigma_\text{w}$ , shown for the networks with either analog (c), (d) or binary synapse (g), (h). Data depicts the median across an ensemble of 5 different network realizations. Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ . To gain more robustness with respect to the variability (decrease false negatives), we decrease $\theta_\text{dAP}$ by $10\%$ as compared to the default value described in equation (6). For remaining parameters see table 2.
Download figure:
Standard image High-resolution image

**Figure 7.**Â Effect of read and write variability on plasticity dynamics and network performance. Evolution of the synaptic conductance ((a), (b), (e), (f), blue and black) and permanence ((e), (f), gray) during a sequence of 100 SET (potentiation; black) and 100 RESET events (depression; blue) for an analog (a), (b) and a binary synapse (e), (f) in the presence of read noise ( $\sigma_\text{r} = 0.03$ , (a), (e)) or write noise ( $\sigma_\text{w} = 0.03$ , (b), (d)). Each experiment is repeated 3 times (see the different curves in each panel). Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ . For remaining parameters see table 2. (c), (d), (g), and (h) Effect of the variability on the prediction performance. Dependence of the prediction error and episodes to solution on the read and write variability $\sigma_\text{r}$ , $\sigma_\text{w}$ , shown for the networks with either analog (c), (d) or binary synapse (g), (h). Data depicts the median across an ensemble of 5 different network realizations. Parameters: learning rates $\lambda_{+} = 0.1$ , $\lambda_{-} = \lambda_{+}/3$ (analog synapse), $\lambda_{+} = 0.04$ , $\lambda_{-} = \lambda_{+}/3$ (binary synapse), and weight dependence exponents $\mu_{+} = \mu_{-} = 0.5$ . To gain more robustness with respect to the variability (decrease false negatives), we decrease $\theta_\text{dAP}$ by $10\%$ as compared to the default value described in equation (6). For remaining parameters see table 2.
Download figure:
Standard image High-resolution image

To study how the variability influences the prediction performance, we assess the prediction error and episodes to solution for different magnitudes of the read and write variability $\sigma_\text{r}$ and $\sigma_\text{w}$ , respectively. Both networks with either analog or binary synapses allow similar read and write noise levels, with the binary synapse being slightly more resilient toward the read noise (figures 7(c) and (g)). In both cases, the write noise is more detrimental as it accumulates across the different learning episodes and can therefore have a higher impact on the learning performance. The read noise tends to average out as it is independent across the learning episodes. Overall, increasing the read or write variability beyond what is acceptable leads to a spurious activation of the dAPs, i.e. predictions, and a decline in the prediction performance. The learning speed (episodes to solution) varies only slightly within the parameter region where learning is successful (figures 7(d) and (h)).

2.3.4.Â Robustness with respect to device failure

When operating ReRAM devices, they risk failing by getting trapped in the HCS even after applying voltage pulses with the appropriate magnitude across them (Kumar et al 2017). To study how device failure affects the prediction performance, we first train the network till it reaches zero prediction error (after 150 episodes in figure 8). Then, the conductance of a random fraction of synapses is set to the HCS. We quantify the level of device failure by the ratio between the number of failed synapses and the total number of existing synapses. In the spiking TM model, a neuron may falsely generate a dAP if a sufficient number of its synapses are randomly switched to the HCS (this number can be approximated as the ratio $\theta_\text{dAP}/G_\text{max}$ , where $\theta_\text{dAP}$ is the dAP threshold and G_max is the maximum conductance). This may result in generating false positives and thus an increase in the prediction error. This is confirmed by our results presented in figures 8(a) and (b). At up to $10\%$ device failure no impact is observed on the prediction performance (figures 8(a) and (b)). At greater than $10\%$ device failure the performance of the network declines and does not recover.

Figure 8. Refer to the following caption and surrounding text. — **Figure 8.**Â Effect of device failure on the prediction performance. Dependence of the prediction error on the number of training episodes and different levels of device failure (red 10%, orange 20%, black 30%) shown for both the analog ((a), (c)) and the binary synapse ((b), (d)). We implement the device failure by fixing a random selection of synapses to be stuck at the HCS (ON stuck; (a) and (b)) or stuck at the LCS (OFF stuck; (c) and (d)). The device failure is introduced at episode 150. Curves and error bands indicate the median as well as the $5\%$ and $95\%$ percentiles across an ensemble of 5 different network realizations, respectively. Same parameters as in figure 2.
Download figure:
Standard image High-resolution image

**Figure 8.**Â Effect of device failure on the prediction performance. Dependence of the prediction error on the number of training episodes and different levels of device failure (red 10%, orange 20%, black 30%) shown for both the analog ((a), (c)) and the binary synapse ((b), (d)). We implement the device failure by fixing a random selection of synapses to be stuck at the HCS (ON stuck; (a) and (b)) or stuck at the LCS (OFF stuck; (c) and (d)). The device failure is introduced at episode 150. Curves and error bands indicate the median as well as the $5\%$ and $95\%$ percentiles across an ensemble of 5 different network realizations, respectively. Same parameters as in figure 2.
Download figure:
Standard image High-resolution image

In a second experiment, instead of turning a selection of random synapses to the HCS, we turn them to the LCS. For the different levels of device failures, the performance of the network initially declines. Due to the failing synapses, which are stuck at the LCS, the neurons in certain subpopulations do not receive enough current and are thus not able to generate dAPs, i.e. make predictions. After further training episodes, the prediction errors converge back to zero as the network relearns using other synapses (figures 8(c) and (d)). At greater than $20\%$ device failure the performance does not recover due to the absence of alternative connections to form sequence-specific pathways.

3.Â Discussion

3.1.Â Summary

In this work, we demonstrate that the learning rules of the spiking temporal memory (spiking TM) model proposed by Bouhadjar et al (2022) can be realized using memristive dynamics. We investigate this for a particular type of memristive device known as VCM ReRAM (Waser 2012a). We show that the spiking TM retains high prediction performance for a broad range of on-off ratios and learning rates. The model is resilient toward the write and read variability as well as the dependence of the synaptic updates on the weight. Moreover, our results show that the VCM-type ReRAM device can be operated either in the binary or the gradual switching regime without performance loss. We note only slight differences: the network with binary synapse is more resilient toward read noise and requires less synaptic on-off ratio. The analog synapse is more robust to larger values of the weight-dependent coefficient $\mu_{+}$ . The ability of the network to retain successful performance with both types of synapses (for a broad range of parameters) is in line with the original spiking TM implementation (Bouhadjar et al 2022), which shows that the learning rule can either be implemented using structural plasticity where the weight abruptly changes between two levels or a conventional form of STDP where the weight gradually changes. This suggests that the intrinsic dynamics of the VCM ReRAM capture not only synaptic properties of biological synapses such as the variability and the dependence of the synaptic updates on the weight but also can implement known forms of plasticity in the neuroscientific literature. Our study thereby contributes to establishing a dynamical and functional correspondence between biological synapses and memristive devices.

3.2.Â Relationship to previous models

In artificial neural networks trained by gradient-based approaches, ReRAM non-idealities can severely undermine the overall performance (Fouda et al 2020). Due to the ReRAM variability, devices can be hardly programmed to a desired state, and the asymmetry in the conductance change can affect the propagation of the gradient and lead to performance loss. Correcting for these non-idealities can be costly and may require additional circuitry (Chen et al 2015, Agarwal et al 2016, Ambrogio et al 2018, Hong et al 2018, Yu 2018, Adnan et al 2021). We know that biological neuronal networks carry out accurate computations despite their synaptic non-ideal characteristics such as variability. This suggests the existence of biological principles accommodating that, which we need to understand and port to successfully implement neuromorphic hardware. The spiking TM and other brain-inspired self-organizing networks (Lazar et al 2009, Yi et al 2022) suggest a set of biological concepts that might be at the heart of brain processing capabilities. For instance, the highly sparse connectivity and activity of the spiking TM are observed in biological networks, and they are essential for increasing the capacity of the system and decreasing energy consumption.

There are a number of biologically motivated sequence learning models that are closely related to the spiking TM, such as the self-organizing recurrent neural network model (SORN, Lazar et al 2009). Recent work incorporated memristive dynamics into the synapses and neurons of the SORN model and showed that it retains successful performance (Payvand et al 2022). The authors studied the role of variability and showed that it can improve the prediction performance. However, the other memristive non-idealities were not studied systematically. It remains also to be investigated whether the model can learn high-order sequences similar to the ones presented in our work.

3.3.Â Outlook

Neuromorphic hardware that relies on components implemented in the analog domain is noisy and heterogeneous, similar to real brains (Zhu et al 2020). To date, there are only speculations on how the brain contributes to sensible and reliable behavior in the face of these imperfections. By using neuromorphic hardware as a test substrate, we expect to gain new insights into the neuronal principles that solve this issue. In this study, it is apparent that ReRAM devices share the same characteristics as biological synapses including the weight dependence of the synaptic updates, limited dynamic range, and variability. Throughout the work on these neuromorphic systems, we can develop intuitions of how the biological synapse exploits these different characteristics. For instance, in biology, the write noise represents the variability in the synaptic weight change following a pre- and/or postsynaptic spike, such as the variability in the postsynaptic receptor synthesis. The read noise, instead, refers to the momentary variability in the postsynaptic response amplitude, which is, for example, caused by a variability in the amount of neurotransmitter released by individual presynaptic spikes. Note that the write noise is accumulated over time, whereas the read noise affects the synaptic weight (conductance) only during the presynaptic spike. In neuroscience studies, synaptic stochasticity (including synaptic failure) is typically regarded as read noise. The role of the write noise is often ignored. So far, it is not clear how these different characteristics contribute to the learning dynamics in the biological system. Neuromorphic hardware can provide an environment where this question can be studied.

In this work, we show that the model is resilient toward synaptic variability. Other works show that synaptic variability can even have a computational benefit (Dalgaty et al 2021). For example, in probabilistic computing frameworks, the variability is considered a prerequisite for efficient probabilistic inference (Buesing et al 2011, Suri et al 2013, Maass 2014, Neftci et al 2016, Dutta et al 2022). It allows the system to explore the state space and come up with an estimate of how likely is each solution. Similarly, a recent extension of the spiking TM model shows that the model can learn to replay probabilistic sequences using noise (Bouhadjar et al 2023). The study demonstrated that, in a network context, variability can only serve exploration if it is locally correlated and hence not averaged out. Understanding whether synaptic or memristive variability could contribute to such a form of noise remains the subject of future studies.

Ultimately, the goal is to implement the spiking TM model on a standalone neuromorphic chip. In this work, we only investigate how the intrinsic properties of the memristive device affect the learning in the spiking TM. A study by (Siegel et al 2023a) provided a specific instance of an electronic circuit design of the hardware implementing the different components of the spiking TM and showed in simulations that the system supports successful prediction performance. A follow-up study by (Siegel et al 2023b) taped out a memristive synaptic array of a simplified spiking TM model and showed successful performance on a simple sequence learning problem. A future study needs to upscale the array size and task difficulty.

4.Â Conclusion

Neuromorphic hardware centered around memristive devices is a potential hardware substrate for efficient execution of machine learning algorithms. Memristive devices are however characterized by non-ideal characteristics such as variability, nonlinearity, and finite precision, which were shown to degrade the performance of machine learning models. Addressing these non-idealities can be an expensive process and may necessitate the integration of additional circuitry.

This work demonstrates that a bio-inspired sequence learning model is robust with respect to the non-idealities exhibited by memristive devices (and biological synapses). The model learns complex sequences in an unsupervised manner using biologically inspired, local learning rules. It can be implemented with both analog and binary memristive synapses, and shows high performance even in the presence of memristive variability, nonlinearity, and finite precision. It thereby lays the foundation for novel algorithmic principles that can be implemented in edge applications.

5.Â Methods

In the following tables (table 1), we provide an overview of the network model, the training protocol, and the simulation details. Parameter values can be found in table 2. See (Bouhadjar et al 2022) for a detailed description of the model.

5.1.Â Model tables

Table 1.Â Description of the network model (continued on next page). Parameter values are given in table 2.

Summary
Populations	Excitatory neurons ( $\mathcal{E}$ ), inhibitory neurons ( $\mathcal{I}$ ), external spike sources ( $\mathcal{X}$ ); $\mathcal{E}$ and $\mathcal{I}$ composed of M disjoint subpopulations $\mathcal{M}_k$ and $\mathcal{I}_{k}$ ( $k = 1,\ldots,M$ )
Connectivity	Sparse random connectivity between excitatory neurons (plastic) Local recurrent connectivity between excitatory and inhibitory neurons (static)
Neuron model	Excitatory neurons: leaky integrate-and-fire (LIF) with nonlinear input integration (dendritic action potentials) Inhibitory neurons: leaky integrate-and-fire (LIF)
Synapse model	Exponential postsynaptic currents (PSCs)
Plasticity	Homeostatic spike-timing dependent plasticity in excitatory-to-excitatory connections

Populations
Name	Elements	Size
$\mathcal{E} = \cup_{i = k}^M\mathcal{M}_k$	Excitatory (E) neurons	$N_\textrm{E}$
$\mathcal{I} = \cup_{i = k}^M\mathcal{I}_k$	Inhibitory (I) neurons	$N_\textrm{I}$
$\mathcal{M}_k$	Excitatory neurons in subpopulation k, $\mathcal{M}_k\cap\mathcal{M}_l = \emptyset (\forall{}k\ne{}l\in[1,M])$	$n_\textrm{E}$
$\mathcal{I}_{k}$	Inhibitory neurons in subpopulation k, $\mathcal{I}_k\cap\mathcal{I}_l = \emptyset (\forall{}k\ne{}l\in[1,M])$	$n_\textrm{I}$
$\mathcal{X} = \{x_1,\ldots,x_M\}$	External spike sources	M

Connectivity
Source population	Target population	Pattern
$\mathcal{E}$	$\mathcal{E}$	Random; fixed in-degrees $K_i = K_{\textrm{E}\textrm{E}}$ , delays $d_{ij} = d_{{\textrm{E}\textrm{E}}}$ , synaptic time constants $\tau_{ij} = d_{{\textrm{E}\textrm{E}}}$ plastic weights $G_{ij}\in\{0,\overline{G}_{ij}\}$ ( $\forall{}i\in{\mathcal{E}},\,\forall{}j\in{\mathcal{E}}$ ; ' ${\textrm{E}\textrm{E}}$ connections')
$\mathcal{M}_k$	$\mathcal{I}_k$	All-to-all; fixed delays $d_{ij} = d_{{\textrm{I}\textrm{E}}}$ , synaptic time constants $\tau_{ij} = \tau_{{\textrm{I}\textrm{E}}}$ , and weights $G_{ij} = G_{\textrm{I}\textrm{E}}$ ( $\forall{}i\in\mathcal{M}_k,\,\forall{}j\in{}\mathcal{I}_k,\,\forall{}k\in[1,M]$ ; ' ${\textrm{I}\textrm{E}}$ connections')
$\mathcal{I}_{k}$	$\mathcal{M}_k$	All-to-all; fixed delays $d_{ij} = d_{{\textrm{E}\textrm{I}}}$ , synaptic time constants $\tau_{ij} = \tau_{{\textrm{E}\textrm{I}}}$ , and weights $G_{ij} = G_{\textrm{E}\textrm{I}}$ ( $\forall{}i\in\mathcal{I}_k,\,\forall{}j\in{}\mathcal{M}_k,\,\forall{}k\in[1,M]$ ; ' ${\textrm{E}\textrm{I}}$ connections')
$\mathcal{I}_{k}$	$\mathcal{I}_{k}$	None ( $\forall{}k\in[1,M]$ ; ' ${\textrm{I}\textrm{I}}$ connections')
$\mathcal{X}_{k} = x_k$	$\mathcal{M}_k$	One-to-all; fixed delays $d_{ik} = d_{{\textrm{E}\textrm{X}}}$ , synaptic time constants $\tau_{ij} = \tau_{{\textrm{E}\textrm{X}}}$ , and weights $G_{ik} = G_{\textrm{E}\textrm{X}}$ ( $\forall{}i\in\mathcal{M}_k,\,\forall{}k\in[1,M]$ ; ' ${\textrm{E}\textrm{X}}$ connections')
No self-connections ('autapses'), no multiple connections ('multapses')
All unmentioned connections $\mathcal{M}_k\to\mathcal{I}_l$ , $\mathcal{I}_k\to\mathcal{M}_l$ , $\mathcal{I}_k\to\mathcal{I}_l$ , $\mathcal{X}_k\to\mathcal{M}_l$ ( $\forall{}k\ne{}l$ ) are absent

Neuron and synapse
Neuron
Type	Leaky integrate-and-fire (LIF) dynamics
Description	Dynamics of membrane potential $V_{i}(t)$ of neuron i:
Â	Emission of the kth spike of neuron i at time $t_{i}^{k}$ if $V_{i}(t_{i}^{k})\unicode{x2A7E}\theta_i$ ââââ (7) with somatic spike threshold Î¸_i Reset and refractoriness: $V_{i}(t) = V_\textrm{r} \quad \forall{}k,\ \forall t \in \left(t_{i}^{k},\,t_{i}^{k}+\tau_{\text{ref},i}\right]$ with refractory time $\tau_{\text{ref},i}$ and reset potential $V_\textrm{r}$ Spike train: $s_i(t) = \sum_k \delta(t-t_k^i)$ Subthreshold dynamics: $\tau_{\text{m},i}\dot{V}_i(t) = -V_i(t)+R_{\text{m},i} I_i(t) \qquad( \forall{}k,\, \forall{}t\notin[t_i^k,t_i^k+\tau_{\text{ref},i}) )$ ââââ (8) with membrane resistance $R_{\text{m},i} = \dfrac{\tau_{\text{m},i}}{C_{\text{m},i}}$ , membrane time constant $\tau_{\text{m},i}$ , and total synaptic input current $I_i(t)$ $\tau_{\text{m},i} = \tau_\text{m,E}$ , $C_{\text{m},i}=C_\text{m}$ , $\theta_i = \theta_\text{E}$ , $\tau_{\text{ref},i} = \tau_\text{ref,E}$ ( $\forall i\in\mathcal{E}$ ) $\tau_{\text{m},i} = \tau_{\text{m},I}$ , $C_{\text{m},i} = C_\text{m}$ , $\theta_i = \theta_\text{I}$ , $\tau_{\text{ref},i} = \tau_\text{ref,I}$ ( $\forall i\in\mathcal{I}$ )
Synapse
Type	Exponential or alpha-shaped postsynaptic currents (PSCs)
Description	Total synaptic input current ${I}_i(t) = I_{\text{ED},i}(t) + I_{\text{EX},i}(t) + I_{\text{EI},i}(t), \forall i\in\mathcal{E}$ ââââ (9) ${I}_i(t) = I_{\text{IE},i}(t), \forall i\in\mathcal{I}$ with dendritic, inhibitory, external and excitatory input currents $I_{\text{ED},i}(t)$ , $I_{\text{EI},i}(t)$ , $I_{\text{EX},i}(t)$ , $I_{\text{IE},i}(t)$ evolving according to $I_{\text{ED},i}(t) = \sum_{j\in\mathcal{E}}(\alpha_{ij}*s_j)(t-d_{ij})$ ââââ (10) with $\alpha_{ij}(t) = V_\text{read} G_{ij} \dfrac{e}{\tau_{\text{ED}}} t e^{-t/\tau_{\text{ED}}} \Theta(t)$ and $\Theta(t) = \begin{cases}1 & t \unicode{x2A7E} 0 \\ 0 & \text{else} \end{cases}$ $\tau_\text{EI}\dot{I}_{\text{EI},i} = -I_{\text{EI},i}(t) + V_\text{read} \sum_{j\in\mathcal{I}} G_{ij} s_j(t-d_{ij})$ ââââ (11) $\tau_\text{EX}\dot{I}_{\text{EX},i} = -I_{\text{EX},i}(t) + V_\text{read} \sum_{j\in\mathcal{X}} G_{ij} s_j(t-d_{ij})$ ââââ (12) $\tau_\text{IE}\dot{I}_{\text{IE},i} = -I_{\text{IE},i}(t) + V_\text{read} \sum_{j\in\mathcal{E}} G_{ij} s_j(t-d_{ij}))$ ââââ (13) with synaptic time constants $\tau_\text{EX}$ , $\tau_\text{EI}$ , and $\tau_\text{IE}$ of EX, EI, and IE connections, respectively, G_ij the synaptic weight, and the read voltage V_read dAP generation: â Emission of lth dAP of neuron i at time $t_i^l$ if $I_{\text{ED},i}(t_{i}^{l})\unicode{x2A7E}\theta_{\text{dAP}}$ â dAP current plateau: $I_{\text{ED},i}(t) = I_\text{dAP} \quad\forall{}l,\ \forall t \in \left(t_{i}^{l},\,t_{i}^{l}+\tau_\text{dAP}\right]$ ââââ (14) with dAP current plateau amplitude I_dAP, dAP current duration $\tau_\text{dAP}$ , and dAP activation threshold $\theta_{\text{dAP}}$ .

Plasticity
Type	Hebbian-type plasticity and dAP-rate homeostasis
EE synapses	$\bullet$ âHebbian plasticity described in section 2.1 controlled by
Â	$\bullet$ âhomeostatic control:
Â	â If $z_i(t) \gt z^*$ : a depression pulse is applied (see equation (1) or equation (3))
Â	â If $z_i(t) \unicode{x2A7D} z^*$ : a potentiation pulse is applied (see equation (1) or equation (3))
Â	with the dAP trace $z_i(t)$ and target dAP activity $z^*$ .
Â	$\bullet$ âdAP trace $z_i(t)$ of postsynaptic neuron i, evolving according to
Â	$\frac{\mathrm dz_i}{\mathrm dt} = -\tau_\text{h}^{-1} z_i(t) + \sum_k \delta(t-t_{\text{dAP},i}^k)$
Â	with onset time $t_{\text{dAP},i}^k$ of the kth dAP, homeostasis time constant $\tau_\text{h}$
All other synapses	Non-plastic

Input
$\bullet$ âRepetitive stimulation of the network using the same set $\mathcal{S} = \{s_1,\ldots,s_{S}\}$ of sequences $s_i = \{\zeta_{i,1}, \zeta_{i,2},\ldots, \zeta_{i,C_i}\}$ of ordered discrete items $\zeta_{i,j}$ with number of sequences S and length C_i of ith sequence
$\bullet$ âPresentation of sequence element $\zeta_{i,j}$ at time $t_{i,j}$ modeled by a single spike $x_k(t) = \delta(t-t_{i,j})$ generated by the corresponding external source x_k
$\bullet$ âInter-stimulus interval $\Delta{}T = t_{i,j+1}-t_{i,j}$ between subsequent sequence elements $\zeta_{i,j}$ and $\zeta_{i,j+1}$ within a sequence s_i
$\bullet$ âInter-sequence time interval $\Delta{}T_\text{seq} = t_{i+1,1}-t_{i,C_i}$ between subsequent sequences s_i and $s_{i+1}$
$\bullet$ âExample sequence sets:
ââ Sequence set: $\mathcal{S}$ = {{A,D,B,E,I}, {F,D,B,E,C}, {H,L,J,K,D}, {G,L,J,K,E}}
Output
$\bullet$ âSomatic spike times $\{t_i^k \| \forall{}i\in\mathcal{E},k = 1,2,\ldots \}$
$\bullet$ âDendritic currents $I_{\text{ED},i}(t)$ ( $\forall{}i\in\mathcal{E}$ )
Initial conditions and network realizations
$\bullet$ âMembrane potentials: $V_i(0) = V_\text{r}$ ( $\forall{}i\in\mathcal{E}\cup\mathcal{I}$ )
$\bullet$ âDendritic currents: $I_{\text{ED},i}(0) = 0$ ( $\forall{}i\in\mathcal{E}$ )
$\bullet$ âExternal currents: $I_{\text{EX},i}(0) = 0$ ( $\forall{}i\in\mathcal{E}$ )
$\bullet$ âInhibitory currents: $I_{\text{EI},i}(0) = 0$ ( $\forall{}i\in\mathcal{E}$ )
$\bullet$ âExcitatory currents: $I_{\text{IE},i}(0) = 0$ ( $\forall{}i\in\mathcal{I}$ )
$\bullet$ âSynaptic permanences: $P_{ij}(0) = P_{\text{min},ij}$ with $P_{\text{min},ij}\sim\mathcal{U}(P_{0,\text{min}},P_{0,\text{max}})$ ( $\forall{}i,j\in\mathcal{E}$ )
$\bullet$ âSynaptic weights: $\overline{G}_{ij}(0) = G_{\text{min},ij}$ with $G_{\text{min},ij}\sim\mathcal{U}(G_{0,\text{min}},G_{0,\text{max}})$ ( $\forall{}i,j\in\mathcal{E}$ ) (analog synapse)
$\bullet$ âSynaptic weights: $\overline{G}_{ij}(0) = G_\text{min}$ ( $\forall{}i,j\in\mathcal{E}$ ) (binary synapse)
$\bullet$ âSpike traces: $x_i(0) = 0$ ( $\forall{}i\in\mathcal{E}$ )
$\bullet$ âdAP traces: $z_i(0) = 0$ ( $\forall{}i\in\mathcal{E}$ )
$\bullet$ âPotential connectivity and initial permanences randomly and independently drawn for each network realization
Simulation details
$\bullet$ âNetwork simulations performed in NEST (Gewaltig and Diesmann 2007) version 3.0 (Hahne et al 2021)
$\bullet$ âDefinition of excitatory neuron model using NESTML (Plotnikov et al 2016, Nagendra Babu et al 2021)
$\bullet$ âSynchronous update using exact integration of system dynamics on discrete-time grid with step size $\Delta t$ (Rotter and Diesmann 1999)

5.2.Â Model and simulation parameters

Table 2.Â Model and simulation parameters (continued on next page).

Name	Value	Description
Network
$N_\textrm{E}$	1800	Total number of excitatory neurons
$N_\textrm{I}$	12	Total number of inhibitory neurons
M	12	Number of excitatory subpopulations ( = number of external spike sources)
${n_\textrm{E}}$	$N_\textrm{E}/M = 150$	Number of excitatory neurons per subpopulation
${n_\textrm{I}}$	$N_\textrm{I}/M = 1$	Number of inhibitory neurons per subpopulation
Ï	20	(target) number of active neurons per subpopulation after learning = minimal number of coincident excitatory inputs required to trigger a spike in postsynaptic inhibitory neurons
(Potential) Connectivity
${K_{\textrm{E}\textrm{E}}}$	450	Number of excitatory inputs per excitatory neuron ( ${\textrm{E}\textrm{E}}$ in-degree)
p	$K_{\textrm{E}\textrm{E}}/N_\textrm{E} = 0.25$	Probability of potential (excitatory) connections
${K_{\textrm{E}\textrm{I}}}$	$n_\textrm{I} = 1$	Number of inhibitory inputs per excitatory neuron ( ${\textrm{E}\textrm{I}}$ in-degree)
${K_{\textrm{I}\textrm{E}}}$	${n_\textrm{E}}$	Number of excitatory inputs per inhibitory neuron ( ${\textrm{I}\textrm{E}}$ in-degree)
${K_{\textrm{I}\textrm{I}}}$	0	Number of inhibitory inputs per inhibitory neuron ( ${\textrm{I}\textrm{I}}$ in-degree)
Excitatory neurons
$\tau_\text{m,E}$	$10\,\textrm{ms}$	Membrane time constant
$\tau_\text{ref,E}$	$20\,\textrm{ms}$	Absolute refractory period
$C_\textrm{m}$	$250\,\mu{}F$	Membrane capacity
$V_\textrm{r}$	$0\,\textrm{mV}$	Reset potential
$\theta_\text{E}$	$30\,\textrm{mV}$	Somatic spike threshold
I_dAP	$200\,\mu\textrm{A}$	dAP current plateau amplitude
$\tau_\text{dAP}$	$60\,\textrm{ms}$	dAP duration
$\theta_{\text{dAP}}$	see equation (6)	dAP threshold
Inhibitory neurons
$\tau_\text{m,I}$	$5\,\textrm{ms}$	Membrane time constant
$\tau_\text{ref,I}$	$2\,\textrm{ms}$	Absolute refractory period
$C_\textrm{m}$	$250\,\mu{}F$	Membrane capacity
$V_\textrm{r}$	$0\,\textrm{mV}$	Reset potential
$\theta_\text{I}$	$15\,\textrm{mV}$	Spike threshold
Name	Value	Description
Synapse
$\tilde{G}_\text{IE}$	$0.9\,\textrm{mV}$	Weight of IE connections (EPSP amplitude)
$G_{\textrm{I}\textrm{E}}$	$581.19\,\mu\textrm{S}$	Weight of IE connections (EPSC amplitude)
$\tilde{G}_\text{EI}$	$-60\,\textrm{mV}$	Weight of EI connections (IPSP amplitude)
$G_{\textrm{E}\textrm{I}}$	$-19~373.24\,\mu\textrm{S}$	Weight of EI connections (IPSC amplitude)
$\tilde{G}_\text{EX}$	$33\,\textrm{mV}$	Weight of EX connections (EPSP amplitude)
$G_\textrm{EX}$	$6168.31\,\mu\textrm{S}$	Weight of EX connections (EPSC amplitude)
${\tau}_{{\textrm{E}\textrm{E}}}$	$2 \,\textrm{ms}$	Synaptic time constant of EE connections
${\tau}_{{\textrm{I}\textrm{E}}}$	$0.5 \,\textrm{ms}$	Synaptic time constant of IE connections
${\tau}_{{\textrm{E}\textrm{I}}}$	$1 \,\textrm{ms}$	Synaptic time constant of EI connections
${\tau}_{{\textrm{E}\textrm{X}}}$	$2 \,\textrm{ms}$	Synaptic time constant of EX connection
$d_{{\textrm{E}\textrm{E}}}$	$2\,\textrm{ms}$	Delay of EE connections (dendritic)
$d_{\textrm{I}\textrm{E}}$	$0.1\,\textrm{ms}$	Delay of IE connections
$d_{\textrm{E}\textrm{I}}$	$0.1\,\textrm{ms}$	Delay of EI connections
$d_{\textrm{E}\textrm{X}}$	$0.1\,\textrm{ms}$	Delay of EX connections
V_read	$1{}\text{V}$	Read voltage
Plasticity
$\lambda_{+}$	$\{0.02,\ldots,\textbf{0.1},\ldots,0.42\}$ (analog synapse),	Potentiation learning rate
Â	$\{0.02,\ldots,\textbf{0.04},\ldots,0.42\}$ (binary synapse)	Â
$\lambda_{-}$	$\lambda_{+}/\beta$	Depression rate
Î²	$\{0.5,1,2,\textbf{3}\}$	Ratio between depression and potentiation learning rates
$\lambda_\text{h}$	$\lambda_{-}$	Homeostasis rate
$\mu_{+}$	$\{0,\textbf{0.5},1\}$	Weight dependence (potentiation) exponent (default parameter)
$\mu_{-}$	$\{0,\textbf{0.5},1\}$	Weight dependence (depression) exponent (default parameter)
Î¸_P	10	Synapse maturity threshold
$P_{\text{min}, ij}$	$\sim\mathcal{U}(P_{0,\text{min}},P_{0,\text{max}})$	Minimum permanence
$G_{\text{min}, ij}$	$\sim\mathcal{U}(G_{0,\text{min}},G_{0,\text{max}})$	Minimum conductance
G_max	$\{50,\ldots,\textbf{300},\ldots,400\}\,\mu\textrm{S}$	Maximum conductance
$G_{0,\text{min}}$	$7.5\,\mu\textrm{S}$	Minimal initial conductance
$G_{0,\text{max}}$	$12.5\,\mu\textrm{S}$	Maximal initial conductance
$P_{0,\text{max}}$	8	Maximal initial permanence
$P_{0,\text{min}}$	0	Minimal initial permanence
$P_{0,\text{max}}$	8	Maximal initial permanence
$\sigma_\text{r}$	$\{0,\ldots,\textbf{0.03},\ldots,0.1\}$	Read noise
$\sigma_\text{w}$	$\{0,\ldots,\textbf{0.01},\ldots,0.2\}$	Write noise
$z^*$	1.8	Target dAP activity
$\tau_\text{h}$	$1040\,\text{ms}$	Homeostasis time constant
$\Delta{}t_\text{min}$	$4\,\text{ms}$	Minimum time lag between pairs of pre- and postsynaptic spikes at which synapses are potentiated
$\Delta{}t_\text{max}$	$50\,\text{ms}$	Maximum time lag between pairs of pre- and postsynaptic spikes at which synapses are potentiated
Input
S	4	Number of sequences per set
C	5	Number of characters per sequence
A	12	Alphabet length
$\Delta{}T$	$40\,\textrm{ms}$	Inter-stimulus interval
$\Delta{}T_\text{seq}$	$100\,\textrm{ms}$	Inter-sequence interval
Simulation
$\Delta t$	$0.1\,\textrm{ms}$	Time resolution
K	{200, 400}	Number of training episodes

Data availability statement

The data cannot be made publicly available upon publication because they are not available in a format that is sufficiently accessible or reusable by other researchers. The data that support the findings of this study are available upon reasonable request from the authors. The source code to reproduce the data is available at https://doi.org/10.5281/zenodo.6754964.

Funding

This Project was supported by the German Federal Ministry of Education and Research (BMBF) under Grant Numbers 16ME0398K and 16ME0399 (NEUROTEC), by the Helmholtz Association Initiative and Networking Fund under Project Number SO-092 (Advanced Computing Architectures, ACA), and by the European Union's Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 785907 (Human Brain Project SGA2) and No. 945539 (Human Brain Project SGA3).

Dates

Peer review information

2.3.1.Â High prediction performance obtained for a broad range of on-off ratios and learning rates

2.3.2.Â Resilience against weight dependent updates

2.3.3.Â Resilience against read and write variability

2.3.4.Â Robustness with respect to device failure

Sequence learning in a spiking neuronal network with memristive synapses

Author notes

Article metrics

Submit

Share this article

Dates

Peer review information

Abstract

1.Â Introduction

2.Â Results

2.1.Â A model of a ReRAM synapse

2.2.Â A spiking neural networks with ReRAM synapses successful at sequence prediction

2.3.Â Influence of device characteristics on prediction performance

2.3.1.Â High prediction performance obtained for a broad range of on-off ratios and learning rates

2.3.2.Â Resilience against weight dependent updates

2.3.3.Â Resilience against read and write variability

2.3.4.Â Robustness with respect to device failure

3.Â Discussion

3.1.Â Summary

3.2.Â Relationship to previous models

3.3.Â Outlook

4.Â Conclusion

5.Â Methods

5.1.Â Model tables

5.2.Â Model and simulation parameters

Data availability statement

Funding