Tunable synaptic working memory with volatile memristive devices

Saverio Ricci; David Kappel; Christian Tetzlaff; Daniele Ielmini; Erika Covi

doi:10.1088/2634-4386/ad01d6

1.Â Introduction

The short-term storage of information, one of the main properties of working memory (WM), is at the forefront of most cognitive abilities in humans and animals [1, 2]. Tasks that involve WM include visual processing, speech comprehension, and episodic planning; thus impairment of WM results in a partial or complete loss of these abilities [3, 4]. Several models have been suggested to explain the physiological mechanisms that underlie the WM in the brain. Today it is becoming increasingly evident that a mixture of specialized mechanisms enable an ensemble of versatile memory systems in the brain that cover memory duration from several seconds to minutes [4, 5].

One such mechanism for WM is based on short-term dynamics of synapses, i.e., short-term plasticity (STP). Figure 1 illustrates the synaptic model of WM. When a new memory item is stored in the WM network, an ensemble of neurons that represents the item is activated and synaptic strengths between these neurons are transiently potentiated. This transient increase in synaptic strength is caused by short term mechanisms that put the synapse in a state of higher effectiveness [2, 5]. This state of high effectiveness supports the system to retrieve the stored memory item. The transient decay of the synaptic strengths and, thus, of the memory items makes the WM susceptible for new content.

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.**Â Conceptual illustration of information storage in working memory. The neural network can store and recall features of an item. The volatile nature of the synapses allows the memory of the stored features to fade in time. After depletion of the memory, the features of a different item can be stored without significant interferences.
Download figure:
Standard image High-resolution image

Various mechanisms for WM were introduced also in modern machine learning (ML) to improve the performance in complex sequence processing tasks including natural language [6â9]. Indeed, similar to the WM, recurrent neural networks (RNNs) provide the possibility to perform associative memory and recall stored information thanks to the presence of internal feedback loops that ensures the persistence of data. In recent studies, recurrency demonstrated to be a crucial aspect in, e.g. grammar learning [10]. In contrast to feedforward architectures that are often being used in ML, RNNs inhere the desirable computational properties to represent long-term dependencies in sequential data, due to their computational load [11, 12].

Network architectures such as long-short term memory and gated recurrent units try to optimize the use of hardware resources while still exploiting recurrency [13]. However, these solutions are still expensive in terms of hardware area and number of operations. Bespoke hardware optimizing resources in terms of power consumption, area, and computational workload could therefore facilitate the implementation of RNNs. At present, complementary metalâoxideâsemiconductor (CMOS)-only solutions include digital standard hardware, e.g. graphic processing units and field programmable gate arrays, as well as analog or mixed-signal application specific integrated circuit (ASIC). Digital circuits usually imply a heavy computational workload and the need of waiting for several clock cycles [14], while ASICs have a higher energy efficient approach by adopting subthreshold circuits working asynchronously [15]. However, the temporal dynamics are usually implemented by charging or discharging a capacitor with a constant current [16], thus consuming power. Furthermore, when dealing with biologically relevant time scales, the size of the capacitors becomes non-negligible.

Alternative approaches with better scaling and lower energy consumption are deeply sought to enable neuromorphic circuits with bio-realistic WM [17]. In this respect, a promising technology is represented by memristive devices [18], namely two-terminal devices able to reversibly change their conductance upon the application of proper electrical stimuli. Memristive devices show a broad range of attractive properties, including high scalability, high read/program speed, high energy efficiency, and programming voltages comparable with the power supply of typical neuromorphic chips [19, 20]. While non-volatile memristive devices have already shown promising results when used as non-volatile synapses, volatile memristive devices that can keep track of recent neural activity and implement synapses with biologically compatible time constants, are still largely unexplored. Indeed, volatile properties have so far mainly been investigated in reservoir computing applications [21â23].

In this work, we use C/HfO $\mathrm{_{2}}$ /Ag memristive devices. These devices feature a high ON/OFF ratio of 10⁸ and tunable time constants in the range of milliseconds to seconds, as shown in [24, 25], which are two desirable features for our application. The switching mechanism relies on the formation and spontaneous dissolution of a silver conductive filament across the switching layer. The relaxation, namely retention time, i.e., the time it takes to the conductive filament (CF) to dissolve, has been shown to depend on the diameter of the CF itself: the thicker the filament, the longer the retention time [26â28]. The advantages of this technology are manifold: the retention times are electrically tunable and the time information is physically located in the CF, thus consuming a negligible power and area on the chip. Moreover, the probability to switch the device on is also electrically tunable, thus adding a further degree of freedom in controlling the dynamics of the WM. These features are exploited to demonstrate store and recall of patterns in a small scale WM hardware, then the system is scaled up in simulations to demonstrate the use of volatile memristive devices in a large-scale biologically inspired model of synaptic WM and in an associative symbolic WM.

2.Â Materials and methods

2.1.Â Device fabrication

The memristive devices used in this study are fabricated on top of foundry-based MOS transistors, namely one-transistor/one-resistor, allowing the control of the compliance current $I_\mathrm{C}$ . The bottom electrode (BE) consists of a 70ânmâ×â70ânm graphitic carbon pillar, that was already demonstrated to be a good electrode material for both volatile and non-volatile devices thanks to its stability and inert behavior [25]. The oxide layer and the top electrode (TE) are fabricated by e-beam evaporation. The oxide is a 10ânm HfO $\mathrm{_2}$ active layer and the TE is a 100ânm thick Ag layer. Both HfO $\mathrm{_2}$ and Ag are deposited at room temperature and without breaking the vacuum (pressure 3×10 $\mathrm{^{-6}}$ âmbar).

2.2.Â Electrical setup for device characterization

The electrical setup allows the device to be connected either to a semiconductor device parameter analyzer (DC characterization) or to a waveform generator and an oscilloscope (pulsed characterization) through a switch matrix. An example of this configuration has been shown in [25]. DC characterization is carried out using an HP 4156 C semiconductor device parameter analyzer. A sweep from 0âV to 1.5âV is applied to the TE of the device while the BE is grounded. The characterization includes DC sweeps with different compliance current (I $\mathrm{_C}$ ), as in figure S1(a). The pulsed characterization as well as the WM experiment are carried out using a TTI TGA12104 arbitrary waveform generator. The voltage waveforms are applied to the TE. To measure the current, a LeCroy Waverunner 640Zi oscilloscope is connected at the BE side, and the voltage drop across a 50âÎ© series resistance is probed. MATLABÂ® software is used for the data analysis and the control of the instruments.

The temporal dynamics are studied by first applying a semi-triangular pulse with 10âms pulse duration and 5âV amplitude to induce the filament formation and then monitoring the state of the filament with constant â150âmV bias. To tune the temporal behavior, the I $\mathrm{_C}$ is changed from 10âÂµ A to 70âÂµ A. For each I $\mathrm{_C}$ , 100 experiments are carried out. To avoid possible interference due to the previous cycles, the value of I $\mathrm{_C}$ is changed in random order.

The switching probability (figure 2(D)) is studied by applying 100 pulses for each combination of voltage amplitudeâpulse duration. The conditions are applied in random order. The impact of the number of pulses (figures 2(D) and (E)) is analyzed with the same methodology, selecting the order of the combination of voltage amplitudeânumber of pulses randomly. Each combination is applied 100 times.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.**Â Ag-based volatile memristive device characterization. (A) Sketch of the one-transistor/one-resistor (1T1R) RRAM device together with its working principle. The memristive device (1 R) is based on a W/C/10ânm HfO $\mathrm{_2}$ /Ag stack. The RRAM shows a volatile behavior, i.e. a set operation together with a spontaneous switch off. (B) Time characterization of the retention of the filament. After a 5âV amplitude triangular pulse to switch the cell on with an $I_\mathrm{CC}$ â=â20âÂµ A, a constant reading voltage of â150âmV is applied to monitor the retention. (C) Retention time distributions at different compliance currents. The median value of the retention time increases with the compliance current. Inset: applied programming pulse. (D) Switching probability of the device for a single pulse as a function of the amplitude and the pulse duration of the programming pulse. Shorter pulses required higher voltage amplitudes to switch ON. (E) Probability of switching the RRAM to the LRS depending on the number of pulses and their amplitude (1âms pulses). The circles are the experimental data while the solid lines is the fitting. (F) Effect of the number of pulses and the voltage amplitude (1.6âV) on the switching of the device. The switching of the device is stochastic. Considering a group (burst) of pulses, the probability that the device is in the LRS inside the group increases.
Download figure:
Standard image High-resolution image

**Figure 2.**Â Ag-based volatile memristive device characterization. (A) Sketch of the one-transistor/one-resistor (1T1R) RRAM device together with its working principle. The memristive device (1 R) is based on a W/C/10ânm HfO $\mathrm{_2}$ /Ag stack. The RRAM shows a volatile behavior, i.e. a set operation together with a spontaneous switch off. (B) Time characterization of the retention of the filament. After a 5âV amplitude triangular pulse to switch the cell on with an $I_\mathrm{CC}$ â=â20âÂµ A, a constant reading voltage of â150âmV is applied to monitor the retention. (C) Retention time distributions at different compliance currents. The median value of the retention time increases with the compliance current. Inset: applied programming pulse. (D) Switching probability of the device for a single pulse as a function of the amplitude and the pulse duration of the programming pulse. Shorter pulses required higher voltage amplitudes to switch ON. (E) Probability of switching the RRAM to the LRS depending on the number of pulses and their amplitude (1âms pulses). The circles are the experimental data while the solid lines is the fitting. (F) Effect of the number of pulses and the voltage amplitude (1.6âV) on the switching of the device. The switching of the device is stochastic. Considering a group (burst) of pulses, the probability that the device is in the LRS inside the group increases.
Download figure:
Standard image High-resolution image

For the WM store and recall experiment, each probabilityâfrequencyâpattern condition is repeated ten times. The voltage bias is applied only at the end of the experiments to check the retention capabilities.

2.3.Â Fitting of device features

Thanks to the stochasticity of the switching mechanism, the probability ${P}_\mathrm{ON}$ for a single pulse of given amplitude (figure 2(D)) follows a normal distribution and thus is fit using a cumulative distribution function:

$\begin{equation} {\Large\displaystyle{P_\mathrm{ON}\left(V\right) = \frac{1}{2}\left(1+\mathrm{erf}\left(\frac{V-\mu}{\sigma\sqrt{2}}\right)\right)}} \end{equation} \tag{ 1 }$

where the average value Âµ and variance Ï are fitting parameters of the device and the pulse duration, respectively. Table 1 collects the data of the device presented in figure 2(D). Each point in the figure corresponds to 100 measurements.

Table 1.Â Values of Âµ and Ï calculated for different pulse time widths.

Time width (ms)	Average Âµ (V)	Variance Ï (V)
0.05	2.31	0.38
0.10	2.11	0.33
0.15	1.86	0.30
0.50	1.73	0.22
1.00	1.21	0.16
2.00	0.61	0.15
5.00	0.59	0.11

The switching probability, for a given amplitude, as a function of the number of pulses is fit with the mathematical model:

$\begin{equation} P_\mathrm{ON}\left(N\right) = 1 - \left(1 - P_\mathrm{ON}\left(1\right)\right)^N\quad \mathrm{where} \quad P_\mathrm{ON}\left(1\right) = P_\mathrm{ON}\left(V\right) \end{equation} \tag{ 2 }$

In the WM store and recall experiments, unless otherwise noted, the following conditions apply: no read voltage between spikes during the store and the recall phases. A read voltage of â150âmV is applied at the end of the experiment. The current compliance is I $_\mathrm{C}$ â=â17âÂµ A, which gives an average retention time of 28âms.

2.4.Â STP synapse model

To perform the computer simulations we developed a simplified phenomenological model that qualitatively reproduces the the behavior of the memristive devices. The model captures the switching probabilities and variable retention time of the devices. Each synapse i was modeled with a binary internal state variable x_i that denotes either the low ( $x_i = 1$ ) or the high resistance state ( $x_i = 0)$ . The device resistance was r₁ and r₀ in the low and high resistance state, respectively. To model switching probabilities we assigned a parameter $\rho_i \in [0,1]$ to every synapse. Upon arrival of a pre-synaptic input spike, x_i was set to 1 with probability Ï_i.

To model the trial-by-trail variability of the retention times we adopted a Lognormal distribution for the retention times $t_\mathrm{ret}$ , namely:

$\begin{equation} \begin{split} t_\mathrm{ret} \; &\sim\; \mathrm{Lognormal}\,\left( t_\mathrm{ret} \;|\; \mu_i, \sigma_i \right) = \; \frac{1}{\sqrt{2}\pi\sigma_i \, t_\mathrm{ret}} \, \exp \left( -\frac{\left(\log\left(t_\mathrm{ret}\right) - \mu_i\right)^2}{2\sigma_i^2} \right). \end{split} \end{equation} \tag{ 3 }$

The parameters Âµ_i and Ï_i in equation (3) were adjusted to fit the device properties. Supplementary figure S14 shows example model fits to experimental data for different compliance currents. In the simulations we used Âµâ=â7.24 and $\sigma_i = 0.82$ , if not stated otherwise, which corresponds to a mean retention time of ${\sim}1.5$ âs. Whenever a synapse was set to the low resistance state ( $x_i = 1$ ) the retention time $t_\mathrm{ret}$ was drawn from equation (3). After the simulation time t exceeded $t+t_\mathrm{ret}$ the synapse spontaneously returned to the high-resistance state.

2.5.Â WM network model

To reproduce the WM model introduced in [5], we used a recurrent network of 8000 excitatory and 2000 inhibitory neurons. Connection probabilities between these neurons were as in [5]. Supplementary figure S13 illustrates the detailed network structure, connection probabilities and baseline synaptic conductances. Each memory item was represented by a population of 800 excitatory neurons, which were randomly chosen from the recurrent network for each of the five memory items. Synaptic weights between these neurons were five times stronger (0.5âmV) than between other neurons (0.1âmV). All inhibitory synapses were static (no STP) and had a strength of â0.2âmV. To store and recall the WM, 1000 input neurons were set up for each memory item. Input neurons were only connected to one of the memory item populations in the network. Input neurons fired at low baseline firing rates of 0.1âHz. To store a memory item, firing rates of input neurons corresponding to one memory item were elevated ten-fold. During recall all input neurons were set to unspecific elevated activation.

All neurons used the leaky integrate and fire (LIF) model with biologically plausible parameters [29]. The LIF neuron is a spiking point neuron model that transiently integrates synaptic inputs using a leaky membrane potential. The membrane potential u(t) follows the dynamics

$\begin{equation} \frac{\mathrm{d}u}{\mathrm{d}t} = -\frac{1}{\tau_m}\left(u\left(t\right)-u_0\right) +i\left(t\right), \end{equation} \tag{ 4 }$

where u₀ is the membrane resting potential $\tau_\mathrm{m}$ is a time constant. i(t) is the input into the neuron denoting the summed effect of afferent synapses. If the membrane potential crosses a firing threshold Ï at time t^f a spike is emitted and the membrane potential is reset immediately after

$\begin{equation} u\left(t^f+\Delta t\right) = u_r\;. \end{equation} \tag{ 5 }$

After a spike the neuron is inactive for a brief refractory time. The firing threshold was set to 20âmV and refractory time to 2âms. Membrane time constants $\tau_\mathrm{m}$ were 15âms and 10âms, resting potentials u₀ 16âmV and 13âmV, for excitatory and inhibitory neurons, respectively. Independent unit variance Gaussian noise with mean 0.5775âmV were injected into each excitatory, and with mean 0.5275âmV into inhibitory neurons. As in [5] the mean of the Gaussian noise was precisely tuned to enable multistability in the network. Example network dynamics are shown in figure 4(D), for a storeârecall experiment over 8âs.

2.6.Â Associative symbolic WM model

For the associative WM model in figure 5 we developed a network architecture that was capable to store bidirectional short-term associations between items of different categories. The associative memory model consisted of eight memory neurons that were connected through bidirectional associative connections as illustrated in figure 5(A). The memory neurons encoded for the three memory categories, shape, texture and color. Each memory neuron exclusively received input from one input neuron to trigger store and recall events. Memory neurons were implemented using the LIF neuron model, with firing threshold of 1.4âV and membrane time constant of 15âms. Each group of neurons corresponding to one memory category was augmented with a single inhibitory neuron that provided lateral feedback to enforce mutually exclusive activation of one memory item at a time. Inhibitory neurons were LIF neurons with threshold of 1.4âV and membrane time constant of 10âms. All excitatory connections had a strength of 1.5âV and inhibitory feedback was â1.5âV.

Model memristive devices were used inside the bidirectional connections to store the association between a specific pair of memory neurons (which we call here the specific neurons). Exactly one model device was used per association. Memristive devices were augmented with two auxiliary threshold gates to route the input and outputs during store and recall cycles. Gates were put in series with the devices, one before (input gate) and one after (output gate). Input gates received excitatory input from the two specific memory neurons the association corresponded to. In addition input gates received inhibitory input from all other memory neurons. Output gates recursively connected back to specific memory neurons through excitatory connections. Storage of memory items was not dependent on the activity of the neurons and threshold gates (see figure 5), but retained solely in the hidden state of the memristive devices. Model parameters of memristive devices were fit to experimental data (Âµâ=â7.24, Ïâ=â0.82, corresponding to ca. 1.4âs mean retention time). Switching probabilities Ï were 0.2. The synaptic conductance for the low resistance state was chosen to elicit a voltage pulse of 1.0âV in the post-synaptic neuron. The high resistance r₀ was set to be 20 times higher than the low resistance r₁.

2.7.Â Details to software simulations

Simulations of the WM model with volatile memristive devices were done in python (3.8) using a custom implementation of the memristive synapse model that was developed for the NEST 2.14 simulation environment [30]. The simulation time step was 1âms. Neuron dynamics were simulated using the current-based LIF model is available in NEST. A custom model of the STP model outlined in section 2.4 was implemented based on existing synapse models. Data analysis used the numpy and matplotlib python packages in version 1.23.2 and 3.5.3, respectively. The code is available at the following link: https://gitlab.com/kappeld/stp-tunable-synaptic-working-memory.

3.Â Results

3.1.Â Volatile memristive device

The volatile synapse used in this work is a resistive switching C/HfO $\mathrm{_2}$ /Ag memristive device (see section 2 for details). After the device fabrication, the device is in its high-resistive state (HRS) and can be switched to the low resistive state (LRS) by applying a quasi-static voltage sweep between 0âV and 1.5âV and back. Contrary to many filamentary devices, the proposed device does not require an electroforming operation (figure S1(a)), thus simplifying the electrical operation within the neuromorphic circuit. The maximum current flowing through the device (i.e. current compliance, $I\mathrm{_{CC}}$ ) is limited by applying a voltage to the gate of an NMOS transistor whose drain is connected to the bottom electrode of the memristive device (figure 2(A), inset). The switching to the LRS occurs because, as the voltage exceeds a given threshold voltage $V\mathrm{_{T}}$ , Ag ions migrate across the oxide layer and form the CF, thus bringing the device in an LRS (set operation, figure 2(A) [27, 31, 32]. However, the filament self-sustains only in presence of a voltage higher than a critical voltage referred to as hold voltage $V_\mathrm{H}$ . Below this value, the re-diffusion of the Ag atoms in the dielectric layer causes the self-disruption of the filament and, as a consequence, the device transition to an HRS (spontaneous recovery, figure 2(A). These operations are fairly reproducible (figure S1(b)). Due to the different interface between the oxide and the silver and the carbon, the switching is enabled only from the silver to the carbon, while no switching is reported from the carbon (see figure S2), since carbon is atomically stable and no other physical mechanisms are involved. The two main features that enable the use of the proposed memristive device as a volatile synapse in WM tasks are tunable volatility in biologically relevant time-scales and controllable switching probability.

The retention time is shown in figure 2(B) and the median retention time increases exponentially with the current compliance, as illustrated in figure 2(C). The device can be tuned to be in LRS for a time ranging from ms to seconds depending on $I\mathrm{_{CC}}$ , which is the time-scale of interest for our application. The intrinsic stochasticity of the filament formation at a microscopic level originates high variability in the distributions of the retention times [33].

The controllability of the switching probability ( $P_\mathrm{ON}$ ) of the device can instead be achieved by either changing the width or the amplitude of the voltage pulse applied to the device, as shown in figure 2(D). It is worth noticing that the device operates with voltages $\lt$ 3âV, which are compatible with standard CMOS technology and therefore suitable for integration with neuromorphic circuits. Figure 2(E) shows the stochastic behavior of the device. Due to its stochastic nature, the threshold voltage of the device is slightly variable, which results in an electrically tunable switching probability: the higher the pulse voltage, the higher the switching probability [25]. Therefore, the desired switching probability of the device can be tuned by setting a suitable voltage amplitude. Moreover, when stimulated by a burst of identical pulses (figure S4), the switching probability of the device increases with the number of pulses, as demonstrated in figures 2(F) and S5. This effect can therefore be exploited to accelerate the storing or training phase in a network. The device proposed for the experiments also shows good stability in terms of cycle to cycle variability, requiring an initial stabilization, or warm up, of the properties (more details are provided in figure S3).

3.2.Â WM

3.2.1.Â Store and recall of features

The volatile behavior of the memristive device is used in a small-scale experiment of WM to store and recall features, as depicted in figure 3(A). In our example, our network consists of five volatile synapses all connected to the same neuron (schematic in figure 3(B). The retention time of the devices is set through the current compliance, which is obtained by applying a voltage to the gate of the transistors. The current flowing to the neuron is the sum of the currents flowing through the stimulated devices. The neuron is trained to recognize a color within a color stream. Each color is encoded by stimulating a different combination of three devices as in figure 3(C). As a result, for each color only three devices out of five are switched on. At first, the target color is stored in the network by repeatedly stimulating the network with the combination of pulses that encodes the desired color. As a consequence, the corresponding stimulated devices will be switched on and set to their LRS. (More details about the impact of the different experimental parameters are provided in figure S6). Afterward the network is fed with a stream of random stimuli, as shown in figure 3(D) (see also figures S7âS9 for more experimental traces). When the stored item is presented to the network, all the stimulated devices are in LRS and therefore the current received by the neuron overcomes a threshold, thus triggering the firing of the neuron. The current threshold is set according to the current distributions shown in figure S10. After about 1âs without stimulation, all the devices switch off and the pattern is forgotten, as shown in figure S7. The correlation plot in figure 3(E) shows the difference between the expected and the measured current when a pattern is presented. The main contribution to the current is given by the synapses that are stimulated and in LRS, therefore three different current levels can be expected. It is shown that we can reach more than 90% accuracy, defined as the percentage of correct classifications (stored/non-stored pattern) during a test sequence.

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.**Â WM store and recall experiment. (A) High-level sketch of the working memory. (B) Schematic of the WM implementation: five volatile 1T1R memristive devices are arranged in parallel configuration. The gate is chosen to set $I_\mathrm{CC}$ â=â17âÂµ A, that corresponds to a retention time of 28 ms. (C) Colorâpattern encoding. (D) Store and recall experiment. During the store phase, a single pattern is fed to the network. Top colored plot: input stimuli. For ease of visualization, each pattern is colored as the color it encodes. Black dots in the bottom part of the upper plot indicate the stored pattern. Bottom plot: measure current fed to the post-neuron. The current threshold for recognition is indicated as a dashed black horizontal line. The traces are cropped on the x-axes to better highlight the salient events. (E) Correlation plot between the expected and measured currents based on the difference between the presented and the stored pattern. Results obtained from ten different store and recall experiments with $P\mathrm{_{ON}}$ â = â5% and stimulation frequency $f\mathrm{_{stim}}$ â=â50âHz. (F) Accuracy of the system in distinguishing the stored pattern under different stimulation and switching conditions. (G) Average current error, defined as the difference between the measured current and the expected current, during 100 patterns applied for the different conditions.
Download figure:
Standard image High-resolution image

**Figure 3.**Â WM store and recall experiment. (A) High-level sketch of the working memory. (B) Schematic of the WM implementation: five volatile 1T1R memristive devices are arranged in parallel configuration. The gate is chosen to set $I_\mathrm{CC}$ â=â17âÂµ A, that corresponds to a retention time of 28 ms. (C) Colorâpattern encoding. (D) Store and recall experiment. During the store phase, a single pattern is fed to the network. Top colored plot: input stimuli. For ease of visualization, each pattern is colored as the color it encodes. Black dots in the bottom part of the upper plot indicate the stored pattern. Bottom plot: measure current fed to the post-neuron. The current threshold for recognition is indicated as a dashed black horizontal line. The traces are cropped on the x-axes to better highlight the salient events. (E) Correlation plot between the expected and measured currents based on the difference between the presented and the stored pattern. Results obtained from ten different store and recall experiments with $P\mathrm{_{ON}}$ â = â5% and stimulation frequency $f\mathrm{_{stim}}$ â=â50âHz. (F) Accuracy of the system in distinguishing the stored pattern under different stimulation and switching conditions. (G) Average current error, defined as the difference between the measured current and the expected current, during 100 patterns applied for the different conditions.
Download figure:
Standard image High-resolution image

The retention time and the switching probability of each synaptic device are electrically tunable through the gate voltage of the transistor (which sets the current compliance) and the voltage amplitude of the pulse applied to the TE of the device, respectively. These features, together with the stimulation frequency of the network, determine the accuracy of the network in recognizing the color. As shown in figure 3(F), for low $P\mathrm{_{ON}}$ ( $P\mathrm{_{ON}}$ â = â5%), an increase of the spike rate leads to an improvement of the accuracy because the time between two pulses is shorter than the retention time of the device. Instead, higher values of $P\mathrm{_{ON}}$ lead the network to experience a decrease in its accuracy because the probability that a non-active device is switched on is higher. Finally, the intrinsic volatility of the network results in a progressive forgetting of the stored element, which is needed in the WM to prevent the saturation of the network and, consequently, the inability to learn new experiences or recognize the ones already learned. Figure 3(G) gives an estimation on how long the network remembers by showing the average mismatch current, defined as the difference between the expected current and the measured current. The two main errors that can occur over time, i.e. the network either fails to recognize the correct color or recognizes the wrong one, depend on the combination of stimulation frequency and $P\mathrm{_{ON}}$ . Indeed, for high $P\mathrm{_{ON}}$ and stimulation frequency, devices that were supposed to be in their HRS turn to their LRS, thus increasing the measured current. The opposite combination, i.e. low $P\mathrm{_{ON}}$ and stimulation frequency, results in the switching off of devices previously in LRS, thus decreasing the measured current. The results suggest that indeed a careful selection of the $P\mathrm{_{ON}}$ based on the planned stimulation frequency is advisable to fine tune the WM accuracy for a given task, as it will be discussed further in section 4.

3.2.2.Â Biologically inspired model of synaptic WM

The experimental results on WM based on volatile memristive devices can be used to develop a WM model for simulation of a larger scale, RNN. In [5] a model of WM in the brain is introduced that is based on STP combined with multi-stable network dynamics. We adapted this network model to implement the core dynamics of our volatile memristive synapses (figures 2 and 3) as a model of STP. Figure 4(A) illustrates our model that serves as WM model that is able to store a set of discrete patterns. For this, the network receives input from five input populations that represent the input patterns to be stored. A set of input neurons (illustrated by colored circles) are connected to the recurrent WM network (cloud) so that the memory items could be stored and reactivated. All neurons are spiking neurons, such that outputs are given by unitary events that are emitted when an internal membrane potential variable crossed the firing threshold (see section 2.5). The network has a multi-stable dynamics in that neurons of a specific population excite each other and transiently strengthen synapses within the population through STP when activated. Inhibitory neurons are added to facilitate the network multi-stability (see section 2 for details). The experimental paradigm, the network architecture, and the synapse parameters are adjusted to the characteristics of our memristive devices. Retention time parameters are fit to the measured device characteristics.

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.**Â Large-scale simulation of WM. (A) Illustration of the network model. Five different memory items (*A, B, C, D* and E) can be stored in a recurrent network of spiking neurons. Corresponding strongly connected populations within the network transiently store memory items after activation. (B) Network activity of the WM model. Black dots show individual spikes of input (top) and network (bottom) neurons. Multiple phases of store and recall are shown. Insets show average firing rates (spikes per second in Hz) over recall phases. Data obtained using a current compliance of 330âÂµ A, corresponding to average retention times of 1.5âs, and a voltage amplitude of 0.5âV, corresponding to a switching probability of 5%, were used in this simulation. Network behavior using (C) different retention time distribution (change of compliance current) and (D) different switching probability (change of applied voltage amplitude).
Download figure:
Standard image High-resolution image

Figure 4(B) shows typical behavior of the model over a simulation of several seconds. Spiking activity of the network and input neurons are shown. The pattern C is stored by strongly activating the corresponding input neurons. This triggers co-activation of the pool of WM neurons that encode patterns C. Through this activation, volatile memristive synapses get strengthened and cause a prominent response in WM neurons when a recall stimulus that activates all input neurons at intermediate rates is given after a timeout of 1âs. During this recall phase neurons that encode pattern C show significantly increased activity. Two repeated recalls are shown and lead to reliable memory performance. In the timeout phase WM neurons are almost perfectly silent, which is perfectly in line with the behavior of biological neurons. After the second recall phase the memory item C is being forgotten and A can be stored without interference by strongly activating pattern A input neurons. In a third recall period, A but not C neurons get strongly activated.

The WM model requires time constants in the order of behaviorally relevant time scales ( $\gt$ 500âms to few seconds) to function well. Figure 4(C) shows the memory recall performance of the network when different retention times are used. Recall performance is measured here as the signal-to-noise ratio (SNR) between the WM neurons. The SNR is computed here as the mean population firing rate of the neurons specific to the memory item over the mean firing rate of non-specific neurons. Population firing rates were estimated over the total duration of the recall phase (figure 4(B) illustrates firing rate estimators). Retention times below 1âs lead to degraded performance because the firing activity of the neurons is significantly lower than the retention time. This result is in accordance with the experiments shown in figure 3. Too low retention times result in too fast forgetting such that memory items cannot be reliably recalled. Also switching probabilities had to be finely tuned. Figure 4(D) shows the impact of switching probabilities on the recall performance. Switching probabilities of about 0.05 are found to work best for this memory store/recall protocol because it prevents an excessive activation of the volatile devices, which would prevent the correct storage of information, as also confirmed by the experimental results of figure 3. Too low switching probabilities prevent memory formation, whereas too high switching probabilities may result in unstable circuit dynamics. Thanks to the flexibility of our volatile memristive devices, these probabilities can be obtained with a suitable selection of the voltage amplitude and pulse time width as shown in figure 2(D).

3.2.3.Â Associative symbolic WM

Another important feature of WM is the ability to transiently form associations between properties, such as color or shape, to remember representations of real-world object. Figure 5(A) illustrates this associative symbolic WM model. We investigated whether the model memristive devices can be used to form transient associations between symbolic items of different categories. We used a similar storeârecall paradigm as in figure 4 but here the memory item to store is given by association between features that could be dynamically bound together. These features are thought to represent the state of physical objects an associative memory networks is able to perceive through a set of sensors. Concretely we used features from three different stimulus categories, shape, texture and color (see figure 5). Objects are represented by jointly activating feature neurons in the memory network, e.g. a 'smooth, red cylinder' is represented by jointly activating the corresponding feature neurons in the network. STP model synapses as in figure 4 are used as short-term memory inside the bidirectional connections between the neurons.

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.**Â Associative symbolic WM with volatile memristive devices. (A) Illustration of the network for associative symbolic WM. (B) Sequence of store and recall. Store/recall input (top row) and WM neuron output spikes (bottom row) are shown. Inserted pictograms represent the decoded objects and recall queries. Store/recall inputs had a one-to-one fixed connectivity to WM neurons. After storing an association it can be recalled by cuing the network with an arbitrary memory element. (C) Decoding error plotted as a function of time delay between store and first recall.
Download figure:
Standard image High-resolution image

The network is able to form an autoassociative memory. After an association has been formed by presenting an object to the network, the features of that object can be recalled, after a brief time delay of length T_delay, by querying the network with only one part of the features. Figure 4(B) shows one example store and recall sequence. After the object 'smooth, red cylinder' is stored, neurons representing its features can be re-activated by triggering only the 'cylinder' or 'smooth' neurons. After a new object ('smooth, blue cone') is stored, its representation can be retrieved by only activating the 'cone' neuron. The object representations are transiently stored for and then slowly fade away. This is analyzed in figure 4(C) where we plot the decoding error as a function T_delay. The memory can be reliably retrieved during a time window of 600âms, corresponding in our devices to $I\mathrm{_{CC}}$ â=â70âÂµ A.

4.Â Discussion

The emergence of a new class of volatile memristive devices featuring volatility in biologically-relevant time scales opens the possibility to implement in hardware neuromorphic systems that can inherently solve complex sequence processing tasks. The main advantage of the proposed technology lies in the storage of the information in the physical configuration of the nanoscale device, i.e. the CF inside the oxide layer of the memristive device. This way, a direct correlation of the retention time with the electrically tunable properties of the device is established.

The volatile properties of memristive devices have been so far exploited mainly in reservoir computing [21â23]. Other applications include selector devices and hardware security [34], while the exploration of the potential of volatile devices in systems requiring short-term memory is still at its infancy [35, 36]. Yet, the use of volatile devices in tasks as WM is extremely advantageous from a hardware perspective. Indeed, the networks devoted to carry out such tasks need the ability to forget the stored information in time, otherwise the network would quickly reach its maximum memory capacity and become unable to store new experiences unless old ones are forgotten. An implementation using non-volatile devices is thus feasible only together with the design of extra circuits that could reset the devices, thus consuming extra area and power. Another solution to implement volatility is with capacitors. However, in addition to the much larger area that capacitors would require to implement the same time constants as the proposed volatile devices, the stochastic properties that contribute to the correct functioning of the network (see figure 4(D) would have to be implemented by dedicated circuits.

In our work, we exploit the properties of Ag-based resistive switching devices to carry out WM tasks. We first characterize the device and assess its electrical properties, then we select the parameters to conduct a store and recall hardware experiment, where we demonstrate the ability of short-term storage of features and explore the performance of the system under a variety of stimulation conditions. We then use a biologically inspired synapse model that qualitatively reproduces short-term dynamics of STP with the properties of our memristive device [37, 38]. We demonstrate the functionality of this STP model by qualitatively reproducing the WM model of [5] in figure 4. This experiment demonstrates that STP based on volatile memristive device dynamics is able to install WM capability in a multi-stable network of spiking neurons. Memory items can be reliably retrieved seconds after storage and overwritten with new items on the same time scale. Furthermore we demonstrate a symbolic associative WM model in figure 5 where we use the data of the device characterization to preserve the feature of the device in terms of switching probability, retention time, and variability.

As shown in figure 3, the device parameters and the stimulation conditions are linked and therefore they need to be matched in order to achieve successful network operation. During the experiments, the store phase exploits the properties of burst stimulation shown in figure 2(E), which in our case is beneficial because it shortens the store phase. Due to their stochasticity, the devices do not switch on together, as visible in figure 3(D), hence the duration of the store phase should take this aspect into consideration. In case of two consecutive store phases, only the second element is actually safely stored in the network. Indeed, during the second store phase, the synapses common to both elements remain active, whereas the ones that are no longer stimulated switch off, while the others specific to the second element only switch on. Changing P $\mathrm{_{ON}}$ has repercussions in the speed of the store process (figure S6), but also in the recall phase (figures 3(F) and (G). Indeed, while an increase of $P\mathrm{_{ON}}$ is beneficial during store, it becomes detrimental during recall, since it results in the erroneous activation of one or more devices. A trade-off therefore exists between store speed and recall accuracy when setting the device's $P\mathrm{_{ON}}$ .

Another important factor that has to be considered when setting the device parameters is the expected stimulation rate. Each incoming pulse has a different effect on the memristive device, depending on whether the device is ON (in LRS) or OFF (in HRS). In the former case, the CF, i.e. the information stored in the memristive device, is refreshed. In the latter, the stimulation might activate OFF devices. As a consequence, there are two mechanisms that degrade the accuracy of the system: low stimulation rates may lead to the switching off of previously ON devices, while high rates may lead to erroneous classification due to the switching on of previously OFF devices (figures S11 and S12). A possible mitigation measure is tuning the mean retention time of the devices: low spike rates need longer retention times, whereas high spike rates might benefit from shorter retention times.

The proposed STP synapse model presents an important difference compared to the one in [37, 38], that is the high level of noise, which imitates a feature of biological synapses. Chemical synaptic transmission in biology is inherently unreliable. About half of synaptic transmissions are not detectable at the post-synaptic side at all, which makes synapses an abundant source of noise in the brain [39â41]. Given how costly synapses are in terms of energy consumption [42], this finding is surprising. Several authors have therefore suggested that the noise in synapses serves as a computational resource that allows the brain to solve complex tasks more efficiently [39, 43]. We show here that the noise in synaptic short-term dynamics, that mimics the behavior of synaptic facilitation in biological synapses, can be exploited to realize short-term memory on behaviorally relevant time scales of several seconds.

5.Â Conclusion

In summary, real-world applications require very different time constants. Technologies that enable the design of systems whose internal temporal dynamics can be tuned to match the real world ones present appealing opportunities, especially in the context of power and memory limited edge computing. Our results show that the proposed Ag-based volatile memristive device features electrical tunability of its key parameters, i.e., retention time and switching probability, that allows to adapt the lifetime of WM to the task-specific timescale neededâfrom 1âms to 10âs.

Acknowledgment

The authors thank Matteo Farronato and Alessandro Milozzi for fruitful discussions on the volatile devices. The Authors would like to thank Polifab's staff Marco Asa, Andrea Scaccabarozzi, Claudio Somaschini, Chiara Nava, Stefano Fasoli, Stefano Bigoni, and Elisa Sogne for help in the fabrication process. Thanks to Luciano Feltri for the help in the setup optimization.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Funding

This work was partially supported by the European Research Council (ERC) through the European's Union Horizon Europe Research and Innovation Programme under Grant Agreement No. 101042585 and from the European Union's Horizon 2020 research and innovation program, Grant Agreement No. 824164. Views and opinions expressed are however those of the Authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. C T acknowledges funding by the German Research Foundation (CRC1286, Projects C1 and Z01). D K acknowledges funding from the German Ministry of Education and Research (BMBF) Project EVENTS (16ME0733). This work was partially performed in Polifab, the micro and nanofabrication facility of Politecnico di Milano.

Author contributions

E C and D K conceived the idea and designed the experiments together with D I and C T. S R fabricated and characterized the volatile devices, carried out the WM store and recall experiment, and analyzed the experimental data. D K ran the simulations of the large-scale simulation of WM and of the associative symbolic WM. The initial draft of the manuscript was written by E C and D K. All the authors discussed the results and provided feedback. E C, D I, and C T supervised the research.

Dates

Peer review information

3.2.1.Â Store and recall of features

3.2.2.Â Biologically inspired model of synaptic WM

3.2.3.Â Associative symbolic WM

Tunable synaptic working memory with volatile memristive devices

Author notes

Author notes

Author notes

Notes

Article metrics

Submit

Share this article

Dates

Peer review information

Abstract

1.Â Introduction

2.Â Materials and methods

2.1.Â Device fabrication

2.2.Â Electrical setup for device characterization

2.3.Â Fitting of device features

2.4.Â STP synapse model

2.5.Â WM network model

2.6.Â Associative symbolic WM model

2.7.Â Details to software simulations

3.Â Results

3.1.Â Volatile memristive device

3.2.Â WM

3.2.1.Â Store and recall of features

3.2.2.Â Biologically inspired model of synaptic WM

3.2.3.Â Associative symbolic WM

4.Â Discussion

5.Â Conclusion

Acknowledgment

Data availability statement

Funding

Author contributions