Putting the Neural back into Networks
Part 1: Why spikes?
11th January, 2020Not long ago, one of the gods of modern machine learning made a slightly controversial statement. In the final slide of his ISSCC 2019 keynote [1], Yann LeCun [2, 3, 4] (that’s “Mr CNN” to you) said he was skeptical about the usefulness of spiking neural networks, as almost a throwaway remark.
He also threw some shade at a hardware field called “Neuromorphic Engineering”, asking why engineers would bother to build chips for algorithms that don’t work?
“Fair enough,” you might think. Except that we do have sophisticated working examples of spiking neural networks, performing incredibly complicated tasks from sensory processing to motor control, along with high-level planning and general intelligence. In fact, this approach is so adaptable that it can be scaled right down for simple sense-react tasks; can be applied in small mobile agents that need to interact with their environment; can autonomously learn complex tasks including games such as Starcraft and Go; and can even make fairly convincing presentations of high-level cognition and consciousness.
All this with incredible energy efficiency of around 4×10¹¹ synaptic operations per second per Watt (SynOps/s/W).
Dad-jokes aside, of course I’m referring to biological nervous systems.
One engineering lesson we can learn from biological neural systems is that communication is expensive. Neurons in the brain communicate with essentially binary “spikes” of electricity passing from one neuron to several partners via synapses. These spikes travel over long distances compared with the scale of single cells, and cost energy to generate and propagate. Reflecting this, neurons in the brain fire very sparsely, and connect to each other sparsely.
In contrast, the current breed of artificial neural networks connect densely between layers, and operate on a “frame-like” basis — all neurons compute and send their output on each time step, even when nothing much is changing in their input. Nevertheless, recent successes in training deep ANNs on very challenging problems shows that the mathematical tools for building ANNs are vey useful.
Ns, ANs and SNs
Standard artificial neurons (ANs) are a small blob of linear algebra, that instantaneously transform some real-valued inputs into a real-valued output. A very common formulation is given by \(y = H \left( W \cdot x + b \right)\), with \(x\) the inputs; \(y\) the output, \(b\) a bias input, and where \(H(x)\) is a common transfer function such as \(tanh\), a sigmoid or a rectified-linear function. The crucial thing to note is than ANs have no concept of time — all inputs are processed instantaneously.
Spiking neurons, on the other hand, are more like little clockwork devices that care implicitly about time, and mimic biological neurons in the simplest possible way. Shown above is a leaky integrate-and-fire (LIF) neuron. This neuron receives inputs as a series of spike trains \(S_i(t)\), via synapses which integrate the spikes and decay over time. Each neuron also has an internal state \(V_m(t)\), which integrates synaptic inputs. When the internal state crosses a threshold \(V_{th}\), a neuron emits a spike event \(S_o(t)\) and the neuron state \(V_m\) is decreased.
Formally, we have:
Synaptic inputs: \(\tau_{syn}\cdot dI_{syn}/dt + I_{syn} = S_i(t)\)
Neuron state: \(\tau_m \cdot dV_m/dt + V_m(t) = W \cdot I_{syn} + b\)
Spike production: if \(V_m(t) > V_{th} \rightarrow S_o(t) = 1, V_m(t) = V_m(t) - 1 \)
So we’ve got a spiking neuron, what can we do with it? In the next post, we’ll look at how to deal with the more complex dynamics of a spiking neuron during training.
Part 2: More spikes, more problems
TL;DR: Brains use spikes for communication. Spiking neurons are energy efficient because ultra sparse temporal processing. Spiking neurons are cool because they know about time. Spikes good.