parameter estimation in spiking neural networks: a reverse-engineering approach

15
Parameter estimation in spiking neural networks: a reverse-engineering approach This content has been downloaded from IOPscience. Please scroll down to see the full text. Download details: IP Address: 128.118.88.48 This content was downloaded on 04/10/2013 at 13:18 Please note that terms and conditions apply. 2012 J. Neural Eng. 9 026024 (http://iopscience.iop.org/1741-2552/9/2/026024) View the table of contents for this issue, or go to the journal homepage for more Home Search Collections Journals About Contact us My IOPscience

Upload: t

Post on 15-Dec-2016

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Parameter estimation in spiking neural networks: a reverse-engineering approach

Parameter estimation in spiking neural networks: a reverse-engineering approach

This content has been downloaded from IOPscience. Please scroll down to see the full text.

Download details:

IP Address: 128.118.88.48

This content was downloaded on 04/10/2013 at 13:18

Please note that terms and conditions apply.

2012 J. Neural Eng. 9 026024

(http://iopscience.iop.org/1741-2552/9/2/026024)

View the table of contents for this issue, or go to the journal homepage for more

Home Search Collections Journals About Contact us My IOPscience

Page 2: Parameter estimation in spiking neural networks: a reverse-engineering approach

IOP PUBLISHING JOURNAL OF NEURAL ENGINEERING

J. Neural Eng. 9 (2012) 026024 (14pp) doi:10.1088/1741-2560/9/2/026024

Parameter estimation in spiking neuralnetworks: a reverse-engineering approachH Rostro-Gonzalez1,2,3, B Cessac3 and T Vieville4

1 KIOS Research Centre and Holistic Electronics Research Lab., Department of Electrical and ComputerEngineering, University of Cyprus, 75 Kallipoleos Avenue, PO Box 20537, 1678 Nicosia, Cyprus2 Division de Ingenierıas Campus Irapuato-Salamanca, Universidad de Guanajuato,Com. Palo Blanco s/n 36885, Salamanca Gto, Mexico3 NeuroMathComp project team (INRIA, ENS Paris, UNSA LJAD), 2004 route des Lucioles, BP 93,06902 Sophia Antipolis Cedex, France4 INRIA Cortex project team, 2004 Route Des Lucioles, BP 93, 06902 Sophia Antipolis Cedex, France

E-mail: [email protected]

Received 23 October 2011Accepted for publication 7 February 2012Published 15 March 2012Online at stacks.iop.org/JNE/9/026024

AbstractThis paper presents a reverse engineering approach for parameter estimation in spiking neuralnetworks (SNNs). We consider the deterministic evolution of a time-discretized network withspiking neurons, where synaptic transmission has delays, modeled as a neural network of thegeneralized integrate and fire type. Our approach aims at by-passing the fact that the parameterestimation in SNN results in a non-deterministic polynomial-time hard problem when delaysare to be considered. Here, this assumption has been reformulated as a linear programming(LP) problem in order to perform the solution in a polynomial time. Besides, the LP problemformulation makes the fact that the reverse engineering of a neural network can be performedfrom the observation of the spike times explicit. Furthermore, we point out how the LPadjustment mechanism is local to each neuron and has the same structure as a ‘Hebbian’ rule.Finally, we present a generalization of this approach to the design of input–output (I/O)transformations as a practical method to ‘program’ a spiking network, i.e. find a set ofparameters allowing us to exactly reproduce the network output, given an input. Numericalverifications and illustrations are provided.

(Some figures may appear in colour only in the online journal)

1. Introduction

Neuronal networks have tremendous computational capacity,but their biological complexity makes the exact reproductionof all the mechanisms involved in these network dynamicsessentially impossible, even at the numerical simulation level,as soon as the number of neurons becomes too large. Onecrucial issue is thus to be able to reproduce the ‘output’of a neuronal network using approximated models easy toimplement numerically. The issue addressed here is ‘Can weprogram an integrate and fire network, i.e. tune the parameters,in order to exactly reproduce another network output, on abounded time horizon, given the input?’.

1.1. Calculability power of neural network models

The main aspect we are interested in here is the calculabilityof neural network models. It is known that recurrentneural networks with high frequency rates are universalapproximators (any open dynamical system) [1], as multilayerfeed-forward networks are [2]. This means that neuralnetworks are not only able to approximate measurablefunctions on a compact domain, but also to simulate dynamicalsystems5, as originally stated (see, e.g., [1] for a detailed

5 As an example, see the very interesting paper of Albers–Sprott using thisproperty to investigate the dynamical stability conjecture of Palis and Smale inthe field of dynamical systems theory [3] or route to chaos in high-dimensionalsystems [4].

1741-2560/12/026024+14$33.00 1 © 2012 IOP Publishing Ltd Printed in the UK & the USA

Page 3: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

introduction on these notions). Spiking neuron networks canalso be universal approximators [5].

Theoretically, spiking neurons can perform very powerfulcomputations with precise spike times. They are at leastas computationally powerful as the sigmoidal neuronstraditionally used in artificial neural networks [6, 7]. Thisresult has been shown using a spike-response model (see [8]for a review) and considering piecewise linear approximationsof the potential profiles. In this context, analogue inputs andoutputs are encoded by temporal delays of spikes. The authorsshow that any feed-forward or recurrent (multi-layer) analogueneuronal network (e.g. McCulloch-Pitts) can be simulatedarbitrarily closely by an insignificantly larger network ofspiking neurons. This holds even in the presence of noise[6, 7]. These results highly motivate the use of spiking neuralnetworks, as studied here.

In a computational context, spiking neuron networks aremainly implemented through specific network architectures,such as echo state networks [9] and liquid state machines [10],that are called ‘reservoir computing’ (see [11] for unificationof reservoir computing methods at the experimental level). Inthis framework, the reservoir is a network model of neurons(with linear or sigmoid neurons, but more usually spikingneurons), with a random topology and a sparse connectivity.The reservoir is a recurrent network, where weights can beeither fixed or driven by an unsupervised learning mechanism.In the case of spiking neurons (e.g. in the model of [12]),the learning mechanism is a form of synaptic plasticity,usually STDP (spike-time-dependent plasticity), or a temporalHebbian unsupervised learning rule, biologically inspired. Theoutput layer of the network (the so-called readout neurons) isdriven by a supervised learning rule, generated from any typeof classifier or regressor, ranging from a least-mean-squaresrule to sophisticated discriminant or regression algorithms.The ease of training and a guaranteed optimality guides thechoice of the method. It appears that simple methods yieldgood results [11]. This distinction between a readout layer andan internal reservoir is indeed induced by the fact that only theoutput of the neuron network activity is constrained, whereasthe internal state is not controlled.

1.2. Learning the parameters of a neural network model

In the biological context, learning is mainly related tosynaptic plasticity [13, 14] and STDP (see e.g. [15] for arecent formalization), as far as spiking neuron networks areconcerned. This unsupervised learning mechanism is known toreduce the variability of neuron responses [16] and is related tothe maximization of information transmission [17] and mutualinformation [18]. It has also other interesting computationalproperties such as tuning neurons to react as soon as possibleto the earliest spikes, or segregate the network response intotwo classes depending on the input to be discriminated, andmore general structuring like the emergence of orientationselectivity [19].

In this study, the point of view is quite different: weconsider supervised learning, since ‘each spike matters’; thatis, in the special case of a feed-forward sweep of visual activity

in response to a brief visual presentation [19, 20], we want notonly to statistically reproduce the spiking output, but also toreproduce it exactly.

The motivation to explore this track is twofold. On onehand, we want to better understand what can be learnedat a theoretical level by spiking neuron networks, tuningweights and delays. The key point is the non-learnability ofspiking neurons [21], since it is proved that this problemis non-deterministic polynomial-time (NP)-complete, whenconsidering the estimation of both weights and delays. Here,we show that we can ‘elude’ this caveat and propose analternate efficient estimation, inspired by biological models.However, it is important to mention that there are someinteresting works that propose methods for the solution ofNP-complete problems with non-deterministic approaches[22–25].

We also have to note that the same restriction applies notonly to simulation but, as far as this model is biologicallyplausible, also holds at the biological level. It is thus an issueto wonder if, in biological neuron networks, delays are reallyestimated during learning processes, or if a weaker form ofweight adaptation, as developed now, is considered.

On the other hand, the computational use of spiking neuralnetworks in the framework of reservoir computing or beyond[26], at application levels, requires efficient tuning methodsnot only on ‘average’, but in the deterministic case. This isthe reason why we must consider how to exactly generate agiven spike train. Furthermore, as far as biological modelingis concerned, there are situations where the exact spike timingis to be taken into account [27, 28].

The paper is structured as follows: the spiking neuronmodel is presented in section 2, followed by the methods insection 3 and an application of this approach in section 4.In section 5, we present some numerical results and concludein section 6.

2. Discretized integrate and fire neuron models

Let us consider a normalized and reduced ‘punctualconductance-based generalized integrate and fire’ (gIF) neuronmodel [29], as reviewed in [30]. The model is reduced in thatboth adaptive currents and nonlinear ionic currents no longerexplicitly depend on the potential membrane, but on time andprevious spikes only (see [31] for a development). Here, wefollow [31–33] after [34] and review how to properly discretizea gIF model (see [31, 33] for a complete derivation of thismodel).

Thus, the dynamics of such a neuron model are given bythe next equations and are represented in figure 2:

Vi[k] = γi Vi[k − 1] (1 − Zi[k − 1]) +N∑

j=1

D∑d=1

Wi jd Z j[k − d]

+ I(ext)i [k], (1)

with

Zi[k] ={

1 if V � θ (firing)0 otherwise.

(2)

2

Page 4: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

0 D

Figure 1. Synaptic profile governing the inter-neural transmissiondelays on the synaptic connections of the network.

Equation (1) represents the membrane potential of the ithneuron at time k. γ ∈ [0, 1[ defines the leak rate. The firingstate is given by the term Zi[k] in equation (2). N and D are thenumber of neurons and the maximal inter-neural transmissiondelay, respectively. Hence, the size of the network is definedby N × N for a fully connected topology. W is the matrix ofsynaptic weights, where each connection i j is modeled by analpha profile (figure 1). Finally, I(ext) represents an externalstimulus.

WhenVi[k] reaches a given threshold θ , then a spike occursin Zi[k] (equation (2)) and the neuron i is reset by the term1 − Zi[k] (equation (1)).

3. Methods: parameter estimation in spiking neuralnetworks

Our approach attempts to estimate the parameters in aneural network from the observation of its spiking dynamics(figure 2). These dynamics are parametrized by N × N × Dvalues, which define the matrix of delayed synaptic weightsWi jd . Furthermore, we assume that if the neurons in thenetwork (equation (1)) have fired at least once, the dependenceon the initial condition is removed by the reset mechanism

(equation (2)). This demonstrates the fact that the state of thenetwork does not depend on the initial conditions as soon asspikes are known. It can be written as

Vi[k] = 0 for 0 � k < D.

With this assumption, equation (1) becomes

Vi[k] =N∑

j=1

D∑d=1

Wi jd

τik∑τ=0

γ τ Zj[k − τ − d] + Iikτ , (3)

writing Iikτ = ∑τikτ=0 γ τ I(ext)

i [k − τ ] with

τik = k − arg minl>0{Zi[l − 1] = 1}.This derivation is easily obtained by induction from (1). Here,τik is the delay from the last spiking time, i.e. the last membranepotential reset. If there is no spike, we simply set τik = k.

From equation (3) we can clearly distinguish a linearsystem between spikes (denoted by Z) and membranepotentials (denoted by V ). Let us now discuss how toretrieve the parameters from the observation of the neuraldynamics. We propose several solutions depending on differentassumptions.

3.1. Retrieving weights and delayed weights from theobservation of spikes and membrane potentials

The approach in its most basic form assumes that the parameterestimation can be performed from the observation of bothspikes and membrane potentials. We assume that equation (3)can be expressed as a linear system of the form:

Ci wi = di (4)

with

Ci =⎛⎝. . . . . . . . .

. . .∑τik

τ=0 γ τ Zj[k − τ − d] . . .

. . . . . . . . .

⎞⎠ ∈ R(T−D)×ND,

di = (. . . Vi[k] − Iikτ . . .)t ∈ RT−D,

wi = (. . . Wi jd . . .)t ∈ RN×D.

writing ut the transpose of u.

D

T

N

Figure 2. Schematic representation of a raster of N neurons observed during a time interval T after an initial condition interval D (in red).

3

Page 5: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Here, Ci is defined by the neuron spike inputs, di is definedby the neuron membrane potentials and membrane currentsand the network parameter is defined by the weights vector wi.

The weights are thus directly defined by a set of linearequalities for each neuron. Let us call this a linear (L) problem.

Definition (4) concerns only the weights of one neuron ofindex i. This estimation is local to each neuron and not globalto the network. Furthermore, the weight estimation is givenby the observation of the input Zi[k] and output Vi[k]. Thesetwo characteristics correspond to usual Hebbian-like learningrules (see [13] for a discussion) .

Given a general raster (i.e. assuming Ci is of rankmin(T − D, N D)):

• The linear system in equation (4) always has a solution,in the general case, if

N >T − D

D= O

(T

D

)⇔ D >

T

N + 1= O

(T

N

)⇔ D (N + 1) > T. (5)

This requires enough non-redundant neurons N or weightprofile delays D, with respect to the observation time T . Inthis case, given any membrane potential and spike values,there are always weights able to map the spikes input ontothe desired potential output.

• On the contrary, if the system has more unknowns thanequations (i.e. N × D � T − D), there does not existan explicit solution in the general case. However, there isalways a solution if the potentials and spikes have beengenerated by a neural network model of the form (1).

If Ci is not of full rank, this may correspond to severalcases, e.g.,

• redundant spike patterns: some neurons do not providelinearly independent spike trains;

• redundant or trivial spike train: for instance with a lot ofbursts (with many Zj[k] = 1) or a very sparse train (withmany Zj[k] = 0), or periodic spike trains.

Regarding the observation duration T , it has beendemonstrated in [32, 33] that the dynamics of an integrateand fire neural network is generically periodic. This howeverdepends on parameters such as external current or synapticweights, while periods can be larger than any accessiblecomputational time.

In any case, several choices of weights wi (in the generalcase a D (N + 1) − T dimensional affine space) may leadto the same membrane potential and spikes. The problemof retrieving weights from the observation of spikes andmembrane potential may then have many solutions.

The particular case where D = 1, i.e. when there is nodelayed weight but a simple weight scalar value to define aconnection strengths, is included in this framework.

3.2. Retrieving weights and delayed weights from theobservation of spikes

In the more general case, we assume that only spikes (and notthe membrane potentials) can be observed during estimation.

This fact corresponds to the usual assumption when observinga spiking neural network.

Here, the value of Vi[k] is not known; however, its positionwith respect to the firing threshold is provided as was describedin equation (2). This fact allows us to establish the followingexpression:

Zi[k] = 0 ⇔ Vi[k] < θ and Zi[k] = 1 ⇔ Vi[k] � θ,

where θ = 1 for simplification. The above expression can nowbe written as an inequality as follows:

eik = (2Zi[k] − 1)(Vi[k] − 1) � 0.

If this condition holds for all time index k and all neuron indexi, then the spiking activity of the network exactly correspondsto the desired firing pattern.

Expanding (3) with the previous condition, we have

ei = Ai wi + bi � 0, (6)

where

Ai =⎛⎝. . . . . . . . .

. . . (2Zi[k] − 1)∑τ jk

τ=0 γ τ Zj[k − τ − d] . . .

. . . . . . . . .

⎞⎠

∈ RT−D×ND,

bi = (. . . (2Zi[k] − 1)(Iikτ − 1) . . .)t ∈ RT−D,

wi = (. . . Wi jd . . .)t ∈ RND,

ei = (. . . (2Zi[k] − 1)(Vi[k] − 1) . . .)t ∈ RT−D;

thus, Ai = Di Ci where Di is the non-singular RT−D×T−D

diagonal matrix with Dkki = 2 Zi[k] − 1 ∈ {−1, 1}.

The weights are now directly defined by a set of linearinequalities for each neuron. This is therefore a linearprogramming (LP) problem. See [35] for an introduction and[36] for the detailed method used here to implement the LPproblem.

Furthermore, the same discussion about the dimension ofthe set of solutions applies to this new paradigm; except thatwe now have to consider a simplex of the solution, instead ofa simple affine sub-space.

Let us now derive a bound for eik. Since 0 � Vi[k] < 1 forsub-threshold values and reset as soon as Vi[k] > 1, we havethe following bounds for Vi[k]:

V mini =

∑jd,Wi jd<0

Wi jd � Vi[k] � V maxi =

∑jd,Wi jd>0

Wi jd .

We must have V maxi > 1 to have at least a spike while V min

i � 0by construction. These bounds are attained in the high-activitymode when either all excitatory or all inhibitory neurons fire.From this derivation, emax > 0 and

emax = maxi

(1 − V mini ,V max

i − 1)

0 < eik � emax,

thus providing an explicit bound for eik.As a consequence, the present estimation problem reads

maxei,wi

∑k

ek, with 0 < eik � emax, and ei = Ai wi + bi, (7)

which is a standard bounded LP problem.

4

Page 6: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

The key point is that the LP problem can be solved inpolynomial time. Thus, this is not a NP-complete problem,which is subject to the curse of combinatorial complexity.In practice, this LP problem can be solved using one of theseveral LP solution methods proposed in the literature (i.e.simplex method, which is, in principle, NP-complete in theworst case, but in practice, as fast as, when not faster, thanpolynomial methods).

3.3. About retrieving delays from the observation of spikes

Let us now discuss the key idea of the paper.In the previous derivations, we have considered delayed

weights, i.e. a quantitative weight value Wi jd at each delayd ∈ {1, D}.

Another point of view is to consider a network withadjustable synaptic delays. Such estimation problem may, e.g.,correspond to the ‘simpler’ model:

Vi[k] = γiVi[k − 1](1 − Zi[k − 1]) +n∑

j=1

Wi jZ j[k − di j] + Iik,

where now the weights Wi j and delays di j are to be estimated.As pointed out previously, the non-learnability of spiking

neurons is known [21], i.e. the previous estimation is provedto be NP-complete. We have carefully checked in [21] that theresult still applies to the present setup. This means that in orderto ‘learn’ the proper parameters we have to ‘try all possiblecombinations of delays’. This is intuitively due to the fact thateach delay has no ‘smooth’ effect on the dynamics but maychange the whole dynamics in an unpredictable way.

We see here that the estimation problem of delays di j

seems incompatible with usable algorithms, as reviewed in theintroduction.

We propose to elude this NP-complete problem byconsidering another estimation problem. Here, we do notestimate one delay (for each synapse) but consider connectionweights at several delays and then estimate a weightedpondering of their relative contribution. This means that weconsider a weak delay estimation problem.

Obviously, the case where there is a weight Wi j witha corresponding delay di j ∈ {0, D} is a particular case ofconsidering several delayed weights Wi jd (corresponding tohaving all weights equal to zero except at di j, i.e. Wi jd = Wi j

if d = di j; otherwise, it is 0).We thus do not restrain the neural network model by

changing the position of the problem, but enlarge it. In fact,the present estimation provides a smooth approximation of theprevious NP-complete problem.

We can easily conjecture that the same restriction appliesto the case where both the observation of spikes and themembrane potential are considered.

3.4. Methods: exact spike train simulation

Up to now, we have assumed that a raster Zi[k], i ∈ {1, N}, k ∈{1, T }, is generated by a network whose dynamics are definedby (1), with initial conditions Z j[k] and Vj[k] = 0 for allj ∈ {1, N}, k ∈ {1, D}.

Although we have shown that there is always a solution inthe general case, if the observation period is small enough, i.e.T < O(N D). Let us now consider the case when T � O(N D)

and where there is not a solution for a general case. This isespecially when the raster has not been generated by a networkgiven by (1), e.g., in the case when the raster is random.

What can we do then? For instance, what can be done inthe case when the raster is entirely random and is not generatedby a network of type (1)?

The key idea, borrowed from the reservoir computingparadigm reviewed in the introduction, is to add a reservoir of‘hidden neurons’, i.e. to consider not N but N +S neurons. Theset of N ‘output’ neurons is going to reproduce the expectedraster Zi[k]. The set of S ‘hidden’ neurons increases the numberof degrees of freedom in order to obtain T < O((N + S) D),thus being able to apply the previous algorithms to estimate theoptimal delayed weights. Clearly, in the worst case, it seemsthat we have to add about S = O(T/D) hidden neurons. Thisis illustrated in figure 3.

In order to make this idea clear, let us consider a sparsetrivial set of hidden neurons, as illustrated in figure 4. Hence,S = T/D + 1 hidden neurons of index i′ ∈ {0, S} each neuronfiring once at ti′ = i′ D, except the last one always firing(in order to maintain a spiking activity). Thus,

Zi′ [k] = δ(i′D − k), 0 � i′ < S, ZS[k] = 1.

Let us chooseWSS1 > 1Wi′S1 = 1−γ

1−γti′ −1/2

Wi′i′1 < − γ2ti′ −γ T

γ T (1−γti′ ) < 0

Wi′ j′d = 0 otherwise,

with initial conditions Zi′ [k] = 0, i′ ∈ {0, S{ and ZS[k] =1, k ∈ {1, D}, while Ii′k = 0.

A straightforward derivation over equation (1) allows us toverify that this choice generates the specified Zi′ [k]. In words,as the reader can easily verify, it appears that

• the neuron of index S is always firing since (though WSS1)a positive internal loop maintains its activity;

• the neurons of index i′ ∈ {0, S{, whose equation reads

Vi′[k] = γVi′[k−1](1−Zi′ [k−1])+Wi′S1+Wi′i′1Zi′ [k−1],

are firing at ti′ integrating the constant input Wi′S1;• the neurons of index i′ ∈ {0, S{, after firing is inhibited

(though Wi′i′1) by a negative internal loop, thus reset to anegative value, so that i does no longer fire before T . Wethus generate Zi′ [k] as expected.

Alternatively, the use of the firing neuron of index S canbe avoided by introducing a constant current Ii′k = Wi′S1.

However, without the firing neuron of index S or someinput current, the sparse trivial raster cannot be generated,although T < O(N D). This is due to the fact that the activityis too sparse to be self-maintained and even if ‘a solutionexists, in the general case, for a small observation period, i.e.T < O(N D)’, a set of singular cases, such as this one, was tobe excluded.

Once the hidden neurons reservoir raster is generated,it is straightforward to generate the output neuron raster,

5

Page 7: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Σ

Ν

Δ

Τ

Figure 3. Schematic representation of a raster with N output neurons observed during a time interval T after an initial condition interval D,with an add-on of S hidden neurons.

Figure 4. Schematic representation of a sparse trivial set of hidden neurons, allowing the generation of any raster of length T .

considering that

• there is no recurrent connection between the N outputneurons, i.e. Wi jd = 0, i ∈ {1, N}, j ∈ {1, N}, d ∈ {1, D},

• there is no backward connection from the N outputneurons to the S hidden neurons, i.e. Wi′ jd = 0, i′ ∈{0, N{, j ∈ {1, N}, d ∈ {1, D},

• but there are forward excitatory connections between thehidden and output neurons:

Wi j′d = (1 + ε)Zi[ j′D + d] for some small ε > 0,

yielding, from (1),

Vi[k] =n∑

j′=1

D∑d=1

Wi j′d Z j′ [k − d]

=n∑

j′=1

D∑d=1

(1 + ε) Zi[ j′ D − d] δ( j′ D − (k − d))

= (1 + ε) Zi[k]),

and setting γ = 0 for the output neuron and Ii′k = 0,so that Zi[k] = Zi[k], i.e. the generated spikes Zi[k]correspond to the desired Zi[k], as expected.

The previous construction allows us to state: given anyraster of N neurons and observation time T , there is alwaysa network of size N + T/D + 1 with weights delayed up toD, which exactly simulates this raster. What do we learn fromthis fact?

This helps us to better understand how the reservoircomputing paradigm works. Although it is not always possibleto simulate any raster plot using a ‘simple’ integrate and firemodel such as the one defined in (1), adding hidden neuronsallows one to embed the problem in a higher dimensionalspace where a solution can be found.

This result is induced by the fact that learning the networkweights is essentially a linear (L or LP) problem. With thisinterpretation, a neuron spiking sequence is a vector in this

6

Page 8: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Figure 5. A raster plot showing ‘periodic’ behavior for a network of 20 fully connected neurons (70% excitatory and 30% inhibitoryneurons). After estimation, master and servant generate exactly the same raster plot.

linear space, while a network raster is a vector set. Designinga ‘reservoir’ simply means choosing a set of neurons whosespiking activity spans the space of expected raster. We aregoing to see in the following section that this point of viewstill holds in our framework when considering network inputs.

4. Application: input/output transfer identification

The main practical application of the previous algorithmicdevelopment is to ‘program’ a spiking network in order togenerate a given spike train, or realize a given I/O spike trainfunction. In the present context, this means finding the ‘right’or the ‘best’ spiking network parameters in order to map aninput’s set onto an output’s set.

Hence, from (3) we can establish an I/O system as follows:

Vi[k] =No+S∑j=1

D∑d=1

Wi jd

τik∑τ=0

γ τ Zj[k − τ − d]

︸ ︷︷ ︸output + hidden

+Ni∑

l=1

D∑d=1

W ′ild

τik∑τ=0

γ τ Z′l[k − τ − d]

︸ ︷︷ ︸input

+τik∑

τ=0

γ τ Ii[k − τ ],

(8)

where Ni and No are the numbers of neurons in the I/O layers,respectively. The other terms in this equation have been definedin sections 2 and 3. In the same way as for equation (3), theresolution for the parameter estimation for the system definedin equation (8) can be performed from the definition of a LPproblem and whose equation has the following form

AiWi + BiW′i + ci > 0,

where W and W ′ are the parameters (delayed weights) toestimate and

Ai =⎛⎝. . . . . . . . .

. . . (2Zi[k] − 1)∑τik

τ=0 γ τ Zj[k − τ − d]. . . . . . . . .

⎞⎠

∈ RNS(T−D)×(N+Nh)D

Bi =⎛⎝. . . . . . . . .

. . . (2Zi[k] − 1)∑τik

τ=0 γ τ Z′l[k − τ − d]

. . . . . . . . .

⎞⎠

∈ RNS(T−D)×NiD

ci =(

. . .

(2Zi[k] − 1)

(τik∑

τ=0

γ τI i[k − τ

]− 1

). . .

)t

∈ RNS(T−D).

The number of hidden neurons necessary to do the matchingbetween any input and any output could be calculated from thenumber of unknowns in the LP system (weights), the availablenumber of equations (spike constraints) and the number ofconstraints (since we constrain all blocks of N × D initialconditions to be the same).

To summarize, the system has solutions in the generalcase with N = Ni + No + S. The number of hidden neurons Sis given by

S � T × NS

D+ (NS − 1) − Ni − No.

Here, the term NS corresponds to the number of samplesto train the system to learn a given I/O function. Finally,the solution is obtained by the simplex method as we havedescribed previously.

5. Numerical results

In order to validate our approach we have performed severalnumerical tests on artificial and biological data. The results aredivided into three sections, where we describe the experimentsfor the different paradigms presented in this paper. In all cases,both given and expected rasters are compared by the Victor–Purpura metric [37].

5.1. Retrieving weights from the observation of spikes andmembrane potential

In the first experiment, we consider the linear problem definedin (4) and make use of the singular value decomposition (SVD)

7

Page 9: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Figure 6. A raster plot showing ‘chaotic’ behavior for a network of 20 fully connected neurons (70% excitatory and 30% inhibitoryneurons). Again, master and servant rasters are the same.

Figure 7. Example of rather complex ‘chaotic’ dynamics retrieved by the LP estimation defined in (7) using the master–servant paradigm(N = 50, T = 200 and D = 3).

Figure 8. Example of periodic dynamics retrieved by the LP estimation defined in (7) using the master–servant paradigm (N = 30, T = 300and D = 3).

mechanism [38] to perform the solution. For this, the well-established GSL6 library SVD implementation is used.

This allows us to find

6 http://www.gnu.org/software/gsl

• if there are one or more solutions, the weights of minimalmagnitude |wi|2 = ∑

jd W 2i jd ;

• if there does exist an exact solution, one that allows us tominimize

∑k(Vi[k] − Vi[k])2 where Vi[k] is the membrane

potential predicted by the estimation.

8

Page 10: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

(a)

(d)(c)

(f)(e)

(h)(g)

(b)

Figure 9. Finding the expected dynamics from a raster with a uniform distribution. (a), (c), (e) and (g) correspond to different rasters with aBernoulli distribution; in addition (b), (d), (f) and (h) show the rasters calculated by the methodology proposed. The red lines correspond tothe initial conditions (initial raster), the black ones are the spikes calculated by the method and the blue ones are the spikes in the hiddenlayer obtained with a Bernoulli distribution. We can also observe that the number of neurons in the hidden layer increases, one by one,between (b), (d), (f) and (h); this is because the observation time T is augmented by 4, as predicted. Here N = 5, γ = 0.95, D = 3; in(a) and (b) T = 15 with S = 0, in (c) and (d) T = 19 with S = 1, in (e) and (f) T = 23 with S = 2 and in (g) and (h) T = 27 with S = 3,while S corresponds to the number of neurons in the hidden layer, as detailed in the text.

9

Page 11: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Figure 10. Relationship between the number of hidden neurons S and the observation time T ; here N = 10, T = 470, D = 5 and γ = 0.95for this simulation. The right view is a zoom of the left view. This curves shows the required number of hidden neurons, using the proposedalgorithm, in order to obtain an exact simulation. We observe that S = T

D − N, thus also that an almost maximal number of hidden neurons isrequired. This curve has been drawn from 45 000 independent randomly selected inputs.

Figure 11. Numerical results for a highly correlated Gibbs distribution. Here, the parameters are: r = −1, Ct = −0.5 and Ci = −1, N = 4,γ = 0.95, D = 3, T = 330 with S = 106. Red lines correspond to the initial conditions (initial raster), black ones are I/O spikes and bluesones are spikes in the hidden layer.

We have observed that there is a solution for any raster ifthe observation time is small enough, it is when D (N+1) > T .We follow this track and consider a master/servant paradigmas follows.

(i) First, we generate a ‘master’ raster from weights randomlychosen.

(ii) Second, the raster is submitted to our estimation method(the ‘servant’) in order to perform the solution, while themaster weights are hidden.

(iii) Finally, both rasters are compared by using the Victor–Purpura metric [37].

The numerical results are shown in figures 5 and 6 fortrivial and complex dynamics, respectively.

5.2. Retrieving weights from the observation of spikes

In the second experiment, we considered the samemaster/servant paradigm for the parameter estimation.However, in this case we established numerical tests in orderto validate our approach based on the LP problem. Here, thenumerical solution was performed by the simplex method withthe GLPK7 library.

7 http://www.gnu.org/software/glpk

10

Page 12: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Figure 12. Raster calculated, by the proposed method, from a biological data set, with N = 50, γ = 0.95, D = 3 T = 391 and S = 80.

Numerical results in figures 7 and 8 show the ability of ourmethod to reproduce trivial and complex spiking dynamics.

5.3. Retrieving delayed weights from the observation ofspikes, using hidden units

In this set of numerical experiments we show how theproposed method can reproduce any raster by consideringhidden neurons.

5.3.1. Considering Bernoulli distribution. We start with arandom input, drawn from a uniform Bernoulli distribution,which corresponds to an input with maximal entropy. Here,the master/servant paradigm is no longer used. Thus, the rasterto be reproduced cannot verify the neural network dynamicsconstraints induced by (1), unless we add hidden neurons asproposed in section 3.4.

The numerical results of this experiment are shown infigure 9, where we present an exact reproduction of the spikingdynamics with different layers of hidden neurons. The numberof layers depends on N, T and D as was described in section 3.4.The maximal number of layers is shown in figure 10 for araster with maximal randomness (dynamics with a Bernoullidistribution).

5.3.2. Considering correlated distributions. We nowconsider a correlated random input, drawn from a Gibbsdistribution [39, 40]. To make it simple, the raster input isdrawn from a Gibbs distribution, i.e. a parametrized range RMarkov conditional probability of the form

P({Zi[k], 1 � i � N}|{Zi[k − l], 1 � i � N, 1 � l < R})= 1

Z exp (�λ({Zi[k − l], 1 � i � N, 0 � l > −R})) ,

where �λ() is the related Gibbs potential parametrized by λ

and Z is a normalization constant.

This allows us to test our method on highly correlatedrasters. We have chosen a potential of the form

�λ(Z|k=0) = rN∑

i=1

Zi[0] + Ct

N∑i=1

R∏l=0

Zi[l] + Ci

R∏l=0

Zi[0],

thus with a term related to the firing rate r, a term relatedto temporal correlations Ct and a term related to inter-unitcorrelation Ci.

We obtain a less intuitive result in this case, as illustratedin figure 11: even strongly correlated (but non-periodic) rastersare reproduced only if using as many hidden neurons as in thenon-correlated case. In fact, we have drawn the number S ofhidden neurons against the observation period T randomlyselecting 45 000 inputs and have obtained the same curve asin figure 10.

Since the raster has chaotic behavior, non-predictablechanges occur in the general case, at any time. The rastermust thus be generated by a maximal number of degrees offreedom.

5.3.3. Considering biological data.. We also consider twoexamples of biological data8 [41–43]. Data are related to spikesynchronization in a population of motor cortical neurons inthe monkey, during preparation for movement, in a movementdirection and reaction time paradigm (see figures 12 and 13).Raw data are presented trial by trial (without synchronizationon the input) for different motion directions, and the inputsignal is not represented, since it is meaningless for ourpurpose. Original data resolution was 0.1 ms, while we haveconsidered a 1 ms scale here.

5.4. Input/output estimation

In this section, we present numerical results on experimentswith I/O matching. The aim is to find the parameters (delayed

8 These biological data were courtesy of Alexa Riehle’s team at the Institutde Neurosciences Cognitives de la Mediterranee, Marseille, France.

11

Page 13: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

Figure 13. Raster calculated, by the proposed method, from a biological data set, with N = 50, γ = 0.95, D = 5 T = 291 and S = 8.

-1

0

1

2

3

4

5

60 20 40 60 80 100

Neu

rons

Time

Raster In

-1

0

1

2

3

4

5

60 20 40 60 80 100

Neu

rons

Time

Raster In

Figure 14. Matching of I/O dynamics; the purple lines represent the input dynamics, and the black ones are the OR function of the inputs; itmeans that if at least one neuron in the input layer fires a spike in t, then the output fires a spike in t + 1; finally, the red ones represent theinitial conditions. Parameters in this setup are: Ni = 5, No = 1, D = 0, T = 100 and S = 6. Exact matching.

weights) for a transfer function and to demonstrate that theproposed methodology is able to learn certain functions inorder to approximate I/O spiking dynamics.

In figure 14, we present an I/O spike raster that has beenestimated from an OR function. More specifically, this rastercorresponds to both activities, on one side the input representedin purple and on the other side the output represented by theblack lines. Here, the output is generated from an OR functionapplied to all the spikes in the input layer. This is a basicexample, but it allows us to show that the present approachis able to learn such function from samples represented byspiking dynamics.

6. Conclusion

Considering a deterministic time-discretized spiking networkof neurons with connection weights having delays, we havebeen able to investigate in detail to what extent it is possible toreverse-engineer the network parameters, i.e. the connectionweights, in the exact case. Contrary to the known NP-hard problem which occurs when weights and delays are

to be calculated, the present reformulation, now expressedas a linear-programming (LP) problem, provides an efficientresolution. We have extensively discussed all the potentialapplications of such a mechanism, including regarding whatis known as reservoir computing mechanisms, with or withoutfull connectivity, etc.

At the simulation level, this is a concrete illustration thatrasters produced by the proposed model can produce any rasterproduced by more realistic models such as Hodgkin–Huxleyfor a finite horizon.

At the computational level, we are here in frontof a method which allows one to ‘program’ a spikingnetwork, i.e. find a set of parameters allowing us to exactlyreproduce the network output, given an input. Obviously, manycomputational modules where information is related to ‘times’and ‘events’ are now easily programmable using the presentmethod. A step further, if the computational requirement is toboth consider ‘analogue’ and ‘event’ computations, the factthat we have studied both the unit analogue state and the unitevent firing reverse-engineering problems (corresponding tothe L and LP problems), suggests that we can easily generalizethis method to networks where both ‘times’ and ‘values’

12

Page 14: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

have to be taken into account. The present equations are tobe slightly adapted, yielding a new LP problem with bothequality and inequality constraints, but the method is there.

At the modeling level, the fact that we do not onlystatistically reproduce the spiking output, but reproduceit exactly, corresponds to the computational neuroscienceparadigm where ‘each spike matters’ [19, 20]. The debate isfar beyond the scope of this work, but interesting enough isthe fact that, when considering natural images, the primaryvisual cortex activity seems to be very sparse and deterministic,contrary to what happens with unrealistic stimuli [44]. Thismeans that it is not nonsense to address the problem ofestimating a raster exactly.

As far as modeling is concerned, the most importantmessage is in the ‘delayed weights design: the key point, inthe present context, is not to have one weight or one weightand delay but several weights at different delays’. We haveseen that this increases the computational capability of thenetwork. In other words, more than the connection’s weight,the connection’s profile matters.

Furthermore, we point out how the related LP adjustmentmechanism is distributed and has the same structure asa ‘Hebbian’ rule. This means that the present learningmechanism corresponds to a local plasticity rule, adaptingthe unit weights, from only the unit and output spike-times.It has the same architecture as another spike-time-dependentplasticity mechanism. However, these are supervised learningmechanisms, whereas usual STDP rules are unsupervisedones, while the rule implementations are entirely different.

To what extent this LP algorithm can teach us somethingabout other plasticity mechanisms is an interesting perspectiveof this work. Similarly, a better understanding of the dynamicsof the generated networks is another issue to investigate, aspointed out previously.

We consider the present approach as preliminary and pointout that it must be further investigated at three levels as follows.

• Optimal number of hidden units. We have now a clear viewof the role of these hidden units, used to span the linearspace corresponding to the expected raster, as detailedin section 3.4. This opens a way, not only to find acorrect set of hidden units, but also to find a minimalset of hidden units. This problem is in general NP-hard,but efficient heuristics may be found considering greedyalgorithms. We have not further discussed this aspect inthis paper, because quite different non-trivial algorithmshave to be considered, with the open question of theirpractical algorithmic complexity. But this is an ongoingwork.

• Approximate raster matching. We have seen that, inthe deterministic case using, e.g., alignment metric,approximate matching is a much more challengingproblem, since the distance to minimize is notdifferentiable, thus not usable without a combinatorialexplosion of the search space. However, if we considerother metrics (see [26, 45] for a review), the situation maybe easier to manage, and this is to be investigated further.

• Application to unsupervised or reinforcement learning.Although we deliberately have considered here the

simplest paradigm of supervised learning in order toseparate the different issues, it is clear that the presentmechanism must be studied in a more general setting of,e.g., reinforcement learning [46], for both computationaland modeling issues. Since the specification is based on avariational formulation, such a generalization consideringcriterion related to other learning paradigms seemspossible to develop.

Although we are still far from solving the three issues, thisstudy is completed in the sense that we do not only proposetheory and experimentation, but also a true usable piece ofsoftware9.

Acknowledgments

This work has been supported by the ERC grant Nervi number227747, the ANR grant KEOPS and the European grants:BrainScales and SCANDLE (231168). The authors also wishto thank the Council for Science and Technology (CONACYT)of Mexico for the support in the development of this work.

References

[1] Schafer A M and Zimmermann H G 2006 Recurrent neuralnetworks are universal approximators Lecture NotesComput. Sci. 4131 632–40

[2] Hornik K, Stinchcombe M and White H 1989 Multilayerfeedforward networks are universal approximators NeuralNetw. 2 359–66

[3] Albers D J and Sprott J C 2006 Structural stability andhyperbolicity violation in high-dimensional dynamicalsystems Nonlinearity 19 1801–47

[4] Albers D J and Sprott J C 2006 Routes to chaos inhigh-dimensional dynamical systems: a qualitativenumerical study Physica D 223 194–207

[5] Maass W 2001 On the relevance of time in neural computationand learning Theor. Comput. Sci. 261 157–78

[6] Maass W 1997 Fast sigmoidal networks via spiking neuronsNeural Comput. 9 279–304

[7] Maass W and Natschlager T 1997 Networks of spikingneurons can emulate arbitrary hopfield nets in temporalcoding Neural Syst. 8 355–72

[8] Maass W and Bishop C M 2003 Pulsed Neural Networks(Cambridge, MA: MIT Press)

[9] Jaeger H 2003 Adaptive nonlinear system identification withecho state networks NIPS2002: Advances in NeuralInformation Processing Systems vol 15 ed S Becker,S Thrun and K Obermayer (Cambridge, MA: MIT Press)pp 593–600

[10] Maass W, Natschlager T and Markram H 2002 Real-timecomputing without stable states: a new framework forneural computation based on perturbations Neural Comput.14 2531–60

[11] Verstraeten D, Schrauwen B, D’Haene M and Stroobandt D2007 An experimental unification of reservoir computingmethods Neural Netw. 20 391–403

[12] Paugam-Moisy H, Martinez R and Bengio S 2008 Delaylearning and polychronization for reservoir computingNeurocomputing 71 1143–58

[13] Gerstner W and Kistler W M 2002 Mathematical formulationsof Hebbian learning Biol. Cybern. 87 404–15

[14] Cooper L N, Intrator N, Blais B S and Shouval H Z 2004Theory of Cortical Plasticity (Singapore: World Scientific)

9 Source code available at http://enas.gforge.inria.fr/v1

13

Page 15: Parameter estimation in spiking neural networks: a reverse-engineering approach

J. Neural Eng. 9 (2012) 026024 H Rostro-Gonzalez et al

[15] Toyoizumi T, Pfister J P, Aihara K and Gerstner W 2007Optimality model of unsupervised spike-timing dependentplasticity: synaptic memory and weight distribution NeuralComput. 19 639–71

[16] Bohte S M and Mozer M C 2007 Reducing the variability ofneural responses: a computational theory ofspike-timing-dependent plasticity Neural Comput.19 371–403

[17] Toyoizumi T, Pfister J P, Aihara K and Gerstner W 2005Generalized Bienenstock–Cooper–Munro rule for spikingneurons that maximizes information transmission Proc.Natl Acad. Sci. 102 5239–44

[18] Chechik G 2003 Spike-timing-dependent plasticity andrelevant mutual information maximization Neural Comput.15 1481–510

[19] Guyonneau R, vanRullen R and Thorpe S J 2005 Neurons tuneto the earliest spikes through stdp Neural Comput.17 859–79

[20] Delorme A, Perrinet L and Thorpe S 2001 Network ofintegrate-and-fire neurons using rank order coding b: spiketiming dependent plasticity and emergence of orientationselectivity Neurocomputing 38 539–45

[21] Sıma J and Sgall J 2005 On the nonlearnability of a singlespiking neuron Neural Comput. 17 2635–47

[22] Orchard G, Russell A, Mazurek K, Tenore Fand Etienne-Cummings R 2008 Configuring silicon neuralnetworks using genetic algorithms ISCAS 2008: IEEE Int.Symp. on Circuits and Systems pp 1048–51

[23] Leporati A, Zandron C, Ferretti C and Mauri G 2007 Mauri:solving numerical NP-complete problems with spikingneural P systems WMC8: Membrane Computing, Int.Workshop (Thessaloniki, Greece, 2007) (Berlin: Springer)(selected and invited papers)

[24] Paun G, Perez-Jimenez M J and Rozenberg G 2006 Spiketrains in spiking neural P systems Int. J. Found. Comput.Sci. 17 975–1002

[25] Wang J, Hoogeboom H J, Pan L, Paun Gand Perez-Jimenez M J 2010 Spiking neural P systems withweights Neural Comput. 22 2615–46

[26] Schrauwen B 2007 Towards applicable spiking neuralnetworks PhD Thesis Universiteit Gent, Belgium

[27] Nemenman I, Lewen G D, Bialek W and de Ruyter vanSteveninck R R 2006 Neural coding of a natural stimulusensemble: information at sub-millisecond resolution PLoSComput. Biol. 4 e1000025

[28] Van Rullen R and Thorpe S 2001 Rate coding versus temporalorder coding: what the retina ganglion cells tell the visualcortex Neural Comput. 13 1255–83

[29] Destexhe A 1997 Conductance-based integrate and fire modelsNeural Comput. 9 503–14

[30] Rudolph M and Destexhe A 2006 Analytical integrate and fireneuron models with conductance-based dynamics for eventdriven simulation strategies Neural Comput. 18 2146–210

[31] Cessac B, Rochel O and Vieville T 2008 Introducingnumerical bounds to improve event-based neural networksimulation Technical Report 6924 INRIA

[32] Cessac B 2008 A discrete time neural network model withspiking neurons. Rigorous results on the spontaneousdynamics J. Math. Biol. 56 311–45

[33] Cessac B and Vieville T 2008 On dynamics ofintegrate-and-fire neural networks with adaptiveconductances Front. Neurosci. 2

[34] Soula H and Chow C C 2007 Stochastic dynamics of a finite-size spiking neural networks Neural Comput. 19 3262–92

[35] Darst R B 1990 Introduction to Linear ProgrammingApplications and Extensions (New York: Dekker)

[36] Bixby R E 1992 Implementing the simplex method: the initialbasis J. Comput. 4 267–84

[37] Victor J D and Purpura K P 1996 Nature and precision oftemporal coding in visual cortex: a metric-space analysisJ. Neurophysiol. 76 1310–26

[38] Gantmatcher F R 1977 Matrix Theory (New York: Chelsea)[39] Chazottes J R, Floriani E and Lima R 1998 Relative entropy

and identification of Gibbs measures in dynamical systemsJ. Stat. Phys. 90 697–725

[40] Cessac B, Rostro-Gonzalez H, Vasquez J C and Vieville T2008 Statistics of spikes trains, synaptic plasticity andGibbs distributions Neurocomp 2008

[41] Riehle A, Grammont F, Diesmann M and Grun S 2000Dynamical changes and temporal precision of synchronizedspiking activity in monkey motor cortex during movementpreparation J. Physiol. 94 569–82

[42] Grun S, Diesmann M, Grammont F, Riehle A and Aertsen A1999 Detecting unitary events without discretization of timeJ. Neurosci. Methods 94 67–79

[43] Grammont F and Riehle A 2003 Spike synchronization andfiring rate in a population of motor cortical neurons inrelation to movement direction and reaction time Biol.Cybern. 88 360–73

[44] Baudot P 2007 Nature is the code: high temporal precision andlow noise in V1 PhD Thesis University of Paris

[45] Cessac B, Rostro-Gonzalez H, Vasquez J C and Vieville T2008 To which extent is the ‘neural code’ a metric? Proc.NeuroComp pp 302–6

[46] Sutton R S and Barto A G 1998 Reinforcement Learning: AnIntroduction (Cambridge, MA: MIT Press)

14