CONTINUOUS TIME CORRELATION ANALYSIS TECHNIQUESFOR SPIKE TRAINS
By
IL PARK
A THESIS PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCE
UNIVERSITY OF FLORIDA
2007
1
c© 2007 Il Park
2
Memmings are Memmings,
computers are recursive,
brains are brains.
3
ACKNOWLEDGMENTS
I thank my adviser Dr. Jose C. Prıncipe for all his great guidance, my committee
member Dr. John Harris for insightful suggestions, and Dr. Thomas B. DeMarse for his
knowledge and intuition on experiments. I thank my collaborators Antonio R. C. Paiva
and Karl Dockendorf for all the joyful discussions. I also thank Dongming Xu (dynamics),
Jian-Wu Xu (RKHS), Vaibhav Garg, Manu Rastogi, Savyasachi Singh (chess), Allen
Martins (pdf), Yiwen Wang and Aysegul Gunduz of CNEL, Jason T. Winters, Alex J.
Cadotte, Hany Elmariah (singing) and Nicky Grimes of the Neural Robotics and Neural
Computation Lab for their support and help. Last but not least, I thank my family and
friends for being there.
4
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
CHAPTER
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.1.1 Why Do We Analyze Spike Trains? . . . . . . . . . . . . . . . . . . 111.1.2 What Are Similar Spike Trains? . . . . . . . . . . . . . . . . . . . . 12
1.2 Minimal Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 CROSS INFORMATION POTENTIAL . . . . . . . . . . . . . . . . . . . . . . 14
2.1 Smoothed Spike Train Representation . . . . . . . . . . . . . . . . . . . . . 142.2 L2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Cauchy-Schwarz Dissimilarity . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 Information Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5.1 Comparison of Distances . . . . . . . . . . . . . . . . . . . . . . . . 182.5.2 Robustness to Jitter in the Spike Timings . . . . . . . . . . . . . . . 20
3 INSTANTANEOUS CROSS INFORMATION POTENTIAL . . . . . . . . . . . 22
3.1 Synchrony Detection Problem . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Instantaneous CIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.1 Derivation from CIP . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2.2 Spatial Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2.3 Rescaling ICIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.1 Sensitivity to Number of Neurons . . . . . . . . . . . . . . . . . . . 24
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.1 High-order Synchronized Spike Trains . . . . . . . . . . . . . . . . . 253.4.2 Mirollo-Strogatz Model . . . . . . . . . . . . . . . . . . . . . . . . . 27
5
4 CONTINUOUS CROSS CORRELOGRAM . . . . . . . . . . . . . . . . . . . . 32
4.1 Delay Estimation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2 Continuous Correlogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1 Summary of Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.2 Potential Applications and Future Work . . . . . . . . . . . . . . . . . . . 49
APPENDIX
A BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1 Point Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50A.1.1 An Alternative Representation of Poisson Process . . . . . . . . . . 51A.1.2 Filtered Poission Process . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2 Mean Square Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53A.3 Probability Density Estimation . . . . . . . . . . . . . . . . . . . . . . . . 54A.4 Information Theoretic Learning . . . . . . . . . . . . . . . . . . . . . . . . 56A.5 Reproducing Kernel Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . 58
B STATISTICAL PROOFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
C NOTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
D SOURCE CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
D.1 CIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69D.2 ICIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70D.3 CCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6
LIST OF TABLES
Table page
A-1 Various probability density estimation kernels . . . . . . . . . . . . . . . . . . . 56
7
LIST OF FIGURES
Figure page
2-1 L2 distance versus CS divergence . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2-2 Distance difference of CS divergence for a synchronized or uncorrelated missingspike . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2-3 Change in CIP versus jitter standard deviation in the synchronous spike timings 20
3-1 Spike train as a realization of point process and smoothed spike train . . . . . . 22
3-2 Variance in scaled CIP versus the number of spike trains used for spatial averagingin log scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3-3 Analysis of ICIP as a function of synchrony . . . . . . . . . . . . . . . . . . . . 26
3-4 Evolution of synchrony in the spiking neural network . . . . . . . . . . . . . . . 28
3-5 Zero-lag cross-correlation for comparison . . . . . . . . . . . . . . . . . . . . . . 29
4-1 Example of cross correlogram construction . . . . . . . . . . . . . . . . . . . . . 33
4-2 Decomposition and shift of the multiset A. . . . . . . . . . . . . . . . . . . . . . 36
4-3 Effect of the length of spike train and strength of connectivity on precision ofdelay estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4-4 Effect of kernel size (bin size) of CCC (CCH) to the performance . . . . . . . . 41
4-5 Schematic diagram for the configuration of neurons. . . . . . . . . . . . . . . . . 43
4-6 Comparison between CCC and CCH on synthesized data. . . . . . . . . . . . . . 44
4-7 Effect of length of spike trains on CCC and CCH . . . . . . . . . . . . . . . . . 45
4-8 Correlograms for in vitro data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8
Abstract of Thesis Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science
CONTINUOUS TIME CORRELATION ANALYSIS TECHNIQUESFOR SPIKE TRAINS
By
Il Park
May 2007
Chair: Jose Carlos PrıncipeMajor: Electrical and Computer Engineering
Correlation is the most basic analysis tool for time series. To apply correlation
to train of action potentials generated by neurons, the conventional method is to
discretize the time. However, time binning is not optimal: time resolution is sacrificed,
and it introduces the notorious problem of bin size sensitivity. Since spike trains can
be considered as a realization of a point process, the signal has no amplitude and all
information is embedded in the times of occurrence. Instead of time binning, we propose a
set of methods based on kernel smoothing to analyze the correlations. Smoothing is done
in continuous time so we do not lose the exact time of spikes while enabling interaction
between spikes at a distance. We present three techniques derived from correlation: (1)
spike train similarity measure, (2) synchrony detection mechanism, and (3) continuous
cross correlogram.
9
CHAPTER 1INTRODUCTION
1.1 Motivation
Signal processing tools such as adaptive filtering, least squares, detection theory,
clustering, and spectral analysis have brought engineers the power to analyze virtually
any signal. However, the application of such tools to the signal of the nervous system, the
spike train, has remained restricted. This is mainly because of the poor performance of
usual estimators for statistical variables such as mean and correlation function for point
process observations.
The foundation of signal processing tools is the L2, the metric space of random
processes with finite second order moment, which is a well defined Hilbert space. The
metric (distance measure) of the random process provides a continuous spectra of similar
signals, providing a friendly space analogous to Euclidean space. Also, the distance is
strongly related to correlation which is the inner product in L2. While point processes can
be theoretically treated in the same way, the main problem is to estimate the process from
the observation. In contrast to analog and digital signals, the distance estimator between
two point process observations in the traditional sense leads to natural numbers which is
not continuous but discrete, so the spectra of signals is lost. The discrete metric makes it
inappropriate to directly apply the signal processing tools to spike trains.
Neuroscience literature have been using several approaches to overcome this difficulty.
The most widely used approach is to use time bins to convert the times of occurrence to
a sequence of binary amplitude or discrete time series. Recently, van Rossum proposed a
metric for spike trains [1], which is related to a non-Euclidean metric proposed by Victor
and coworkers which is an extension of the Levenshtein distance (also known as edit
distance in computer science) to continuous time [2]. Many neuroscientists were already
using the van Rossum distance by intuition in the form of correlation [3–6].
We mapped the spike trains to a realization of a random process in L2, so that
traditional signal processing techniques can be readily applied. We will analyze
10
the properties of the mapping and the metric induced by the mapping. One of the
advantages we gain from this approach is that by choosing the appropriate mapping, the
computational cost can be minimized while the time resolution remains continuous. We
will derive correlation based measures from this space and recover the power of signal
processing tools for spike trains. Specifically, we propose three techniques, (1) the cross
information potential (CIP) as a similarity measure between spike trains based on
correlation, (2) the instantaneous cross information potential (ICIP) as a measure
of instantaneous synchrony among spikes trains, and (3) continous cross correlogram
(CCC) as an extension of CIP to continuous time lags. All of the proposed has efficient
computation mechanism and will be accompanied by statistical analysis.
1.1.1 Why Do We Analyze Spike Trains?
Neurons communicate mainly through a series of action potentials, although
there are increasing evidence that field potentials are also essential in the brain [7].
Action potentials are generation by the complex dynamics of a neuron [8, 9], and has a
stereotypical shape which can be propagated through a long distance and can resist noise
because of its all-or-none type of transmission. There have been evidence that not only
the existence of an action potentials carries information, but the duration of the action
potential is systematically modulated [10], and recently even subthreshold dendritic input
can modulate synaptic terminals [11].
However, from the computational point of view, it is believed that the temporal
structure of the action potentials is more important than individual details of an action
potential. Experiments mainly in sensory encoding demonstrates precise timing (or
precise time to first spike) of action potentials ([12–14], see [15] for a review, and [16] for
arguments against it) which supports the idea of encoding information on spike times. The
precision of spike timings is less than 100 µs in auditory system [17] and in the order of
1 ms in other experiments [14].
11
The other reason that spike trains are widely studied is because it is relatively easy
to record with high accuracy and precision. Extracellular electrode arrays permits the
recording from massive number of neurons simultaneously in vivo and in vitro.
Many methods have been developed to analyze spike trains for various problems
including correlation analysis [18], connectivity estimation [19, 20], delay estimation [21],
system identification [22], clustering different spike patterns [4, 23], estimating entropy
[24–27], and neural decoding [28, 29]. We will tackle some of these problems with the
proposed techniques.
1.1.2 What Are Similar Spike Trains?
As mentioned in section 1.1.1, the spike times produced by neurons in response to
repeated stimulus often shows precise timing with some error. The jitter error distribution
fits with a Gaussian distribution [13]. The possible noise sources are thermal noise, ion
channels, probabilistic synapse activation, spontaneous release of vesicles.
When the spike train is modeled by a Poisson process, the jitter noise restricts the
shape of the intensity function (instantaneous firing rate) over time. In other words, the
noise will limit the narrowness of a precisely timed spike. In addition, this implicates that
the spike trains with small timing differences should be treated as similar to each other,
thus having a small distance (or dissimilarity1 ).
We can exploit this and construct a probable intensity function from a spike train
by using the techniques of kernel density estimation. The kernel, which represents the
jitter timing distribution, will be placed where the spikes have actually occurred, and the
summation of all kernels will estimate the intensity function assuming a Poisson process.
Nawrot and coworkers have tried various kernels for single trial estimation of the intensity
1 Distance usually refers to a mathematical metric which satisfies positivity, reflexivity,definiteness, symmetry and triangle inequality. However, we will also refer to dissimilaritymeasures that lack the triangle inequality as a distance informally and interchangeablywith dissimilarity.
12
function from spike trains in a model, and concluded that the kernel size (bandwidth) is
more important that the shape of the kernel [30].
Another type of noise in spike trains is insertion or deletion of spikes. Although
spike trains of neurons conserve high precision of spike timings when they occur, there
is evidence that neurons often skip a few spikes [4, 31, 32]. When a spike is inserted or
removed from a spike train, the distance differs by the constant 12
in van Rossum distance.
In contrast, a correlation measure does not depend on signal power (or number of action
potentials), but only on the coincidental action potential pairs. In applications, such
as classification of spike trains with template matching, the correlation based distance
measure (Cauchy-Schwarz divergence) can perform better than van Rossum (L2) distance.
The concept of coincidental spikes leads to synchrony between spike trains. In
addition, there are strong evidences that neurons and dendrites work as a coincidence
detector and sensitive to afferent synchrony [26, 33–36].
1.2 Minimal Notation
We introduce the minimal mathematical notation. We assume that a number of spike
trains are observed, and indexed. Each spike train is a finite set of spike timings where
the action potentials are detected. For the spike train indexed by i, individual timings are
denoted as tim where m is the index for spikes. The functional form of i-th spike train is
defined as,
si(t) =
Ni∑m=1
δ(t− tim) (1–1)
where Ni is the number of spikes in i-th spike train, and δ(·) is the Dirac delta function.
13
CHAPTER 2CROSS INFORMATION POTENTIAL
2.1 Smoothed Spike Train Representation
Given a spike train si(t), we assume inhomogeneous Poisson process and estimate the
intensity function by using a kernel. The kernel has to be non-negative valued and has
area of 1, that is, it has to be a proper probability density function. Denote this kernel as
κpdf(t), then the estimated intensity function can be written as,
λi(t) =
Ni∑m=1
κpdf(t− tim). (2–1)
This process can also be viewed as low pass filtering of the spike trains to estimate
the post synaptic potential of synapses. In the point process literature, this is a special
case of filtered point process, and in the engineering literature known as shot noise. 1
The estimated intensity function is continuous if κpdf is continuous. Assuming continuous
κpdf , the mapping equation (2–1) converts a spike train to a continuous signal that can be
interpreted with the second order theory with a continuous metric. Note that the mapping
is one-to-one and onto: deconvolution of λi(t) with κpdf uniquely determines a spike train.
2.2 L2 Metric
The smoothed spike train, or estimated intensity function, can be considered as a
signal in L2. The distance in L2 of two smoothed spike trains is,
∥∥∥λi(t)− λj(t)∥∥∥
2
2=
∫ ∞
−∞(λi(t)− λj(t))
2dt (2–2a)
=
∫ ∞
−∞
(λ2
i (t)− 2λi(t)λj(t) + λ2j(t)
)dt. (2–2b)
1 When the underlying process is a homogeneous Poisson process, the filtered pointprocess is wide sense stationary (WSS) by Campbell’s theorem (see appendix, theorem 3).
14
Using the definition of the estimator (2–1),
∫ ∞
−∞λ2
i (t)dt =
∫ ∞
−∞
Ni∑m=1
Ni∑n=1
κpdf(t− tim)κpdf(t− tin)dt (2–3a)
=
Ni∑m=1
Ni∑n=1
κ(tim − tin) (2–3b)
and the cross term (inner product in L2) becomes,
∫ ∞
−∞λi(t)λj(t)dt =
Ni∑m=1
Nj∑n=1
κ(tim − tjn) (2–3c)
where κ(t) =∫∞−∞ κpdf(s)κpdf(s + t)ds. κ is the kernel which computes the correlation.
If an exponential distribution is used, i.e.,
κpdf(t) =1
τe−
tτ u(t), (2–4)
where u(t) is the unit step function, then the L2 distance is proportional to van Rossum
distance with factor 1τ. In addition, the combined kernel κ(t) becomes a scaled Laplace
distribution kernel:
∫ ∞
−∞λi(t)λj(t)dt =
1
τ 2
∫ ∞
−∞
Ni∑m=1
Nj∑n=1
exp
(−t− tim
τ
)u(t− tim) exp
(−t− tjn
τ
)u(t− tjn)dt
(2–5)
=1
τ 2
Ni∑m=1
Nj∑n=1
∫ ∞
−∞exp
(−2t− tim − tjn
τ
)u(t− tim)u(t− tjn)dt (2–6)
=1
τ 2
Ni∑m=1
Nj∑n=1
∫ ∞
max(tim,tjn)
exp
(−2t− tim − tjn
τ
)dt (2–7)
=1
τ 2
Ni∑m=1
Nj∑n=1
(−τ
2) exp
(−2t− tim − tjn
τ
)∞
max(tim,tjn)
(2–8)
=
Ni∑m=1
Nj∑n=1
1
2τexp
(−|t
im − tjn|
τ
)(2–9)
Note that in terms of a linear filter, the causal exponential distribution corresponds to a
first-order infinite impulse response (IIR) filter with time constant τ with gain of 1τ.
15
2.3 Cauchy-Schwarz Dissimilarity
An alternative dissimilarity measure that can be induced from inner product of L2 is
the Cauchy-Schwarz (CS) divergence. Recall the Cauchy-Schwarz inequality (see lemma
6):
|〈x|y〉| ≤ ‖x‖ ‖y‖ .
Since each quantity is positive if x and y are not zero vectors, and equality holds when
either of them are zero, we can divide both sides,
1 ≤ ‖x‖ ‖y‖|〈x|y〉| .
By taking the logarithm,
0 ≤ log
(‖x‖ ‖y‖|〈x|y〉|
)≤ ∞.
It can be proved that this quantity is positive, reflexive, and symmetric [37] if we exclude
0 from the space. However, CS divergence does not hold the triangular inequality, thus it
is not a metric. By expanding the definition of inner product and norm of L2 space,
dCS(λi(t), λj(t)) = log
√∫∞−∞ λ2
i (t)dt∫∞−∞ λ2
j(t)dt∫∞−∞ λi(t)λj(t)dt
(2–10a)
= log
√∑Ni
m=1
∑Ni
n=1 κ(tim − tin)∑Nj
m=1
∑Nj
n=1 κ(tjm − tjn)∑Ni
m=1
∑Nj
n=1 κ(tim − tjn)
(2–10b)
= log
√√√√ Ni∑m=1
Ni∑n=1
κ(tim − tin)
Nj∑m=1
Nj∑n=1
κ(tjm − tjn)
− log
Ni∑m=1
Nj∑n=1
κ(tim − tjn)
, (2–10c)
where dCS denotes the CS divergence.
If the spike trains are homogeneous Poisson with firing rate λi and λj respectively,
the expected value of the norm of estimated intensity function E [λ2i (t)] is the second order
16
moment of the shot noise, which can be obtained by equation (A–8),
E[λ2
i (t)]
= λi
∫ ∞
−∞κ2(t)dt. (2–11)
Therefore the first term in equation (2–10c) can be approximated as a constant. However,
depending on the correlation of the spike trains, the second term will vary. Since the
negative logarithm is a monotonically decreasing function, we take the argument, denote
as Vij, and define as cross information potential for reasons that would be explained in
section 2.4.
Vij =
Ni∑m=1
Nj∑n=1
κ(tim − tjn) (2–12)
This inner product term is essentially equivalent to correlation of smoothed spike trains.
CIP is inversely related to CS divergence, so it quantifies similarity between spike trains.
2.4 Information Potential
Given a probability distribution, entropy quantifies the peakiness and is related
to the higher order moments that the variance cannot capture. Renyı’s entropy is a
generalization of the classic Shannon’s entropy. Information theoretic learning (see section
A.4 for a summary of the information theoretic learning framework).
Inhomogeneous Poisson process can be represented as two separate random variables:
one for the number of spikes and the other for the temporal density (see section A.1.1).
The pdf for the temporal density is simply a normalized form of the intensity function
(equation (A–2)). This pdf does not have the information of how active the process is,
that is, the firing rate.
Information potential of density function estimated using Parzen window with κpdf for
the i-th spike train has the following form (compare equation (A–16)),
Vi =1
N2i
Ni∑m=1
Ni∑n=1
κ(tim − tin) (2–13)
where κ(t) =∫∞−∞ κpdf(s)κpdf(s + t)ds is defined as before. This coincides with the
definition of norm square of the smoothed spike train, equation (2–3b), normalized by the
17
0 0.01 0.02 0.03 0.040
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04L2 distance
distance from template 1
dist
ance
from
tem
plat
e 2
0 1 2 3 4 50
1
2
3
4
5CS distance
distance from template 1
dist
ance
from
tem
plat
e 2
Figure 2-1. L2 distance versus CS divergence. Spike trains from template 1 is generatedand the distance (or divergence) from each template. Gaussian jitter with0.7 ms standard deviation is added to the timings. Blue circles correspondto spike trains with same number of spikes, and red dots correspond to spiketrains with missing spikes. The kernel κ was Laplacian with time constantτ = 1 ms.
number of spikes. For a pair of spike trains, the cross information potential can be defined
as a similarity index between the corresponding pair of pdfs. Note that in terms of CS
divergence, the normalization with the number of spikes in the spike train cancels away.
2.5 Discussion
2.5.1 Comparison of Distances
As mentioned earlier in section 1.1.2, although neurons fire with high temporal
precision, they often miss spikes. In this case, L2 distance would deviate because of the
missing spike. CS divergence would be less sensitive because it will ignore missing spikes.
To demonstrate this, a simple classification task was performed (see figure .2-1). Two
template spike trains were prepared: template 1 with 2 spikes at 3 ms and 8 ms, and
template 2 with 1 spike at 6 ms. Then, we generated instances of template 1 by putting
Gaussian jitter on timing (blue circles) and removing a spike (red dots).
For the no missing spike case, both L2 (94%) and CS divergence (100%) correctly
classified the instance as template 1 (they lie on the upper half). But for missing spikes
18
0 10 20 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
number of total spikes
decr
ease
in C
S d
ista
nce
loosing one uncorrelated spike
0 10 20 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
incr
ease
in C
S d
ista
nce
number of correlated spikes
loosing one correlated spike
30 spikes total60 spikes total
Figure 2-2. Increase or decrease in Cauchy-Schwarz (CS) divergence (dissimilarity) whena spike is missing. (Left) When a correlated (perfectly synchronized in thiscase) spike is missing, the divergence decrease inversely related to the totalnumber of spikes. (Right) But if a correlated (synchronized) spike is missing,the divergence increases proportional to the total number of synchronizedspikes, and not greatly influence by the total number of spikes. In contrast, L2
distance the increase and decrease are constant (see text for details).
case, L2 distance (51%) performed a lot worse than CS divergence (93%). The CS
divergence shows lines when one spike is missing because the distance (quantified as the
divergence) is a log of the kernel which is a single Laplacian.
Suppose individual spikes are separated compared to the kernel size or exactly
synchronized so that we can approximate the norm and inner product by the number
of spikes: norm square of a spike train is the number of spikes, and inner product gives
the number of synchronized spikes. This is equivalent to making the kernel size infinitely
small, so that it converges to a Dirac delta function.
Let there be two spike trains A and T (for template) with NA and NT number of
spikes respectively, and NAT synchronized spikes. The L2 distance between A and T is
NA + NT − 2NAT , and the CS divergence is log NANT
NAT.
If we loose a spike that was not synchronous between A and T, the distance will
decrease by the constant 1 in L2 distance (12
in van Rossum distance) and for CS
19
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Jitter standard deviation (ms)
CIP
[2ms]
00.10.20.30.40.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.1
0.15
0.2
0.25
0.3
0.35
Jitter standard deviation (ms)
CIP
[5ms]
00.10.20.30.40.5
Figure 2-3. Change in CIP versus jitter standard deviation in the synchronous spiketimings. For the case with independent spike trains, the error bars for onestandard deviation are also shown. The kernel size is 2ms (left) and 5ms(right).
divergence the decrease is,
logNANT
NAT
− logNA(NT − 1)
NAT
= logNANT
NAT
NAT
NA(NT − 1)(2–14)
= logNT
NT − 1. (2–15)
Thus, if there are more spikes, the CS divergence decreases less for a missing non-synchronous
spike. (And if the last spike is lost, the CS divergence is not defined anymore.)
If a synchronized (correlated) spike is lost, NT and NAT are reduced by 1. L2 distance
increases by 3, and for CS divergence the increase is,
logNA(NT − 1)
NAT − 1− log
NANT
NAT
= logNT − 1
NT
NAT
NAT − 1. (2–16)
Therefore, if there are more synchronized spikes, the distance decreases more. See
figure 2-2 for an illustrative example.
2.5.2 Robustness to Jitter in the Spike Timings
CIP was analyzed when jitter is present in the spike timings. This was done with
a modified multiple interaction process (MIP) model [38, 39] where jitter, modeled as
20
i.i.d. Gaussian noise, was added to the individual spike timings. In the MIP model an
initial spike train is generated as a realization of a Poisson process. All spike trains are
derived from this one by copying spikes with a probability ε. The operation is performed
independently for each spike and for each spike train. The resulting spike trains are also
Poisson processes. If γ was the firing rate of the initial spike train then the derived spikes
trains will have firing rate εγ. Furthermore, it can be shown that ε is also the count
correlation coefficient [38]. A different interpretation for ε is that, given a spike in a spike
train, it quantifies the probability of a spike co-occurrence in another spike train.
The effect was then studied in terms of the synchrony level and kernel size. Figure 2-3
shows the average CIP for 10 Monte Carlo runs of two spike trains, 10 seconds long, and
with constant firing rate of 20 spikes/s. In the simulation, the synchrony level was varied
between 0 (independent) to 0.5 for a kernel size of 2ms and 5ms. The jitter standard
deviation varied between the ideal case (no-jitter) to 15ms.
As mentioned earlier, CIP measures the coincidence of the spike timings. As a
consequence, the presence of jitter in the spike timings decreases the expected values
of CIP (and time averaged ICIP). Nevertheless, the results in Fig. 2-3 support the
statement that the measure is indeed robust to large levels of jitter compared to the kernel
size, and is capable of detecting the existence of synchrony among neurons. Of course,
increasing the kernel size decreases the sensitivity of the measure for the same amount
of jitter. Furthermore, as in the previous example, it is also shown that small levels of
synchrony can be discriminated from the independent case as suggested by the error
bars in Figure 2-3. Finally, we remark that the difference in scale between the figures is
a consequence of the normalization of the kernel so that it is a valid pdf. This can be
compensated explicitly by scaling the CIP by τ . Simply note that the expressions provided
in the previous example for mean ICIP (and therefore CIP) as a function of the synchrony
level implicitly compensate for τ .
21
CHAPTER 3INSTANTANEOUS CROSS INFORMATION POTENTIAL
3.1 Synchrony Detection Problem
Coincidental firing of different neurons has been a focus of interest–from synfire
chain [40], neural coding [31, 41], neural assemblies [3], binding problem [42], and to pulse
coupled oscillators [43–47]. Analysis of synchrony has relied on various methods, such
as the cross-correlation [48], joint peri-stimulus time histogram (JPSTH) [49], unitary
events [50], and gravity transform [3], among many others.
Since CIP (or CS divergence) characterizes the similarity (or dissimilarity) of spike
trains with correlation of spike times, CIP can also be used as a synchrony measure.
However, CIP does not provide information about instantaneous synchrony. A sliding
window approach can be used with sacrifice of the temporal resolution, as in cross
correlation and gravity transform.
3.2 Instantaneous CIP
3.2.1 Derivation from CIP
Let us break the integral range from the definition of L2 inner product (equation
(2–3c)).
Vij(t) =
∫ t
−∞λi(σ)λj(σ)dσ. (3–1)
Taking the derivative on time yields ICIP,
vij(t) = λi(t)λj(t), (3–2)
(a) ti1
ti2
ti3
ti4
tiN
i
T0 Ttime
(b)
Figure 3-1. Spike train as a realization of point process and smoothed spike train. (a)Spike train of neuron i represented in the time domain as a sequence ofimpulses and (b) its filtered counterpart using a causal decaying exponential.
22
by the fundamental theorem of calculus. Since the derivative provides the instantaneous
change of CIP at that time, ICIP quantifies instantaneous synchrony of the action
potential timing. If we use the exponential kernel for intensity estimation, ICIP can be
easily estimated by two IIRs and a multiplication, therefore requiring no memory, but just
two state variables.
3.2.2 Spatial Averaging
In the context of neural assembly, ensemble of neurons work together with synchronous
spikes. Current multielectrode recording technology has enabled the analysis of a number
of spike trains recorded simultaneously. It is possible to reduce the trial averaging by
combining the concept of neural assemblies and multiple spike trains recording. The
spatial averaging over the ensemble may provide high resolution of the events.
Consider a set of M spike trains. ICIP (and CIP) can be generalized to multiple spike
trains in a straightforward manner by averaging over all the pairwise combinations. That
is, the ensemble averaged ICIP is given by
v(t) =2
M(M − 1)
M∑i=1
M∑j=i+1
vij(t). (3–3)
Analysis of the spatial averaging is presented in section 3.3.1.
3.2.3 Rescaling ICIP
When precise timing is modulated with a fluctuation of the firing rate, the precision
of the timing may vary. In high firing rate regions, the experimenter would like to pay
more attention to more precise synchronizations, since the spikes are dense. Changing the
kernel size according to the general firing rate trend may help in these cases.
The time rescaling theorem states that an inhomogeneous Poisson process can be
transformed into a homogeneous Poisson process [51, 52] by stretching the time according
to the intensity function. Transformation of equation (2–1) into a constant firing rate
time scale for different spike trains depends on individual intensity function, and therefore
the transformed results are not synchronous. Thus, in order to quantify synchrony, the
23
5 10 15 20 25 3010
−5
10−4
10−3
10−2
10−1
100
Number of spike trains
Var
ianc
e of
CIP
0.00.10.20.30.40.5
0 5 10 15 20 25 3010
−6
10−5
10−4
10−3
10−2
10−1
Number of spike trains
Var
ianc
e of
CIP
1 ms2 ms5 ms10 ms20 ms
Figure 3-2. Variance in scaled CIP versus the number of spike trains used for spatialaveraging in log scale. The analysis was performed for different levels ofsynchrony and constant τ = 2ms (left), and different values of the exponentialdecay parameter τ on independent spike trains (right). In both plots thetheoretical value of CIP for independent spike trains is shown (dashed line).
correlation operation should be performed in the original times, but with the smoothing in
the transformed space. The first order approximation of this can be achieved by redefining
the intensity estimator as
λi(t) =1
β
Ni∑m=1
exp
(− fi(t)
β(t− tim)
)u(t− tim) (3–4)
where fi(t) is also the estimation for the intensity function and β > 0 is a scaling constant
which specifies the value of τ when the firing rate is one. Therefore, at time t, the effective
time constant is approximately β
λi(t). It may seem like an oxymoron to estimate an
intensity function using an estimate of the intensity function, but f(t) is estimated with a
broader kernel for the firing rate trend, and λ(t) has a small kernel size that corresponds
to the resolution of interest.
3.3 Analysis
3.3.1 Sensitivity to Number of Neurons
We now analyze the effect of the number of spike trains used for spatial averaging.
This effect was studied with respect to two main factors: the synchrony level of the spike
24
trains and the exponential decay parameter τ . In the first case, a constant τ = 2ms was
used, while the latter case considered only independent spike trains. The results are shown
in Fig. 3-2 for the scaled CIP spatially averaged over all pair combinations of neurons.
The simulation was repeated for 200 Monte Carlo runs using 10 second long spike trains
obtained as homogeneous Poisson processes with firing rate of 20 spikes/s.
As illustrated in the figure, the variance in CIP decreases dramatically with the
increase in the number of spike trains employed in the analysis. Recall that the number of
pair combinations over which the averaging is performed increases with M(M − 1), where
M is the number of spike trains. As expected, this improvement is most pronounced in the
case of independent spikes trains. In this situation, the variance decreases proportionally
to the number of averaged pairs of spike trains. This is shown by the dashed line in the
plots of Fig. 3-2. These results support the role and importance of ensemble averaging as a
principled method to reduce the variance of the CIP estimator.
3.4 Results
3.4.1 High-order Synchronized Spike Trains
Figure 3-3 shows ICIP of different levels of synchrony over ten spike trains. The
synchrony was generated by using the MIP model, and modulated over time for 1 seconds
of time durations. The firing rate of the generated spike trains was constant and equal to
20 spikes/s for all spike trains. The figure shows the ICIP averaged for each time instant
over all pair combinations of spike trains. Because the spike trains have constant firing
rate, the time constant of the decaying exponential convolved with the spike trains was
constant and chosen to be τ = 2 ms. Also, in the bottom plot the average value of the
mean ICIP is shown. This was computed in 25 ms steps with a causal 250 ms long sliding
window. To establish a relevance of the values measured, the expectation and this value
plus two standard deviations are also shown, assuming independence between spike trains.
The mean and standard deviation, assuming independence, are 1 and√(
12τλ
+ 1)2 − 1,
respectively (see Appendix for details). The expected value of the ICIP when synchrony
25
0
0.5
Syn
chro
ny, ε
2
4
6
8
10
Spi
ke tr
ain
num
ber
0
100
200
300
400
500
600
ICIP
0 1 2 3 4 5 6 7 8 9 10 110
10
Time (s)
Figure 3-3. Analysis of ICIP as a function of synchrony. (Top) Level on synchronyspecified in the simulation of the spike trains. (Upper middle) Raster plotof firings. (Lower middle) Average ICIP across all neuron pair combinations.(Bottom) Time average of ICIP in the upper plot computed in steps of 25mswith a causal rectangular window 250ms long (dark gray). For reference, itis also displayed the expected value (dashed line) and this value plus twostandard deviations (dotted line) for independent neurons, together with theexpected value during moments of synchronous activity (thick light gray line),as obtained analytically from the level of synchrony used in the generation ofthe dataset. Furthermore, the mean and standard deviation of the ensembleaveraged CIP scaled by T measured from data in one second intervals is alsoshown (black).
26
among spike trains exists is given by 1 + ε/(2τλ), with λ the firing rate of the two spike
trains, and is also shown in the plot for reference.
In the figure, it is noticeable that estimated synchrony increases as measured by ICIP.
Moreover, the averaged ICIP is very close to the theoretical expected value and is typically
below the expected maximum under an independence assumption as given by the line
indicating the mean plus two standard deviations. The delayed increase in the averaged
ICIP is a consequence of the causal averaging of ICIP. It is equally remarkable to verify
that (scaled) CIP matches precisely the expected values from ICIP as given analytically.
3.4.2 Mirollo-Strogatz Model
In this example, we show that ICIP can quantify synchrony in a spiking neural
network of leaky-integrate-and-fire (LIF) neurons designed according to [43]1 and
compare the result with extended cross-correlation for multiple neurons. This is the
simplest pulse coupled network that was proven to be perfectly synchronized from almost
any initial condition (Fig. 3-4). The synchronization is essentially due to leakiness and the
weak global coupling among the oscillatory neurons.
The raster plot of the network firing pattern is shown in Fig. 3-4. There are two main
observations: the progressive synchronization of the firings associated with the global
oscillatory behavior of the network, and the local grouping that tends to preserve local
synchronizations that either entrain the full network or wash out over time. As expected
from theoretical studies of the network behavior [43, 46] and which ICIP depicts precisely,
the synchronization is monotonically increasing, with a period of fast increase in the first
second followed by a plateau and slower increase as time advances. Moreover, it is possible
1 The parameters for the simulation are: 100 neurons, resting and reset membranepotential -60 mV, threshold -45 mV, membrane capacitance 300 nF, membrane resistance1 MΩ, current injection 50 nA, synaptic weight 100 nV, synaptic time constant 0.1 ms andthe topology was all to all excitatory connection.
27
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4x 10
6
IP o
f Mem
bran
e P
oten
tial
Time (sec)
0
0.5
1
1.5
2
2.5
3x 10
−3
ICIP
10
20
30
40
50
60
70
80
90
100
Spi
ke tr
ain
num
ber
1.1 1.2 1.3 1.4 1.5
Figure 3-4. Evolution of synchrony in the spiking neural network. (Top) Raster plot ofthe neuron firings. (Middle) ICIP over time. The inset highlights the mergingof two synchronous groups. (Bottom) Information potential of the membranepotentials. This is a macroscopic variable describing the synchrony in theneurons’ internal state.
28
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (sec)
Cro
ssco
rrel
atio
n
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (sec)
Cro
ssco
rrel
atio
n
Figure 3-5. Zero-lag cross-correlation computed over time using a sliding window 10 binslong, and bin size 1ms (top) and 1.1ms (bottom).
29
to observe in the first 1.5 s the formation of a second group of synchronized neurons which
slowly merges into the main group.
Since the model was simulated, we also have access to all the internal variables: the
membrane potential of individual neurons over time. Thus, we can compute the synchrony
of neurons in terms of membrane potential. Surprisingly, the information potential (IP)
of the membrane potentials reveals the same evolution as the envelope of ICIP, including
the plateau. The IP was computed according to (A–18) using a Gaussian kernel with
size 0.75mV.2 The IP measures synchrony of the neuron’s internal state, which is only
available in simulated networks. Yet the results show that ICIP was able to sucessfully
and accurately extract such information from the observed spike trains.
For completeness, in Fig. 3-5 we also present the zero-lag cross-correlation over
time, averaged through all pairwise combinations of neurons. The cross-correlation was
computed with a sliding window 10 bins long, sliding 1 bin at a time. In the figure,
the result is shown for a bin size of 1 ms and 1.1 ms. It is notable that although
cross-correlation captures the general trends of synchrony, it masks the plateau and
the final synchrony and it is highly sensitive to the bin size as shown in the figure, unlike
ICIP (data not shown). In other words, the results for the windowed cross-correlation
show the importance of working in “continuous” time which is crucial for robust
synchrony estimation in the spike domain. Other methods relying on binning also
suffer from sensitivity to bin size, such as the ones mentioned earlier. For this reason,
these methods are limited and unable to achieve the same high temporal resolution as
ICIP. In addition, spike trains are generally non-stationary unlike some methods assume.
The conventional approach is to use a moving window analysis such that only piece-wise
2 The distance used in the Gaussian kernel was d(θi, θj) = min (|θi − θj|, 15mV − |θi − θj|),where θi is the membrane potential of the ith neuron. This wrap-around effect expressesthe phase proximity of the neurons before and after firing.
30
stationarity is necessary. The information theoretic framework of ICIP, and CIP, treats the
non-stationarity implicitly as a pdf estimation problem.
31
CHAPTER 4CONTINUOUS CROSS CORRELOGRAM
4.1 Delay Estimation Problem
Precise time delay in transmission of a spike in the neural system is considered to
be one of the key features to allow efficient computation in cortex [15, 53]. For example,
it is crucial for coincidence detection of auditory signal processing [17]. One of the
effective methods for estimating the delay is to use a cross correlogram [54]. Cross
correlogram is a basic tool to analyze the temporal structure of signals. It is widely
applied in neuroscience to assess oscillation, propagation delay, effective connection
strength, and spatiotemporal structure of a network [28].
However, estimating the cross correlation of spike trains is non-trivial since they are
point processes, thus the signals do not have amplitude but only time instances when the
spikes occur. A well known algorithm for estimating the correlogram from point processes
involves histogram construction with time interval bins [48]. The binning process is
effectively transforming the uncertainty in time to amplitude variability. This quantization
of time introduces binning error and leads to coarse time resolution. Furthermore, the
correlogram does not take advantage of the higher temporal resolution of the spike times
provided by current recording methods.
This can be improved by using smoothing kernels to estimate the cross correlation
function from finite samples. The resulting cross correlogram is continuous and provides
high temporal resolution in the region where there is a peak (see Fig. 4-1 for comparison
between histogram method and kernel method.) In this paper, we propose an efficient
algorithm for estimating the continuous correlogram of spike trains without time binning.
The continuous time resolution is achieved by computing at finite time lags where the
continuous cross correlogram can have a local maximum. The time complexity of the
proposed algorithm is O(T log T ) on average where T is the duration of spike trains. The
application of the proposed algorithm is not restricted to simultaneously recorded spike
trains, but also to PSTH and also other point processes in general.
32
0 100 200 300time (ms)
A
B
0 100 200 300
C
−300 −200 −100 0 100 200 300
D
−200 0 200 0
2
4
6
E
−300 −200 −100 0 100 200 300
F
Figure 4-1. Example of cross correlogram construction. A and C are two spike trainseach with 4 spikes. Except for the third spike in A, each spike in A invokes aspike in C with some small delay around 10 ms. B represents all the positive(black) and negative (gray) time differences between the spike trains. Dshows the position of delays obtained in B. E is the histogram of D, whichis the conventional cross correlogram with bin size of 100 ms. F shows thecontinuous cross correlogram with Laplacian kernel (solid) and Gaussiankernel (dotted) with bandwidth 40 ms. Note that the Laplacian kernel is moresensitive to the exact delay.
33
4.2 Continuous Correlogram
Two simultaneously recorded instances of point processes are represented as a sum of
Dirac delta functions at the time of firing event, si(t) and sj(t),
si(t) =
Ni∑m=1
δ(t− tim), (4–1)
where Ni is the number of spikes and tim are the time instances of action potentials. The
cross correlation function is defined as,
Q†ij(∆t) = Et [si(t)sj(t + ∆t)] , (4–2)
where Et [·] denotes expected value over time t. The cross correlation can be interpreted
as scaled conditional probability of j-th neuron firing given i-th neuron fired ∆t seconds
before [55]. In a physiological context, there is a physical restriction of propagation delay
for an action potential to have a causal influence to invoke any other action potential.
Therefore, this delay would influence the cross correlogram as a form of increased
amplitude. Thus, estimating the delay involves finding the lag at which there is a
maximum in the cross correlogram (inhibitory interaction which appear as troughs
rather than peaks is not considered in this article).
Smoothing a point process is superior to the histogram method for the estimation of
the intensity function [30], and especially the maxima [56]. Similarly, the cross correlation
function can also be estimated better with smoothing which is done in continuous time
so we do not lose the exact time of spikes while enabling interaction between spikes at a
distance.
Instead of smoothing the histogram of time differences between two spike trains,
we first smooth the spike train to obtain a continuous signal [57]. We will show that
this is equivalent to smoothing the time differences with a different kernel. A causal
exponential decay was chosen as the smoothing kernel to achieve computational efficiency
34
(see section 4.3). Smoothed spike trains are represented as,
qi(t) =
Ni∑m=1
1
τe−
t−timτ u(t− tim), (4–3)
where u(t) is the unit step function. The cross correlation function of the smoothed spike
trains is,
Q∗ij(∆t) = Et [qi(t)qj(t + ∆t)] . (4–4)
Given a finite length of observation, the expectation in equation (4–4) can be
estimated from samples as,
Q∗ij(∆t) =
1
T
∫ ∞
0
qi(t)qj(t + ∆t)dt, (4–5)
where T is the length of the observation. After evaluation of the integral, the resulting
estimator becomes,
Q∗ij(∆t) =
1
2τT
Ni∑m=1
Nj∑n=1
e−|tim−t
jn−∆t|τ , (4–6)
which is equivalent to the kernel intensity estimation [58, 59] from time differences using a
Laplacian distribution kernel.
The mean and variance of the estimator is analyzed by assuming the spike trains are
realizations of two independent homogeneous Poisson processes.
E[Q∗
ij(∆t)]' λAλB, (4–7)
var(Q∗ij(∆t)) ' λAλB
4τT, (4–8)
where λA and λB denote the firing rate of the Poisson process of which i-th and j-th spike
train, respectively, is a realization (see Appendix for derivation). Note that the variance
reduces linearly as the duration of the spike train is elongated. By removing the mean
and dividing by the standard deviation, we standardize the measure for inter-experiment
35
-
· · · θn−1 θn θn+1 θn+2 · · ·?
∆t
6
∆t− δ
(A∆t)− (A∆t)
+
(A∆t−δ)− (A∆t−δ)
+
Figure 4-2. Decomposition and shift of the multiset A.
comparison:
Qij(∆t) =
√4τT (Qij(∆t)− λAλB)√
λAλB
. (4–9)
4.3 Algorithm
The algorithm divides the computation of the summation of continuous cross
correlogram into disjoint regions and combines the result. We show that there are only
finite possible local maxima, and by storing the intermediate computation results for
neighboring time lags, the cross correlation of each lag can be computed in constant time.
The essential quantity to be computed is the following double summation,
Qij(∆t) =
Ni∑m=1
Nj∑n=1
e−|tim−t
jn−∆t|τ . (4–10)
The basic idea for efficient computing is that the summation of the exponential function
computed on a collection of points can be shifted with only one multiplication,∑
i exi+δ =
(∑
i exi)eδ. Since a Laplacian kernel is two exponentials stitched together, we need to
carefully take the regions into account.
Define the multiset of all time differences between two spike trains,
A = θ | θ = tim − tjn, m = 1, . . . , Ni, n = 1, . . . , Nj. (4–11)
Even though A is not strictly a set, since it may contain duplicates, we will abuse the set
notation for simplicity. Note that the cardinality of the multiset A is NiNj. Now equation
36
(4–10) can be rewritten as
Qij(∆t) =∑
θ∈A
e−|θ−∆t|
τ . (4–12)
Now let us define a series of operations for a multiset B ⊂ R and δ ∈ R,
B+ = x | x ∈ B and x ≥ 0, (non-negative lag) (4–13a)
B− = x | x ∈ B and x < 0, (negative lag) (4–13b)
Bδ = x | y ∈ B and x = y − δ. (shift) (4–13c)
Since B can be decomposed into two exclusive sets B+ and B−, equation (4–12) can also
be rewritten and decomposed,
Qij(∆t) =∑
θ∈A∆t
e−|θ|τ =
∑
θ∈(A∆t)+
e−|θ|τ +
∑
θ∈(A∆t)−e−
|θ|τ (4–14a)
=∑
θ∈(A∆t)+
e−θτ +
∑
θ∈(A∆t)−e
θτ . (4–14b)
For convenience, we define the following summations
Q±ij(∆t) =
∑
θ∈(A∆t)±e∓
θτ . (4–15)
Let us order the multiset A in ascending order and denote the elements as θ1 ≤ θ2 ≤. . . ≤ θn ≤ θn+1 ≤ . . . θNiNj
. Observe that within an interval ∆t ∈ (θn, θn+1], the
multiset ((A∆t)±)−∆t is always the same (see Fig. 4-2). In other words, if ∆t = θn+1,
for a small change δ ∈ [0, θn+1 − θn), the multisets do not change their membership, i.e.
((A∆t)±)δ = (A(∆t−δ))
±. Therefore, we can simplify an arbitrary shift of Q±ij with single
multiplication of an exponential as,
Q±ij(∆t− δ) =
∑
t∈(A∆t−δ)±e∓
tτ =
∑
t∈((A∆t)±)δ
e∓tτ (4–16a)
=∑
t∈(A∆t)±e∓
t−δτ =
∑
t∈(A∆t)±e∓
tτ e±
δτ = Q±
ij(∆t)e±δτ . (4–16b)
37
Thus, local changes of Qij can be computed by a constant number of operations no matter
how large the set A is, so that
Qij(∆t− δ) = Q+ij(∆t− δ) + Q−
ij(∆t− δ) (4–17a)
= Q+ij(∆t)e
δτ + Q−
ij(∆t)e−δτ . (4–17b)
If there is a local maximum or minimum of Qij(∆t − δ), it would be wheredQij(∆t−δ)
dδ= 0,
which is,
δ∗ =τ
2
(ln(Q−
ij(∆t))− ln(Q+ij(∆t))
). (4–18)
Also note that since the second derivative,
d2Qij(∆t− δ)
dδ2=
1
τ 2
(Q+
ij(∆t)eδτ + Q−
ij(∆t)e−δτ
)≥ 0, (4–19)
Qij(∆t− δ) is a convex function of δ within the range. Thus, the maximum of the function
value is always on either side of its valid range, only local minimum can be in between.
In principle, we need to compute equation (4–10) for all ∆t ∈ [−T ∗, T ∗] to achieve
continuous resolution, where T ∗ is the maximum time lag of interest. However, if we only
want all local minima and maxima, we just need to evaluate on all ∆t ∈ A, and compute
the minima and maxima using equation (4–17b) and equation (4–18). Therefore, if we
compute the Q±ij(θn) for all θn ∈ A, we can compute δ∗ for all intervals (θn, θn+1] if a local
extremum exists. These can be computed using the following recursive formulae.
Q−ij(θn+1) = Q−
ij(θn)e−θn+1−θn
τ + 1, (4–20a)
Q+ij(θn+1) = Q+
ij(θn)eθn+1−θn
τ − 1. (4–20b)
In practice, due to accumulation of numerical error, the following form is preferable for
Q+ij,
Q+ij(θn) = (Q+
ij(θn+1) + 1)e−θn+1−θn
τ . (4–21)
38
Initial conditions for the recursions are Q−ij(θ1) = 1 and Q+
ij(θN) = 0. The resulting
pseudocode is listed as in Algorithm 1.
Algorithm 1 Calculate Qij
Require: τ > 0, A 6= ∅, N = |A|Ensure: Qij(∆t) =
∑t∈A e−
|t−∆t|τ ,∀∆t ∈ A
1: A ⇐ sort(A) O(N log N)2: Q−(1) ⇐ 13: Q+(N) ⇐ 04: for k = 1 to N − 1 do5: ed(k) ⇐ e−
A(k+1)−A(k)τ
6: end for7: for k = 1 to N − 1 do8: Q−(k + 1) ⇐ 1 + Q−(k) · ed(k)9: Q+(N − k) ⇐ (Q+(N − k + 1) + 1) · ed(N − k)
10: end for11: for k = 1 to N do12: Qij(A(k)) ⇐ Q+(k) + Q−(k)13: end for
The bottleneck for time complexity is the sorting of the multiset A, thus the overall
time complexity is O(NiNj log(NiNj)).1 Note that the time complexity of straight
forward evaluation of equation (4–10) is O(NiNj) for each time lag ∆t. Assuming
homogeneous Poisson process for individual spike trains, the average time complexity
becomes O(N∗ log N∗) where N∗ = λAλBT , T is the length of spike train, and λA
represents the average firing rate for the Poisson process. Note that the conventional cross
correlogram algorithm [48] has the time complexity of O(N∗) on average.
4.4 Results
In this section, we analyze the statistical properties and demonstrate the usefulness
of the continuous cross correlogram (CCC) estimator compared to the cross correlation
histogram (CCH). The CCC is defined by the linear interpolation of equation (4–9)
1 It is possible to reduce the sorting to O(NiNj log(min(Ni, Nj))) using merge sortingpartially sorted lists. However, it is only a minor improvement in general.
39
0 5 10 15 200
2
4
6
8
10
12
14
Data length (sec)
Pre
cisi
on (
ms)
CCC, strength 0.05CCH, strength 0.05CCC, strength 0.1CCH, strength 0.1
(a) Effect of spike train length
0 0.1 0.2 0.3 0.4 0.50
2
4
6
8
10
12
14
Connection strength
Pre
cisi
on (
ms)
CCC, length 1 sCCH, length 1 sCCC, length 10 sCCH, length 10 s
(b) Effect of connection strength
Figure 4-3. Effect of the length of spike train and strength of connectivity on precision ofdelay estimation. The precision is estimated by the standard deviation in 1000Monte Carlo runs with kernel size τ = 0.4 ms (or bin size h = 1.96 ms). Thesmaller standard deviation indicates higher temporal resolution.
between the possible maxima (but not the minima). In order to compare with CCC, CCH
is standardized in a similar way to equation (4–9) according to [60].
Since CCH is essentially equivalent to using a uniform distribution kernel (or a boxcar
kernel) and sampling at equally spaced intervals as opposed to the Laplacian distribution
kernel used in CCC, in order to make a fair comparison, we choose the kernel size (bin
size) of both distributions to have the same standard deviation. To be specific, if the time
bin size of CCH is h, then we compare the result to CCC with kernel size of τ = h2√
6.
Since the histogram method is highly sensitive to bin size, we used the procedure of
optimal bin size selection of Poisson processes suggested by [61]. The method is designed
for the estimation of firing rate or PSTH from a measurement assuming a Poisson process.
However, since the time difference between two Poisson processes of finite length can be
considered as a realization of a Poisson process, it is possible to directly apply to the
CCH.
40
0 0.5 1 1.5 2 2.5 32
4
6
8
10
12
Kernel size (ms)
Pre
cisi
on (
ms)
CCCCCHCCH optimal
(a) CCC vs CCH
0 0.5 1 1.5 20
1
2
3
4
5
6
7
8
9
Kernel size (ms)
Pre
cisi
on (
ms)
0.2 ms0.4 ms0.6 ms0.8 ms1 ms
(b) Optimal τ for CCC
Figure 4-4. Effect of kernel size (bin size) of CCC (CCH) to the performance. Theconnection strength was 5% and the spike trains are 10 seconds long, i.e. 5spikes are correlated on average. (a) Sensitivity of CCC and CCH on kernelsize for noise standard deviation 0.25 ms. The horizontal dotted line indicatesthe performance when optimal bin size is chosen for each set of simulatedspike time differences. The median of the optimal bin size chosen (right) andcorresponding kernel size for CCC (left) are plotted as vertical dashed lines.Note that CCC is robust on kernel size selection and performs better thanCCH. (b) For different standard deviations of jitter noises, the precision isplotted versus the kernel size τ . Note that the optimal kernel size increases asthe jitter variance increases. For each point, 3000 Monte Carlo runs are used,and the actual delay is uniformly distributed from 3 ms to 4 ms to reduce thebias of CCH.
41
4.4.1 Analysis
For a pair of directly synapsing neurons, the delay from the generation of an action
potential of the presynaptic neuron to the generation of an action potential of the post
synaptic neuron is not always precise. Various sources of noise such as variability in
axon conduction delay, presynaptic waveform, probability of presynaptic vesicle release,
and threshold mechanism [62] effect the location, significance and width of the cross
correlogram peak. Furthermore, if the neurons are in a network, multiple paths, common
input sources, recurrent feedback and local field potential fluctuation can influence the
cross correlogram.
In this section, we model the timing jitter with a Gaussian distribution and analyze
the statistical properties of CCC and CCH on time delay estimation. A pair of Poisson
spike trains of firing rate 10 spikes/s were correlated by copying a portion of the spikes
from one to another and then shifting by the delay with the Gaussian jitter noise. The
fraction of spikes copied represents the effective synaptic connectivity.
The total number of correlated spikes depend on two factors: the length of spike
train, and the synaptic connectivity. In figure 4-3, the precision of CCC and CCH are
compared according to these factors. The precision is defined to be the standard deviation
of the error in estimating the exact delay. Precision of both CCC and CCH improves as
the number correlated spikes increases in a similar trend. CCC converges to a precision
lower than half the jitter noise standard deviation (500 µs).
The optimal kernel size (or bin size) which gives the best precision depends on the
noise jitter level. In figure 4-4(a), CCC and CCH is compared across different kernel sizes.
In general, CCC performs better than optimal bin size and most of the bin sizes CCH.
As mentioned above, CCH is sensitive to bin size, but CCC is robust to the kernel size
for precision performance. Also note that the optimal kernel size for CCC corresponds
to equal median value of the variance optimal bin size selected (vertical dash lines).
42
?
A
:
¼
I12.3 ms
13.7 ms
B
4.3 ms
9.3 ms
Figure 4-5. Schematic diagram for the configuration of neurons.
Increasing the jitter level worsens the best precision and increases the optimal kernel size
for CCC as shown in Fig. 4-4(b).
4.4.2 Examples
In this section, we demonstrate the power of CCC using two examples: the first
example uses synthetic spike trains from a simple spiking neuronal network model, and for
the second we use recordings from a cortical culture on a microelectrode array (MEA).
Two standard leaky-integrate-and-fire neurons are configured with 4 synapses, two
from neuron A to neuron B, and two for the other direction as illustrated in figure 4-5.
Individual synapses are static (no short/long-term plasticity), with equal weights and
generate EPSP (excitatory postsynaptic potential) with a time constant of 1 ms. Each
neuron is injected with positively biased Gaussian white noise current, so that they would
fire with mean firing rate of 35 spikes/s. The simulation step size is 0.1 ms.
As shown in figure 4-6, both CCH and CCC identifies the delays imposed by the
conduction delay, synaptic delay, and the delay for the generation of action potential by
noisy fluctuation of membrane potential. However, the time lag identified by CCC is more
accurate than that of CCH, since the temporal precision provided by CCH is limited by
the bin size and the jitter noise on delay, but for CCC, it is only limited by the jitter.
In other words, if there is no jitter, or a sufficient amount of spike timings has the exact
delay, then CCC is capable of quantifying the delay with infinite resolution.
43
−20 −15 −10 −5 0 5 10 15 20
0
5
10
15
20
25
30
35continuous cross−correlogram
Time lag (ms)
corr
elat
ion
−20 −15 −10 −5 0 5 10 15 20
0
5
10
15
20
25
30
35cross−correlation histogram
Time lag (ms)
corr
elat
ion
Figure 4-6. Comparison between CCC and CCH on synthesized data.
44
−10 −8 −6 −4 −2 0
0
5
10
15
20
25
30continuous cross−correlogram
Time lag (ms)
2.5 s5.0 s10.0 s
−10 −8 −6 −4 −2 0
0
5
10
15
20
25
30cross−correlation histogram
Time lag (ms)
2.5 s5.0 s10.0 s
Figure 4-7. Effect of length of spike trains. Comparison of continuous cross correlogram(left) and cross correlation histogram (right) with different length of spiketrains (2.5, 5, 10 seconds). Estimated optimal bin size is 0.267 ms.
In figure 4-7, we illustrate the difference in performance of the methods according
to the length of the spike trains. When the spike trains are only of length 2.5 seconds,
the CCC has significantly lower time resolution where no spikes had that time difference,
yet maintaining the high resolution in highly correlated peaks. In contrast, the CCH is
uniformly sampled regardless of the amount of data. The non-uniform sampling gives
significant advantage to CCC when only a short segment of data is available.
To test the method further, spike trains recorded in vitro were used. We recorded
electrical activity from dissociated E-18 rat cortex cultured on a 60 channel microelectrode
array from MultiChannel Systems [63]. For a particular pair of electrodes, specific delays
were observed as shown in Fig. 4-8. Those delays are rarely observed (3 to 5 times through
5 to 10 minutes of recording), however the precision is less than 2 ms which makes it
significant in CCC. The delays persisted at least 2 days, and many more interaction delays
were observable as the culture matured. As observable in the CCH analysis, it is almost
impossible to detect the delays and their consistency.
45
−80 −60 −40 −20 0 20 40 60 802
4
6
8
10
12
time lag (ms)
7 DIV [Ch 20, 0.57 Hz][Ch 21, 0.52 Hz]9 DIV [Ch 20, 0.15 Hz][Ch 21, 1.11 Hz]
−80 −60 −40 −20 0 20 40 60 80−1
0
1
2
3
4
5
6
7
8
time lag (ms)
7 DIV9 DIV
Figure 4-8. CCC (top) and CCH (bottom) of 7 DIV (days in vitro) and 9 DIV corticalculture recordings. Spike trains from two adjacent electrodes are analyzed. On7 DIV, CCC shows two significant peaks and they are also observable on 9DIV, and some non-significant spike time differences corresponds to peaks on 9DIV (marked with arrows). In contrast, CCH this structure is difficult to note.The optimal bin size is 3.8 ms for 7 DIV and 3.3 ms for 9 DIV data. The totalrecording time is 350 seconds for 7 DIV and 625 seconds for 9 DIV.
46
Note that the delays are much longer than the expected conduction time which
is estimated to be in the order of 2 ms for conduction speed of 100 µm/ms [64]. One
possible mechanism would be a rarely activated chain of synaptic pathway from a common
source neuron with different delays. In contrast to a recent study by [65] where the delay
between two channels is estimated with a single approximated Gaussian distribution with
relatively large variance, we observe multiple delays between channels.
4.5 Discussion
We proposed an estimator of cross correlogram from an observation of a point
process, and provide a efficient algorithm to compute it. The method utilizes the fact that
there are more samples where the correlation is stronger. Thus, computing the continuous
correlogram at the lags of samples provides non-uniform sampling advantageous for
estimating the precise delay. Unfortunately, this non-uniform sampling is disadvantageous
for inhibitory relations, therefore only positively related delays can be accurately
estimated. To achieve computational efficiency, the algorithm is limited to the use of
Laplacian distribution as the kernel. However, it has been shown that the bandwidth
(kernel size) is more important than the shape of the kernel for the performance of
intensity estimation [30].
The only free parameter is the kernel size which determines the amount of smoothing.
Unlike the conventionally used histogram method, the proposed method is robust on
kernel size, however, the optimal kernel size depends on the noise level of the delay. In
a biological neuronal network, the noise level may depend on which path the signal was
transmitted. Therefore each peak of the correlogram may have different amount of noise.
We suggested the use the optimal bin size for histogram as a guideline for the kernel size
selection.
The continuous cross correlogram can be viewed as a generalization of the cross
information potential where the correlation is interpreted as similarity (or dissimilarity)
between spike trains as we discussed in chapter 2. The proposed algorithm can be used
47
to find the similarity between two spike trains over continuous time lags. However, due to
accumulation of numerical error, the algorithm has to be non-causal (see equation (4–21)).
This prevents the algorithm to be used as an online filter to detect certain spike train
patterns, while offline analysis can still be done.
The proposed algorithm is not limited to cross-correlations. It can be directly
applied to smooth any type of point processes histogram, such as PSTH. However, one
always has to be cautious when the underlying process is highly non-stationary. Various
non-stationarities can cause peaks in the correlogram [66].
48
CHAPTER 5CONCLUSION
5.1 Summary of Contribution
The techniques presented here are based on smoothing spike trains with a continu-
ous kernel which preserves the time resolution while obtaining a continuous signal. We
demonstrated the usefulness of Cauchy-Schwarz divergence as a metric for smoothed spike
trains when spikes can be missing. The CS divergence is related to the similarity measure
CIP which is the inner product of the smoothed spike trains in L2. ICIP, the derivative
of CIP, is proposed as an instantaneous synchrony measure and extended to ensemble
average. Finally, time lag is incorporated into CIP to obtain a cross correlation function
of spike trains. All three algorithms can be computed efficiently depending only on the
number of spikes, without approximation, and independent of the sampling rate.
5.2 Potential Applications and Future Work
Given a similarity (or dissimilarity/divergence) measure with efficiently computable
closed form, the possibilities are endless. Clustering, classification, system identification,
and adaptive filtering can be applied to spike trains. We have some preliminary results
on stimulus to response mapping and stimulus estimation from response in a dissociated
cortical tissue culture, and willing to apply the techniques to various experiments.
In neuroscience, connectivity estimation, delay estimation, and identification of
synchronous neural group would be the most obvious applications. Correlation of
synchrony and attention or behavior would also be interesting. In a more engineering
perspective, detection of seizure, building a liquid state machine from living tissue, and
study of synchrony dynamics in pulse coupled oscillators seem to be promising. Finding
valid delay subnetwork [67] and polychronous group of neurons [34] may also be possible
by using CCC.
49
APPENDIX ABACKGROUND
A.1 Point Process
Point process is a statistical random process where events (points) are distributed
over a continuous space. Typically the magnitude of the event is ignored and only the
position (time) is described (otherwise it is called a marked point process). Distribution
of trees in a mountain, rain drops in space, earthquake instances over time, and action
potentials in a spike train are examples of point processes.
In this section, we will introduce some notation and definitions of point processes1 .
Point process is built up from counting random variables, which maps sample space to a
natural number that represents the number of events in a certain space.
Definition 1 (Counting Process [51]). Let Ω be the sample space consisting of realization
of points ω = x1, x2, . . . ∈ Ω. We define the counting process N(A : ω) as
N(A : ω) =∑
i
IA(xi),
where IA(x) denotes the set characteristic-function of A,
IA(x) =
1, x ∈ A,
0, x /∈ A.
By taking the derivative of a realization of a counting process, a realization of a point
process can be obtained. In this case, the realization of point process will consist of delta
functions at on the locations of events. Spike trains will be treated as a realization of a
point process for the rest of the thesis.
The simplest type of point process is the Poisson process. In Poisson process,
each event is independent and the probability of firing at a location (time) is completely
1 Some materials of this section is replicated from Snyder [51]
50
determined by the functional parameter Λ(t). When Λ(t) is differentiable, we call the
derivative λ(t) the intensity function.
Definition 2 (Temporal Poisson process). A temporal Poisson process for times t ≥ t0 is
a counting process N(t) : t ≥ t0 with the following properties:
1. Pr[N(t0) = 0] = 1;
2. for t0 ≤ s < t, the increment N(s, t) = N(t) − N(s) is Poisson distributed with
parameter Λ(t)− Λ(s),
Pr[N(s, t) = n] =1
n!(Λ(t)− Λ(s))n e−(Λ(t)−Λ(s)),
where n is a nonnegative integer, and Λ(t) is a finite-valued, nonnegative, nonde-
creasing function of t;
3. N(t) : t ≥ t0 has independent increments.
Proposition 1. Let [ti, ui)i=1,2,...,k be disjoint intervals on [t0,∞). If N(t) : t ≥ t0 is a
temporal Poisson process, then the independent increments implies,
Pr[N(t1, u1) = n1, N(t2, u2) = n2, · · · , N(tk, uk) = nk] =k∏
i=1
Pr[N(ti, ui) = ni].
We assume that the spike trains are realizations of Poisson process. The intensity
function λ(t) corresponds to the underlying (instantaneous) firing rate. This assumption
is based on statistics observed from in vivo systems and frequently considered as good
approximation [68]. The simple formulation of Poisson process enables analytical analysis
for the tools (some of which are presented in the Appendix).
A.1.1 An Alternative Representation of Poisson Process
For any interval, there are only finite number of events in a Poisson process. The
statistics in the interval can also be described by a combination of two random variables.
The first random variable represents the distribution for the number of events in the
51
interval which follows the Poisson distribution f .
f(k, λ) =λke−λ
k!, (A–1)
where k is the number of events, λ = Λ(T )−Λ(0)T
is the average intensity function where
T is the length of the interval, ! is the factorial operator. The second random variable X
represents the distribution of the finite events (points) over the interval. This distribution
is obtained by normalizing the intensity function λ(t) over the interval to make it a
pdf [69]:
fX(x) =λ(t)∫ T
0λ(t)dt
. (A–2)
The equivalence can be shown by the joint distribution of the points and considering all
the possible order (order statistics) [51, 52].
A.1.2 Filtered Poission Process
Smoothed spike train is a form of shot noise, and if the underlying point process is
Poisson, we can get the moments analytically using the characteristic functionals.
Theorem 2 (Characteristic functional for a filtered Poisson process [51]). Let a Poisson
process with intensity function λ(t) defined on t ≥ t0 be filtered by a causal linear
filter with impulse response h(σ, τ ;u), resulting in a continuous time signal y(t). The
characteristic functional of y(t) is defined as,
φy(jν) = E
[exp[j
∫ T
t0
y(σ)dν(σ)]
](A–3)
has the evaluation (A–4)
= exp∫ T
t0
λτE
[exp[j
∫ T
τ
h(σ, τ ;u)dν(σ)− 1]dτ
] (A–5)
where ν(·) is any function with
∫ ∫f(α, β)dν(α)dν(β) < ∞
52
where
f(α, β) =
∫ min(α,β)
t0
λ(τ)E [h(α, τ ;u)h(β, τ ;u)] dτ
+
∫ α
t0
λτE [h(α, τ ;u)] dτ
∫ β
t0
λτE [h(β, τ ;u)] dτ.
See [51] page 219-220 for proof.
We can choose the form of ν(·) to be,
ν(σ) =
0, t0 ≤ σ < t
α, t ≤ σ < T.
(A–6)
Then, the characteristic function for y(t) becomes,
My(t)(jα) = exp∫ t
t0
λ(τ)E[ejαh(t,τ ;u) − 1
]dτ. (A–7)
Therefore the n-th cumulant γn for y(t) can be derived as
γn =
∫ t
t0
λ(τ)E [hn(t, τ ;u)] dτ. (A–8)
There are ways to get the actual pdf [70–72], however, the closed form is highly
complicated.
The following theorem supports that the correlation function of the smoothed spike
train is meaningful under the assumption of Poisson spike trains.
Theorem 3 (Campbell’s Theorem [51]). Shot noise of a homogeneous Poisson process is
wide sense stationary.
Furthermore, the power spectral density of the smoothed spike train is same as
exciting the system (filter h) with white Gaussian noise [51].
A.2 Mean Square Calculus
Statistical signal processing is based on second order theory, or mean square calculus,
of the random process. In this section, we give a brief introduction to the theory.
53
First, we introduce L2, the space of all random variables with a finite second order
moment.
L2 = X | E [ |X|2] < ∞ (A–9)
It can be shown that L2 is a Hilbert space [73]. In this space, the order of limit and
expected value operator can be exchanged up to second order.
Definition 3 (Mean-square continuity). Let X(t) be a stochastic process defined on the
real line. X(t) is continuous in mean square sense at t if
limh→0
E[ |X(t + h)−X(t)|2] = 0
Definition 4 (Mean-square differentiability). The random process X(t) is mean-square
differentiable if the following limit exists
limh→0
X(t + h)−X(t)
h.
Proposition 4. A random process with well defined correlation function belongs to L2.
Note that mean square error is equivalent to Euclidean distance.
∫(x(t)− y(t))2dt =
∫ (x(t)2 + y(t)2
)dt− 2
∫x(t)y(t)dt (A–10)
A.3 Probability Density Estimation
Estimating a probability density function (pdf) from a set of samples (observations)
has been one of the fundamental problems in statistics [74]. Parametric methods
assumes a distribution and fits the data to the distribution, which is usable only if the
assumed model is at least approximately correct. On the other hand, nonparametric
approach makes a milder assumption, usually in the form that the pdf is continuous. One
of the widely used nonparametric method is the histogram. The other is Parzen window,
or otherwise known as kernel density estimation [59]. These can be motivated from the
54
empirical cumulative distribution function F (x) [74]:
F (x) =number of samples in (x− h, x + h)
total number of samples. (A–11)
Plugging in equation (A–11) to the definition of pdf,
f(x) =dF (x)
dx= lim
h→0
F (x + h)− F (x− h)
2h, (A–12)
can be written in the following form:
f(x) =1
nh
N∑i=1
K
(x− xi
h
), (A–13)
where N is the total number of samples, h is the bandwidth, and K is defined as,
K(x) =
12, if − 1 < x ≤ 1,
0, otherwise.
(A–14)
This is the histogram method if x is evaluated for every non-overlapping interval of size h.
Note that equation (A–14) is a uniform distribution. By allowing any pdf as a probability
density estimation kernel, we can define the kernel density estimation. It had been shown
that under all nonnegative kernels with compact support, Epanechinikov kernel is optimal
for the asymptotic mean integrated squared error (AMISE), however Gaussian kernel and
other kernels are widely used [74].
The free parameter h, the bandwidth, determines how smooth the estimate will
be, and in general depends on the number of samples in the region. When using fixed
bandwidth, AMISE provides optimal bandwidth which balances bias and variance of the
estimate [74]. There have been number of extension to the fixed bandwidth kernel density
estimation methods [75]. The general idea is to decrease the bandwidth in the region
where there are more samples, and use large bandwidth where there are less samples.
55
Kernel K(x)
Epanechinikov 34(1− x2)
Uniform 12
Triangle 12− |x|
Gaussian 1√2π
e−x2
2
Laplacian 1√2e−|x|
Table A-1. Various probability density estimation kernels. Gaussian and Laplacian, hasinfinite support, and the other kernels have [−1, 1] as the support.
One of the drawbacks of kernel density estimation is the boundary bias. When the
support of the pdf is finite, the infinite support kernels will underestimate, and even finite
support kernels will leak some of the density to outside of its support.
Since kernel density estimation provides relatively accurate continuous pdf estimation
with a finite summation, a set of algorithms that is based on pdf can be written in efficient
manner. Information theoretic learning, a framework of signal processing with information
theoretic cost function, combines Renyı’s quadratic entropy with kernel density estimation,
and nonparametrically estimates entropy without approximations [37, 76].
A.4 Information Theoretic Learning
CIP is strongly related with information theoretic learning framework [77]. For a
random variable X with a pdf f(x), Renyi’s quadratic entropy is defined as, [78]
HR2 = − log
∫ ∞
−∞f 2(x)dx = − log E [f(x)] . (A–15)
The argument of the logarithm,
VX =
∫ ∞
−∞f 2(x)dx = E [f(x)] , (A–16)
is called the information potential (IP) [77].
As mentioned in A.3, Renyi’s quadratic entropy can be estimated efficiently with
kernel density estimation. Let xi : i = 1, . . . , N be a set of N i.i.d. samples of a random
56
variable X. Then, the pdf of X can be approximated non-parametrically by,
f(x) =1
N
N∑i=1
κpdf (x, xi), (A–17)
where κpdf (·, ·) is the kernel. Substituting this estimator in the above definition of the
information potential, equation (A–16), yields,
VX =1
N2
N∑i=1
N∑j=1
∫ +∞
−∞κpdf (x, xi)κpdf (x, xj)dx =
1
N2
N∑i=1
N∑j=1
κ(xi, xj). (A–18)
where κ(xi, xj) =∫ +∞−∞ κpdf (x, xi)κpdf (x, xj)dx. Note that we are estimating entropy of
a continuous random variable directly with sums of kernel evaluations and without any
approximation.
Let fi(x) and fj(x) be the pdfs of random variables Xi and Xj, defined on the same
probability space. A distance between the pdfs of the two random variables can be defined
in the space of the distributions with the Cauchy-Schwarz (CS) distance, ICS,
ICS = log
√(∫f 2
i (t)dt) (∫
f 2j (t)dt
)∫
fi(t)fj(t)dt= log
√ViVj
Vij
, (A–19)
where Vi is the information potential of the ith random variable [77]. It is important to
remark that ICS is in fact approximating the Kullback-Leibler divergence [79] between
the two pdfs; however, a significant advantage is the ease of computation of this measure
using the information potential. Notice also that in the argument of the logarithm
the numerator contains the normalizing terms. In other words, the behavior of ICS
is determined by the denominator term, Vij, appropriately called cross information
potential (CIP). Much like the IP, the CIP expresses a potential due to interactions
between particles, but from different random variables. Because the CIP negatively affects
the CS divergence, it is in effect measuring the similarity between the two distributions.
57
A.5 Reproducing Kernel Hilbert Space
In kernel methods, the concept of reproducing kernel Hilbert space (RKHS) is often
mentioned. The smoothing of a spike train can be seen as applying a kernel method, and
projecting the spike trains to an RKHS. This can be seen from the fact that the Laplacian
distribution is a positive definite kernel. Indeed we are using a subspace of L2, which
consist of the smoothed spike trains and their linear combination, and also is an RKHS.
Being an RKHS provides the kernel trick, so that the algorithms can be efficient.
Definition 5 (Inner Product). The inner product of x, y ∈ V where V is a vector space is
a mapping 〈x|y〉 : V × V → K such that,
(I1) ∀x ∈ V, 〈x|x〉 ≥ 0 and 〈x|x〉 = 0 ⇐⇒ x = 0
(I2) 〈x|y〉 = 〈y|x〉(I3) ∀x, y, z ∈ V, ∀a, b ∈ K, 〈ax + by|z〉 = a〈x|z〉+ b〈y|z〉Definition 6 (Norm induced by inner product). For a Kfield-vector space V equipped
with an inner product, the norm of a vector is defined as, ‖x‖ =√〈x|x〉.
Proof.
‖λx‖ =√〈λx|λx〉 =
√|λ|2 〈x|x〉 = λ ‖x‖ (A–20)
‖x + y‖ ≤ ‖x‖+ ‖y‖ by lemma 5 (A–21)
‖x‖ = 0 ⇐⇒ x = 0 by (I1) in definition 5 (A–22)
Lemma 5. ‖x + y‖2 = ‖x‖2 + 2Re〈x|y〉+ ‖y‖2 .
Proof. ‖x + y‖2 = 〈x + y|x + y〉 = ‖x‖2 + 〈y|x〉+ 〈x|y〉+ ‖y‖2
Lemma 6 (Cauchy-Schwarz inequality). |〈x|y〉| ≤ ‖x‖ ‖y‖ .
58
Proof. Suppose ‖y‖ 6= 0, for a λ ∈ K,
0 ≤ ‖x− λy‖ = 〈x− λy|x− λy〉
= 〈x|x〉 − λ〈x|y〉 − λ〈y|x〉+ |λ|2 〈y|y〉.
Let λ = 〈x|y〉/〈y|y〉,
0 ≤ 〈x|x〉 − 〈y|x〉〈x|y〉/〈y|y〉 − 〈x|y〉〈y|x〉/〈y|y〉+ 〈x|y〉〈y|x〉/〈y|y〉,
which is equivalent to,
〈x|y〉〈y|x〉 ≤ 〈x|x〉〈y|y〉.
Definition 7 (Cauchy Sequence). A sequence of elements xn indexed by n ∈ N of a metric
space with metric d(·, ·) is a Cauchy sequence if for all ε > 0, there exist a N ∈ N such
that for all n,m ≥ N , d(xn, xm) < ε.
Definition 8 (Complete Metric Space). A metric space is complete if every Cauchy
sequence converges to a point in the space.
Definition 9 (Hilbert Space). A vector space V complete under the norm induced by
inner product is a Hilbert space.
Definition 10 (Reproducing Kernel Hilbert Space (RKHS)). There exist a kernel
K : V × V → K, such that, for all f ∈ H, the reproducing property holds:
f(y) = 〈f(x)|K(x, y)〉.
Remark 1 (Linear Operator View). RKHS is a sub-Hilbert space of L2. In particular,
RKHS is the +1 eigenvector space of the kernel. In other words, we are restricting the
general Hilbert space L2 to a smaller space where the reproducing property holds. Also note
that L2 is not an RKHS.
59
Definition 11 (Positive Semi-definite Kernel). A positive semi-definite kernel is a
function on X ×X with the following property. For all natural number n, for all x1, . . . , xn
in X, and for all α1, . . . , αn in a real or complex,
n∑i=1
n∑j=1
αiαjK(xi, xj) ≥ 0.
Theorem 7 (Moore-Aronzajn Theorem [80]). Given a symmetric positive semi-definite
kernel K, There exist a unique RKHS H with K as the reproducing kernel.
Proof. Let A be the set of all functionals of the form Φi(·) = K(·, i). Define the linear
combination of the functionals.
∀f, g ∈ I → R ∀a, b ∈ R ∀x ∈ I (af + bg)(x) = a(f(x)) + b(g(x)) (A–23)
Let B be the vector space spanned by A. Now let us define the inner product 〈·|·〉 and
norm ‖f‖ =√〈f |f〉 of B.
∀f, g ∈ A 〈f |g〉 = 〈∑i∈I
αfi Φi(·)|
∑j∈I
αgjΦj(·)〉 (A–24)
=∑i∈I
αfi
∑j∈I
αgj 〈Φj(·)|Φi(·)〉 (A–25)
=∑i∈I
∑j∈I
αfi α
gjK(i, j) (A–26)
To ensure that the inner product is well-defined, two different representation of f, g ∈ B
should lead to the same inner product, which is obvious.
Let us complete the space by including all limits of Cauchy sequences fn|n ∈ N, fn ∈B and denote as H which is a Hilbert space. Note that B is a dense subset of H.
60
The reproducing property of HK is immediate.
〈KSi(·), f〉 = 〈KSi
(·),∑
l∈Lf
αfl KSil
(·)〉 (A–27)
=∑
l∈Lf
αfl 〈KSi
(·), KSil(·)〉 (A–28)
=∑
l∈Lf
αfl K(Si, Sil) (A–29)
=∑
l∈Lf
αfl K(Sil , Si) (A–30)
= f(Si) (A–31)
61
APPENDIX BSTATISTICAL PROOFS
To assess the significance of the correlation, it is necessary to know the probability
distribution of the estimator given the null hypothesis (independent Poisson spike
trains). However, instead of calculating the complicated closed form of the distribution
for Q∗ij(∆t), we derive of mean and variance of the estimator Q∗
ij(∆t), and assume
Gaussianity. For the time binning with sufficiently small bin size case, Palm and coworkers
have derived statistics for the histogram of a Poisson process [60]. The analysis can easily
be applied to CIP by multiplying the normalization factor.
Let Ω(λ, T ) be the class of all possible homogeneous Poisson spike trains of rate λ and
length T . The probability of having a realization Ωi = ti1, ti2, . . . , tiNi of Ω(λA, T ) is,
P [Ω = Ωi|λA, T ] = P [N(T ) = Ni, ti1 = ti1, t
i2 = ti2, . . . , t
iNi
= tiNi] (B–1)
= P [N(T ) = Ni]P [ti1 = ti1, t
i2 = ti2, . . . , t
iNi
= tiN |N = Ni] (B–2)
=(λAT )Ni
Ni!e−λAT
Ni∏m=1
P [t = tim] (B–3)
=(λAT )Ni
Ni!e−λAT
Ni∏m=1
1
T=
λNiA
Ni!e−λAT (B–4)
62
The expected value of the estimator for all possible pairs of independent spike trains is,
E [ij] Q∗ij(∆t) =
∫ ∞
−∞
∫ ∞
−∞P [Ωi, Ωj|λA, λB, T ]Q∗
ij(∆t)dΩidΩj (B–5a)
=∞∑
Ni=0
∞∑Nj=0
∫ ∞
−∞· · ·
∫ ∞
−∞
λNiA
Ni!e−λAT λ
Nj
B
Nj!e−λBT
1
2τT
Ni∑m=1
Nj∑n=1
e−|tim−t
jn−∆t|τ dti1dti2 · · · dtiNi
dtj1dtj2 · · · dtjNj(B–5b)
=1
2τTe−(λA+λB)T
∞∑Ni=0
∞∑Nj=0
λNiA
Ni!
λNj
B
Nj!
TNi−1TNj−1
Ni∑m=1
Nj∑n=1
∫ ∞
−∞
∫ ∞
−∞e−
|tim−tjn−∆t|τ dtimdtjn (B–5c)
=1
2τTe−(λA+λB)T
∞∑Ni=0
∞∑Nj=0
(λAT )Ni
Ni!
(λBT )Nj
Nj!
NiNj
T 2
∫ T
0
∫ T
0
e−|tim−t
jn−∆t|τ dtimdtjn (B–5d)
63
Let us evaluate the integral first, from the symmetry of tim and tjn we can assume ∆t ≥ 0
without loss of generality.
∫ T
0
∫ T
0
e−|tim−t
jn−∆t|τ dtimdtjn (B–6a)
=
∫ ∆t
0
∫ T
0
etim−t
jn−∆t
τ dtimdtjn
+
∫ T
∆t
∫ tim−∆t
0
e−tim−t
jn−∆t
τ dtimdtjn
+
∫ T
∆t
∫ T
tim−∆t
etim−t
jn−∆t
τ dtimdtjn (B–6b)
= −τ
∫ ∆t
0
(e
tim−T−∆t
τ − etim−∆t
τ
)dtim
+ τ
∫ T
∆t
(1− e−
tim−∆t
τ
)dtim − τ
∫ T
∆t
(e
tim−T−∆t
τ − 1
)dtim (B–6c)
= −τ 2(e
∆t−T−∆tτ − e
0−T−∆tτ − e
∆t−∆tτ + e
0−∆tτ
)
+ τ(T −∆t)− τ 2(−e−
T−∆tτ + e−
∆t−∆tτ
)
− τ 2(e
T−T−∆tτ − e
∆t−T−∆tτ
)+ τ(T −∆t) (B–6d)
= 2τ(T −∆t) + τ 2(e−T+∆t
τ + e−T−∆t
τ − 2e−∆tτ ) (B–6e)
= 2τ(T −∆t) + O(τ 2). (B–6f)
Approximating λA = Ni
T, and substituting the integral to equation (B–5d) gives,
E [ij] Q∗ij(∆t) ' λAλB
2τ(T −∆t) + O(τ 2)
2τT, (B–7)
where O(τ 2) are the terms with order of τ 2 or higher. Assuming τ ¿ 1 and ∆t ¿ T ,
equation (B–7) can be approximated by λAλB which is the desired value.
64
Now let us evaluate the second-moment of the estimator.
E [ij] Q∗ij(∆t)2 =
∫ ∞
−∞
∫ ∞
−∞P [Ωi, Ωj|λA, λB, T ]Q∗
ij(∆t)2dΩidΩj (B–8a)
=∞∑
Ni=0
∞∑Nj=0
∫ ∞
−∞· · ·
∫ ∞
−∞
λNiA
Ni!e−λAT λ
Nj
B
Nj!e−λBT
1
4τ 2T 2
Ni∑p=1
Nj∑q=1
Ni∑r=1
Nj∑s=1
e−|tip−t
jq−∆t|τ e−
|tir−tjs−∆t|τ
dti1dti2 · · · dtiNidtj1dtj2 · · · dtjNj
(B–8b)
=∞∑
Ni=0
∞∑Nj=0
λNiA
Ni!e−λAT λ
Nj
B
Nj!e−λBT 1
4τ 2T 2TNiTNj
1
T 4
∫ T
0
∫ T
0
∫ T
0
∫ T
0
Ni∑p=1
Nj∑q=1
Ni∑r=1
Nj∑s=1
e−|tip−t
jq−∆t|τ e−
|tir−tjs−∆t|τ
dtipdtjqdtirdtjs (B–8c)
65
Let us consider the integral part first.
1
T 4
∫ T
0
∫ T
0
∫ T
0
∫ T
0
Ni∑p=1
Nj∑q=1
Ni∑r=1
Nj∑s=1
e−|tip−t
jq−∆t|τ e−
|tir−tjs−∆t|τ dtipdtjqdtirdtjs (B–9a)
=1
T 4
Ni∑p=1
Nj∑q=1
∑
r 6=p
∑
s 6=q
∫ T
0
∫ T
0
e−|tip−t
jq−∆t|τ dtipdtjq
∫ T
0
∫ T
0
e−|tir−t
js−∆t|τ dtirdtjs
+1
T 2
Ni∑p=1
Nj∑q=1
∫ T
0
∫ T
0
e−2|tip−t
jq−∆t|
τ dtipdtjq
+1
T 3
Ni∑p=1
Nj∑q=1
∑
r 6=p
∫ T
0
∫ T
0
∫ T
0
e−|tip−t
jq−∆t|τ e−
|tir−tjs−∆t|τ dtipdtjqdtir
+1
T 3
Ni∑p=1
Nj∑q=1
∑
s 6=q
∫ T
0
∫ T
0
∫ T
0
e−|tip−t
jq−∆t|τ e−
|tir−tjs−∆t|τ dtipdtjqdtjs (B–9b)
=NiNj(Ni − 1)(Nj − 1)
T 4(2τ(T −∆t) + O(τ 2))2
+NiNj
T 2(τ(T −∆t) + O(τ 2))
+NiNj(Ni − 1)
T 3(τ 2(2T (2 + e−
Tτ )) + O(τ 3)
+NiNj(Nj − 1)
T 3(τ 2(2T (2 + e−
Tτ )) + O(τ 3) (B–9c)
By assuming τ ¿ 1 ¿ T , we can approximate e−Tτ ' 0, O(τ 3) ' 0. And we further
approximate λA = Ni
T, and Ni − 1 ' Ni. These approximations lead to,
E [ij] Q∗ij(∆t)2 ' 1
4τ 2T 2(λAλB)2(2τ(T −∆t))2
+ (λAλB)(τ(T −∆t)) + (λAλB)(λA + λB)(4τ 2T )
= (λAλB)2
(T −∆t
T
)2
+λAλB
4τT
T −∆t
T
+ (λAλB)(λA + λB)1
T. (B–10)
Finally, the variance of the estimator is given by
E [ij] (Q∗ij(∆t)− λAλB)2 ' λAλB(T −∆t)
4τT 2. (B–11)
66
APPENDIX CNOTATION
Spaces
K generic field
N natural number
R real field
Rd d-dimensional Euclidean space
H Hilbert space [p. 59]
I index set space for spike trains [p. ??]
General notation
tim : m = 1, . . . , Ni spike train as a set of spike timings [p. ??]
si(t) spike train as a function over time [p. ??]
h(t; τ) impulse response of a linear filter
qi(t) filtered (or smoothed) spike train
g estimation of a general function g
λi(t) intensity function of a Poisson process [p. ??]
X,Y,Z random variables
T random variable of time
X(t),Y(t),Z(t) random processes
N(t, s),N(t) counting process
fX probability density function of X
κ generic kernel
κpdf pdf estimation kernel
κτ generic kernel with kernel size parameter τ
K CIP kernel, or reproducing kernel of a Hilbert space
67
Operators
EX [g(x)] expectation of g(x) over X
〈x|y〉 inner product [p. 58]
‖·‖ norm of a vector
|·| absolute value
x(t) ∗ y(t) convolution
68
APPENDIX DSOURCE CODE
D.1 CIP
function V = cip(x, tau)
%V = cip(X, TAU)
% Return the Cross Information Potential .
% If more than two neurons are provided average through all pair combinations .
%
% X: Data , organized as a cell array , with each cell containing an
% array of spike times (in seconds ).
% TAU: Kernel size (in seconds ).
/* vim: set ts=8 sts =4 sw =4: (modeline) */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <mex.h>
void cip_func(int N, double *x[], int nSpikes[], double tau , double *v);
void mexFunction(int nlhs , mxArray *plhs[], int nrhs , const mxArray *prhs [])
mxArray *sts;
mxArray *stp;
int nSpikeTrain; /* number of spike trains */
double **x; /* array of vectors with the
* spike times (sec) */
int *nSpikes; /* array with the number of spikes
* per spike train */
double tau; /* exponential decay parameter */
double *v; /* output argument , CIP */
int i;
/*
* check input arguments
*/
if (nrhs != 2)
mexErrMsgTxt("2ÃinputsÃareÃrequired.");
else if (nlhs > 1)
mexErrMsgTxt("TooÃmanyÃoutputÃarguments");
if (! mxIsDouble(prhs [1]))
mexErrMsgTxt("TAUÃmustÃbeÃaÃscalar");
/*
* get input arguments
*/
sts = (mxArray *) prhs [0];
if (mxGetClassID(sts) != mxCELL_CLASS)
mexErrMsgTxt("XÃmustÃbeÃaÃcellÃarray");
nSpikeTrain = mxGetNumberOfElements(sts);
if (nSpikeTrain < 2)
mexErrMsgTxt("AtÃleastÃtwoÃspikeÃtrainsÃareÃneeded.");
nSpikes = (int *) mxMalloc(sizeof(int) * nSpikeTrain );
x = (double **) mxMalloc(sizeof(double *) * nSpikeTrain );
for (i = 0; i < nSpikeTrain; i++)
stp = mxGetCell(sts , i);
nSpikes[i] = mxGetNumberOfElements(stp);
x[i] = mxGetPr(stp);
tau = mxGetPr(prhs [1])[0];
/*
* allocate output
*/
plhs [0] = mxCreateDoubleMatrix (1, 1, mxREAL );
v = mxGetPr(plhs [0]);
memset(v, 0, sizeof(double ));
/*
* compute CIP
*/
cip_func(nSpikeTrain , x, nSpikes , tau , v);
/**
69
* Compute CIP of a set of spike trains.
*
* @param N number of spike trains
* @param x array to pointers for spike trains
* @param nSpikes array to length for spike trains
* @param tau decay time constant for the exponential function
* @param v computed CIP will be stored here , need to be preallocated
* @author Antonio Paiva
* @version $Id: cip.c 52 2007 -01 -03 16:55:26Z memming $
*/
void cip_func(int N, double *x[], int nSpikes[], double tau , double *v)
int i, j; /* counters for spike trains */
int m, n; /* counters for spike times */
int lastStartIdx; /* index to start computing the exponential */
double maxT; /* maximum range of exponential to have non -zero value */
double aux; /* auxiliary variable: holds CIP for each
* pair combination */
double tmp;
maxT = tau * 100;
*v = 0;
for (i=0; i<(N-1); i++)
for (j=(i+1); j<N; j++)
aux = 0;
lastStartIdx = 0;
for (m=0; m<nSpikes[i]; m++)
for (n=lastStartIdx; n<nSpikes[j]; n++)
tmp = x[j][n] - x[i][m];
if (tmp < -maxT)
lastStartIdx ++;
continue;
if (tmp <= maxT)
aux += exp(-((tmp < 0) ? (-tmp) : tmp) / tau);
else
break;
aux /= (2*tau * (nSpikes[i] * nSpikes[j]));
*v += aux;
*v /= (N * (N-1) / 2);
D.2 ICIP
function [v] = offline_icip(st, T, DT , FR_TAU , BETA)
% [v] = offline_icip (st , T, DT , FR_TAU , BETA)
% Offline version of ICIP computation . For online evaluation directly use
% online_icip .
%
% Input
% st: cell array containing spike trains (seconds)
% T: total length of spike trains (seconds)
% DT: time step size (seconds)
% FR_TAU: the tau for firing rate estimation (1/ seconds)
% if FR_TAU is zero , it is constant tau mode
% BETA: kernel size for ICIP at average firing rate 1 Hz (1/ seconds)
% (BETA should be smaller than FR_TAU ^2 for accurate estimation )
% Output:
% v: ICIP over time
%
% You may want to use tr = 0:DT:(T-DT); which are the start time for each bin
%
% Copyright 2006 Antonio and Memming , CNEL , all rights reserved
% $Id: offline_icip .m 32 2006 -12 -09 18:05:57Z memming $
% Actuall implementation is now in offline_icip .c
/* vim: set ts=8 sts =4 sw =4: (modeline) */
#include <math.h>
#include <string.h>
#include <mex.h>
/**
* Compute ICIP of a set of spike trains with changing TAU mode.
*
* @param N number of spike trains
* @param sts array to pointers for spike trains
* @param nsts array to length for spike trains
* @param v computed ICIP will be stored here , need to be preallocated
* @param T total time (sec)
* @param dt time bin size (sec)
* @param BETA the parameter $\beta$ of ICIP (sec)
* @param FR_TAU the time constant for firing rate estimation
70
* @author Memming Park
* @version $Id: offline_icip .c 51 2007 -01 -02 19:53:01Z memming $
*/
void offline_icip(int N, double *sts[], int nsts[], double *v, double T, double dt , double BETA , double FR_TAU)
double t; /* time */
int i; /* time index */
int j, k; /* spike train index */
int *idx; /* spike index per spike train */
int NPair;
double EXP_FR;
double ONE_OVER_FR_TAU;
double ONE_OVER_BETA;
double *q; /* charge */
double *f; /* firing rate */
int Nstep; /* number of time steps (bins) */
ONE_OVER_FR_TAU = 1 / FR_TAU;
ONE_OVER_BETA = 1 / BETA;
EXP_FR = exp(-dt / FR_TAU );
idx = (int *) malloc(sizeof(int) * N);
memset(idx , 0, sizeof(int) * N);
q = (double *) malloc(sizeof(double) * N);
memset(q, 0, sizeof(double) * N);
f = (double *) malloc(sizeof(double) * N);
memset(f, 0, sizeof(double) * N);
NPair = N * (N - 1) / 2;
Nstep = (int) ceil(T / dt);
for(t = dt, i = 0; i < Nstep; t += dt , i++)
for(j = 0; j < N; j++)
f[j] *= EXP_FR;
q[j] *= exp(-dt * f[j] * ONE_OVER_BETA );
while(idx[j] < nsts[j] && sts[j][idx[j]] <= t)
idx[j]++;
f[j] += ONE_OVER_FR_TAU;
q[j] += ONE_OVER_BETA;
for(j = 0; j < N; j++)
for(k = (j + 1); k < N; k++)
v[i] += q[j] * q[k];
v[i] /= NPair;
free(idx);
free(q);
free(f);
/**
* Compute ICIP of a set of spike trains with constant TAU mode.
*
* @param N number of spike trains
* @param sts array to pointers for spike trains
* @param nsts array to length for spike trains
* @param v computed ICIP will be stored here , need to be preallocated
* @param T total time (sec)
* @param dt time bin size (sec)
* @param TAU the time constant for the expnential (or Laplacian )
* @author Memming Park
* @version $Id: offline_icip .c 51 2007 -01 -02 19:53:01Z memming $
*/
void offline_icip_const_tau(int N, double *sts[], int nsts[], double *v, double T, double dt, double TAU)
double t; /* time */
int i; /* time index */
int j, k; /* spike train index */
int *idx; /* spike index per spike train */
int NPair;
double EXP_TAU;
double *q; /* charge */
double *ONE_OVER_TAU_F; /* (1 / (tau*firing rate )); constant */
int Nstep; /* number of time steps (bins) */
EXP_TAU = exp(-dt / TAU);
idx = (int *) malloc(sizeof(int) * N);
memset(idx , 0, sizeof(int) * N);
q = (double *) malloc(sizeof(double) * N);
memset(q, 0, sizeof(double) * N);
ONE_OVER_TAU_F = (double *) malloc(sizeof(double) * N);
memset(ONE_OVER_TAU_F , 0, sizeof(double) * N);
for(k = 0; k < N; k++)
ONE_OVER_TAU_F[k] = (1/ TAU) * (T / nsts[k]);
71
NPair = N * (N - 1) / 2;
Nstep = (int) ceil(T / dt);
for(t = dt, i = 0; i < Nstep; t += dt , i++)
for(j = 0; j < N; j++)
q[j] *= EXP_TAU;
while(idx[j] < nsts[j] && sts[j][idx[j]] <= t)
idx[j]++;
q[j] += ONE_OVER_TAU_F[j];
for(j = 0; j < N; j++)
for(k = (j + 1); k < N; k++)
v[i] += q[j] * q[k];
v[i] /= NPair;
free(idx);
free(q);
free(ONE_OVER_TAU_F );
void mexFunction(int nlhs , mxArray *plhs[], int nrhs , const mxArray *prhs [])
int nSpikeTrain;
mxArray *sts;
mxArray *stp;
double **st;
int i;
int *nst;
double T, DT , FR_TAU , BETA;
double *v;
if (nrhs != 5)
mexErrMsgTxt("5ÃinputsÃareÃrequired.");
else if (nlhs > 1)
mexErrMsgTxt("TooÃmanyÃoutputÃarguments");
sts = (mxArray *) prhs [0];
if (mxGetClassID(sts) != mxCELL_CLASS)
mexErrMsgTxt("TheÃfirstÃargumentÃshouldÃbeÃaÃcellÃarray");
nSpikeTrain = mxGetNumberOfElements(sts);
if (nSpikeTrain < 2)
mexErrMsgTxt("AtÃleastÃtwoÃspikeÃtrainsÃareÃrequired.");
nst = (int *) mxMalloc(sizeof(int) * nSpikeTrain );
st = (double **) mxMalloc(sizeof(double *) * nSpikeTrain );
for (i = 0; i < nSpikeTrain; i++)
stp = mxGetCell(sts , i);
nst[i] = mxGetNumberOfElements(stp);
st[i] = mxGetPr(stp);
if (! mxIsDouble(prhs [1]))
mexErrMsgTxt("TheÃsecondÃargumentÃshouldÃbeÃaÃdouble");
if (! mxIsDouble(prhs [2]))
mexErrMsgTxt("TheÃthirdÃargumentÃshouldÃbeÃaÃdouble");
if (! mxIsDouble(prhs [3]))
mexErrMsgTxt("TheÃfourthÃargumentÃshouldÃbeÃaÃdouble");
if (! mxIsDouble(prhs [4]))
mexErrMsgTxt("TheÃfifthÃargumentÃshouldÃbeÃaÃdouble");
T = mxGetPr(prhs [1])[0];
DT = mxGetPr(prhs [2])[0];
FR_TAU = mxGetPr(prhs [3])[0];
BETA = mxGetPr(prhs [4])[0];
plhs [0] = mxCreateDoubleMatrix ((int) ceil(T / DT), 1, mxREAL );
v = mxGetPr(plhs [0]);
memset(v, 0, sizeof(double) * (int) ceil(T / DT));
if (FR_TAU == 0) /* Constant tau mode */
offline_icip_const_tau(nSpikeTrain , st, nst , v, T, DT, BETA);
else
offline_icip(nSpikeTrain , st , nst , v, T, DT , BETA , FR_TAU );
72
D.3 CCC
function [Q, deltaT] = cipogram(st1 , st2 , tau , maxT , T, verbose)
% [Q, tr] = cipogram(st1 , st2 , tau , maxT , T, verbose)
%
% Input
% st1 , st2: spike trains with sorted spike timings
% tau: time constant for CIP kernel
% maxT: correlogram range will be effective in [-maxT , maxT]
% T: length of spike train in seconds
% verbose: (optional /0) detailed info , uses tic , toc
%
% Output
% Q: cipogram
% deltaT: time range
%
% See also: cip_max_filter2 , ncipogram
%
% Copyright 2006 Antonio and Memming , CNEL , all rights reserved
% $Id: cipogram.m 53 2007 -01 -14 23:24:21Z memming $
if nargin < 5
verbose = 0;
end
N1 = length(st1);
N2 = length(st2);
Nij = N1 * N2;
if N1 == 0 || N2 == 0
warning(’cipogram:NODATA ’, ’AtÃleastÃoneÃspikeÃisÃrequired!’);
deltaT = []; Q = [];
return;
end
maxTTT = abs(maxT) + tau * 10; % exp ( -100) is effectively zero
% rough estimate of # of time difference required (assuming independence )
% this estimate is aweful if the spike trains are strongly correlated
eN = ceil((max(N1, N2))^2 * maxTTT * 2 / min(st1(end), st2(end )));
if verbose; fprintf(’ExpectedÃtimeÃdifferencesÃ[%d]Ã/Ã[%d]\n’, eN, Nij); end
deltaT = zeros(2 * eN , 1);
% Compute all the time differences
lastStartIdx = 1;
k = 1;
for n = 1:N1
for m = lastStartIdx:N2
timeDiff = st2(m) - st1(n);
if timeDiff < -maxTTT
lastStartIdx = lastStartIdx + 1;
continue;
end
if timeDiff <= maxTTT
deltaT(k) = timeDiff;
k = k + 1;
else % this is the ending point
break;
end
end
end
deltaT = deltaT (1:(k-1));
N = length(deltaT );
if N < 2
warning(’cipogram:NODATA ’, ’AtÃleastÃtwoÃintervalsÃareÃrequired ’);
deltaT = []; Q = [];
return;
end
if verbose
fprintf(’ActualÃnumberÃofÃtimeÃdifferencesÃ[%d]\ nSorting ...\n’, N); tic;
end
deltaT = sort(deltaT , 1); % Sort the time differences
if verbose; fprintf(’SortingÃfinishedÃ[%fÃsec]\r’, toc); end
Qplus = zeros(N, 1);
Qminus = zeros(N, 1);
Qminus (1) = 1;
Qplus(N) = 0;
EXP_DELTA = exp(-(diff(deltaT ))/ tau);
for k = 1:(N-1)
Qminus(k + 1) = 1 + Qminus(k) * EXP_DELTA(k);
kk = N - k;
Qplus(kk) = (Qplus(kk+1) + 1) * EXP_DELTA(kk);
end
Q = Qminus + Qplus;
Q = Q / 2 / tau / T;
73
function [Q, deltaT] = ncipogram(st1 , st2 , tau , maxT , T, verbose)
% [Q, tr] = ncipogram (st1 , st2 , tau , maxT , T, verbose)
% Normalized cipogram with 2nd order statistics .
%
% Input
% st1 , st2: spike trains with sorted spike timings
% tau: time constant for CIP kernel
% maxT: correlogram range will be effective in [-maxT , maxT]
% T: length of spike train in seconds
% verbose: (optional /0)
%
% Output
% Q: cipogram
% deltaT: time range
%
% Copyright 2006 Antonio and Memming , CNEL , all rights reserved
% $Id: ncipogram.m 59 2007 -01 -27 19:26:14Z memming $
[Q, deltaT] = cipogram(st1 , st2 , tau , maxT , T, verbose );
N1 = length(st1);
N2 = length(st2);
Nij = N1 * N2;
Q = (Q * T - Nij / T) * 2 * sqrt(tau * T) / sqrt(Nij);
74
REFERENCES
[1] M. C. W. van Rossum, “A novel spike distance,” Neural Computation, vol. 13, no. 4,pp. 751–764, 2001.
[2] J. D. Victor, “Spike train metrics,” Current Opinion in Neurobiology, vol. 15, no. 5,pp. 585–592, Sept. 2005.
[3] G. L. Gerstein, D. H. Perkel, and J. E. Dayhoff, “Cooperative firing activity insimultaneously recorded populations of neurons: detection and measurement,”Journal of Neuroscience, vol. 5, no. 4, pp. 881–889, 1985.
[4] J.-M. Fellous, P. H. E. Tiesinga, P. J. Thomas, and T. J. Sejnowski, “Discoveringspike patterns in neuronal responses,” J. Neurosci., vol. 24, no. 12, pp. 2989–3001,Mar. 2004.
[5] S. Schreiber, J. M. Fellous, D. Whitmer, P. Tiesinga, and T. J. Sejnowski, “A newcorrelation-based measure of spike timing reliability,” Neurocomputing, vol. 52-54, pp.925–931, 2003.
[6] A. Carnell and D. Richardson, “Linear algebra for time series of spikes,” in ESANN,2005.
[7] G. Buzsaki and A. Draguhn, “Neuronal oscillations in cortical networks,” Science,vol. 304, no. 5679, pp. 1926–1929, June 2004.
[8] Z. F. Mainen, J. Joerges, J. R. Huguenard, and T. J. Sejnowski, “A model of spikeinitiation in neocortical pyramidal neurons,” Neuron, vol. 15, no. 6, pp. 1427–1439,Dec. 1995.
[9] Y. Shu, A. Duque, Y. Yu, B. Haider, and D. A. McCormick, “Properties of actionpotential initiation in neocortical pyramidal cells: evidence from whole cell axonrecordings (in press),” J. Neurophysiol., Aug. 2006.
[10] B. Hochner, M. Klein, S. Schacher, and E. R. Kandel, “Action-potential durationand the modulation of transmitter release from the sensory neurons of Aplysia inpresynaptic facilitation and behavioral sensitization,” Proc. Natl. Aca. Sci., vol. 83,pp. 8410–8414, 1986.
[11] H. Alle and J. R. P. Geiger, “Combined analog and action potential coding inhippocampal mossy fibers,” Science, vol. 311, no. 5765, pp. 1290–1293, Mar. 2006.
[12] A.-K. Warzecha, J. Kretzberg, and M. Egelhaaf, “Temporal precision of the encodingof motion information by visual interneurons,” Current Biology, vol. 8, no. 7, pp.359–368, Mar. 1998.
75
76
[13] Z. N. Aldworth, J. P. Miller, T. Gedeon, G. I. Cummins, and A. G. Dimitrov,“Dejittered spike-conditioned stimulus waveforms yield improved estimates ofneuronal feature selectivity and spike-timing precision of sensory interneurons,” J.Neurosci., vol. 25, no. 22, pp. 5323–5332, June 2005.
[14] Z. F. Mainen and T. J. Sejnowski, “Reliability of spike timing in neocorticalneurons,” Science, vol. 268, no. 5216, pp. 1503–1506, 1995.
[15] R. VanRullen, R. Guyonneau, and S. J. Thorpe, “Spike times make sense,” Trends inNeurosciences, vol. 28, no. 1, pp. 1–4, Jan 2005.
[16] M. N. Shadlen and W. T. Newsome, “Noise, neural codes and cortical organization,”Curr. Opin. Neurobiol., vol. 4, pp. 569–579, 1994.
[17] H. Agmon-Snir, C. E. Carr, and J. Rinzel, “The role of dendrites in auditorycoincidence detection,” Nature, vol. 393, no. 6682, pp. 268–72, May 1998.
[18] J. P. Donoghue, J. N. Sanes, N. G. Hatsopoulos, and G. Gaal, “Neural dischargeand local field potential oscillations in primate motor cortex during voluntarymovements,” J. Neurophysiol., vol. 79, no. 1, pp. 159–173, 1998.
[19] D. R. Brillinger, J. Hugh L. Bryant, and J. P. Segundo, “Identification of synapticinteractions,” Biological Cybernetics, vol. 22, pp. 213–228, 1976.
[20] R. Dahlhaus, M. Eichler, and J. Sandkuhler, “Identification of synaptic connections inneural ensembles by graphical models,” Journal of Neuroscience Methods, vol. 77, pp.93–107, 1997.
[21] G. Schneider, M. N. Havenith, and D. Nikolic, “Spatiotemporal structure in largeneuronal networks detected from cross-correlation,” Neural Computation, vol. 18, no.10, pp. 2387–2413, 2006.
[22] T. Berger, M. Baudry, R. Brinton, J.-S. Liaw, V. Marmarelis, A. Y. Park, B. Sheu,and A. Tanguay, “Brain-implantable biomimetic electronics as the next era in neuralprosthetics,” Proceedings of the IEEE, vol. 89, no. 7, pp. 993–1012, 2001.
[23] R. Gutig and H. Sompolinsky, “The tempotron: a neuron that learns spiketiming-based decisions,” Nat Neurosci, vol. 9, no. 3, pp. 420–428, Mar. 2006.
[24] A. Borst and F. E. Theunissen, “Information theory and neural coding,” NatNeurosci, vol. 2, no. 11, pp. 947–957, Nov. 1999.
[25] L. Paninski, “Estimating entropy on m bins given fewer than m samples,” Informa-tion Theory, IEEE Transactions on, vol. 50, no. 9, pp. 2200– 2203, 2004.
[26] W. Bialek and A. Zee, “Coding and computation with neural spike trains,” J. Stat.Phys., vol. 59, pp. 103–115, 1990.
77
[27] J. D. Victor, “Binless strategies for estimation of information from neural data,”Phys. Rev. E, vol. 66, no. 5, pp. 051903, Nov 2002.
[28] E. N. Brown, R. E. Kass, and P. P. Mitra, “Multiple neural spike train data analysis:state-of-the-art and future challenges,” Nature neuroscience, vol. 7, no. 5, pp. 456–61,may 2004.
[29] N. Masuda and K. Aihara, “Spatiotemporal spike encoding of a continuous externalsignal,” Neural Computation, vol. 14, pp. 15991628, 2002.
[30] M. Nawrot, A. Aertsen, and S. Rotter, “Single-trial estimation of neuronal firingrates: From single-neuron spike trains to population activity,” Journal of Neuro-science Methods, vol. 94, pp. 81–92, 1999.
[31] M. Bazhenov, M. Stopfer, M. Rabinovich, R. Huerta, H. D. Abarbanel, T. J.Sejnowski, and G. Laurent, “Model of transient oscillatory synchronization inthe locust antennal lobe,” Neuron, vol. 30, pp. 553–567, 2001.
[32] P. Reinagel and R. C. Reid, “Precise firing events are conserved across neurons,” J.Neurosci., vol. 22, no. 16, pp. 6837–6841, Aug 2002.
[33] R. Azouz and C. M. Gray, “Dynamic spike threshold reveals a mechanism forsynaptic coincidence detection in cortical neurons in vivo,” PNAS, vol. 97, no. 14, pp.8110–8115, July 2000.
[34] E. M. Izhikevich, “Polychronization: Computation with spikes,” Neural Comp., vol.18, no. 2, pp. 245–282, Feb. 2005.
[35] J. de la Rocha and N. Parga, “Short-term synaptic depression causes anon-monotonic response to correlated stimuli,” J. Neurosci., vol. 25, no. 37, pp.8416–8431, Sept. 2005.
[36] P. Konig, A. K. Engel, and W. Singer, “Integrator or coincidence detector? the roleof the cortical neuron revisited,” Trends in Neurosciences, vol. 19, no. 4, pp. 130–137,April 1996.
[37] D. Xu, Energy, entropy and information potential for neural computation, Ph.D.dissertation, University of Florida, May 1999.
[38] A. Kuhn, A. Aertsen, and S. Rotter, “Higher-order statistics of input ensemblesand the response of simple model neurons.,” Neural Computation, vol. 15, no. 1, pp.67–101, 2003.
[39] A. Kuhn, S. Rotter, and A. Aertsen, “Correlated input spike trains and their effectson the response of the leaky integrate-and-fire neuron,” Neurocomputing, vol. 44–46,pp. 121–126, June 2002.
78
[40] J. Hopfield and A. Herz, “Rapid local synchronization of action potentials: Towardcomputation with coupled integrate-and-fire neurons,” PNAS, vol. 92, no. 15, pp.6655–6662, July 1995.
[41] J. J. Hopfield and C. D. Brody, “What is a moment? transient synchrony as acollective mechanism for spatiotemporal integration,” PNAS, vol. 98, no. 3, pp.1282–1287, 2001.
[42] A. K. Engel and W. Singer, “Temporal binding and the neural correlates of sensoryawareness,” Trends in Cognitive Sciences, vol. 5, no. 1, pp. 16–25, 2001.
[43] R. E. Mirollo and S. H. Strogatz, “Synchronization of pulse-coupled biologicaloscillators,” SIAM Journal on Applied Mathematics, vol. 50, no. 6, pp. 1645–1662,Dec. 1990.
[44] D. Hansel and H. Sompolinsky, “Synchronization and computation in a chaotic neuralnetwork,” Phys. Rev. Lett., vol. 68, no. 5, pp. 718–721, Feb 1992.
[45] A. Zumdieck, M. Timme, T. Geisel, and F. Wolf, “Long chaotic transients in complexnetworks,” Phys. Rev. Lett., vol. 93, no. 24, pp. 244103, 2004.
[46] E. M. Izhikevich, “Weakly pulse-coupled oscillators, FM interactions, synchronization,and oscillatory associative memory,” IEEE Transactions on Neural Networks, vol. 10,no. 3, pp. 508–526, 1999.
[47] L. F. Abbott and C. van Vreeswijk, “Asynchronous states in networks ofpulse-coupled oscillators,” Phys. Rev. E, vol. 48, no. 2, pp. 1483–1489, Aug 1993.
[48] P. Dayan and L. F. Abbott, Theoretical Neuroscience: Computational and Mathemat-ical Modeling of Neural Systems, MIT Press, Cambridge, MA, USA, 2001.
[49] A. M. Aertsen, G. L. Gerstein, M. K. Habib, and G. Palm, “Dynamics of neuronalfiring correlation: modulation of “effective connectivity”,” Journal of Neurophysiology,vol. 61, no. 5, pp. 900–917, 1989.
[50] S. Grun, M. Diesmann, and A. Aertsen, “Unitary Events in multiple single-neuronactivity. I. detection and significance,” Neural Computation, vol. 14, no. 1, pp. 43–80,2002.
[51] D. L. Snyder and M. I. Miller, Random Point Processes in Time and Space,Springer-Verlag, 1991.
[52] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Point Processes,Springer, 1988.
[53] Z. F. Mainen and T. J. Sejnowski, “Reliability of spike timing in neocorticalneurons,” Science, vol. 268, no. 5216, pp. 1503–1506, 1995.
79
[54] D. H. Perkel, G. L. Gerstein, and G. P. Moore, “Neuronal spike trains and stochasticpoint processes: II. simultaneous spike trains,” Biophys J, vol. 7, no. 4, pp. 419–440,July 1967.
[55] C. K. Knox, “Cross-correlation functions for a neuronal model,” Biophysical Journal,vol. 14, no. 8, pp. 567–582, 1974.
[56] R. E. Kass, V. Ventura, and C. Cai, “Statistical smoothing of neuronal data,”Network: Comput. Neural Syst., vol. 14, pp. 5–15, 2003.
[57] W. Gerstner, “Coding properties of spiking neurons: reverse and cross-correlations,”Neural Networks, vol. 14, pp. 599–610, 2001.
[58] P. Diggle and J. S. Marron, “Equivalence of smoothing parameter selectors in densityand intensity estimation,” Journal of the American Statistical Association, vol. 83,no. 403, pp. 793–800, Sept. 1988.
[59] E. Parzen, “On the estimation of a probability density function and the mode,” TheAnnals of Mathematical Statistics, vol. 33, no. 2, pp. 1065–1076, Sept. 1962.
[60] G. Palm, A. M. H. J. Aertsen, and G. L. Gerstein, “On the significance of correlationsamong neuronal spike trains,” Biological Cybernetics, vol. 59, pp. 1–11, 1988.
[61] H. Shimazaki and S. Shinomoto, “A recipe for optimizing a time-histogram,” inNeural Information Processing Systems, 2006.
[62] B. L. Sabatini and W. G. Regehr, “Timing of synaptic transmission,” Annual Reviewof Physiology, vol. 61, pp. 521–542, 1999.
[63] S. M. Potter and T. B. DeMarse, “A new approach to neural cell culture forlong-term studies,” Journal of Neuroscience Methods, vol. 110, pp. 17–24, 2001.
[64] H. Kawaguchi and K. Fukunishi, “Dendrite classification in rat hippocampal neuronsaccording to signal propagation properties,” Experiments in Brain Research, vol. 122,pp. 378 – 392, 1998.
[65] J. le Feber, W. L. C. Rutten, J. Stegenga, P. S. Wolters, G. J. A. Ramakers, andJ. van Pelt, “Conditional firing probabilities in cultured neuronal networks: a stableunderlying structure in widely varying spontaneous activity patterns,” J. NeuralEng., vol. 4, pp. 54–67, 2007.
[66] C. D. Brody, “Correlations without synchrony,” Neural Computation, vol. 11, pp.1537–1551, 1999.
[67] D. Nikolic, “Non-parametric detection of temporal order across pairwisemeasurements of time delays,” J. Comput. Neurosci., vol. 22, pp. 5–19, 2007.
[68] F. Rieke, D. Warland, R. de Ruyter van Steveninck, and W. Bialek, Spikes: exploringthe neural code, MIT Press, Cambridge, MA, USA, 1999.
80
[69] P. Diggle, “A kernel method for smoothing point process data,” Applied Statistics,vol. 34, no. 2, pp. 138–147, 1985.
[70] E. N. Gilbert and H. O. Pollak, “Amplitude distribution of shot noise,” Bell Syst.Tech. J., vol. 39, pp. 333350, 1960.
[71] S. Yue and M. Hashino, “The general cumulants for a filtered point process,” AppliedMathematical Modelling, vol. 25, pp. 193–201, 2001.
[72] J. Michel, “A point process approach to filtered processes,” Methodol. Comput. Appl.Probab., vol. 6, pp. 423–440, 2004.
[73] E. Parzen, Time Series Analysis Papers, Holden-Day, 1967.
[74] J. S. Simonoff, Smoothing Methods in Statistics, Springer, 1996.
[75] I. S. Abramson, “On bandwidth variation in kernel estimates-a square root law,”Ann. Stat., vol. 10, no. 4, pp. 1217–1223, Dec 1982.
[76] D. Erdogmus and J. C. Prıncipe, “Generalized information potential criterion foradaptive system training,” IEEE Transactions on Neural Networks, vol. 13, no. 5, pp.1035–1044, Sept. 2002.
[77] J. C. Prıncipe, D. Xu, and J. W. Fisher, “Information theoretic learning,” inUnsupervised Adaptive Filtering, S. Haykin, Ed., vol. 2, pp. 265–319. John Wiley &Sons, 2000.
[78] A. Renyi, “On measures of entropy and information,” in Selected papers of AlfredRenyi, vol. 2, pp. 565–580. Akademiai Kiado, Budapest, Hungary, 1976.
[79] S. Kullback, Information Theory and Statistics, Dover Publications, New York, 1968.
[80] N. Aronszajn, “Theory of reproducing kernels,” Transactions of the AmericanMathematical Society, vol. 68, no. 3, pp. 337–404, May 1950.
BIOGRAPHICAL SKETCH
Il Park was born on April 29, 1979 in Gosla, Germany. He attended Gyunggi Science
High School for 2 years. He majored computer science at KAIST (Korea Advanced
Institute of Science and Technology). He spent 2001-2003 in an internet security
company as a developer. He has been working with Dr. Jose Prıncipe in Computational
NeuroEngineering Laboratory (CNEL) since 2006. He is admitted to the Biomedical
Engineering department for the Ph.D. program in University of Florida.
81