c 2007 il park - university of floridaufdcimages.uflib.ufl.edu/uf/e0/02/03/02/00001/park_i.pdf ·...

CONTINUOUS TIME CORRELATION ANALYSIS TECHNIQUESFOR SPIKE TRAINS

By

IL PARK

A THESIS PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2007

1

c© 2007 Il Park

2

Memmings are Memmings,

computers are recursive,

brains are brains.

3

ACKNOWLEDGMENTS

I thank my adviser Dr. Jose C. Prıncipe for all his great guidance, my committee

member Dr. John Harris for insightful suggestions, and Dr. Thomas B. DeMarse for his

knowledge and intuition on experiments. I thank my collaborators Antonio R. C. Paiva

and Karl Dockendorf for all the joyful discussions. I also thank Dongming Xu (dynamics),

Jian-Wu Xu (RKHS), Vaibhav Garg, Manu Rastogi, Savyasachi Singh (chess), Allen

Martins (pdf), Yiwen Wang and Aysegul Gunduz of CNEL, Jason T. Winters, Alex J.

Cadotte, Hany Elmariah (singing) and Nicky Grimes of the Neural Robotics and Neural

Computation Lab for their support and help. Last but not least, I thank my family and

friends for being there.

4

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

CHAPTER

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.1.1 Why Do We Analyze Spike Trains? . . . . . . . . . . . . . . . . . . 111.1.2 What Are Similar Spike Trains? . . . . . . . . . . . . . . . . . . . . 12

1.2 Minimal Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 CROSS INFORMATION POTENTIAL . . . . . . . . . . . . . . . . . . . . . . 14

2.1 Smoothed Spike Train Representation . . . . . . . . . . . . . . . . . . . . . 142.2 L2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Cauchy-Schwarz Dissimilarity . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 Information Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5.1 Comparison of Distances . . . . . . . . . . . . . . . . . . . . . . . . 182.5.2 Robustness to Jitter in the Spike Timings . . . . . . . . . . . . . . . 20

3 INSTANTANEOUS CROSS INFORMATION POTENTIAL . . . . . . . . . . . 22

3.1 Synchrony Detection Problem . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Instantaneous CIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.1 Derivation from CIP . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2.2 Spatial Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2.3 Rescaling ICIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.1 Sensitivity to Number of Neurons . . . . . . . . . . . . . . . . . . . 24

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.1 High-order Synchronized Spike Trains . . . . . . . . . . . . . . . . . 253.4.2 Mirollo-Strogatz Model . . . . . . . . . . . . . . . . . . . . . . . . . 27

5

4 CONTINUOUS CROSS CORRELOGRAM . . . . . . . . . . . . . . . . . . . . 32

4.1 Delay Estimation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2 Continuous Correlogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.1 Summary of Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.2 Potential Applications and Future Work . . . . . . . . . . . . . . . . . . . 49

APPENDIX

A BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

A.1 Point Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50A.1.1 An Alternative Representation of Poisson Process . . . . . . . . . . 51A.1.2 Filtered Poission Process . . . . . . . . . . . . . . . . . . . . . . . . 52

A.2 Mean Square Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53A.3 Probability Density Estimation . . . . . . . . . . . . . . . . . . . . . . . . 54A.4 Information Theoretic Learning . . . . . . . . . . . . . . . . . . . . . . . . 56A.5 Reproducing Kernel Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . 58

B STATISTICAL PROOFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

C NOTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

D SOURCE CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

D.1 CIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69D.2 ICIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70D.3 CCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6

LIST OF TABLES

Table page

A-1 Various probability density estimation kernels . . . . . . . . . . . . . . . . . . . 56

7

LIST OF FIGURES

Figure page

2-1 L2 distance versus CS divergence . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2-2 Distance difference of CS divergence for a synchronized or uncorrelated missingspike . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2-3 Change in CIP versus jitter standard deviation in the synchronous spike timings 20

3-1 Spike train as a realization of point process and smoothed spike train . . . . . . 22

3-2 Variance in scaled CIP versus the number of spike trains used for spatial averagingin log scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3-3 Analysis of ICIP as a function of synchrony . . . . . . . . . . . . . . . . . . . . 26

3-4 Evolution of synchrony in the spiking neural network . . . . . . . . . . . . . . . 28

3-5 Zero-lag cross-correlation for comparison . . . . . . . . . . . . . . . . . . . . . . 29

4-1 Example of cross correlogram construction . . . . . . . . . . . . . . . . . . . . . 33

4-2 Decomposition and shift of the multiset A. . . . . . . . . . . . . . . . . . . . . . 36

4-3 Effect of the length of spike train and strength of connectivity on precision ofdelay estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4-4 Effect of kernel size (bin size) of CCC (CCH) to the performance . . . . . . . . 41

4-5 Schematic diagram for the configuration of neurons. . . . . . . . . . . . . . . . . 43

4-6 Comparison between CCC and CCH on synthesized data. . . . . . . . . . . . . . 44

4-7 Effect of length of spike trains on CCC and CCH . . . . . . . . . . . . . . . . . 45

4-8 Correlograms for in vitro data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8

Abstract of Thesis Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of the

Requirements for the Degree of Master of Science

CONTINUOUS TIME CORRELATION ANALYSIS TECHNIQUESFOR SPIKE TRAINS

By

Il Park

May 2007

Chair: Jose Carlos PrıncipeMajor: Electrical and Computer Engineering

Correlation is the most basic analysis tool for time series. To apply correlation

to train of action potentials generated by neurons, the conventional method is to

discretize the time. However, time binning is not optimal: time resolution is sacrificed,

and it introduces the notorious problem of bin size sensitivity. Since spike trains can

be considered as a realization of a point process, the signal has no amplitude and all

information is embedded in the times of occurrence. Instead of time binning, we propose a

set of methods based on kernel smoothing to analyze the correlations. Smoothing is done

in continuous time so we do not lose the exact time of spikes while enabling interaction

between spikes at a distance. We present three techniques derived from correlation: (1)

spike train similarity measure, (2) synchrony detection mechanism, and (3) continuous

cross correlogram.

9

CHAPTER 1INTRODUCTION

1.1 Motivation

Signal processing tools such as adaptive filtering, least squares, detection theory,

clustering, and spectral analysis have brought engineers the power to analyze virtually

any signal. However, the application of such tools to the signal of the nervous system, the

spike train, has remained restricted. This is mainly because of the poor performance of

usual estimators for statistical variables such as mean and correlation function for point

process observations.

The foundation of signal processing tools is the L2, the metric space of random

processes with finite second order moment, which is a well defined Hilbert space. The

metric (distance measure) of the random process provides a continuous spectra of similar

signals, providing a friendly space analogous to Euclidean space. Also, the distance is

strongly related to correlation which is the inner product in L2. While point processes can

be theoretically treated in the same way, the main problem is to estimate the process from

the observation. In contrast to analog and digital signals, the distance estimator between

two point process observations in the traditional sense leads to natural numbers which is

not continuous but discrete, so the spectra of signals is lost. The discrete metric makes it

inappropriate to directly apply the signal processing tools to spike trains.

Neuroscience literature have been using several approaches to overcome this difficulty.

The most widely used approach is to use time bins to convert the times of occurrence to

a sequence of binary amplitude or discrete time series. Recently, van Rossum proposed a

metric for spike trains [1], which is related to a non-Euclidean metric proposed by Victor

and coworkers which is an extension of the Levenshtein distance (also known as edit

distance in computer science) to continuous time [2]. Many neuroscientists were already

using the van Rossum distance by intuition in the form of correlation [3–6].

We mapped the spike trains to a realization of a random process in L2, so that

traditional signal processing techniques can be readily applied. We will analyze

10

the properties of the mapping and the metric induced by the mapping. One of the

advantages we gain from this approach is that by choosing the appropriate mapping, the

computational cost can be minimized while the time resolution remains continuous. We

will derive correlation based measures from this space and recover the power of signal

processing tools for spike trains. Specifically, we propose three techniques, (1) the cross

information potential (CIP) as a similarity measure between spike trains based on

correlation, (2) the instantaneous cross information potential (ICIP) as a measure

of instantaneous synchrony among spikes trains, and (3) continous cross correlogram

(CCC) as an extension of CIP to continuous time lags. All of the proposed has efficient

computation mechanism and will be accompanied by statistical analysis.

1.1.1 Why Do We Analyze Spike Trains?

Neurons communicate mainly through a series of action potentials, although

there are increasing evidence that field potentials are also essential in the brain [7].

Action potentials are generation by the complex dynamics of a neuron [8, 9], and has a

stereotypical shape which can be propagated through a long distance and can resist noise

because of its all-or-none type of transmission. There have been evidence that not only

the existence of an action potentials carries information, but the duration of the action

potential is systematically modulated [10], and recently even subthreshold dendritic input

can modulate synaptic terminals [11].

However, from the computational point of view, it is believed that the temporal

structure of the action potentials is more important than individual details of an action

potential. Experiments mainly in sensory encoding demonstrates precise timing (or

precise time to first spike) of action potentials ([12–14], see [15] for a review, and [16] for

arguments against it) which supports the idea of encoding information on spike times. The

precision of spike timings is less than 100 µs in auditory system [17] and in the order of

1 ms in other experiments [14].

11

The other reason that spike trains are widely studied is because it is relatively easy

to record with high accuracy and precision. Extracellular electrode arrays permits the

recording from massive number of neurons simultaneously in vivo and in vitro.

Many methods have been developed to analyze spike trains for various problems

including correlation analysis [18], connectivity estimation [19, 20], delay estimation [21],

system identification [22], clustering different spike patterns [4, 23], estimating entropy

[24–27], and neural decoding [28, 29]. We will tackle some of these problems with the

proposed techniques.

1.1.2 What Are Similar Spike Trains?

As mentioned in section 1.1.1, the spike times produced by neurons in response to

repeated stimulus often shows precise timing with some error. The jitter error distribution

fits with a Gaussian distribution [13]. The possible noise sources are thermal noise, ion

channels, probabilistic synapse activation, spontaneous release of vesicles.

When the spike train is modeled by a Poisson process, the jitter noise restricts the

shape of the intensity function (instantaneous firing rate) over time. In other words, the

noise will limit the narrowness of a precisely timed spike. In addition, this implicates that

the spike trains with small timing differences should be treated as similar to each other,

thus having a small distance (or dissimilarity1 ).

We can exploit this and construct a probable intensity function from a spike train

by using the techniques of kernel density estimation. The kernel, which represents the

jitter timing distribution, will be placed where the spikes have actually occurred, and the

summation of all kernels will estimate the intensity function assuming a Poisson process.

Nawrot and coworkers have tried various kernels for single trial estimation of the intensity

1 Distance usually refers to a mathematical metric which satisfies positivity, reflexivity,definiteness, symmetry and triangle inequality. However, we will also refer to dissimilaritymeasures that lack the triangle inequality as a distance informally and interchangeablywith dissimilarity.

12

function from spike trains in a model, and concluded that the kernel size (bandwidth) is

more important that the shape of the kernel [30].

Another type of noise in spike trains is insertion or deletion of spikes. Although

spike trains of neurons conserve high precision of spike timings when they occur, there

is evidence that neurons often skip a few spikes [4, 31, 32]. When a spike is inserted or

removed from a spike train, the distance differs by the constant 12

in van Rossum distance.

In contrast, a correlation measure does not depend on signal power (or number of action

potentials), but only on the coincidental action potential pairs. In applications, such

as classification of spike trains with template matching, the correlation based distance

measure (Cauchy-Schwarz divergence) can perform better than van Rossum (L2) distance.

The concept of coincidental spikes leads to synchrony between spike trains. In

addition, there are strong evidences that neurons and dendrites work as a coincidence

detector and sensitive to afferent synchrony [26, 33–36].

1.2 Minimal Notation

We introduce the minimal mathematical notation. We assume that a number of spike

trains are observed, and indexed. Each spike train is a finite set of spike timings where

the action potentials are detected. For the spike train indexed by i, individual timings are

denoted as tim where m is the index for spikes. The functional form of i-th spike train is

defined as,

si(t) =

Ni∑m=1

δ(t− tim) (1–1)

where Ni is the number of spikes in i-th spike train, and δ(·) is the Dirac delta function.

13

CHAPTER 2CROSS INFORMATION POTENTIAL

2.1 Smoothed Spike Train Representation

Given a spike train si(t), we assume inhomogeneous Poisson process and estimate the

intensity function by using a kernel. The kernel has to be non-negative valued and has

area of 1, that is, it has to be a proper probability density function. Denote this kernel as

κpdf(t), then the estimated intensity function can be written as,

λi(t) =

Ni∑m=1

κpdf(t− tim). (2–1)

This process can also be viewed as low pass filtering of the spike trains to estimate

the post synaptic potential of synapses. In the point process literature, this is a special

case of filtered point process, and in the engineering literature known as shot noise. 1

The estimated intensity function is continuous if κpdf is continuous. Assuming continuous

κpdf , the mapping equation (2–1) converts a spike train to a continuous signal that can be

interpreted with the second order theory with a continuous metric. Note that the mapping

is one-to-one and onto: deconvolution of λi(t) with κpdf uniquely determines a spike train.

2.2 L2 Metric

The smoothed spike train, or estimated intensity function, can be considered as a

signal in L2. The distance in L2 of two smoothed spike trains is,

∥∥∥λi(t)− λj(t)∥∥∥

2

2=

∫ ∞

−∞(λi(t)− λj(t))

2dt (2–2a)

=

∫ ∞

−∞

(λ2

i (t)− 2λi(t)λj(t) + λ2j(t)

)dt. (2–2b)

1 When the underlying process is a homogeneous Poisson process, the filtered pointprocess is wide sense stationary (WSS) by Campbell’s theorem (see appendix, theorem 3).

14

Using the definition of the estimator (2–1),

∫ ∞

−∞λ2

i (t)dt =

∫ ∞

−∞

Ni∑m=1

Ni∑n=1

κpdf(t− tim)κpdf(t− tin)dt (2–3a)

=

Ni∑m=1

Ni∑n=1

κ(tim − tin) (2–3b)

and the cross term (inner product in L2) becomes,

∫ ∞

−∞λi(t)λj(t)dt =

Ni∑m=1

Nj∑n=1

κ(tim − tjn) (2–3c)

where κ(t) =∫∞−∞ κpdf(s)κpdf(s + t)ds. κ is the kernel which computes the correlation.

If an exponential distribution is used, i.e.,

κpdf(t) =1

τe−

tτ u(t), (2–4)

where u(t) is the unit step function, then the L2 distance is proportional to van Rossum

distance with factor 1τ. In addition, the combined kernel κ(t) becomes a scaled Laplace

distribution kernel:

∫ ∞

−∞λi(t)λj(t)dt =

1

τ 2

∫ ∞

−∞

Ni∑m=1

Nj∑n=1

exp

(−t− tim

τ

)u(t− tim) exp

(−t− tjn

τ

)u(t− tjn)dt

(2–5)

=1

τ 2

Ni∑m=1

Nj∑n=1

∫ ∞

−∞exp

(−2t− tim − tjn

τ

)u(t− tim)u(t− tjn)dt (2–6)

=1

τ 2

Ni∑m=1

Nj∑n=1

∫ ∞

max(tim,tjn)

exp


τ

)dt (2–7)

=1

τ 2

Ni∑m=1

Nj∑n=1

(−τ

2) exp


τ

)∞

max(tim,tjn)

(2–8)

=

Ni∑m=1

Nj∑n=1

1

2τexp

(−|t

im − tjn|

τ

)(2–9)

Note that in terms of a linear filter, the causal exponential distribution corresponds to a

first-order infinite impulse response (IIR) filter with time constant τ with gain of 1τ.

15

2.3 Cauchy-Schwarz Dissimilarity

An alternative dissimilarity measure that can be induced from inner product of L2 is

the Cauchy-Schwarz (CS) divergence. Recall the Cauchy-Schwarz inequality (see lemma

6):

|〈x|y〉| ≤ ‖x‖ ‖y‖ .

Since each quantity is positive if x and y are not zero vectors, and equality holds when

either of them are zero, we can divide both sides,

1 ≤ ‖x‖ ‖y‖|〈x|y〉| .

By taking the logarithm,

0 ≤ log

(‖x‖ ‖y‖|〈x|y〉|

)≤ ∞.

It can be proved that this quantity is positive, reflexive, and symmetric [37] if we exclude

0 from the space. However, CS divergence does not hold the triangular inequality, thus it

is not a metric. By expanding the definition of inner product and norm of L2 space,

dCS(λi(t), λj(t)) = log

√∫∞−∞ λ2

i (t)dt∫∞−∞ λ2

j(t)dt∫∞−∞ λi(t)λj(t)dt

(2–10a)

= log

√∑Ni

m=1

∑Ni

n=1 κ(tim − tin)∑Nj

m=1

∑Nj

n=1 κ(tjm − tjn)∑Ni

m=1

∑Nj

n=1 κ(tim − tjn)

(2–10b)

= log

√√√√ Ni∑m=1

Ni∑n=1

κ(tim − tin)

Nj∑m=1

Nj∑n=1

κ(tjm − tjn)

− log

Ni∑m=1

Nj∑n=1

κ(tim − tjn)

, (2–10c)

where dCS denotes the CS divergence.

If the spike trains are homogeneous Poisson with firing rate λi and λj respectively,

the expected value of the norm of estimated intensity function E [λ2i (t)] is the second order

16

moment of the shot noise, which can be obtained by equation (A–8),

E[λ2

i (t)]

= λi

∫ ∞

−∞κ2(t)dt. (2–11)

Therefore the first term in equation (2–10c) can be approximated as a constant. However,

depending on the correlation of the spike trains, the second term will vary. Since the

negative logarithm is a monotonically decreasing function, we take the argument, denote

as Vij, and define as cross information potential for reasons that would be explained in

section 2.4.

Vij =

Ni∑m=1

Nj∑n=1

κ(tim − tjn) (2–12)

This inner product term is essentially equivalent to correlation of smoothed spike trains.

CIP is inversely related to CS divergence, so it quantifies similarity between spike trains.

2.4 Information Potential

Given a probability distribution, entropy quantifies the peakiness and is related

to the higher order moments that the variance cannot capture. Renyı’s entropy is a

generalization of the classic Shannon’s entropy. Information theoretic learning (see section

A.4 for a summary of the information theoretic learning framework).

Inhomogeneous Poisson process can be represented as two separate random variables:

one for the number of spikes and the other for the temporal density (see section A.1.1).

The pdf for the temporal density is simply a normalized form of the intensity function

(equation (A–2)). This pdf does not have the information of how active the process is,

that is, the firing rate.

Information potential of density function estimated using Parzen window with κpdf for

the i-th spike train has the following form (compare equation (A–16)),

Vi =1

N2i

Ni∑m=1

Ni∑n=1

κ(tim − tin) (2–13)

where κ(t) =∫∞−∞ κpdf(s)κpdf(s + t)ds is defined as before. This coincides with the

definition of norm square of the smoothed spike train, equation (2–3b), normalized by the

17

0 0.01 0.02 0.03 0.040

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04L2 distance

distance from template 1

dist

ance

from

tem

plat

e 2

0 1 2 3 4 50

1

2

3

4

5CS distance

distance from template 1

dist

ance

from

tem

plat

e 2

Figure 2-1. L2 distance versus CS divergence. Spike trains from template 1 is generatedand the distance (or divergence) from each template. Gaussian jitter with0.7 ms standard deviation is added to the timings. Blue circles correspondto spike trains with same number of spikes, and red dots correspond to spiketrains with missing spikes. The kernel κ was Laplacian with time constantτ = 1 ms.

number of spikes. For a pair of spike trains, the cross information potential can be defined

as a similarity index between the corresponding pair of pdfs. Note that in terms of CS

divergence, the normalization with the number of spikes in the spike train cancels away.

2.5 Discussion

2.5.1 Comparison of Distances

As mentioned earlier in section 1.1.2, although neurons fire with high temporal

precision, they often miss spikes. In this case, L2 distance would deviate because of the

missing spike. CS divergence would be less sensitive because it will ignore missing spikes.

To demonstrate this, a simple classification task was performed (see figure .2-1). Two

template spike trains were prepared: template 1 with 2 spikes at 3 ms and 8 ms, and

template 2 with 1 spike at 6 ms. Then, we generated instances of template 1 by putting

Gaussian jitter on timing (blue circles) and removing a spike (red dots).

For the no missing spike case, both L2 (94%) and CS divergence (100%) correctly

classified the instance as template 1 (they lie on the upper half). But for missing spikes

18

0 10 20 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

number of total spikes

decr

ease

in C

S d

ista

nce

loosing one uncorrelated spike

0 10 20 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

incr

ease

in C

S d

ista

nce

number of correlated spikes

loosing one correlated spike

30 spikes total60 spikes total

Figure 2-2. Increase or decrease in Cauchy-Schwarz (CS) divergence (dissimilarity) whena spike is missing. (Left) When a correlated (perfectly synchronized in thiscase) spike is missing, the divergence decrease inversely related to the totalnumber of spikes. (Right) But if a correlated (synchronized) spike is missing,the divergence increases proportional to the total number of synchronizedspikes, and not greatly influence by the total number of spikes. In contrast, L2

distance the increase and decrease are constant (see text for details).

case, L2 distance (51%) performed a lot worse than CS divergence (93%). The CS

divergence shows lines when one spike is missing because the distance (quantified as the

divergence) is a log of the kernel which is a single Laplacian.

Suppose individual spikes are separated compared to the kernel size or exactly

synchronized so that we can approximate the norm and inner product by the number

of spikes: norm square of a spike train is the number of spikes, and inner product gives

the number of synchronized spikes. This is equivalent to making the kernel size infinitely

small, so that it converges to a Dirac delta function.

Let there be two spike trains A and T (for template) with NA and NT number of

spikes respectively, and NAT synchronized spikes. The L2 distance between A and T is

NA + NT − 2NAT , and the CS divergence is log NANT

NAT.

If we loose a spike that was not synchronous between A and T, the distance will

decrease by the constant 1 in L2 distance (12

in van Rossum distance) and for CS

19

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Jitter standard deviation (ms)

CIP

[2ms]

00.10.20.30.40.5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.1

0.15

0.2

0.25

0.3

0.35

Jitter standard deviation (ms)

CIP

[5ms]

00.10.20.30.40.5

Figure 2-3. Change in CIP versus jitter standard deviation in the synchronous spiketimings. For the case with independent spike trains, the error bars for onestandard deviation are also shown. The kernel size is 2ms (left) and 5ms(right).

divergence the decrease is,

logNANT

NAT

− logNA(NT − 1)

NAT

= logNANT

NAT

NAT

NA(NT − 1)(2–14)

= logNT

NT − 1. (2–15)

Thus, if there are more spikes, the CS divergence decreases less for a missing non-synchronous

spike. (And if the last spike is lost, the CS divergence is not defined anymore.)

If a synchronized (correlated) spike is lost, NT and NAT are reduced by 1. L2 distance

increases by 3, and for CS divergence the increase is,

logNA(NT − 1)

NAT − 1− log

NANT

NAT

= logNT − 1

NT

NAT

NAT − 1. (2–16)

Therefore, if there are more synchronized spikes, the distance decreases more. See

figure 2-2 for an illustrative example.

2.5.2 Robustness to Jitter in the Spike Timings

CIP was analyzed when jitter is present in the spike timings. This was done with

a modified multiple interaction process (MIP) model [38, 39] where jitter, modeled as

20

i.i.d. Gaussian noise, was added to the individual spike timings. In the MIP model an

initial spike train is generated as a realization of a Poisson process. All spike trains are

derived from this one by copying spikes with a probability ε. The operation is performed

independently for each spike and for each spike train. The resulting spike trains are also

Poisson processes. If γ was the firing rate of the initial spike train then the derived spikes

trains will have firing rate εγ. Furthermore, it can be shown that ε is also the count

correlation coefficient [38]. A different interpretation for ε is that, given a spike in a spike

train, it quantifies the probability of a spike co-occurrence in another spike train.

The effect was then studied in terms of the synchrony level and kernel size. Figure 2-3

shows the average CIP for 10 Monte Carlo runs of two spike trains, 10 seconds long, and

with constant firing rate of 20 spikes/s. In the simulation, the synchrony level was varied

between 0 (independent) to 0.5 for a kernel size of 2ms and 5ms. The jitter standard

deviation varied between the ideal case (no-jitter) to 15ms.

As mentioned earlier, CIP measures the coincidence of the spike timings. As a

consequence, the presence of jitter in the spike timings decreases the expected values

of CIP (and time averaged ICIP). Nevertheless, the results in Fig. 2-3 support the

statement that the measure is indeed robust to large levels of jitter compared to the kernel

size, and is capable of detecting the existence of synchrony among neurons. Of course,

increasing the kernel size decreases the sensitivity of the measure for the same amount

of jitter. Furthermore, as in the previous example, it is also shown that small levels of

synchrony can be discriminated from the independent case as suggested by the error

bars in Figure 2-3. Finally, we remark that the difference in scale between the figures is

a consequence of the normalization of the kernel so that it is a valid pdf. This can be

compensated explicitly by scaling the CIP by τ . Simply note that the expressions provided

in the previous example for mean ICIP (and therefore CIP) as a function of the synchrony

level implicitly compensate for τ .

21

CHAPTER 3INSTANTANEOUS CROSS INFORMATION POTENTIAL

3.1 Synchrony Detection Problem

Coincidental firing of different neurons has been a focus of interest–from synfire

chain [40], neural coding [31, 41], neural assemblies [3], binding problem [42], and to pulse

coupled oscillators [43–47]. Analysis of synchrony has relied on various methods, such

as the cross-correlation [48], joint peri-stimulus time histogram (JPSTH) [49], unitary

events [50], and gravity transform [3], among many others.

Since CIP (or CS divergence) characterizes the similarity (or dissimilarity) of spike

trains with correlation of spike times, CIP can also be used as a synchrony measure.

However, CIP does not provide information about instantaneous synchrony. A sliding

window approach can be used with sacrifice of the temporal resolution, as in cross

correlation and gravity transform.

3.2 Instantaneous CIP

3.2.1 Derivation from CIP

Let us break the integral range from the definition of L2 inner product (equation

(2–3c)).

Vij(t) =

∫ t

−∞λi(σ)λj(σ)dσ. (3–1)

Taking the derivative on time yields ICIP,

vij(t) = λi(t)λj(t), (3–2)

(a) ti1

ti2

ti3

ti4

tiN

i

T0 Ttime

(b)

Figure 3-1. Spike train as a realization of point process and smoothed spike train. (a)Spike train of neuron i represented in the time domain as a sequence ofimpulses and (b) its filtered counterpart using a causal decaying exponential.

22

by the fundamental theorem of calculus. Since the derivative provides the instantaneous

change of CIP at that time, ICIP quantifies instantaneous synchrony of the action

potential timing. If we use the exponential kernel for intensity estimation, ICIP can be

easily estimated by two IIRs and a multiplication, therefore requiring no memory, but just

two state variables.

3.2.2 Spatial Averaging

In the context of neural assembly, ensemble of neurons work together with synchronous

spikes. Current multielectrode recording technology has enabled the analysis of a number

of spike trains recorded simultaneously. It is possible to reduce the trial averaging by

combining the concept of neural assemblies and multiple spike trains recording. The

spatial averaging over the ensemble may provide high resolution of the events.

Consider a set of M spike trains. ICIP (and CIP) can be generalized to multiple spike

trains in a straightforward manner by averaging over all the pairwise combinations. That

is, the ensemble averaged ICIP is given by

v(t) =2

M(M − 1)

M∑i=1

M∑j=i+1

vij(t). (3–3)

Analysis of the spatial averaging is presented in section 3.3.1.

3.2.3 Rescaling ICIP

When precise timing is modulated with a fluctuation of the firing rate, the precision

of the timing may vary. In high firing rate regions, the experimenter would like to pay

more attention to more precise synchronizations, since the spikes are dense. Changing the

kernel size according to the general firing rate trend may help in these cases.

The time rescaling theorem states that an inhomogeneous Poisson process can be

transformed into a homogeneous Poisson process [51, 52] by stretching the time according

to the intensity function. Transformation of equation (2–1) into a constant firing rate

time scale for different spike trains depends on individual intensity function, and therefore

the transformed results are not synchronous. Thus, in order to quantify synchrony, the

23

5 10 15 20 25 3010

−5

10−4

10−3

10−2

10−1

100

Number of spike trains

Var

ianc

e of

CIP

0.00.10.20.30.40.5

0 5 10 15 20 25 3010

−6

10−5

10−4

10−3

10−2

10−1

Number of spike trains

Var

ianc

e of

CIP

1 ms2 ms5 ms10 ms20 ms

Figure 3-2. Variance in scaled CIP versus the number of spike trains used for spatialaveraging in log scale. The analysis was performed for different levels ofsynchrony and constant τ = 2ms (left), and different values of the exponentialdecay parameter τ on independent spike trains (right). In both plots thetheoretical value of CIP for independent spike trains is shown (dashed line).

correlation operation should be performed in the original times, but with the smoothing in

the transformed space. The first order approximation of this can be achieved by redefining

the intensity estimator as

λi(t) =1

β

Ni∑m=1

exp

(− fi(t)

β(t− tim)

)u(t− tim) (3–4)

where fi(t) is also the estimation for the intensity function and β > 0 is a scaling constant

which specifies the value of τ when the firing rate is one. Therefore, at time t, the effective

time constant is approximately β

λi(t). It may seem like an oxymoron to estimate an

intensity function using an estimate of the intensity function, but f(t) is estimated with a

broader kernel for the firing rate trend, and λ(t) has a small kernel size that corresponds

to the resolution of interest.

3.3 Analysis

3.3.1 Sensitivity to Number of Neurons

We now analyze the effect of the number of spike trains used for spatial averaging.

This effect was studied with respect to two main factors: the synchrony level of the spike

24

trains and the exponential decay parameter τ . In the first case, a constant τ = 2ms was

used, while the latter case considered only independent spike trains. The results are shown

in Fig. 3-2 for the scaled CIP spatially averaged over all pair combinations of neurons.

The simulation was repeated for 200 Monte Carlo runs using 10 second long spike trains

obtained as homogeneous Poisson processes with firing rate of 20 spikes/s.

As illustrated in the figure, the variance in CIP decreases dramatically with the

increase in the number of spike trains employed in the analysis. Recall that the number of

pair combinations over which the averaging is performed increases with M(M − 1), where

M is the number of spike trains. As expected, this improvement is most pronounced in the

case of independent spikes trains. In this situation, the variance decreases proportionally

to the number of averaged pairs of spike trains. This is shown by the dashed line in the

plots of Fig. 3-2. These results support the role and importance of ensemble averaging as a

principled method to reduce the variance of the CIP estimator.

3.4 Results

3.4.1 High-order Synchronized Spike Trains

Figure 3-3 shows ICIP of different levels of synchrony over ten spike trains. The

synchrony was generated by using the MIP model, and modulated over time for 1 seconds

of time durations. The firing rate of the generated spike trains was constant and equal to

20 spikes/s for all spike trains. The figure shows the ICIP averaged for each time instant

over all pair combinations of spike trains. Because the spike trains have constant firing

rate, the time constant of the decaying exponential convolved with the spike trains was

constant and chosen to be τ = 2 ms. Also, in the bottom plot the average value of the

mean ICIP is shown. This was computed in 25 ms steps with a causal 250 ms long sliding

window. To establish a relevance of the values measured, the expectation and this value

plus two standard deviations are also shown, assuming independence between spike trains.

The mean and standard deviation, assuming independence, are 1 and√(

12τλ

+ 1)2 − 1,

respectively (see Appendix for details). The expected value of the ICIP when synchrony

25

0

0.5

Syn

chro

ny, ε

2

4

6

8

10

Spi

ke tr

ain

num

ber

0

100

200

300

400

500

600

ICIP

0 1 2 3 4 5 6 7 8 9 10 110

10

Time (s)

Figure 3-3. Analysis of ICIP as a function of synchrony. (Top) Level on synchronyspecified in the simulation of the spike trains. (Upper middle) Raster plotof firings. (Lower middle) Average ICIP across all neuron pair combinations.(Bottom) Time average of ICIP in the upper plot computed in steps of 25mswith a causal rectangular window 250ms long (dark gray). For reference, itis also displayed the expected value (dashed line) and this value plus twostandard deviations (dotted line) for independent neurons, together with theexpected value during moments of synchronous activity (thick light gray line),as obtained analytically from the level of synchrony used in the generation ofthe dataset. Furthermore, the mean and standard deviation of the ensembleaveraged CIP scaled by T measured from data in one second intervals is alsoshown (black).

26

among spike trains exists is given by 1 + ε/(2τλ), with λ the firing rate of the two spike

trains, and is also shown in the plot for reference.

In the figure, it is noticeable that estimated synchrony increases as measured by ICIP.

Moreover, the averaged ICIP is very close to the theoretical expected value and is typically

below the expected maximum under an independence assumption as given by the line

indicating the mean plus two standard deviations. The delayed increase in the averaged

ICIP is a consequence of the causal averaging of ICIP. It is equally remarkable to verify

that (scaled) CIP matches precisely the expected values from ICIP as given analytically.

3.4.2 Mirollo-Strogatz Model

In this example, we show that ICIP can quantify synchrony in a spiking neural

network of leaky-integrate-and-fire (LIF) neurons designed according to [43]1 and

compare the result with extended cross-correlation for multiple neurons. This is the

simplest pulse coupled network that was proven to be perfectly synchronized from almost

any initial condition (Fig. 3-4). The synchronization is essentially due to leakiness and the

weak global coupling among the oscillatory neurons.

The raster plot of the network firing pattern is shown in Fig. 3-4. There are two main

observations: the progressive synchronization of the firings associated with the global

oscillatory behavior of the network, and the local grouping that tends to preserve local

synchronizations that either entrain the full network or wash out over time. As expected

from theoretical studies of the network behavior [43, 46] and which ICIP depicts precisely,

the synchronization is monotonically increasing, with a period of fast increase in the first

second followed by a plateau and slower increase as time advances. Moreover, it is possible

1 The parameters for the simulation are: 100 neurons, resting and reset membranepotential -60 mV, threshold -45 mV, membrane capacitance 300 nF, membrane resistance1 MΩ, current injection 50 nA, synaptic weight 100 nV, synaptic time constant 0.1 ms andthe topology was all to all excitatory connection.

27

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4x 10

6

IP o

f Mem

bran

e P

oten

tial

Time (sec)

0

0.5

1

1.5

2

2.5

3x 10

−3

ICIP

10

20

30

40

50

60

70

80

90

100

Spi

ke tr

ain

num

ber

1.1 1.2 1.3 1.4 1.5

Figure 3-4. Evolution of synchrony in the spiking neural network. (Top) Raster plot ofthe neuron firings. (Middle) ICIP over time. The inset highlights the mergingof two synchronous groups. (Bottom) Information potential of the membranepotentials. This is a macroscopic variable describing the synchrony in theneurons’ internal state.

28

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (sec)

Cro

ssco

rrel

atio

n

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (sec)

Cro

ssco

rrel

atio

n

Figure 3-5. Zero-lag cross-correlation computed over time using a sliding window 10 binslong, and bin size 1ms (top) and 1.1ms (bottom).

29

to observe in the first 1.5 s the formation of a second group of synchronized neurons which

slowly merges into the main group.

Since the model was simulated, we also have access to all the internal variables: the

membrane potential of individual neurons over time. Thus, we can compute the synchrony

of neurons in terms of membrane potential. Surprisingly, the information potential (IP)

of the membrane potentials reveals the same evolution as the envelope of ICIP, including

the plateau. The IP was computed according to (A–18) using a Gaussian kernel with

size 0.75mV.2 The IP measures synchrony of the neuron’s internal state, which is only

available in simulated networks. Yet the results show that ICIP was able to sucessfully

and accurately extract such information from the observed spike trains.

For completeness, in Fig. 3-5 we also present the zero-lag cross-correlation over

time, averaged through all pairwise combinations of neurons. The cross-correlation was

computed with a sliding window 10 bins long, sliding 1 bin at a time. In the figure,

the result is shown for a bin size of 1 ms and 1.1 ms. It is notable that although

cross-correlation captures the general trends of synchrony, it masks the plateau and

the final synchrony and it is highly sensitive to the bin size as shown in the figure, unlike

ICIP (data not shown). In other words, the results for the windowed cross-correlation

show the importance of working in “continuous” time which is crucial for robust

synchrony estimation in the spike domain. Other methods relying on binning also

suffer from sensitivity to bin size, such as the ones mentioned earlier. For this reason,

these methods are limited and unable to achieve the same high temporal resolution as

ICIP. In addition, spike trains are generally non-stationary unlike some methods assume.

The conventional approach is to use a moving window analysis such that only piece-wise

2 The distance used in the Gaussian kernel was d(θi, θj) = min (|θi − θj|, 15mV − |θi − θj|),where θi is the membrane potential of the ith neuron. This wrap-around effect expressesthe phase proximity of the neurons before and after firing.

30

stationarity is necessary. The information theoretic framework of ICIP, and CIP, treats the

non-stationarity implicitly as a pdf estimation problem.

31

CHAPTER 4CONTINUOUS CROSS CORRELOGRAM

4.1 Delay Estimation Problem

Precise time delay in transmission of a spike in the neural system is considered to

be one of the key features to allow efficient computation in cortex [15, 53]. For example,

it is crucial for coincidence detection of auditory signal processing [17]. One of the

effective methods for estimating the delay is to use a cross correlogram [54]. Cross

correlogram is a basic tool to analyze the temporal structure of signals. It is widely

applied in neuroscience to assess oscillation, propagation delay, effective connection

strength, and spatiotemporal structure of a network [28].

However, estimating the cross correlation of spike trains is non-trivial since they are

point processes, thus the signals do not have amplitude but only time instances when the

spikes occur. A well known algorithm for estimating the correlogram from point processes

involves histogram construction with time interval bins [48]. The binning process is

effectively transforming the uncertainty in time to amplitude variability. This quantization

of time introduces binning error and leads to coarse time resolution. Furthermore, the

correlogram does not take advantage of the higher temporal resolution of the spike times

provided by current recording methods.

This can be improved by using smoothing kernels to estimate the cross correlation

function from finite samples. The resulting cross correlogram is continuous and provides

high temporal resolution in the region where there is a peak (see Fig. 4-1 for comparison

between histogram method and kernel method.) In this paper, we propose an efficient

algorithm for estimating the continuous correlogram of spike trains without time binning.

The continuous time resolution is achieved by computing at finite time lags where the

continuous cross correlogram can have a local maximum. The time complexity of the

proposed algorithm is O(T log T ) on average where T is the duration of spike trains. The

application of the proposed algorithm is not restricted to simultaneously recorded spike

trains, but also to PSTH and also other point processes in general.

32

0 100 200 300time (ms)

A

B

0 100 200 300

C

−300 −200 −100 0 100 200 300

D

−200 0 200 0

2

4

6

E

−300 −200 −100 0 100 200 300

F

Figure 4-1. Example of cross correlogram construction. A and C are two spike trainseach with 4 spikes. Except for the third spike in A, each spike in A invokes aspike in C with some small delay around 10 ms. B represents all the positive(black) and negative (gray) time differences between the spike trains. Dshows the position of delays obtained in B. E is the histogram of D, whichis the conventional cross correlogram with bin size of 100 ms. F shows thecontinuous cross correlogram with Laplacian kernel (solid) and Gaussiankernel (dotted) with bandwidth 40 ms. Note that the Laplacian kernel is moresensitive to the exact delay.

33

4.2 Continuous Correlogram

Two simultaneously recorded instances of point processes are represented as a sum of

Dirac delta functions at the time of firing event, si(t) and sj(t),

si(t) =

Ni∑m=1

δ(t− tim), (4–1)

where Ni is the number of spikes and tim are the time instances of action potentials. The

cross correlation function is defined as,

Q†ij(∆t) = Et [si(t)sj(t + ∆t)] , (4–2)

where Et [·] denotes expected value over time t. The cross correlation can be interpreted

as scaled conditional probability of j-th neuron firing given i-th neuron fired ∆t seconds

before [55]. In a physiological context, there is a physical restriction of propagation delay

for an action potential to have a causal influence to invoke any other action potential.

Therefore, this delay would influence the cross correlogram as a form of increased

amplitude. Thus, estimating the delay involves finding the lag at which there is a

maximum in the cross correlogram (inhibitory interaction which appear as troughs

rather than peaks is not considered in this article).

Smoothing a point process is superior to the histogram method for the estimation of

the intensity function [30], and especially the maxima [56]. Similarly, the cross correlation

function can also be estimated better with smoothing which is done in continuous time

so we do not lose the exact time of spikes while enabling interaction between spikes at a

distance.

Instead of smoothing the histogram of time differences between two spike trains,

we first smooth the spike train to obtain a continuous signal [57]. We will show that

this is equivalent to smoothing the time differences with a different kernel. A causal

exponential decay was chosen as the smoothing kernel to achieve computational efficiency

34

(see section 4.3). Smoothed spike trains are represented as,

qi(t) =

Ni∑m=1

1

τe−

t−timτ u(t− tim), (4–3)

where u(t) is the unit step function. The cross correlation function of the smoothed spike

trains is,

Q∗ij(∆t) = Et [qi(t)qj(t + ∆t)] . (4–4)

Given a finite length of observation, the expectation in equation (4–4) can be

estimated from samples as,

Q∗ij(∆t) =

1

T

∫ ∞

0

qi(t)qj(t + ∆t)dt, (4–5)

where T is the length of the observation. After evaluation of the integral, the resulting

estimator becomes,

Q∗ij(∆t) =

1

2τT

Ni∑m=1

Nj∑n=1

e−|tim−t

jn−∆t|τ , (4–6)

which is equivalent to the kernel intensity estimation [58, 59] from time differences using a

Laplacian distribution kernel.

The mean and variance of the estimator is analyzed by assuming the spike trains are

realizations of two independent homogeneous Poisson processes.

E[Q∗

ij(∆t)]' λAλB, (4–7)

var(Q∗ij(∆t)) ' λAλB

4τT, (4–8)

where λA and λB denote the firing rate of the Poisson process of which i-th and j-th spike

train, respectively, is a realization (see Appendix for derivation). Note that the variance

reduces linearly as the duration of the spike train is elongated. By removing the mean

and dividing by the standard deviation, we standardize the measure for inter-experiment

35

-

· · · θn−1 θn θn+1 θn+2 · · ·?

∆t

6

∆t− δ

(A∆t)− (A∆t)

+

(A∆t−δ)− (A∆t−δ)

+

Figure 4-2. Decomposition and shift of the multiset A.

comparison:

Qij(∆t) =

√4τT (Qij(∆t)− λAλB)√

λAλB

. (4–9)

4.3 Algorithm

The algorithm divides the computation of the summation of continuous cross

correlogram into disjoint regions and combines the result. We show that there are only

finite possible local maxima, and by storing the intermediate computation results for

neighboring time lags, the cross correlation of each lag can be computed in constant time.

The essential quantity to be computed is the following double summation,

Qij(∆t) =

Ni∑m=1

Nj∑n=1

e−|tim−t

jn−∆t|τ . (4–10)

The basic idea for efficient computing is that the summation of the exponential function

computed on a collection of points can be shifted with only one multiplication,∑

i exi+δ =

(∑

i exi)eδ. Since a Laplacian kernel is two exponentials stitched together, we need to

carefully take the regions into account.

Define the multiset of all time differences between two spike trains,

A = θ | θ = tim − tjn, m = 1, . . . , Ni, n = 1, . . . , Nj. (4–11)

Even though A is not strictly a set, since it may contain duplicates, we will abuse the set

notation for simplicity. Note that the cardinality of the multiset A is NiNj. Now equation

36

(4–10) can be rewritten as

Qij(∆t) =∑

θ∈A

e−|θ−∆t|

τ . (4–12)

Now let us define a series of operations for a multiset B ⊂ R and δ ∈ R,

B+ = x | x ∈ B and x ≥ 0, (non-negative lag) (4–13a)

B− = x | x ∈ B and x < 0, (negative lag) (4–13b)

Bδ = x | y ∈ B and x = y − δ. (shift) (4–13c)

Since B can be decomposed into two exclusive sets B+ and B−, equation (4–12) can also

be rewritten and decomposed,

Qij(∆t) =∑

θ∈A∆t

e−|θ|τ =

∑

θ∈(A∆t)+

e−|θ|τ +

∑

θ∈(A∆t)−e−

|θ|τ (4–14a)

=∑

θ∈(A∆t)+

e−θτ +

∑

θ∈(A∆t)−e

θτ . (4–14b)

For convenience, we define the following summations

Q±ij(∆t) =

∑

θ∈(A∆t)±e∓

θτ . (4–15)

Let us order the multiset A in ascending order and denote the elements as θ1 ≤ θ2 ≤. . . ≤ θn ≤ θn+1 ≤ . . . θNiNj

. Observe that within an interval ∆t ∈ (θn, θn+1], the

multiset ((A∆t)±)−∆t is always the same (see Fig. 4-2). In other words, if ∆t = θn+1,

for a small change δ ∈ [0, θn+1 − θn), the multisets do not change their membership, i.e.

((A∆t)±)δ = (A(∆t−δ))

±. Therefore, we can simplify an arbitrary shift of Q±ij with single

multiplication of an exponential as,

Q±ij(∆t− δ) =

∑

t∈(A∆t−δ)±e∓

tτ =

∑

t∈((A∆t)±)δ

e∓tτ (4–16a)

=∑

t∈(A∆t)±e∓

t−δτ =

∑

t∈(A∆t)±e∓

tτ e±

δτ = Q±

ij(∆t)e±δτ . (4–16b)

37

Thus, local changes of Qij can be computed by a constant number of operations no matter

how large the set A is, so that

Qij(∆t− δ) = Q+ij(∆t− δ) + Q−

ij(∆t− δ) (4–17a)

= Q+ij(∆t)e

δτ + Q−

ij(∆t)e−δτ . (4–17b)

If there is a local maximum or minimum of Qij(∆t − δ), it would be wheredQij(∆t−δ)

dδ= 0,

which is,

δ∗ =τ

2

(ln(Q−

ij(∆t))− ln(Q+ij(∆t))

). (4–18)

Also note that since the second derivative,

d2Qij(∆t− δ)

dδ2=

1

τ 2

(Q+

ij(∆t)eδτ + Q−

ij(∆t)e−δτ

)≥ 0, (4–19)

Qij(∆t− δ) is a convex function of δ within the range. Thus, the maximum of the function

value is always on either side of its valid range, only local minimum can be in between.

In principle, we need to compute equation (4–10) for all ∆t ∈ [−T ∗, T ∗] to achieve

continuous resolution, where T ∗ is the maximum time lag of interest. However, if we only

want all local minima and maxima, we just need to evaluate on all ∆t ∈ A, and compute

the minima and maxima using equation (4–17b) and equation (4–18). Therefore, if we

compute the Q±ij(θn) for all θn ∈ A, we can compute δ∗ for all intervals (θn, θn+1] if a local

extremum exists. These can be computed using the following recursive formulae.

Q−ij(θn+1) = Q−

ij(θn)e−θn+1−θn

τ + 1, (4–20a)

Q+ij(θn+1) = Q+

ij(θn)eθn+1−θn

τ − 1. (4–20b)

In practice, due to accumulation of numerical error, the following form is preferable for

Q+ij,

Q+ij(θn) = (Q+

ij(θn+1) + 1)e−θn+1−θn

τ . (4–21)

38

Initial conditions for the recursions are Q−ij(θ1) = 1 and Q+

ij(θN) = 0. The resulting

pseudocode is listed as in Algorithm 1.

Algorithm 1 Calculate Qij

Require: τ > 0, A 6= ∅, N = |A|Ensure: Qij(∆t) =

∑t∈A e−

|t−∆t|τ ,∀∆t ∈ A

1: A ⇐ sort(A) O(N log N)2: Q−(1) ⇐ 13: Q+(N) ⇐ 04: for k = 1 to N − 1 do5: ed(k) ⇐ e−

A(k+1)−A(k)τ

6: end for7: for k = 1 to N − 1 do8: Q−(k + 1) ⇐ 1 + Q−(k) · ed(k)9: Q+(N − k) ⇐ (Q+(N − k + 1) + 1) · ed(N − k)

10: end for11: for k = 1 to N do12: Qij(A(k)) ⇐ Q+(k) + Q−(k)13: end for

The bottleneck for time complexity is the sorting of the multiset A, thus the overall

time complexity is O(NiNj log(NiNj)).1 Note that the time complexity of straight

forward evaluation of equation (4–10) is O(NiNj) for each time lag ∆t. Assuming

homogeneous Poisson process for individual spike trains, the average time complexity

becomes O(N∗ log N∗) where N∗ = λAλBT , T is the length of spike train, and λA

represents the average firing rate for the Poisson process. Note that the conventional cross

correlogram algorithm [48] has the time complexity of O(N∗) on average.

4.4 Results

In this section, we analyze the statistical properties and demonstrate the usefulness

of the continuous cross correlogram (CCC) estimator compared to the cross correlation

histogram (CCH). The CCC is defined by the linear interpolation of equation (4–9)

1 It is possible to reduce the sorting to O(NiNj log(min(Ni, Nj))) using merge sortingpartially sorted lists. However, it is only a minor improvement in general.

39

0 5 10 15 200

2

4

6

8

10

12

14

Data length (sec)

Pre

cisi

on (

ms)

CCC, strength 0.05CCH, strength 0.05CCC, strength 0.1CCH, strength 0.1

(a) Effect of spike train length

0 0.1 0.2 0.3 0.4 0.50

2

4

6

8

10

12

14

Connection strength

Pre

cisi

on (

ms)

CCC, length 1 sCCH, length 1 sCCC, length 10 sCCH, length 10 s

(b) Effect of connection strength

Figure 4-3. Effect of the length of spike train and strength of connectivity on precision ofdelay estimation. The precision is estimated by the standard deviation in 1000Monte Carlo runs with kernel size τ = 0.4 ms (or bin size h = 1.96 ms). Thesmaller standard deviation indicates higher temporal resolution.

between the possible maxima (but not the minima). In order to compare with CCC, CCH

is standardized in a similar way to equation (4–9) according to [60].

Since CCH is essentially equivalent to using a uniform distribution kernel (or a boxcar

kernel) and sampling at equally spaced intervals as opposed to the Laplacian distribution

kernel used in CCC, in order to make a fair comparison, we choose the kernel size (bin

size) of both distributions to have the same standard deviation. To be specific, if the time

bin size of CCH is h, then we compare the result to CCC with kernel size of τ = h2√

6.

Since the histogram method is highly sensitive to bin size, we used the procedure of

optimal bin size selection of Poisson processes suggested by [61]. The method is designed

for the estimation of firing rate or PSTH from a measurement assuming a Poisson process.

However, since the time difference between two Poisson processes of finite length can be

considered as a realization of a Poisson process, it is possible to directly apply to the

CCH.

40

0 0.5 1 1.5 2 2.5 32

4

6

8

10

12

Kernel size (ms)

Pre

cisi

on (

ms)

CCCCCHCCH optimal

(a) CCC vs CCH

0 0.5 1 1.5 20

1

2

3

4

5

6

7

8

9

Kernel size (ms)

Pre

cisi

on (

ms)

0.2 ms0.4 ms0.6 ms0.8 ms1 ms

(b) Optimal τ for CCC

Figure 4-4. Effect of kernel size (bin size) of CCC (CCH) to the performance. Theconnection strength was 5% and the spike trains are 10 seconds long, i.e. 5spikes are correlated on average. (a) Sensitivity of CCC and CCH on kernelsize for noise standard deviation 0.25 ms. The horizontal dotted line indicatesthe performance when optimal bin size is chosen for each set of simulatedspike time differences. The median of the optimal bin size chosen (right) andcorresponding kernel size for CCC (left) are plotted as vertical dashed lines.Note that CCC is robust on kernel size selection and performs better thanCCH. (b) For different standard deviations of jitter noises, the precision isplotted versus the kernel size τ . Note that the optimal kernel size increases asthe jitter variance increases. For each point, 3000 Monte Carlo runs are used,and the actual delay is uniformly distributed from 3 ms to 4 ms to reduce thebias of CCH.

41

4.4.1 Analysis

For a pair of directly synapsing neurons, the delay from the generation of an action

potential of the presynaptic neuron to the generation of an action potential of the post

synaptic neuron is not always precise. Various sources of noise such as variability in

axon conduction delay, presynaptic waveform, probability of presynaptic vesicle release,

and threshold mechanism [62] effect the location, significance and width of the cross

correlogram peak. Furthermore, if the neurons are in a network, multiple paths, common

input sources, recurrent feedback and local field potential fluctuation can influence the

cross correlogram.

In this section, we model the timing jitter with a Gaussian distribution and analyze

the statistical properties of CCC and CCH on time delay estimation. A pair of Poisson

spike trains of firing rate 10 spikes/s were correlated by copying a portion of the spikes

from one to another and then shifting by the delay with the Gaussian jitter noise. The

fraction of spikes copied represents the effective synaptic connectivity.

The total number of correlated spikes depend on two factors: the length of spike

train, and the synaptic connectivity. In figure 4-3, the precision of CCC and CCH are

compared according to these factors. The precision is defined to be the standard deviation

of the error in estimating the exact delay. Precision of both CCC and CCH improves as

the number correlated spikes increases in a similar trend. CCC converges to a precision

lower than half the jitter noise standard deviation (500 µs).

The optimal kernel size (or bin size) which gives the best precision depends on the

noise jitter level. In figure 4-4(a), CCC and CCH is compared across different kernel sizes.

In general, CCC performs better than optimal bin size and most of the bin sizes CCH.

As mentioned above, CCH is sensitive to bin size, but CCC is robust to the kernel size

for precision performance. Also note that the optimal kernel size for CCC corresponds

to equal median value of the variance optimal bin size selected (vertical dash lines).

42

?

A

:

¼

I12.3 ms

13.7 ms

B

4.3 ms

9.3 ms

Figure 4-5. Schematic diagram for the configuration of neurons.

Increasing the jitter level worsens the best precision and increases the optimal kernel size

for CCC as shown in Fig. 4-4(b).

4.4.2 Examples

In this section, we demonstrate the power of CCC using two examples: the first

example uses synthetic spike trains from a simple spiking neuronal network model, and for

the second we use recordings from a cortical culture on a microelectrode array (MEA).

Two standard leaky-integrate-and-fire neurons are configured with 4 synapses, two

from neuron A to neuron B, and two for the other direction as illustrated in figure 4-5.

Individual synapses are static (no short/long-term plasticity), with equal weights and

generate EPSP (excitatory postsynaptic potential) with a time constant of 1 ms. Each

neuron is injected with positively biased Gaussian white noise current, so that they would

fire with mean firing rate of 35 spikes/s. The simulation step size is 0.1 ms.

As shown in figure 4-6, both CCH and CCC identifies the delays imposed by the

conduction delay, synaptic delay, and the delay for the generation of action potential by

noisy fluctuation of membrane potential. However, the time lag identified by CCC is more

accurate than that of CCH, since the temporal precision provided by CCH is limited by

the bin size and the jitter noise on delay, but for CCC, it is only limited by the jitter.

In other words, if there is no jitter, or a sufficient amount of spike timings has the exact

delay, then CCC is capable of quantifying the delay with infinite resolution.

43

−20 −15 −10 −5 0 5 10 15 20

0

5

10

15

20

25

30

35continuous cross−correlogram

Time lag (ms)

corr

elat

ion

−20 −15 −10 −5 0 5 10 15 20

0

5

10

15

20

25

30

35cross−correlation histogram

Time lag (ms)

corr

elat

ion

Figure 4-6. Comparison between CCC and CCH on synthesized data.

44

−10 −8 −6 −4 −2 0

0

5

10

15

20

25

30continuous cross−correlogram

Time lag (ms)

2.5 s5.0 s10.0 s

−10 −8 −6 −4 −2 0

0

5

10

15

20

25

30cross−correlation histogram

Time lag (ms)

2.5 s5.0 s10.0 s

Figure 4-7. Effect of length of spike trains. Comparison of continuous cross correlogram(left) and cross correlation histogram (right) with different length of spiketrains (2.5, 5, 10 seconds). Estimated optimal bin size is 0.267 ms.

In figure 4-7, we illustrate the difference in performance of the methods according

to the length of the spike trains. When the spike trains are only of length 2.5 seconds,

the CCC has significantly lower time resolution where no spikes had that time difference,

yet maintaining the high resolution in highly correlated peaks. In contrast, the CCH is

uniformly sampled regardless of the amount of data. The non-uniform sampling gives

significant advantage to CCC when only a short segment of data is available.

To test the method further, spike trains recorded in vitro were used. We recorded

electrical activity from dissociated E-18 rat cortex cultured on a 60 channel microelectrode

array from MultiChannel Systems [63]. For a particular pair of electrodes, specific delays

were observed as shown in Fig. 4-8. Those delays are rarely observed (3 to 5 times through

5 to 10 minutes of recording), however the precision is less than 2 ms which makes it

significant in CCC. The delays persisted at least 2 days, and many more interaction delays

were observable as the culture matured. As observable in the CCH analysis, it is almost

impossible to detect the delays and their consistency.

45

−80 −60 −40 −20 0 20 40 60 802

4

6

8

10

12

time lag (ms)

7 DIV [Ch 20, 0.57 Hz][Ch 21, 0.52 Hz]9 DIV [Ch 20, 0.15 Hz][Ch 21, 1.11 Hz]

−80 −60 −40 −20 0 20 40 60 80−1

0

1

2

3

4

5

6

7

8

time lag (ms)

7 DIV9 DIV

Figure 4-8. CCC (top) and CCH (bottom) of 7 DIV (days in vitro) and 9 DIV corticalculture recordings. Spike trains from two adjacent electrodes are analyzed. On7 DIV, CCC shows two significant peaks and they are also observable on 9DIV, and some non-significant spike time differences corresponds to peaks on 9DIV (marked with arrows). In contrast, CCH this structure is difficult to note.The optimal bin size is 3.8 ms for 7 DIV and 3.3 ms for 9 DIV data. The totalrecording time is 350 seconds for 7 DIV and 625 seconds for 9 DIV.

46

Note that the delays are much longer than the expected conduction time which

is estimated to be in the order of 2 ms for conduction speed of 100 µm/ms [64]. One

possible mechanism would be a rarely activated chain of synaptic pathway from a common

source neuron with different delays. In contrast to a recent study by [65] where the delay

between two channels is estimated with a single approximated Gaussian distribution with

relatively large variance, we observe multiple delays between channels.

4.5 Discussion

We proposed an estimator of cross correlogram from an observation of a point

process, and provide a efficient algorithm to compute it. The method utilizes the fact that

there are more samples where the correlation is stronger. Thus, computing the continuous

correlogram at the lags of samples provides non-uniform sampling advantageous for

estimating the precise delay. Unfortunately, this non-uniform sampling is disadvantageous

for inhibitory relations, therefore only positively related delays can be accurately

estimated. To achieve computational efficiency, the algorithm is limited to the use of

Laplacian distribution as the kernel. However, it has been shown that the bandwidth

(kernel size) is more important than the shape of the kernel for the performance of

intensity estimation [30].

The only free parameter is the kernel size which determines the amount of smoothing.

Unlike the conventionally used histogram method, the proposed method is robust on

kernel size, however, the optimal kernel size depends on the noise level of the delay. In

a biological neuronal network, the noise level may depend on which path the signal was

transmitted. Therefore each peak of the correlogram may have different amount of noise.

We suggested the use the optimal bin size for histogram as a guideline for the kernel size

selection.

The continuous cross correlogram can be viewed as a generalization of the cross

information potential where the correlation is interpreted as similarity (or dissimilarity)

between spike trains as we discussed in chapter 2. The proposed algorithm can be used

47

to find the similarity between two spike trains over continuous time lags. However, due to

accumulation of numerical error, the algorithm has to be non-causal (see equation (4–21)).

This prevents the algorithm to be used as an online filter to detect certain spike train

patterns, while offline analysis can still be done.

The proposed algorithm is not limited to cross-correlations. It can be directly

applied to smooth any type of point processes histogram, such as PSTH. However, one

always has to be cautious when the underlying process is highly non-stationary. Various

non-stationarities can cause peaks in the correlogram [66].

48

CHAPTER 5CONCLUSION

5.1 Summary of Contribution

The techniques presented here are based on smoothing spike trains with a continu-

ous kernel which preserves the time resolution while obtaining a continuous signal. We

demonstrated the usefulness of Cauchy-Schwarz divergence as a metric for smoothed spike

trains when spikes can be missing. The CS divergence is related to the similarity measure

CIP which is the inner product of the smoothed spike trains in L2. ICIP, the derivative

of CIP, is proposed as an instantaneous synchrony measure and extended to ensemble

average. Finally, time lag is incorporated into CIP to obtain a cross correlation function

of spike trains. All three algorithms can be computed efficiently depending only on the

number of spikes, without approximation, and independent of the sampling rate.

5.2 Potential Applications and Future Work

Given a similarity (or dissimilarity/divergence) measure with efficiently computable

closed form, the possibilities are endless. Clustering, classification, system identification,

and adaptive filtering can be applied to spike trains. We have some preliminary results

on stimulus to response mapping and stimulus estimation from response in a dissociated

cortical tissue culture, and willing to apply the techniques to various experiments.

In neuroscience, connectivity estimation, delay estimation, and identification of

synchronous neural group would be the most obvious applications. Correlation of

synchrony and attention or behavior would also be interesting. In a more engineering

perspective, detection of seizure, building a liquid state machine from living tissue, and

study of synchrony dynamics in pulse coupled oscillators seem to be promising. Finding

valid delay subnetwork [67] and polychronous group of neurons [34] may also be possible

by using CCC.

49

APPENDIX ABACKGROUND

A.1 Point Process

Point process is a statistical random process where events (points) are distributed

over a continuous space. Typically the magnitude of the event is ignored and only the

position (time) is described (otherwise it is called a marked point process). Distribution

of trees in a mountain, rain drops in space, earthquake instances over time, and action

potentials in a spike train are examples of point processes.

In this section, we will introduce some notation and definitions of point processes1 .

Point process is built up from counting random variables, which maps sample space to a

natural number that represents the number of events in a certain space.

Definition 1 (Counting Process [51]). Let Ω be the sample space consisting of realization

of points ω = x1, x2, . . . ∈ Ω. We define the counting process N(A : ω) as

N(A : ω) =∑

i

IA(xi),

where IA(x) denotes the set characteristic-function of A,

IA(x) =

1, x ∈ A,

0, x /∈ A.

By taking the derivative of a realization of a counting process, a realization of a point

process can be obtained. In this case, the realization of point process will consist of delta

functions at on the locations of events. Spike trains will be treated as a realization of a

point process for the rest of the thesis.

The simplest type of point process is the Poisson process. In Poisson process,

each event is independent and the probability of firing at a location (time) is completely

1 Some materials of this section is replicated from Snyder [51]

50

determined by the functional parameter Λ(t). When Λ(t) is differentiable, we call the

derivative λ(t) the intensity function.

Definition 2 (Temporal Poisson process). A temporal Poisson process for times t ≥ t0 is

a counting process N(t) : t ≥ t0 with the following properties:

1. Pr[N(t0) = 0] = 1;

2. for t0 ≤ s < t, the increment N(s, t) = N(t) − N(s) is Poisson distributed with

parameter Λ(t)− Λ(s),

Pr[N(s, t) = n] =1

n!(Λ(t)− Λ(s))n e−(Λ(t)−Λ(s)),

where n is a nonnegative integer, and Λ(t) is a finite-valued, nonnegative, nonde-

creasing function of t;

3. N(t) : t ≥ t0 has independent increments.

Proposition 1. Let [ti, ui)i=1,2,...,k be disjoint intervals on [t0,∞). If N(t) : t ≥ t0 is a

temporal Poisson process, then the independent increments implies,

Pr[N(t1, u1) = n1, N(t2, u2) = n2, · · · , N(tk, uk) = nk] =k∏

i=1

Pr[N(ti, ui) = ni].

We assume that the spike trains are realizations of Poisson process. The intensity

function λ(t) corresponds to the underlying (instantaneous) firing rate. This assumption

is based on statistics observed from in vivo systems and frequently considered as good

approximation [68]. The simple formulation of Poisson process enables analytical analysis

for the tools (some of which are presented in the Appendix).

A.1.1 An Alternative Representation of Poisson Process

For any interval, there are only finite number of events in a Poisson process. The

statistics in the interval can also be described by a combination of two random variables.

The first random variable represents the distribution for the number of events in the

51

interval which follows the Poisson distribution f .

f(k, λ) =λke−λ

k!, (A–1)

where k is the number of events, λ = Λ(T )−Λ(0)T

is the average intensity function where

T is the length of the interval, ! is the factorial operator. The second random variable X

represents the distribution of the finite events (points) over the interval. This distribution

is obtained by normalizing the intensity function λ(t) over the interval to make it a

pdf [69]:

fX(x) =λ(t)∫ T

0λ(t)dt

. (A–2)

The equivalence can be shown by the joint distribution of the points and considering all

the possible order (order statistics) [51, 52].

A.1.2 Filtered Poission Process

Smoothed spike train is a form of shot noise, and if the underlying point process is

Poisson, we can get the moments analytically using the characteristic functionals.

Theorem 2 (Characteristic functional for a filtered Poisson process [51]). Let a Poisson

process with intensity function λ(t) defined on t ≥ t0 be filtered by a causal linear

filter with impulse response h(σ, τ ;u), resulting in a continuous time signal y(t). The

characteristic functional of y(t) is defined as,

φy(jν) = E

[exp[j

∫ T

t0

y(σ)dν(σ)]

](A–3)

has the evaluation (A–4)

= exp∫ T

t0

λτE

[exp[j

∫ T

τ

h(σ, τ ;u)dν(σ)− 1]dτ

] (A–5)

where ν(·) is any function with

∫ ∫f(α, β)dν(α)dν(β) < ∞

52

where

f(α, β) =

∫ min(α,β)

t0

λ(τ)E [h(α, τ ;u)h(β, τ ;u)] dτ

+

∫ α

t0

λτE [h(α, τ ;u)] dτ

∫ β

t0

λτE [h(β, τ ;u)] dτ.

See [51] page 219-220 for proof.

We can choose the form of ν(·) to be,

ν(σ) =

0, t0 ≤ σ < t

α, t ≤ σ < T.

(A–6)

Then, the characteristic function for y(t) becomes,

My(t)(jα) = exp∫ t

t0

λ(τ)E[ejαh(t,τ ;u) − 1

]dτ. (A–7)

Therefore the n-th cumulant γn for y(t) can be derived as

γn =

∫ t

t0

λ(τ)E [hn(t, τ ;u)] dτ. (A–8)

There are ways to get the actual pdf [70–72], however, the closed form is highly

complicated.

The following theorem supports that the correlation function of the smoothed spike

train is meaningful under the assumption of Poisson spike trains.

Theorem 3 (Campbell’s Theorem [51]). Shot noise of a homogeneous Poisson process is

wide sense stationary.

Furthermore, the power spectral density of the smoothed spike train is same as

exciting the system (filter h) with white Gaussian noise [51].

A.2 Mean Square Calculus

Statistical signal processing is based on second order theory, or mean square calculus,

of the random process. In this section, we give a brief introduction to the theory.

53

First, we introduce L2, the space of all random variables with a finite second order

moment.

L2 = X | E [ |X|2] < ∞ (A–9)

It can be shown that L2 is a Hilbert space [73]. In this space, the order of limit and

expected value operator can be exchanged up to second order.

Definition 3 (Mean-square continuity). Let X(t) be a stochastic process defined on the

real line. X(t) is continuous in mean square sense at t if

limh→0

E[ |X(t + h)−X(t)|2] = 0

Definition 4 (Mean-square differentiability). The random process X(t) is mean-square

differentiable if the following limit exists

limh→0

X(t + h)−X(t)

h.

Proposition 4. A random process with well defined correlation function belongs to L2.

Note that mean square error is equivalent to Euclidean distance.

∫(x(t)− y(t))2dt =

∫ (x(t)2 + y(t)2

)dt− 2

∫x(t)y(t)dt (A–10)

A.3 Probability Density Estimation

Estimating a probability density function (pdf) from a set of samples (observations)

has been one of the fundamental problems in statistics [74]. Parametric methods

assumes a distribution and fits the data to the distribution, which is usable only if the

assumed model is at least approximately correct. On the other hand, nonparametric

approach makes a milder assumption, usually in the form that the pdf is continuous. One

of the widely used nonparametric method is the histogram. The other is Parzen window,

or otherwise known as kernel density estimation [59]. These can be motivated from the

54

empirical cumulative distribution function F (x) [74]:

F (x) =number of samples in (x− h, x + h)

total number of samples. (A–11)

Plugging in equation (A–11) to the definition of pdf,

f(x) =dF (x)

dx= lim

h→0

F (x + h)− F (x− h)

2h, (A–12)

can be written in the following form:

f(x) =1

nh

N∑i=1

K

(x− xi

h

), (A–13)

where N is the total number of samples, h is the bandwidth, and K is defined as,

K(x) =

12, if − 1 < x ≤ 1,

0, otherwise.

(A–14)

This is the histogram method if x is evaluated for every non-overlapping interval of size h.

Note that equation (A–14) is a uniform distribution. By allowing any pdf as a probability

density estimation kernel, we can define the kernel density estimation. It had been shown

that under all nonnegative kernels with compact support, Epanechinikov kernel is optimal

for the asymptotic mean integrated squared error (AMISE), however Gaussian kernel and

other kernels are widely used [74].

The free parameter h, the bandwidth, determines how smooth the estimate will

be, and in general depends on the number of samples in the region. When using fixed

bandwidth, AMISE provides optimal bandwidth which balances bias and variance of the

estimate [74]. There have been number of extension to the fixed bandwidth kernel density

estimation methods [75]. The general idea is to decrease the bandwidth in the region

where there are more samples, and use large bandwidth where there are less samples.

55

Kernel K(x)

Epanechinikov 34(1− x2)

Uniform 12

Triangle 12− |x|

Gaussian 1√2π

e−x2

2

Laplacian 1√2e−|x|

Table A-1. Various probability density estimation kernels. Gaussian and Laplacian, hasinfinite support, and the other kernels have [−1, 1] as the support.

One of the drawbacks of kernel density estimation is the boundary bias. When the

support of the pdf is finite, the infinite support kernels will underestimate, and even finite

support kernels will leak some of the density to outside of its support.

Since kernel density estimation provides relatively accurate continuous pdf estimation

with a finite summation, a set of algorithms that is based on pdf can be written in efficient

manner. Information theoretic learning, a framework of signal processing with information

theoretic cost function, combines Renyı’s quadratic entropy with kernel density estimation,

and nonparametrically estimates entropy without approximations [37, 76].

A.4 Information Theoretic Learning

CIP is strongly related with information theoretic learning framework [77]. For a

random variable X with a pdf f(x), Renyi’s quadratic entropy is defined as, [78]

HR2 = − log

∫ ∞

−∞f 2(x)dx = − log E [f(x)] . (A–15)

The argument of the logarithm,

VX =

∫ ∞

−∞f 2(x)dx = E [f(x)] , (A–16)

is called the information potential (IP) [77].

As mentioned in A.3, Renyi’s quadratic entropy can be estimated efficiently with

kernel density estimation. Let xi : i = 1, . . . , N be a set of N i.i.d. samples of a random

56

variable X. Then, the pdf of X can be approximated non-parametrically by,

f(x) =1

N

N∑i=1

κpdf (x, xi), (A–17)

where κpdf (·, ·) is the kernel. Substituting this estimator in the above definition of the

information potential, equation (A–16), yields,

VX =1

N2

N∑i=1

N∑j=1

∫ +∞

−∞κpdf (x, xi)κpdf (x, xj)dx =

1

N2

N∑i=1

N∑j=1

κ(xi, xj). (A–18)

where κ(xi, xj) =∫ +∞−∞ κpdf (x, xi)κpdf (x, xj)dx. Note that we are estimating entropy of

a continuous random variable directly with sums of kernel evaluations and without any

approximation.

Let fi(x) and fj(x) be the pdfs of random variables Xi and Xj, defined on the same

probability space. A distance between the pdfs of the two random variables can be defined

in the space of the distributions with the Cauchy-Schwarz (CS) distance, ICS,

ICS = log

√(∫f 2

i (t)dt) (∫

f 2j (t)dt

)∫

fi(t)fj(t)dt= log

√ViVj

Vij

, (A–19)

where Vi is the information potential of the ith random variable [77]. It is important to

remark that ICS is in fact approximating the Kullback-Leibler divergence [79] between

the two pdfs; however, a significant advantage is the ease of computation of this measure

using the information potential. Notice also that in the argument of the logarithm

the numerator contains the normalizing terms. In other words, the behavior of ICS

is determined by the denominator term, Vij, appropriately called cross information

potential (CIP). Much like the IP, the CIP expresses a potential due to interactions

between particles, but from different random variables. Because the CIP negatively affects

the CS divergence, it is in effect measuring the similarity between the two distributions.

57

A.5 Reproducing Kernel Hilbert Space

In kernel methods, the concept of reproducing kernel Hilbert space (RKHS) is often

mentioned. The smoothing of a spike train can be seen as applying a kernel method, and

projecting the spike trains to an RKHS. This can be seen from the fact that the Laplacian

distribution is a positive definite kernel. Indeed we are using a subspace of L2, which

consist of the smoothed spike trains and their linear combination, and also is an RKHS.

Being an RKHS provides the kernel trick, so that the algorithms can be efficient.

Definition 5 (Inner Product). The inner product of x, y ∈ V where V is a vector space is

a mapping 〈x|y〉 : V × V → K such that,

(I1) ∀x ∈ V, 〈x|x〉 ≥ 0 and 〈x|x〉 = 0 ⇐⇒ x = 0

(I2) 〈x|y〉 = 〈y|x〉(I3) ∀x, y, z ∈ V, ∀a, b ∈ K, 〈ax + by|z〉 = a〈x|z〉+ b〈y|z〉Definition 6 (Norm induced by inner product). For a Kfield-vector space V equipped

with an inner product, the norm of a vector is defined as, ‖x‖ =√〈x|x〉.

Proof.

‖λx‖ =√〈λx|λx〉 =

√|λ|2 〈x|x〉 = λ ‖x‖ (A–20)

‖x + y‖ ≤ ‖x‖+ ‖y‖ by lemma 5 (A–21)

‖x‖ = 0 ⇐⇒ x = 0 by (I1) in definition 5 (A–22)

Lemma 5. ‖x + y‖2 = ‖x‖2 + 2Re〈x|y〉+ ‖y‖2 .

Proof. ‖x + y‖2 = 〈x + y|x + y〉 = ‖x‖2 + 〈y|x〉+ 〈x|y〉+ ‖y‖2

Lemma 6 (Cauchy-Schwarz inequality). |〈x|y〉| ≤ ‖x‖ ‖y‖ .

58

Proof. Suppose ‖y‖ 6= 0, for a λ ∈ K,

0 ≤ ‖x− λy‖ = 〈x− λy|x− λy〉

= 〈x|x〉 − λ〈x|y〉 − λ〈y|x〉+ |λ|2 〈y|y〉.

Let λ = 〈x|y〉/〈y|y〉,

0 ≤ 〈x|x〉 − 〈y|x〉〈x|y〉/〈y|y〉 − 〈x|y〉〈y|x〉/〈y|y〉+ 〈x|y〉〈y|x〉/〈y|y〉,

which is equivalent to,

〈x|y〉〈y|x〉 ≤ 〈x|x〉〈y|y〉.

Definition 7 (Cauchy Sequence). A sequence of elements xn indexed by n ∈ N of a metric

space with metric d(·, ·) is a Cauchy sequence if for all ε > 0, there exist a N ∈ N such

that for all n,m ≥ N , d(xn, xm) < ε.

Definition 8 (Complete Metric Space). A metric space is complete if every Cauchy

sequence converges to a point in the space.

Definition 9 (Hilbert Space). A vector space V complete under the norm induced by

inner product is a Hilbert space.

Definition 10 (Reproducing Kernel Hilbert Space (RKHS)). There exist a kernel

K : V × V → K, such that, for all f ∈ H, the reproducing property holds:

f(y) = 〈f(x)|K(x, y)〉.

Remark 1 (Linear Operator View). RKHS is a sub-Hilbert space of L2. In particular,

RKHS is the +1 eigenvector space of the kernel. In other words, we are restricting the

general Hilbert space L2 to a smaller space where the reproducing property holds. Also note

that L2 is not an RKHS.

59

Definition 11 (Positive Semi-definite Kernel). A positive semi-definite kernel is a

function on X ×X with the following property. For all natural number n, for all x1, . . . , xn

in X, and for all α1, . . . , αn in a real or complex,

n∑i=1

n∑j=1

αiαjK(xi, xj) ≥ 0.

Theorem 7 (Moore-Aronzajn Theorem [80]). Given a symmetric positive semi-definite

kernel K, There exist a unique RKHS H with K as the reproducing kernel.

Proof. Let A be the set of all functionals of the form Φi(·) = K(·, i). Define the linear

combination of the functionals.

∀f, g ∈ I → R ∀a, b ∈ R ∀x ∈ I (af + bg)(x) = a(f(x)) + b(g(x)) (A–23)

Let B be the vector space spanned by A. Now let us define the inner product 〈·|·〉 and

norm ‖f‖ =√〈f |f〉 of B.

∀f, g ∈ A 〈f |g〉 = 〈∑i∈I

αfi Φi(·)|

∑j∈I

αgjΦj(·)〉 (A–24)

=∑i∈I

αfi

∑j∈I

αgj 〈Φj(·)|Φi(·)〉 (A–25)

=∑i∈I

∑j∈I

αfi α

gjK(i, j) (A–26)

To ensure that the inner product is well-defined, two different representation of f, g ∈ B

should lead to the same inner product, which is obvious.

Let us complete the space by including all limits of Cauchy sequences fn|n ∈ N, fn ∈B and denote as H which is a Hilbert space. Note that B is a dense subset of H.

60

The reproducing property of HK is immediate.

〈KSi(·), f〉 = 〈KSi

(·),∑

l∈Lf

αfl KSil

(·)〉 (A–27)

=∑

l∈Lf

αfl 〈KSi

(·), KSil(·)〉 (A–28)

=∑

l∈Lf

αfl K(Si, Sil) (A–29)

=∑

l∈Lf

αfl K(Sil , Si) (A–30)

= f(Si) (A–31)

61

APPENDIX BSTATISTICAL PROOFS

To assess the significance of the correlation, it is necessary to know the probability

distribution of the estimator given the null hypothesis (independent Poisson spike

trains). However, instead of calculating the complicated closed form of the distribution

for Q∗ij(∆t), we derive of mean and variance of the estimator Q∗

ij(∆t), and assume

Gaussianity. For the time binning with sufficiently small bin size case, Palm and coworkers

have derived statistics for the histogram of a Poisson process [60]. The analysis can easily

be applied to CIP by multiplying the normalization factor.

Let Ω(λ, T ) be the class of all possible homogeneous Poisson spike trains of rate λ and

length T . The probability of having a realization Ωi = ti1, ti2, . . . , tiNi of Ω(λA, T ) is,

P [Ω = Ωi|λA, T ] = P [N(T ) = Ni, ti1 = ti1, t

i2 = ti2, . . . , t

iNi

= tiNi] (B–1)

= P [N(T ) = Ni]P [ti1 = ti1, t

i2 = ti2, . . . , t

iNi

= tiN |N = Ni] (B–2)

=(λAT )Ni

Ni!e−λAT

Ni∏m=1

P [t = tim] (B–3)

=(λAT )Ni

Ni!e−λAT

Ni∏m=1

1

T=

λNiA

Ni!e−λAT (B–4)

62

The expected value of the estimator for all possible pairs of independent spike trains is,

E [ij] Q∗ij(∆t) =

∫ ∞

−∞

∫ ∞

−∞P [Ωi, Ωj|λA, λB, T ]Q∗

ij(∆t)dΩidΩj (B–5a)

=∞∑

Ni=0

∞∑Nj=0

∫ ∞

−∞· · ·

∫ ∞

−∞

λNiA

Ni!e−λAT λ

Nj

B

Nj!e−λBT

1

2τT

Ni∑m=1

Nj∑n=1

e−|tim−t

jn−∆t|τ dti1dti2 · · · dtiNi

dtj1dtj2 · · · dtjNj(B–5b)

=1

2τTe−(λA+λB)T

∞∑Ni=0

∞∑Nj=0

λNiA

Ni!

λNj

B

Nj!

TNi−1TNj−1

Ni∑m=1

Nj∑n=1

∫ ∞

−∞

∫ ∞

−∞e−

|tim−tjn−∆t|τ dtimdtjn (B–5c)

=1

2τTe−(λA+λB)T

∞∑Ni=0

∞∑Nj=0

(λAT )Ni

Ni!

(λBT )Nj

Nj!

NiNj

T 2

∫ T

0

∫ T

0

e−|tim−t

jn−∆t|τ dtimdtjn (B–5d)

63

Let us evaluate the integral first, from the symmetry of tim and tjn we can assume ∆t ≥ 0

without loss of generality.

∫ T

0

∫ T

0

e−|tim−t

jn−∆t|τ dtimdtjn (B–6a)

=

∫ ∆t

0

∫ T

0

etim−t

jn−∆t

τ dtimdtjn

+

∫ T

∆t

∫ tim−∆t

0

e−tim−t

jn−∆t

τ dtimdtjn

+

∫ T

∆t

∫ T

tim−∆t

etim−t

jn−∆t

τ dtimdtjn (B–6b)

= −τ

∫ ∆t

0

(e

tim−T−∆t

τ − etim−∆t

τ

)dtim

+ τ

∫ T

∆t

(1− e−

tim−∆t

τ

)dtim − τ

∫ T

∆t

(e

tim−T−∆t

τ − 1

)dtim (B–6c)

= −τ 2(e

∆t−T−∆tτ − e

0−T−∆tτ − e

∆t−∆tτ + e

0−∆tτ

)

+ τ(T −∆t)− τ 2(−e−

T−∆tτ + e−

∆t−∆tτ

)

− τ 2(e

T−T−∆tτ − e

∆t−T−∆tτ

)+ τ(T −∆t) (B–6d)

= 2τ(T −∆t) + τ 2(e−T+∆t

τ + e−T−∆t

τ − 2e−∆tτ ) (B–6e)

= 2τ(T −∆t) + O(τ 2). (B–6f)

Approximating λA = Ni

T, and substituting the integral to equation (B–5d) gives,

E [ij] Q∗ij(∆t) ' λAλB

2τ(T −∆t) + O(τ 2)

2τT, (B–7)

where O(τ 2) are the terms with order of τ 2 or higher. Assuming τ ¿ 1 and ∆t ¿ T ,

equation (B–7) can be approximated by λAλB which is the desired value.

64

Now let us evaluate the second-moment of the estimator.

E [ij] Q∗ij(∆t)2 =

∫ ∞

−∞

∫ ∞

−∞P [Ωi, Ωj|λA, λB, T ]Q∗

ij(∆t)2dΩidΩj (B–8a)

=∞∑

Ni=0

∞∑Nj=0

∫ ∞

−∞· · ·

∫ ∞

−∞

λNiA

Ni!e−λAT λ

Nj

B

Nj!e−λBT

1

4τ 2T 2

Ni∑p=1

Nj∑q=1

Ni∑r=1

Nj∑s=1

e−|tip−t

jq−∆t|τ e−

|tir−tjs−∆t|τ

dti1dti2 · · · dtiNidtj1dtj2 · · · dtjNj

(B–8b)

=∞∑

Ni=0

∞∑Nj=0

λNiA

Ni!e−λAT λ

Nj

B

Nj!e−λBT 1

4τ 2T 2TNiTNj

1

T 4

∫ T

0

∫ T

0

∫ T

0

∫ T

0

Ni∑p=1

Nj∑q=1

Ni∑r=1

Nj∑s=1

e−|tip−t

jq−∆t|τ e−

|tir−tjs−∆t|τ

dtipdtjqdtirdtjs (B–8c)

65

Let us consider the integral part first.

1

T 4

∫ T

0

∫ T

0

∫ T

0

∫ T

0

Ni∑p=1

Nj∑q=1

Ni∑r=1

Nj∑s=1

e−|tip−t

jq−∆t|τ e−

|tir−tjs−∆t|τ dtipdtjqdtirdtjs (B–9a)

=1

T 4

Ni∑p=1

Nj∑q=1

∑

r 6=p

∑

s 6=q

∫ T

0

∫ T

0

e−|tip−t

jq−∆t|τ dtipdtjq

∫ T

0

∫ T

0

e−|tir−t

js−∆t|τ dtirdtjs

+1

T 2

Ni∑p=1

Nj∑q=1

∫ T

0

∫ T

0

e−2|tip−t

jq−∆t|

τ dtipdtjq

+1

T 3

Ni∑p=1

Nj∑q=1

∑

r 6=p

∫ T

0

∫ T

0

∫ T

0

e−|tip−t

jq−∆t|τ e−

|tir−tjs−∆t|τ dtipdtjqdtir

+1

T 3

Ni∑p=1

Nj∑q=1

∑

s 6=q

∫ T

0

∫ T

0

∫ T

0

e−|tip−t

jq−∆t|τ e−

|tir−tjs−∆t|τ dtipdtjqdtjs (B–9b)

=NiNj(Ni − 1)(Nj − 1)

T 4(2τ(T −∆t) + O(τ 2))2

+NiNj

T 2(τ(T −∆t) + O(τ 2))

+NiNj(Ni − 1)

T 3(τ 2(2T (2 + e−

Tτ )) + O(τ 3)

+NiNj(Nj − 1)

T 3(τ 2(2T (2 + e−

Tτ )) + O(τ 3) (B–9c)

By assuming τ ¿ 1 ¿ T , we can approximate e−Tτ ' 0, O(τ 3) ' 0. And we further

approximate λA = Ni

T, and Ni − 1 ' Ni. These approximations lead to,

E [ij] Q∗ij(∆t)2 ' 1

4τ 2T 2(λAλB)2(2τ(T −∆t))2

+ (λAλB)(τ(T −∆t)) + (λAλB)(λA + λB)(4τ 2T )

= (λAλB)2

(T −∆t

T

)2

+λAλB

4τT

T −∆t

T

+ (λAλB)(λA + λB)1

T. (B–10)

Finally, the variance of the estimator is given by

E [ij] (Q∗ij(∆t)− λAλB)2 ' λAλB(T −∆t)

4τT 2. (B–11)

66

APPENDIX CNOTATION

Spaces

K generic field

N natural number

R real field

Rd d-dimensional Euclidean space

H Hilbert space [p. 59]

I index set space for spike trains [p. ??]

General notation

tim : m = 1, . . . , Ni spike train as a set of spike timings [p. ??]

si(t) spike train as a function over time [p. ??]

h(t; τ) impulse response of a linear filter

qi(t) filtered (or smoothed) spike train

g estimation of a general function g

λi(t) intensity function of a Poisson process [p. ??]

X,Y,Z random variables

T random variable of time

X(t),Y(t),Z(t) random processes

N(t, s),N(t) counting process

fX probability density function of X

κ generic kernel

κpdf pdf estimation kernel

κτ generic kernel with kernel size parameter τ

K CIP kernel, or reproducing kernel of a Hilbert space

67

Operators

EX [g(x)] expectation of g(x) over X

〈x|y〉 inner product [p. 58]

‖·‖ norm of a vector

|·| absolute value

x(t) ∗ y(t) convolution

68

APPENDIX DSOURCE CODE

D.1 CIP

function V = cip(x, tau)

%V = cip(X, TAU)

% Return the Cross Information Potential .

% If more than two neurons are provided average through all pair combinations .

%

% X: Data , organized as a cell array , with each cell containing an

% array of spike times (in seconds ).

% TAU: Kernel size (in seconds ).

/* vim: set ts=8 sts =4 sw =4: (modeline) */

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <math.h>

#include <mex.h>

void cip_func(int N, double *x[], int nSpikes[], double tau , double *v);

void mexFunction(int nlhs , mxArray *plhs[], int nrhs , const mxArray *prhs [])

mxArray *sts;

mxArray *stp;

int nSpikeTrain; /* number of spike trains */

double **x; /* array of vectors with the

* spike times (sec) */

int *nSpikes; /* array with the number of spikes

* per spike train */

double tau; /* exponential decay parameter */

double *v; /* output argument , CIP */

int i;

/*

* check input arguments

*/

if (nrhs != 2)

mexErrMsgTxt("2ÃinputsÃareÃrequired.");

else if (nlhs > 1)

mexErrMsgTxt("TooÃmanyÃoutputÃarguments");

if (! mxIsDouble(prhs [1]))

mexErrMsgTxt("TAUÃmustÃbeÃaÃscalar");

/*

* get input arguments

*/

sts = (mxArray *) prhs [0];

if (mxGetClassID(sts) != mxCELL_CLASS)

mexErrMsgTxt("XÃmustÃbeÃaÃcellÃarray");

nSpikeTrain = mxGetNumberOfElements(sts);

if (nSpikeTrain < 2)

mexErrMsgTxt("AtÃleastÃtwoÃspikeÃtrainsÃareÃneeded.");

nSpikes = (int *) mxMalloc(sizeof(int) * nSpikeTrain );

x = (double **) mxMalloc(sizeof(double *) * nSpikeTrain );

for (i = 0; i < nSpikeTrain; i++)

stp = mxGetCell(sts , i);

nSpikes[i] = mxGetNumberOfElements(stp);

x[i] = mxGetPr(stp);

tau = mxGetPr(prhs [1])[0];

/*

* allocate output

*/

plhs [0] = mxCreateDoubleMatrix (1, 1, mxREAL );

v = mxGetPr(plhs [0]);

memset(v, 0, sizeof(double ));

/*

* compute CIP

*/

cip_func(nSpikeTrain , x, nSpikes , tau , v);

/**

69

* Compute CIP of a set of spike trains.

*

* @param N number of spike trains

* @param x array to pointers for spike trains

* @param nSpikes array to length for spike trains

* @param tau decay time constant for the exponential function

* @param v computed CIP will be stored here , need to be preallocated

* @author Antonio Paiva

* @version $Id: cip.c 52 2007 -01 -03 16:55:26Z memming $

*/

void cip_func(int N, double *x[], int nSpikes[], double tau , double *v)

int i, j; /* counters for spike trains */

int m, n; /* counters for spike times */

int lastStartIdx; /* index to start computing the exponential */

double maxT; /* maximum range of exponential to have non -zero value */

double aux; /* auxiliary variable: holds CIP for each

* pair combination */

double tmp;

maxT = tau * 100;

*v = 0;

for (i=0; i<(N-1); i++)

for (j=(i+1); j<N; j++)

aux = 0;

lastStartIdx = 0;

for (m=0; m<nSpikes[i]; m++)

for (n=lastStartIdx; n<nSpikes[j]; n++)

tmp = x[j][n] - x[i][m];

if (tmp < -maxT)

lastStartIdx ++;

continue;

if (tmp <= maxT)

aux += exp(-((tmp < 0) ? (-tmp) : tmp) / tau);

else

break;

aux /= (2*tau * (nSpikes[i] * nSpikes[j]));

*v += aux;

*v /= (N * (N-1) / 2);

D.2 ICIP

function [v] = offline_icip(st, T, DT , FR_TAU , BETA)

% [v] = offline_icip (st , T, DT , FR_TAU , BETA)

% Offline version of ICIP computation . For online evaluation directly use

% online_icip .

%

% Input

% st: cell array containing spike trains (seconds)

% T: total length of spike trains (seconds)

% DT: time step size (seconds)

% FR_TAU: the tau for firing rate estimation (1/ seconds)

% if FR_TAU is zero , it is constant tau mode

% BETA: kernel size for ICIP at average firing rate 1 Hz (1/ seconds)

% (BETA should be smaller than FR_TAU ^2 for accurate estimation )

% Output:

% v: ICIP over time

%

% You may want to use tr = 0:DT:(T-DT); which are the start time for each bin

%

% Copyright 2006 Antonio and Memming , CNEL , all rights reserved

% $Id: offline_icip .m 32 2006 -12 -09 18:05:57Z memming $

% Actuall implementation is now in offline_icip .c

/* vim: set ts=8 sts =4 sw =4: (modeline) */

#include <math.h>

#include <string.h>

#include <mex.h>

/**

* Compute ICIP of a set of spike trains with changing TAU mode.

*


* @param sts array to pointers for spike trains

* @param nsts array to length for spike trains

* @param v computed ICIP will be stored here , need to be preallocated

* @param T total time (sec)

* @param dt time bin size (sec)

* @param BETA the parameter $\beta$ of ICIP (sec)

* @param FR_TAU the time constant for firing rate estimation

70

* @author Memming Park

* @version $Id: offline_icip .c 51 2007 -01 -02 19:53:01Z memming $

*/

void offline_icip(int N, double *sts[], int nsts[], double *v, double T, double dt , double BETA , double FR_TAU)

double t; /* time */

int i; /* time index */

int j, k; /* spike train index */

int *idx; /* spike index per spike train */

int NPair;

double EXP_FR;

double ONE_OVER_FR_TAU;

double ONE_OVER_BETA;

double *q; /* charge */

double *f; /* firing rate */

int Nstep; /* number of time steps (bins) */

ONE_OVER_FR_TAU = 1 / FR_TAU;

ONE_OVER_BETA = 1 / BETA;

EXP_FR = exp(-dt / FR_TAU );

idx = (int *) malloc(sizeof(int) * N);

memset(idx , 0, sizeof(int) * N);

q = (double *) malloc(sizeof(double) * N);

memset(q, 0, sizeof(double) * N);

f = (double *) malloc(sizeof(double) * N);

memset(f, 0, sizeof(double) * N);

NPair = N * (N - 1) / 2;

Nstep = (int) ceil(T / dt);

for(t = dt, i = 0; i < Nstep; t += dt , i++)

for(j = 0; j < N; j++)

f[j] *= EXP_FR;

q[j] *= exp(-dt * f[j] * ONE_OVER_BETA );

while(idx[j] < nsts[j] && sts[j][idx[j]] <= t)

idx[j]++;

f[j] += ONE_OVER_FR_TAU;

q[j] += ONE_OVER_BETA;

for(j = 0; j < N; j++)

for(k = (j + 1); k < N; k++)

v[i] += q[j] * q[k];

v[i] /= NPair;

free(idx);

free(q);

free(f);

/**

* Compute ICIP of a set of spike trains with constant TAU mode.

*


* @param sts array to pointers for spike trains

* @param nsts array to length for spike trains

* @param v computed ICIP will be stored here , need to be preallocated

* @param T total time (sec)

* @param dt time bin size (sec)

* @param TAU the time constant for the expnential (or Laplacian )

* @author Memming Park

* @version $Id: offline_icip .c 51 2007 -01 -02 19:53:01Z memming $

*/

void offline_icip_const_tau(int N, double *sts[], int nsts[], double *v, double T, double dt, double TAU)

double t; /* time */

int i; /* time index */

int j, k; /* spike train index */

int *idx; /* spike index per spike train */

int NPair;

double EXP_TAU;

double *q; /* charge */

double *ONE_OVER_TAU_F; /* (1 / (tau*firing rate )); constant */

int Nstep; /* number of time steps (bins) */

EXP_TAU = exp(-dt / TAU);

idx = (int *) malloc(sizeof(int) * N);

memset(idx , 0, sizeof(int) * N);

q = (double *) malloc(sizeof(double) * N);

memset(q, 0, sizeof(double) * N);

ONE_OVER_TAU_F = (double *) malloc(sizeof(double) * N);

memset(ONE_OVER_TAU_F , 0, sizeof(double) * N);

for(k = 0; k < N; k++)

ONE_OVER_TAU_F[k] = (1/ TAU) * (T / nsts[k]);

71

NPair = N * (N - 1) / 2;

Nstep = (int) ceil(T / dt);

for(t = dt, i = 0; i < Nstep; t += dt , i++)

for(j = 0; j < N; j++)

q[j] *= EXP_TAU;

while(idx[j] < nsts[j] && sts[j][idx[j]] <= t)

idx[j]++;

q[j] += ONE_OVER_TAU_F[j];

for(j = 0; j < N; j++)

for(k = (j + 1); k < N; k++)

v[i] += q[j] * q[k];

v[i] /= NPair;

free(idx);

free(q);

free(ONE_OVER_TAU_F );

void mexFunction(int nlhs , mxArray *plhs[], int nrhs , const mxArray *prhs [])

int nSpikeTrain;

mxArray *sts;

mxArray *stp;

double **st;

int i;

int *nst;

double T, DT , FR_TAU , BETA;

double *v;

if (nrhs != 5)

mexErrMsgTxt("5ÃinputsÃareÃrequired.");

else if (nlhs > 1)

mexErrMsgTxt("TooÃmanyÃoutputÃarguments");

sts = (mxArray *) prhs [0];

if (mxGetClassID(sts) != mxCELL_CLASS)

mexErrMsgTxt("TheÃfirstÃargumentÃshouldÃbeÃaÃcellÃarray");

nSpikeTrain = mxGetNumberOfElements(sts);

if (nSpikeTrain < 2)

mexErrMsgTxt("AtÃleastÃtwoÃspikeÃtrainsÃareÃrequired.");

nst = (int *) mxMalloc(sizeof(int) * nSpikeTrain );

st = (double **) mxMalloc(sizeof(double *) * nSpikeTrain );

for (i = 0; i < nSpikeTrain; i++)

stp = mxGetCell(sts , i);

nst[i] = mxGetNumberOfElements(stp);

st[i] = mxGetPr(stp);


mexErrMsgTxt("TheÃsecondÃargumentÃshouldÃbeÃaÃdouble");


mexErrMsgTxt("TheÃthirdÃargumentÃshouldÃbeÃaÃdouble");


mexErrMsgTxt("TheÃfourthÃargumentÃshouldÃbeÃaÃdouble");


mexErrMsgTxt("TheÃfifthÃargumentÃshouldÃbeÃaÃdouble");

T = mxGetPr(prhs [1])[0];

DT = mxGetPr(prhs [2])[0];

FR_TAU = mxGetPr(prhs [3])[0];

BETA = mxGetPr(prhs [4])[0];

plhs [0] = mxCreateDoubleMatrix ((int) ceil(T / DT), 1, mxREAL );

v = mxGetPr(plhs [0]);

memset(v, 0, sizeof(double) * (int) ceil(T / DT));

if (FR_TAU == 0) /* Constant tau mode */

offline_icip_const_tau(nSpikeTrain , st, nst , v, T, DT, BETA);

else

offline_icip(nSpikeTrain , st , nst , v, T, DT , BETA , FR_TAU );

72

D.3 CCC

function [Q, deltaT] = cipogram(st1 , st2 , tau , maxT , T, verbose)

% [Q, tr] = cipogram(st1 , st2 , tau , maxT , T, verbose)

%

% Input

% st1 , st2: spike trains with sorted spike timings

% tau: time constant for CIP kernel

% maxT: correlogram range will be effective in [-maxT , maxT]

% T: length of spike train in seconds

% verbose: (optional /0) detailed info , uses tic , toc

%

% Output

% Q: cipogram

% deltaT: time range

%

% See also: cip_max_filter2 , ncipogram

%


% $Id: cipogram.m 53 2007 -01 -14 23:24:21Z memming $

if nargin < 5

verbose = 0;

end

N1 = length(st1);

N2 = length(st2);

Nij = N1 * N2;

if N1 == 0 || N2 == 0

warning(’cipogram:NODATA ’, ’AtÃleastÃoneÃspikeÃisÃrequired!’);

deltaT = []; Q = [];

return;

end

maxTTT = abs(maxT) + tau * 10; % exp ( -100) is effectively zero

% rough estimate of # of time difference required (assuming independence )

% this estimate is aweful if the spike trains are strongly correlated

eN = ceil((max(N1, N2))^2 * maxTTT * 2 / min(st1(end), st2(end )));

if verbose; fprintf(’ExpectedÃtimeÃdifferencesÃ[%d]Ã/Ã[%d]\n’, eN, Nij); end

deltaT = zeros(2 * eN , 1);

% Compute all the time differences

lastStartIdx = 1;

k = 1;

for n = 1:N1

for m = lastStartIdx:N2

timeDiff = st2(m) - st1(n);

if timeDiff < -maxTTT

lastStartIdx = lastStartIdx + 1;

continue;

end

if timeDiff <= maxTTT

deltaT(k) = timeDiff;

k = k + 1;

else % this is the ending point

break;

end

end

end

deltaT = deltaT (1:(k-1));

N = length(deltaT );

if N < 2

warning(’cipogram:NODATA ’, ’AtÃleastÃtwoÃintervalsÃareÃrequired ’);

deltaT = []; Q = [];

return;

end

if verbose

fprintf(’ActualÃnumberÃofÃtimeÃdifferencesÃ[%d]\ nSorting ...\n’, N); tic;

end

deltaT = sort(deltaT , 1); % Sort the time differences

if verbose; fprintf(’SortingÃfinishedÃ[%fÃsec]\r’, toc); end

Qplus = zeros(N, 1);

Qminus = zeros(N, 1);

Qminus (1) = 1;

Qplus(N) = 0;

EXP_DELTA = exp(-(diff(deltaT ))/ tau);

for k = 1:(N-1)

Qminus(k + 1) = 1 + Qminus(k) * EXP_DELTA(k);

kk = N - k;

Qplus(kk) = (Qplus(kk+1) + 1) * EXP_DELTA(kk);

end

Q = Qminus + Qplus;

Q = Q / 2 / tau / T;

73

function [Q, deltaT] = ncipogram(st1 , st2 , tau , maxT , T, verbose)

% [Q, tr] = ncipogram (st1 , st2 , tau , maxT , T, verbose)

% Normalized cipogram with 2nd order statistics .

%

% Input

% st1 , st2: spike trains with sorted spike timings

% tau: time constant for CIP kernel

% maxT: correlogram range will be effective in [-maxT , maxT]

% T: length of spike train in seconds

% verbose: (optional /0)

%

% Output

% Q: cipogram

% deltaT: time range

%


% $Id: ncipogram.m 59 2007 -01 -27 19:26:14Z memming $

[Q, deltaT] = cipogram(st1 , st2 , tau , maxT , T, verbose );

N1 = length(st1);

N2 = length(st2);

Nij = N1 * N2;

Q = (Q * T - Nij / T) * 2 * sqrt(tau * T) / sqrt(Nij);

74

REFERENCES

[1] M. C. W. van Rossum, “A novel spike distance,” Neural Computation, vol. 13, no. 4,pp. 751–764, 2001.

[2] J. D. Victor, “Spike train metrics,” Current Opinion in Neurobiology, vol. 15, no. 5,pp. 585–592, Sept. 2005.

[3] G. L. Gerstein, D. H. Perkel, and J. E. Dayhoff, “Cooperative firing activity insimultaneously recorded populations of neurons: detection and measurement,”Journal of Neuroscience, vol. 5, no. 4, pp. 881–889, 1985.

[4] J.-M. Fellous, P. H. E. Tiesinga, P. J. Thomas, and T. J. Sejnowski, “Discoveringspike patterns in neuronal responses,” J. Neurosci., vol. 24, no. 12, pp. 2989–3001,Mar. 2004.

[5] S. Schreiber, J. M. Fellous, D. Whitmer, P. Tiesinga, and T. J. Sejnowski, “A newcorrelation-based measure of spike timing reliability,” Neurocomputing, vol. 52-54, pp.925–931, 2003.

[6] A. Carnell and D. Richardson, “Linear algebra for time series of spikes,” in ESANN,2005.

[7] G. Buzsaki and A. Draguhn, “Neuronal oscillations in cortical networks,” Science,vol. 304, no. 5679, pp. 1926–1929, June 2004.

[8] Z. F. Mainen, J. Joerges, J. R. Huguenard, and T. J. Sejnowski, “A model of spikeinitiation in neocortical pyramidal neurons,” Neuron, vol. 15, no. 6, pp. 1427–1439,Dec. 1995.

[9] Y. Shu, A. Duque, Y. Yu, B. Haider, and D. A. McCormick, “Properties of actionpotential initiation in neocortical pyramidal cells: evidence from whole cell axonrecordings (in press),” J. Neurophysiol., Aug. 2006.

[10] B. Hochner, M. Klein, S. Schacher, and E. R. Kandel, “Action-potential durationand the modulation of transmitter release from the sensory neurons of Aplysia inpresynaptic facilitation and behavioral sensitization,” Proc. Natl. Aca. Sci., vol. 83,pp. 8410–8414, 1986.

[11] H. Alle and J. R. P. Geiger, “Combined analog and action potential coding inhippocampal mossy fibers,” Science, vol. 311, no. 5765, pp. 1290–1293, Mar. 2006.

[12] A.-K. Warzecha, J. Kretzberg, and M. Egelhaaf, “Temporal precision of the encodingof motion information by visual interneurons,” Current Biology, vol. 8, no. 7, pp.359–368, Mar. 1998.

75

76

[13] Z. N. Aldworth, J. P. Miller, T. Gedeon, G. I. Cummins, and A. G. Dimitrov,“Dejittered spike-conditioned stimulus waveforms yield improved estimates ofneuronal feature selectivity and spike-timing precision of sensory interneurons,” J.Neurosci., vol. 25, no. 22, pp. 5323–5332, June 2005.

[14] Z. F. Mainen and T. J. Sejnowski, “Reliability of spike timing in neocorticalneurons,” Science, vol. 268, no. 5216, pp. 1503–1506, 1995.

[15] R. VanRullen, R. Guyonneau, and S. J. Thorpe, “Spike times make sense,” Trends inNeurosciences, vol. 28, no. 1, pp. 1–4, Jan 2005.

[16] M. N. Shadlen and W. T. Newsome, “Noise, neural codes and cortical organization,”Curr. Opin. Neurobiol., vol. 4, pp. 569–579, 1994.

[17] H. Agmon-Snir, C. E. Carr, and J. Rinzel, “The role of dendrites in auditorycoincidence detection,” Nature, vol. 393, no. 6682, pp. 268–72, May 1998.

[18] J. P. Donoghue, J. N. Sanes, N. G. Hatsopoulos, and G. Gaal, “Neural dischargeand local field potential oscillations in primate motor cortex during voluntarymovements,” J. Neurophysiol., vol. 79, no. 1, pp. 159–173, 1998.

[19] D. R. Brillinger, J. Hugh L. Bryant, and J. P. Segundo, “Identification of synapticinteractions,” Biological Cybernetics, vol. 22, pp. 213–228, 1976.

[20] R. Dahlhaus, M. Eichler, and J. Sandkuhler, “Identification of synaptic connections inneural ensembles by graphical models,” Journal of Neuroscience Methods, vol. 77, pp.93–107, 1997.

[21] G. Schneider, M. N. Havenith, and D. Nikolic, “Spatiotemporal structure in largeneuronal networks detected from cross-correlation,” Neural Computation, vol. 18, no.10, pp. 2387–2413, 2006.

[22] T. Berger, M. Baudry, R. Brinton, J.-S. Liaw, V. Marmarelis, A. Y. Park, B. Sheu,and A. Tanguay, “Brain-implantable biomimetic electronics as the next era in neuralprosthetics,” Proceedings of the IEEE, vol. 89, no. 7, pp. 993–1012, 2001.

[23] R. Gutig and H. Sompolinsky, “The tempotron: a neuron that learns spiketiming-based decisions,” Nat Neurosci, vol. 9, no. 3, pp. 420–428, Mar. 2006.

[24] A. Borst and F. E. Theunissen, “Information theory and neural coding,” NatNeurosci, vol. 2, no. 11, pp. 947–957, Nov. 1999.

[25] L. Paninski, “Estimating entropy on m bins given fewer than m samples,” Informa-tion Theory, IEEE Transactions on, vol. 50, no. 9, pp. 2200– 2203, 2004.

[26] W. Bialek and A. Zee, “Coding and computation with neural spike trains,” J. Stat.Phys., vol. 59, pp. 103–115, 1990.

77

[27] J. D. Victor, “Binless strategies for estimation of information from neural data,”Phys. Rev. E, vol. 66, no. 5, pp. 051903, Nov 2002.

[28] E. N. Brown, R. E. Kass, and P. P. Mitra, “Multiple neural spike train data analysis:state-of-the-art and future challenges,” Nature neuroscience, vol. 7, no. 5, pp. 456–61,may 2004.

[29] N. Masuda and K. Aihara, “Spatiotemporal spike encoding of a continuous externalsignal,” Neural Computation, vol. 14, pp. 15991628, 2002.

[30] M. Nawrot, A. Aertsen, and S. Rotter, “Single-trial estimation of neuronal firingrates: From single-neuron spike trains to population activity,” Journal of Neuro-science Methods, vol. 94, pp. 81–92, 1999.

[31] M. Bazhenov, M. Stopfer, M. Rabinovich, R. Huerta, H. D. Abarbanel, T. J.Sejnowski, and G. Laurent, “Model of transient oscillatory synchronization inthe locust antennal lobe,” Neuron, vol. 30, pp. 553–567, 2001.

[32] P. Reinagel and R. C. Reid, “Precise firing events are conserved across neurons,” J.Neurosci., vol. 22, no. 16, pp. 6837–6841, Aug 2002.

[33] R. Azouz and C. M. Gray, “Dynamic spike threshold reveals a mechanism forsynaptic coincidence detection in cortical neurons in vivo,” PNAS, vol. 97, no. 14, pp.8110–8115, July 2000.

[34] E. M. Izhikevich, “Polychronization: Computation with spikes,” Neural Comp., vol.18, no. 2, pp. 245–282, Feb. 2005.

[35] J. de la Rocha and N. Parga, “Short-term synaptic depression causes anon-monotonic response to correlated stimuli,” J. Neurosci., vol. 25, no. 37, pp.8416–8431, Sept. 2005.

[36] P. Konig, A. K. Engel, and W. Singer, “Integrator or coincidence detector? the roleof the cortical neuron revisited,” Trends in Neurosciences, vol. 19, no. 4, pp. 130–137,April 1996.

[37] D. Xu, Energy, entropy and information potential for neural computation, Ph.D.dissertation, University of Florida, May 1999.

[38] A. Kuhn, A. Aertsen, and S. Rotter, “Higher-order statistics of input ensemblesand the response of simple model neurons.,” Neural Computation, vol. 15, no. 1, pp.67–101, 2003.

[39] A. Kuhn, S. Rotter, and A. Aertsen, “Correlated input spike trains and their effectson the response of the leaky integrate-and-fire neuron,” Neurocomputing, vol. 44–46,pp. 121–126, June 2002.

78

[40] J. Hopfield and A. Herz, “Rapid local synchronization of action potentials: Towardcomputation with coupled integrate-and-fire neurons,” PNAS, vol. 92, no. 15, pp.6655–6662, July 1995.

[41] J. J. Hopfield and C. D. Brody, “What is a moment? transient synchrony as acollective mechanism for spatiotemporal integration,” PNAS, vol. 98, no. 3, pp.1282–1287, 2001.

[42] A. K. Engel and W. Singer, “Temporal binding and the neural correlates of sensoryawareness,” Trends in Cognitive Sciences, vol. 5, no. 1, pp. 16–25, 2001.

[43] R. E. Mirollo and S. H. Strogatz, “Synchronization of pulse-coupled biologicaloscillators,” SIAM Journal on Applied Mathematics, vol. 50, no. 6, pp. 1645–1662,Dec. 1990.

[44] D. Hansel and H. Sompolinsky, “Synchronization and computation in a chaotic neuralnetwork,” Phys. Rev. Lett., vol. 68, no. 5, pp. 718–721, Feb 1992.

[45] A. Zumdieck, M. Timme, T. Geisel, and F. Wolf, “Long chaotic transients in complexnetworks,” Phys. Rev. Lett., vol. 93, no. 24, pp. 244103, 2004.

[46] E. M. Izhikevich, “Weakly pulse-coupled oscillators, FM interactions, synchronization,and oscillatory associative memory,” IEEE Transactions on Neural Networks, vol. 10,no. 3, pp. 508–526, 1999.

[47] L. F. Abbott and C. van Vreeswijk, “Asynchronous states in networks ofpulse-coupled oscillators,” Phys. Rev. E, vol. 48, no. 2, pp. 1483–1489, Aug 1993.

[48] P. Dayan and L. F. Abbott, Theoretical Neuroscience: Computational and Mathemat-ical Modeling of Neural Systems, MIT Press, Cambridge, MA, USA, 2001.

[49] A. M. Aertsen, G. L. Gerstein, M. K. Habib, and G. Palm, “Dynamics of neuronalfiring correlation: modulation of “effective connectivity”,” Journal of Neurophysiology,vol. 61, no. 5, pp. 900–917, 1989.

[50] S. Grun, M. Diesmann, and A. Aertsen, “Unitary Events in multiple single-neuronactivity. I. detection and significance,” Neural Computation, vol. 14, no. 1, pp. 43–80,2002.

[51] D. L. Snyder and M. I. Miller, Random Point Processes in Time and Space,Springer-Verlag, 1991.

[52] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Point Processes,Springer, 1988.

[53] Z. F. Mainen and T. J. Sejnowski, “Reliability of spike timing in neocorticalneurons,” Science, vol. 268, no. 5216, pp. 1503–1506, 1995.

79

[54] D. H. Perkel, G. L. Gerstein, and G. P. Moore, “Neuronal spike trains and stochasticpoint processes: II. simultaneous spike trains,” Biophys J, vol. 7, no. 4, pp. 419–440,July 1967.

[55] C. K. Knox, “Cross-correlation functions for a neuronal model,” Biophysical Journal,vol. 14, no. 8, pp. 567–582, 1974.

[56] R. E. Kass, V. Ventura, and C. Cai, “Statistical smoothing of neuronal data,”Network: Comput. Neural Syst., vol. 14, pp. 5–15, 2003.

[57] W. Gerstner, “Coding properties of spiking neurons: reverse and cross-correlations,”Neural Networks, vol. 14, pp. 599–610, 2001.

[58] P. Diggle and J. S. Marron, “Equivalence of smoothing parameter selectors in densityand intensity estimation,” Journal of the American Statistical Association, vol. 83,no. 403, pp. 793–800, Sept. 1988.

[59] E. Parzen, “On the estimation of a probability density function and the mode,” TheAnnals of Mathematical Statistics, vol. 33, no. 2, pp. 1065–1076, Sept. 1962.

[60] G. Palm, A. M. H. J. Aertsen, and G. L. Gerstein, “On the significance of correlationsamong neuronal spike trains,” Biological Cybernetics, vol. 59, pp. 1–11, 1988.

[61] H. Shimazaki and S. Shinomoto, “A recipe for optimizing a time-histogram,” inNeural Information Processing Systems, 2006.

[62] B. L. Sabatini and W. G. Regehr, “Timing of synaptic transmission,” Annual Reviewof Physiology, vol. 61, pp. 521–542, 1999.

[63] S. M. Potter and T. B. DeMarse, “A new approach to neural cell culture forlong-term studies,” Journal of Neuroscience Methods, vol. 110, pp. 17–24, 2001.

[64] H. Kawaguchi and K. Fukunishi, “Dendrite classification in rat hippocampal neuronsaccording to signal propagation properties,” Experiments in Brain Research, vol. 122,pp. 378 – 392, 1998.

[65] J. le Feber, W. L. C. Rutten, J. Stegenga, P. S. Wolters, G. J. A. Ramakers, andJ. van Pelt, “Conditional firing probabilities in cultured neuronal networks: a stableunderlying structure in widely varying spontaneous activity patterns,” J. NeuralEng., vol. 4, pp. 54–67, 2007.

[66] C. D. Brody, “Correlations without synchrony,” Neural Computation, vol. 11, pp.1537–1551, 1999.

[67] D. Nikolic, “Non-parametric detection of temporal order across pairwisemeasurements of time delays,” J. Comput. Neurosci., vol. 22, pp. 5–19, 2007.

[68] F. Rieke, D. Warland, R. de Ruyter van Steveninck, and W. Bialek, Spikes: exploringthe neural code, MIT Press, Cambridge, MA, USA, 1999.

80

[69] P. Diggle, “A kernel method for smoothing point process data,” Applied Statistics,vol. 34, no. 2, pp. 138–147, 1985.

[70] E. N. Gilbert and H. O. Pollak, “Amplitude distribution of shot noise,” Bell Syst.Tech. J., vol. 39, pp. 333350, 1960.

[71] S. Yue and M. Hashino, “The general cumulants for a filtered point process,” AppliedMathematical Modelling, vol. 25, pp. 193–201, 2001.

[72] J. Michel, “A point process approach to filtered processes,” Methodol. Comput. Appl.Probab., vol. 6, pp. 423–440, 2004.

[73] E. Parzen, Time Series Analysis Papers, Holden-Day, 1967.

[74] J. S. Simonoff, Smoothing Methods in Statistics, Springer, 1996.

[75] I. S. Abramson, “On bandwidth variation in kernel estimates-a square root law,”Ann. Stat., vol. 10, no. 4, pp. 1217–1223, Dec 1982.

[76] D. Erdogmus and J. C. Prıncipe, “Generalized information potential criterion foradaptive system training,” IEEE Transactions on Neural Networks, vol. 13, no. 5, pp.1035–1044, Sept. 2002.

[77] J. C. Prıncipe, D. Xu, and J. W. Fisher, “Information theoretic learning,” inUnsupervised Adaptive Filtering, S. Haykin, Ed., vol. 2, pp. 265–319. John Wiley &Sons, 2000.

[78] A. Renyi, “On measures of entropy and information,” in Selected papers of AlfredRenyi, vol. 2, pp. 565–580. Akademiai Kiado, Budapest, Hungary, 1976.

[79] S. Kullback, Information Theory and Statistics, Dover Publications, New York, 1968.

[80] N. Aronszajn, “Theory of reproducing kernels,” Transactions of the AmericanMathematical Society, vol. 68, no. 3, pp. 337–404, May 1950.

BIOGRAPHICAL SKETCH

Il Park was born on April 29, 1979 in Gosla, Germany. He attended Gyunggi Science

High School for 2 years. He majored computer science at KAIST (Korea Advanced

Institute of Science and Technology). He spent 2001-2003 in an internet security

company as a developer. He has been working with Dr. Jose Prıncipe in Computational

NeuroEngineering Laboratory (CNEL) since 2006. He is admitted to the Biomedical

Engineering department for the Ph.D. program in University of Florida.

81

c 2007 il park - university of floridaufdcimages.uflib.ufl.edu/uf/e0/02/03/02/00001/park_i.pdf ·...

Documents