modeling and mathematical analysis of neural avalanches …

The Pennsylvania State University

The Graduate School

MODELING AND MATHEMATICAL ANALYSIS

OF NEURAL AVALANCHES

A Dissertation in

Mathematics

by

Anirban Das

c©2019 Anirban Das

Submitted in Partial Fulfillment

of the Requirements

for the Degree of

Doctor of Philosophy

August 2019

The dissertation of Anirban Das was reviewed and approved∗ bythe following:

Manfred DenkerVisiting Professor of Mathematics

Dissertation AdviserChair of Committee

Anna MazzucatoProfessor of Mathematics

Carina CurtoAssociate Professor of Mathematics

Guodong Pang

Associate Professor of Industrial and Manufacturing Engineering

Alexei NovikovHead of Graduate Program, Mathematics:

∗Signatures are on file in the Graduate School.

ii

Abstract

This dissertation centers around the use of mathematical tools mainly stochastic analysis tostudy various neural phenomena. The main goal of the dissertation is not only to use mathe-matics to make biological inference, but to make some existing techniques and models in mathneuro-science further aligned to the realities of the brain.

The first part of the dissertation is a general introduction to the field of critical neuralmodels, and the challenges relating to them which we strive to solve. In the second partwe discuss the popular Abelian distributions in neuroscience. We give for the first time aclosed form expression for the asymptotic variance of the Abelian distribution, and also deriveexpressions for the variance in finite-size systems. It becomes clear that the variance for theAbelian distribution diverges faster than its mean as parameters approach critical values. Thismakes neural data un-suitable for standard statistical analysis. We point out the reasons forthis, and develop a new technique called the p-stable method to manage such data.

In part three we take up the case of a class of models called Self Organized Criticality(SOC) models. These have been used by physicists for more than 30 years to study systemsthat seem to be clearly poised at a critical state, and yet there is no provision for external agentsto tune the system to this critical state. Neural circuits are conceived to be such systems, andhence SOC theory has been applied as a theoretical basis for understanding them. Howeverclassical SOC theory was not developed with the brain in mind. As such, many of its facetsdirectly contradict many of the chief tenets of neuro science. We establish a novel version ofSOC theory which is palatable to application in neuroscience. Here we give for the first timean analytic derivation of the distribution for neural avalanche sizes when there is input duringavalanches.

Finally in the fourth part we show how Markov processes in random Markov environmentscan have an extra maintenance state appended to them, the extra maintenance state servesto endow the joint process with a suitable stationary measure. We point out how this can beused to model a pair of neurons falling in and out of “synchrony”, where occasional externalspikes (due to sensory stimulus or signals from a different part of the brain) can be looked uponas maintenance state controlling ergodic properties. Product forms for stationary measures toMarkov processes in a random environment have been studied before. Our outlook is novelbecause: (1) We not only allow both the environment to effect the basic process, and the basicprocess to effect the environment, but also in our considerations both the state of the basic andenvironment process can change together. (2) There are almost no restrictions on the nature ofinteraction between the environment and basic processes, we only need to add an extra stateto the full space to allow for the product form of the stationary measure to develop.

iii

Contents

List of Figures vii

Acknowledgements viii

Part I : Introduction 1

Part II : The Abelian distribution and its properties 5

1 Background and organization 6

2 Variance of the Abelian distribution 72.1 The variance of the Abelian distribution . . . . . . . . . . . . . . . . . . . . . . 72.2 Stirling numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Asymptotic behavior of the variance of the Abelian distribution . . . . . . . . . 102.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Estimation of parameters for the Abelian distribution 163.1 U statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Adaptations to U statistics for analyzing heavy tailed distributions . . . . . . . 18

3.2.1 Using weights that have p stable distributions . . . . . . . . . . . . . . . 183.2.2 Almost sure central limit theorems in statistics . . . . . . . . . . . . . . . 20

3.3 Simulations for neural avalanche data . . . . . . . . . . . . . . . . . . . . . . . . 213.3.1 Outline of p -stable method in the context of sample mean . . . . . . . . 223.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Part III : External input and neuronal avalanches 25


5 Neuroscience interpretations 285.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.1.1 The Branching Model (BM) . . . . . . . . . . . . . . . . . . . . . . . . . 29

iv

5.1.2 The Levels Model (LM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2 Results and interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.1 Dependence on extent of external input for LM . . . . . . . . . . . . . . 315.2.2 Finite size effects and numerical simulations for LM . . . . . . . . . . . 325.2.3 BM with input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.4 Statistical analysis of data derived from simulation of BM . . . . . . . . . . . . 35

6 The Levels Model, a mathematical analysis 386.1 The “(N, p) BB” space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.2 A technical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406.3 A model with moderate external input . . . . . . . . . . . . . . . . . . . . . . . 41

6.3.1 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426.3.2 Cutoff at k > φ−2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.4 Large and small input regimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.4.1 Small input regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.4.2 Large input regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Part IV : Markov processes in random environments, and theirrelationship to neuroscience 48


8 Constructions of Markov processes in random environments, where thestate spaces are finite 508.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508.2 A construction with a single transitional state . . . . . . . . . . . . . . . . . . . 51

8.2.1 A general result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518.2.2 A minimal version of the single transitional state construction . . . . . . 53

8.3 Application of the minimal construction to queueing theory . . . . . . . . . . . . 548.3.1 A formal description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548.3.2 A technical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

8.4 A construction with multiple maintenance states . . . . . . . . . . . . . . . . . . 598.4.1 Description of the refined construction on the example . . . . . . . . . . 598.4.2 Analysis of transitional rates . . . . . . . . . . . . . . . . . . . . . . . . . 62

8.5 Closing comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9 Extending to state spaces which are compact metric spaces 679.1 Applying the construction to Markov chains in random environments . . . . . . 699.2 δ- smooth workable structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

9.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729.2.2 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

v

Part V : Conclusions and future work 75

References 78

vi

List of Figures

3.1 CLT and p-stable method (p value used is 1.7) for calculating confidence intervalsfor α for three different values of xm: On the x-axes we indicate the methodused to obtain confidence interval for α. On the y-axes is shown the range ofthe 4% confidence interval obtained for each method. Red dots indicate theends of the confidence intervals. The blue 5 symbol indicates a lower boundfor the confidence interval cannot be calculated using the method in question.To calculate confidence intervals we use 1000 instances of synthetic data. Thepoints indicated by × show the sample mean calculated from 900000 instancesof synthetic data. The inset on the leftmost is to show the P-stable results forthis case more prominently. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1 Schematic representation of the levels model without external input, for N = 6neurons with M = 7 energy levels. The avalanche size is 4, avalanche durationsis 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2 Avalanche size distributions in LM with various input strengths. Inset showscorresponding durations distributions that are also changing their exponent. In-put strength φ and the power-law exponents of the lines are indicated in thelegend. N = M = 105, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.3 Avalanche size distributions in the branching model with various input strengths.Input strength φ is indicated in the legend. N = 105, n = 10. Distributions forφ > 0 are shifted such that they all coincide for s = 100. . . . . . . . . . . . . . 33

5.4 Avalanche size distributions in the levels model and branching model with variousinput strengths. Input strength φ and the model type are indicated in the legend.Magenta line indicates the analytic prediction for the onset of the power-law withexponent −1.25. For both models, we take N = 105. To improve visibility, thedistributions are shifted by multiplication with cφ = 10−10φ+1. . . . . . . . . . . 33

5.5 The power law distribution is not a significantly better fit to the data in thepredicted region than the log-normal distribution. The main plot is a log-logplot of the avalanche size (s) vs the empirical probability of avalanche sizes(Pob(s)) as observed in the data. The expected power law region between 12 and1250 is marked in blue. The top-right inset is a log-log plot of the probabilitydensity (PLN(s)) of the best fitted log-normal distribution between 12 and 1250.The bottom left inset is a log-log plot of the probability density (PPL(s)) of thebest fitted power law distribution between 12 and 1250. . . . . . . . . . . . . . 36

vii

Acknowledgements

1. I would like to thank Prof. Manfred Denker for his great mathematical and statisticalinsight, encouragement and guidance. One could not hope to find a better adviser.

2. I would like to thank Prof. Yuri Suhov, Prof. Alexei Novikov, and Prof. Anna Mazzucatofor educating me in probability and analysis, and being available to discuss mathematicsand other pertinent issues.

3. I would like to thank Prof. Guodong Pang, and Prof. Aristotle Arapostathis for theiractive support and collaboration with regards to projects relating to queuing theory,stochastic control, and process-convergence.

4. I would like to thank Prof. Carina Curto, and J-Prof Anna Levina for teaching me aboutneuro science and the place of mathematics in this subject.

5. I would like to thank my friends Matthew Chao and Anthony Deluca with whom I couldfreely discuss mathematics and personal issues within and outside the department.

6. I would like to thank Agnieszka Zelerowicz, Shilpak Banerjee, David Hughes, ShreejitBandyopadhyay, Sumita Garai and Ayush Khaitan for their friendship and support.

7. I would like to thank my friends outside the math department Anish Dasgupta, BikramjitChatterjee, Tanusree Chatterjee, Shashank Pandey, and Manjari Mukhopadhyay whokept me sane and functional.

8. My collaborator Anna Levina received funding from (MarieCurie Actions) of the Euro-pean Union’s Seventh Frame-work Program (FP7/2007-2013) under REA grant agree-ment no. [291734], and from a Sofja Kovalevskaja Award from the Alexander von Hum-boldt Foundation, endowed by the Federal Ministry of Education and Research, GermanFederal Ministry. The work done in Part II and III of this dissertation was done incollaboration wit her. She and The Department of Mathematics, Penn State Universityfunded my travels to Vienna and Tubingen to discuss research with her. I would like toacknowledge these contributions, and also state that the findings and conclusions do notnecessarily agree with the view of the funding agencies.

9. Finally I would like to thank my parents for everything.

viii

Part I :

Introduction

1

A variety of natural system provide observations that follow power-law statistics, possiblywith exponential cutoff [50, 98, 5]. For example a power law distribution for activity propagationcascades (so-called neuronal avalanches) was reported in numerous neuronal systems in vitroand in vivo and at various spatial and temporal scales [5, 71, 84, 100, 96, 82]. In many cases,the appearance of power-law statistics is connected with closeness to the critical point of asecond order (continuous) phase transition. For the brain, the claim that power law observationpoints to a closeness to critical states was additionally supported by the observation of stableexponents relations [46], finite-size scaling [61], and shape collapse [46]. Models of criticalitytherefore began being used for studying the brain. Additional reasoning for this comes fromobservations that criticality brings about optimal computational capabilities [62, 11], optimaltransmission and storage of information [12], and sensitivity to sensory stimuli [95, 60]. Inspite of these reasons for real systems to be close to criticality, there are many ongoing debatesabout models and data-analysis techniques [83] used to demonstrate this fact [8, 4].

A concept of self-organized criticality (SOC) was proposed [2] as a unified mechanism forpositioning and keeping systems close to criticality. SOC models have emerged as the flagshipvehicle for modeling criticality as an operational state of the brain network because it eliminatesthe necessity to endogenously tune parameters. For a system consisting of many interactingnon-linear units, the general theory prescribes conditions necessary for exhibiting self-organizedcriticality. Firstly, it should obey local energy conservation rules [13] and secondly, the timescaleof the external drive should be separated from the timescale of interactions. It implies that noexternal input is delivered to the system before it reaches a stable configuration. The intuitionbehind the timescale separation condition can be summarized as follows: consider, there is amacroscopic scale at which the external energy is applied and a microscopic scale for activitypropagation through the interacting units. When the two scales are comparable, the frequencyof the drive becomes a factor that can be tuned by some moderating party. In the limit, as thefrequency of macroscopic events implodes to zero, global supervision ceases, and a self-organizedsystem emerges [86, 39, 40, 103].

Theoretical studies of avalanche-like propagation in networks were initiated before thefirst experimental results. Initial models were simple [1, 29, 109, 55, 44], later followingexperimental findings, more biologically realistic models were developed [32, 66, 67, 91, 38]. Adetailed review of studies of criticality in biological systems through experiments and modelsmaybe found in [79].

Most current models and evaluation techniques of experimental results rely on the binningprocedure for the definition of avalanches. Namely, the total time of a real or digital experimentis split into short intervals (bins). Empty bins are considered to be pauses between avalanches,which span bins containing at least one active unit. A rare exception is the leaky integrate-and-fire neuronal model by Millman et al. [75]. There the avalanche was defined by following thetime course of activity propagation on the known network. However, recently it was demon-strated [72] that the classical procedure of binning will not reveal any critical statistics forthis model, and neutral theory could explain the observed power-laws without necessitating anSOC model. The usage of binning for data-analysis from neuronal recordings [85, 5] implicitlyrelies on the assumption of timescale separation. Otherwise, the bin size would be a stiff [51]control parameter for a data analysis that can change the outcome of the evaluation. However,in neuronal systems inputs are constantly present and there is no chance for a strict separation

2

of external input from the internal dynamics.In Part III we investigate the effects of relaxing the timescale separation condition by

implementation an input process during avalanches and observing the consequent avalanchesize distribution. Our emphasis here is on models that on the one hand present signatures ofcriticality even when the drive is absent and on the other hand try to mechanistically explainthe power-law generation in activity propagation. In avalanching models additional input tothe system generally has two effects: first, the avalanches increase due to the input and follow-up firing; second, the avalanches are “glued together” namely, the input connects avalanchesthat would have otherwise occurred separately. For the systems that are driven by a constantinput, the definition of criticality is possible by the estimation of the branching ratio [105].However, in this case typical binning-based avalanche analysis might not reveal a critical statebecause both aforementioned effects are present simultaneously and separating the avalanchesbecomes impossible. A recent study [107] numerically demonstrated that changing the binningaccording to the firing rate reveals critical dynamics during task performance, when additionalinput on top of ongoing activity is expected. Here we investigate analytically how criticalitycan be preserved even if external drive is added to the system.

As a first step towards the understanding of timescale separation, we allow for externalinput during avalanches without compromising their separation. We develop an analytic treat-ment and provide numerical simulations for a simple neuronal model. We show that the powerlaw scaling feature is preserved, however, even moderate external input leads to a change in theslope of the avalanche size distribution. The same critical exponents are persistent throughouta range of values of the input. Therefore we prove that the rate of input is not taking therole of a tuning parameter. The analytic results are reproduced when simulating more realisticbranching models, where input was added at a constant rate.

One of the chief reasons analytic results like the ones we develop in Part III are useful inthe study of avalanches is because the data from simulations is difficult to analyze. In particularit is hard to look at data from power law distributions and realize that the underlying causeis a power law [22]. This is essentially because power law distributions are heavy tailed. Wediscuss the problem of estimating parameters for heavy tailed distributions in Part II by usingthe “ resampling approach” [33]. We develop a method, and apply it to power law data fromavalanche-models with full time scale separation.

To understand the nature of this heavy-tailed behavior of distributions related to neuro-science we take up the case of the Abelian distribution, which arises in models studying neuralavalanches (See [44] [68] [69]). Chapter 2 in Part II is devoted to computing the variance of theAbelian distribution. For the variance of finite systems one may use results from quasi-binomialdistribution theory [26], we pursue an adhoc approach which is clearer. To understand whathappens to the variance as system size (number of neurons examined) goes to infinity, we intro-duce Stirling numbers, develop some useful lemmas about them, and apply them to studyingthe asymptotic variance. What we discover in Chapter 2 is that as system size goes to infin-ity the variance of the Abelian distribution explodes much faster than the mean. This meansclassical methods based on CLT theorems for calculating the confidence intervals of parame-ters of Abelian distribution from data will not work well. We develop a new technique calledthe p-stable method in the Chapter 3. We review the theoretical background that underpinsthe validity of the p- stable method, we in-fact introduce the p-stable method in the general

3

context of U statistics. Next we apply the p- stable method to examine synthetic neural data,and compare estimates obtained with with those obtained by classical CLT based methods.

In Part IV we take up the case of Markov processes in random Markov environments.We show how to equip processes interacting freely with random environments with a particularstationary measure by making small adjustments. We demonstrate our work when the “targetmeasure” has a product form[110], however the construction proposed will work for other typesof measures. In Part III, we developed a probability space where the Abelian distributionappeared as the distribution of a random variable. We used a measure space called the (N, p)BB space and provided justification for why it was a correct space to consider in the contextof modeling neural avalanches. We however did not justify why the uniform measure was thecorrect probability measure to equip the (N, p) BB space with (for the time-scale separatedregime justifications are given in [44]). All other measures on the (N, p) BB space that weintroduced in Part III were motivated by trying to introduce various types of external inputsduring avalanches. In each circumstance we assumed that the uniform distribution was thecorrect measure to consider when there was no input. This brings us back to the question ofwhy the uniform measure is a correct choice. In nature most probability measures appear asstationary measures of Markov processes. We therefore try to find the measure as a stationarymeasure of suitable Markov processes. We first begin with understanding possible stationarymeasures for Markov processes in random environments. These are a rich class of processesthat are useful in formalizing physical examples in neuroscience such as memory [15].

Recently Belopolskaya and Suhov (2015) studied Markov processes in a random environ-ment, where the environment changes in a Markovian manner. They introduced constructionsallowing the process to “interact with an environment”. This was done in such a manner thatthe combined process had the product of the stationary measures of the individual processes asits stationary measure. In Chapter 8 a new construction is implemented, related to a productform for the stationary measure. This construction can be carried out with almost no condi-tions. However it requires the use of an additional state, denoted by c. The extent to which thecombined process uses state c indicates how far this process is from naturally having a productform for the stationary measure. To specify various aspects of the construction, we use an ex-ample from queuing theory, which is studied in detail. We observe how our construction worksin this example, especially how the combined process uses state c. Physical interpretations leadto a refined construction also administered on a queuing theory background. In the refinedconstruction the use of state c agrees with an intuition, which results in possible applicationsto mathematical models used in the field of neurobiology. In Chapter 9 we extend the previousresults to compact metric spaces.

4

Part II :

The Abelian distribution and itsproperties

5

Chapter 1

Background and organization

The Abelian distribution is a distribution that is important in models studying neural avalanches(See [44] [68] [69]). Neural avalanches were observed by [6], [5]. In experiment, cultured slicesfrom the brain were attached to multielectrode ensembles and LFP(Local Field Potential)signals were recorded. The data retrieved showed brief intervals of activity, when electrodesdetected LFPs above the threshold. The period in between these short bursts of activity wasmarked by idleness. A sequence of such sustained activity was called an avalanche.

This part is organized into two main chapters, and is intended for conducting a thoroughexamination of the Abelian distribution. Chapter 2 is devoted to computing the variance ofthe Abelian distribution. For the variance of finite systems one may use results from quasi-binomial distribution theory, we pursue an adhoc approach which is clearer. To understandwhat happens to the variance as system size (number of neurons examined) goes to infinity,we introduce Stirling numbers, develop some useful lemmas about them, and apply them tostudying the asymptotic variance. What we have discovered in Chapter 2 is that as system sizegoes to infinity the variance of the Abelian distribution explodes much faster than the mean.This means classical methods based on CLT theorems for calculating the confidence intervals ofparameters of Abelian distribution from data will not work well. We develop a new techniquecalled the p-stable method in Chapter 3. We review the theoretical background that underpinsthe validity of the p- stable method, we in-fact introduce the p-stable method in the generalcontext of U statistics. Next we apply the p- stable method to examine synthetic neural data,and compare estimates obtained with with those obtained by classical CLT based methods.The work was done in collaboration with Dr. Manfred Denker, Dr. Anna Levina, and Dr.Lucia Tabacu.

6

Chapter 2

Variance of the Abelian distribution

2.1 The variance of the Abelian distribution

Definition 2.1.1. Abelian DistributionThe Abelian distribution ZN,p is a probability distribution on 1, 2, · · · , N defined by theprobability density

P (ZN,p = b) = CN,p

(N − 1

b− 1

)pb−1(1− bp)(N−b−1)bb−2,

where CN,p is the normalization constant defined by CN,p = 1−Np1−(N−1)p . The parameter N must

be an integer, the parameter p lies in (0, 1N

).

That this is indeed is a distribution was proved in [68], see also [69]. The p in the Abeliandistribution is often taken as α

N, where 0 < α < 1. It was also proved in [68], [69] that :

Lemma 2.1.1. E(ZN, αN

) = NN−(N−1)α .

In [36] we find the following distribution:

Definition 2.1.2. Avalanche Distribution The Avalanche distributionXN,p is a probabilitydistribution on 0, 1, 2, · · · , N defined by the probability density

P (XN,p = b) =

(N

b

)pb(1− (b+ 1)p)N−b(b+ 1)b−1.

The parameter N must be an integer, the parameter p lies in (0, 1N

).

Lemma 2.1.2. E(XN,p) =N∑i=1

N !(N−i)!p

i.

Also define YN,p = XN,p + 1. Thus

P (YN,p = b) = P (XN,p = b− 1)

=

(N

b− 1

)pb−1(1− bp)(N−b+1)bb−2.

With these results at hand we are ready to find the Variance of ZN,p

7

Theorem 2.1.3. The second moment of the Abelian distribution is as follows:

E(ZN,p2) =

CN,pp

[1

1−Np− 1−

N−1∑i=1

(N − 1)!

(N − 1− i)!pi]. (2.1)

And the variance of the distribution is

V (ZN,p) =CN,pp

[1

1−Np− 1−

N−1∑i=1

(N − 1)!

(N − 1− i)!pi]− (

N

N − (N − 1)α)2

. (2.2)

Proof.

E(YN,p) =N+1∑b=1

bb−1(

N

b− 1

)pb−1(1− bp)(N−b+1)

=N+1∑b=1

bb−1(

N

b− 1

)pb−1(1− bp)(N−b) −

N+1∑b=1

p× bb(

N

b− 1

)pb−1(1− bp)(N−b)

=1

CN+1,p

E(ZN+1,p)− p1

CN+1,p

E(ZN+1,p2).

We know E(ZN+1,p) from 2.1.1, using 2.1.2 we can compute E(YN,p). Using the above two factsone finds E(ZN,p

2). Since V (ZN,p) = E(ZN,p2)−E(ZN,p)

2 , the variance too can be found fromthis.

It should be noted that since Abelian distributions fall in the family of quasibinomialdistributions type 2 [26], Theorem 2.1.3 can also be directly established following results fromthe theory of quasi-binomial distributions.

2.2 Stirling numbers

Our chief goal for the rest of this section will be to find how the variance of the Abelian dis-tribution behaves as N tends to infinity. In order to do this we shall use the Stirling numberof the first kind. The Stirling numbers were so named by N. Nielson (1906) in honor of JamesStirling, who introduced them in his Methodus Differentialis (1730) [102], without using anynotation for them. The notation here is due to [88]. This section gives some definitions, andresults from [19]. We then proceed to state and prove a few Lemmas of our own 2.2.2, 2.2.3,2.2.1. These final three will be used in the section titled Asymptotic behavior of the Varianceof the Abelian Distribution.

We will use the notation (x)n for the polynomial x(x − 1)(x − 2) · · · (x − (n − 1)). Thisis called the factorial moment of order n.

The coefficients of such polynomials are called the Stirling numbers. Formally we have,

(x− r)i =i∑

j=0

s(i, j; r)xj. Set s(0, 0; r) = 1. For i ≥ j ≥ 0, s(i, j; r) are called the non- centered

8

Stirling numbers of the first kind. We will be chiefly interested in r= 1, when (x − 1)i =i∑

j=0

s(i, j; 1)xj. When i ≥ j > 0, denote by τ ij the class of all possible subsets of 1, 2, 3 · · · i

which are of cardinality j. The following can be found in Chapter 2 of [19].

|s(i, j; 1)| = (−1)i−js(i, j; 1). (2.3)

(x+ i)i =i∑

j=0

|s(i, j; 1)|xj. (2.4)

s(i, i; 1) = 1. (2.5)

s(i, i− 1; 1) = −i(i+ 1)

2. (2.6)

|s(i, j; 1)| = i!∑

r1,r2,··· ,rj∈τ ij

1

r1r2 · · · rj, this holds for i ≥ j > 0. (2.7)

Now we make some definitions, and prove some results of interest to us.

Definition 2.2.1. Given positive integer i. Pi is a polynomial of degree i(i ≥ 0) defined as

Pi(x) =i∑

j=0

s(i+2, j; 1)xj. hi is a polynomial of degree i+2 defined as hi(x) = xi+1( (i+2)(i+3)2−x).

Lemma 2.2.1. i, N be positive integers, N − 3 ≥ i. Then

Pi(N) = (N − 1)i+2 + hi(N)

Further when N − 3 ≥ i ≥√

2N , hi(N) > 0, and also 2N i+3 > Pi(N) > (N − 1)i+2 ≥ 0.

Proof. The proof is a straightforward calculation.

Lemma 2.2.2. There exists a polynomial f(x), of degree 4, such that for all integers with i, jwith i ≥ j ≥ 0, we have f(i) ≥ 0 and

|s(i+ 2, j; 1)| ≤ |s(i, j; 1)|f(i). (2.8)

Proof. See Appendix for a Proof.

Before ending the section we will state one last technical lemma that will find use in thenext section.

Lemma 2.2.3. When 0 ≤ i <√

2N ,∏i

j=1(1 + jN

) ≤ e2.

Proof. It would be enough to show∑i

j=1 ln(1 + jN

) ≤ 2. This is exactly what we do.

i∑j=1

ln(1 +j

N) ≤

i∑j=1

j

N(Using the fact ln(1 + x) ≤ x,when x > −1)

=1

2N× (i)(i+ 1) ≤ 2.

9

2.3 Asymptotic behavior of the variance of the Abelian

distribution

We will see how the variance of ZN,p behaves as N tends to infinity, here we take p = αN

. Hereis the result.

Theorem 2.3.1. ZN,p be the Abelian distribution with parametres p and N . For 0 < α < 1,

limN→+∞

V (ZN, αN

) =α

(1− α)3.

Proof. Let p = αN

. Restating 2.1 we get

E(ZN,p2) =

CN,pp

[1

1−Np− 1−

N−1∑i=1

(N − 1)ipi].

The fact that s(i, i, 1) = 1 is used in the following calculations

E(ZN, αN

2) =CN,pp

[1

1−Np− 1−

N−1∑i=1

pii∑

j=0

s(i, j; 1)N j]

=CN,pp

[∞∑i=1

(Np)i −N−1∑i=1

pii∑

j=0

s(i, j; 1)N j]

=CN,pp

[∞∑i=N

(Np)i −N−1∑i=1

pii−1∑j=0

s(i, j; 1)N j]

=CN,pp

[αN

1− α−

N−1∑i=1

pii−1∑j=0

s(i, j; 1)N j].

Hence we haveE(ZN, α

N

2) = CN,p[J1 − J2] (2.9)

where J1 = αNαN(1−α) , J2 = 1

p

N−1∑i=1

pii−1∑j=0

s(i, j; 1)N j.

We easily observe

limN→+∞

CN,p = limN→+∞

CN, αN

= 1, limN→+∞

J1 = 0.

Now to focus on J2

J2 =1

ppN−2∑i=0

pii∑

j=0

s(i+ 1, j; 1)N j

=N−2∑i=0

pii∑

j=0

s(i+ 1, j; 1)N j

=N−2∑i=0

piN is(i+ 1, i; 1) +N−2∑i=1

pii−1∑j=0

s(i+ 1, j; 1)N j.

10

So we haveJ2 = J3 + J4, (2.10)

where J3 =N−2∑i=0

piN is(i+ 1, i; 1), and J4 =N−2∑i=1

pii−1∑j=0

s(i+ 1, j; 1)N j. We observe

J3 =N−2∑i=0

αis(i+ 1, i; 1) = −N−2∑i=0

αi(i+ 1)(i+ 2)

2.

This yields

limN→+∞

J3 = − 1

(1− α)3.

J4 =N−2∑i=1

pii−1∑j=0

s(i+ 1, j; 1)N j

= p

N−3∑i=0

pii∑

j=0

s(i+ 2, j; 1)N j

= p

√2N−1∑i=0

pii∑

j=0

s(i+ 2, j; 1)N j + pN−3∑i=√2N

pii∑

j=0

s(i+ 2, j; 1)N j

= J5 + J6 ,

where J5 =

√2N−1∑i=0

pii∑

j=0

s(i+ 2, j; 1)N j and J6 = pN−3∑i=√2N

pii∑

j=0

s(i+ 2, j; 1)N j.

We next show limN→+∞ J5 = 0, f below is defined in 2.2.2, we use 2.2.3.

|J5| = |p(

√2N−1∑i=0

pii∑

j=0

s(i+ 2, j; 1)N j)|

≤ p(

√2N−1∑i=0

pii∑

j=0

|s(i+ 2, j; 1)|N j)

≤ p(

√2N−1∑i=0

pii∑

j=0

f(i)|s(i, j; 1)|N j)

≤ p(

√2N−1∑i=0

pif(i)i∏

j=1

(N + j))

≤ α

N(

√2N−1∑i=0

αif(i)i∏

j=1

(1 +j

N)) (Putting p =

α

N)

≤ α

N(

√2N−1∑i=0

αif(i)e2).

11

Since the last expression is an upper bound for |J5|, and it approaches zero as N ap-proaches ∞, we are done.

Next we will show limN→+∞ J6 = 0, the Pi below is defined in 2.2.1 .

|J6| = |p(N−3∑i=√2N

piPi(N))|

= p(N−3∑i=√2N

piPi(N))

≤ p(N−3∑i=√2N

pi2N i+3)

=α

N(N−3∑i=√2N

αi2N3) (Putting p =α

N)

= 2N2α(N−3∑i=√2N

αi) = 2N2α√2N+11− α(N−3)−2

√N

1− α.

Since the last expression is an upper bound for |J6|, and it approaches zero as N ap-proaches ∞, we are done.

From (2.9), and the estimations of J1, J3, J4, we see

limN→+∞

E(ZN, αN

2) = (1− α)−3.

Also

limN→+∞

[E(ZN, α

N)

]2= lim

N→+∞

1

1− (N−1)αN

2

=1

(1− α)2.

From this the theorem follows.

2.4 Remarks

The choice of α which is most interesting for studying the biological phenomenon, are thosevalues close to 1, these are the values for which the distribution behaves closest to a powerlaw as has been show in [44]. Also [66] shows these are the values of α the system settles toif one starts with dynamical synapses with a different suitable α. This result shows that suchsystems have very high variance when we deal with a lot of neurons. This is consistent with apower law distribution of exponent −3

2as observed in the experiments of [6], [5]. An explicit

expression for the variance, as has been found here, often proves useful for inferring detailsabout parameters from available data.

12

2.5 Methods

Here we give proofs of certain lemmas from the main text.

Proof. of 2.1.2 Let (Ui)Ni=1 be i.i.d uniformly distributed in [0, 1], N is a fixed positive integers,

p is a fixed number in (0, 1N

) . From this one may recursively construct the random sequences(εi,N)Ni=1 as follows

ε0,N = 1, ε1,N =N∑j=1

1[1−p,1](Uj)

εk,N =N∑j=1

1[1−p

k−1∑i=0

εi,N , 1−pk−2∑i=0

εi,N ](Uj), SN,p =

N∑i=1

εi,N .

It was shown in [36] that SN,p has the same distribution as the Avalanche distribution.Thus it is suffice to prove that E(εk,N) = N !

(N−k)!pk, ∀k ≥ 1,

we do so by induction for k = 1,

E(ε1,N) =N∑i=1

P (Ui > 1− p) =N∑i=1

p = Np.

By Inductive hypothesis the result holds for k = k − 1, now for k = k,

E(εk,N) =N∑i=1

P (pk−1∑m=0

εm,N ≥ 1− Ui ≥ pk−2∑m=0

εm,N)

=N∑i=1

N−1∑j=1

jp P (εk−1,N−1 = j)

= NpN−1∑j=1

jP (εk−1,N−1 = j) [Taking conditions on value of εk−1,N ]

= NpN − 1!

(N − k)!pk−1 [By Inductive Hypothesis]

=N !

(N − k)!pk.

Proof. of 2.2.2 For the moment , consider i ≥ j > 0, the situation where i ≥ j = 0, will betreated at the end separately. Using Equation 2.7, we get

13

|s(i+ 2, j; 1)| = (i+ 2)!∑

r1,r2,··· ,rj∈τ ij

1

r1r2 · · · rj

+(i+ 2)!∑

r1,r2,··· ,rj−1∈τ ij−1

1

(i+ 1)r1r2 · · · rj−1

+(i+ 2)!∑

r1,r2,··· ,rj−1∈τ ij−1

1

(i+ 2)r1r2 · · · rj−1

+(i+ 2)!∑

r1,r2,··· ,rj−2∈τ ij−2

1

(i+ 1)(i+ 2)r1r2 · · · rj−2.

Now for i ≥ j > 0 consider the function Fi.j : τ ij−1 → τ ij (Fi,j is a function which takessets to sets) defined as

Fi,j(r1, r2, · · · , rj−1) = l, r1, r2, · · · , rj−1,

where l is the least number in 1, 2, · · · , i, which is not in r1, r2, · · · , rj−1.

∀K ∈ τ ij , |Fi,j−1(K)| ≤ j ≤ i.

Also

∀r1, r2, · · · , rj−1 ∈ τ ij−1,1

(i+ 1)r1r2 · · · rj−1≤ 1∏

g∈Fi,j(r1,r2,··· ,rj−1) g.

It follows that

∑r1,r2,··· ,rj−1∈τ ij−1

1

(i+ 1)r1r2 · · · rj−1≤

∑r1,r2,··· ,rj−1∈τ ij−1

1∏g∈Fi,j(r1,r2,··· ,rj−1) g

≤∑

r1,r2,··· ,rj∈τ ij

|Fi,j−1(r1, r2, · · · , rj)|1

(r1r2 · · · rj)

≤ i∑

r1,r2,··· ,rj∈τ ij

1

r1r2 · · · rj.

Thusi× (i+ 2)!

(i)!|s(i, j; 1)| ≥ (i+ 2)!

∑r1,r2,··· ,rj−1∈τ ij−1

1

(i+ 1)r1r2 · · · rj−1

. Similarly

i× (i+ 2)!

(i)!|s(i, j; 1)| ≥ (i+ 2)!

∑r1,r2,··· ,rj−1∈τ ij−1

1

(i+ 2)r1r2 · · · rj−1

and

i× (i− 1)× (i+ 2)!

(i)!|s(i, j; 1)| ≥ (i+ 2)!

∑r1,r2,··· ,rj−2∈τ ij−2

1

(i+ 1)(i+ 2)r1r2 · · · rj−2.

14

Using the above three in the initial equation for |s(i+ 2, j; 1)|, we get

|s(i+ 2, j; 1)| ≤ ((i+ 1)(i+ 2) + 2(i+ 1)(i+ 2)i+ (i+ 1)(i+ 2)i(i− 1))|s(i, j; 1)| (2.11)

≤ ((i+ 1)(i+ 2) + 2(i+ 1)(i+ 2)i+ (i+ 1)(i+ 2)i(i− 1) + 4)|s(i, j; 1)|.(2.12)

The polynomial ((x+ 1)(x+ 2) + 2(x+ 1)(x+ 2)x+ (x+ 1)(x+ 2)x(x− 1)) + 4 is defined asf , we have shown above that it satisfies the prescribed properties for i ≥ j > 0.

When i > j = 0, |s(i+ 2, 0; 1)| = (i+ 1)(i+ 2)|s(i, 0; 1)|, when i = j = 0, s(2, 0; 1) = 4 <f(0)s(2, 0; 1). So 2.12 still holds.

15

Chapter 3

Estimation of parameters for theAbelian distribution

3.1 U statistics

The theory of U statistics is a common way for generating unbiased estimators to parametersfrom empirical data. It was first introduced in [56], detailed expositions maybe found in [34,64, 45]. Let P be a family of probability distributions on R, and θ : P 7→ R be a function calledthe “estimable parameter”. We are given an iid sequence of random variables (X1, X2, . . . ),with each Xi ∼ P , where P is an unknown distribution in P . For any J = i1, i2, . . . ic ⊂1, 2, . . .m, let x = (x1, x2, . . . , xm) be any point in Rd, then

xJ := (xi1 , xi2 , . . . xic).

We design a symmetric function h : Rm 7→ R called the kernel, for every n ≥ m,

Un,h(X1, X2, . . . , Xn) :=1(nm

) ∑J⊂1,2,...n|J |=m

h(XnJ)

where random vector Xn is equal to (X1, X2, . . . Xn). The kernel is designed to ensure that forall P ∈ P , and n ≥ m,

EP [Un,h] = θ(P ). (3.1)

Given, θ : P 7→ R, the smallest m for which a h satisfying (3.1) can be constructed is calledthe rank of θ. Henceforth we always assume h satisfies (3.1), statistics like Un,h are called Ustatistics. It is known that if P is “large enough”, the unbiased estimator with least varianceis a U-statistic (Hodges-Lehmann Theorem). We also assume that∫

Rm|h(x1, x2, . . . xm)|πmi=1dP (xi) <∞.

U statistics can be used to estimate confidence intervals for the parameter θ, this is onaccount of a theorem due to Hoeffding. Before stating the theorem we introduce some notation,Let J ⊂ 1, 2, . . . , n, say J = i1, i2, . . . , i|J|, define:

xJ = (xi1 , xi2 , . . . , xi|J|).

16

Given a sequence of random variables(“observations”) X1, X2, . . . , for any integer s the orderstatistic Xs = (X(1), X(2), . . . X(s)) (see [34] chapter 1, pp3). A few related auxiliary objectsthat frequently appear in U statistics are

hc(x1, . . . xc) :=

∫h(x1, . . . xc, y1, y2, . . . ym−c)P (dy1)P (dy2) . . . P (dym−c), ∀c ∈ 0, 1, . . .m

hc(x) :=c∑i=1

(−1)c−i∑

J⊂1,2,...c|J |=i

hi(xJ), ∀c ∈ 0, 1, . . .m & x ∈ Rc,

ζc := Var[hc(X1, . . . Xc))], δc := Var[hc(X1, . . . Xc))], ∀ c ∈ 0, 1, . . .m.(3.2)

From Jenson”s inequality observe that ζ1 ≤ ζ2 . . . ζm, indeed Theorem 1.2.3 of [34] saysunder the assumption that ζm <∞,

ζ11≤ ζ2

2· · · ≤ ζc

c· · · ≤ ζm

m, and nVar[Un,h] ↓ ζ1 as n→∞. (3.3)

A version of Hoeffding’s theorem is (theorem 1.3.1, [34])

Theorem 3.1.1. If Var[h(X1, X2, . . . Xm)] is finite then

√n (Un,h − θ) ⇒ N (0,mζ1) .

If some estimation or bounds for ζ1 are available then theorem 3.1.1 can be used to findconfidence intervals for θ. See example 1.3.2 of [34].

An := σ

(X1, X2, . . . Xn

), for any integer n,

F n := σ

(Xn, Xn+1, . . . Xi, . . .

), for any integer n, F∞ = ∩∞n=1F

n.

(3.4)

From Lemma 1.1.1 and Theorem 1.1.2 of [34], we gather that

Theorem 3.1.2. Un is a reverse martingale w.r.t. Fn, viz E [Un|Fn+1] = Un+1, for all n.This is equivalent to E [Um|Fn] = Un, for all n,m satisfying n ≥ m. Moreover the martingaleconvergence theorem yields that there exists a random variable U∞ = E [Um|F∞], such thatUn → U∞ both in L1 and almost surely. Further it can be said that U∞ = θ a.s.

The kernel h is degenerate w.r.t to P , if for all x ∈ Rd−1, we get∫h(x, y)P (dy) = 0. For

any kernel h, and any integer m ≥ c > 0, hc is degenerate kernel. A notable consequence of hbeing a degenerate kernels is the following (for proof see [34], lemma 1.2.2),

Theorem 3.1.3. Let h be degenerate then(nm

)Un,h is a martingale w.r.t. An.

Theorem 3.1.1 can be often strengthened to a great extent. We discuss two cases:

1. This is when h is degenerate. This implies that θ = 0, also h1 ≡ h2 ≡ . . . hm−1 ≡ 0,consequently ζ1 = ζ2 = . . . ζm−1 = 0. In such a case we get from example 2.2.7 of [34] :

17


√nm (Un,h) ⇒

∫ 1

0

∫ 1

0

· · ·∫ 1

0

ψ(x1, x2, . . . xm)dB1dB2 . . . dBm.

where B1, B2, . . . Bm are some Brownian bridges, the convergence is in distribution, and

ψ(x1, x2, . . . xm) = h(F−1(x1), F−1(x1), F

−1(x2), . . . F−1(xm)).

2. When h is not degenerate, but there exists c such that c ≤ m, and ζ1 = ζ2 = . . . ζc−1 =0, ζc 6= 0. From equation 1.2.7 of [34], we can gather that h1 ≡ h2 ≡ . . . hc−1 ≡ 0.Using Theorem 1.2.4 of [34], example 2.2.7 of [34], and the fact that hr is degenerate forall 1 ≤ r ≤ m (Lemma 1.2.1 [34]), we arrive at


√nc (Un,h − θ) ⇒

(m

c

)∫ 1

0

∫ 1

0

· · ·∫ 1

0

ψc(x1, x2, . . . xc)dB1dB2 . . . dBc.

where B1, B2, . . . Bc are some set of Brownian bridges, and

ψc(x1, x2, . . . xc) = hc(F−1(x1), F

−1(x1), F−1(x2), . . . F

−1(xc)).

Remark. Central limit theorems like Theorem 3.1.1 can be used to find confidence intervals forestimates. The obstacle to using Theorem 3.1.4 and Theorem 3.1.5 for such a purpose comesfrom the fact that the distribution of the limiting random variable is not well understood. Toovercome this almost sure central limit theorems are used.

A second more pressing deficiency with the present theory is that the finite variance con-dition is imposed in all the CLT-type theorems. There is a theory that exists that treats thisissue, this theory is connected p-stable motions.

3.2 Adaptations to U statistics for analyzing heavy tailed

distributions

3.2.1 Using weights that have p stable distributions

Here we outline how one weakens the finite variance criteria which is unsuitable as detailed inRemark 3.1. Stable distributions [81] have been of interest in mathematics and statistics, webriefly describe properties (see [81] for full proofs) that will prove useful to us.

Definition 3.2.1. A non degenerate distribution X on R is called stable if for each n ∈ Z,there exists cn > 0, and dn ∈ R such that for any Xini=1 iid with Xi ∼ X, we have

∑ni=1Xi ∼

cnX + dn, where X is a random variable with distribution X.

Stable distributions form a four parameter family of distributions, for example it can

be shown cn = n1p , with p ∈ (0, 2], this quantity p is called the stability parameter. Also if

E[X] = 0, then dn = 0 for all n. Stable distributions are absolutely continuous with respect toLebesgue measure. A useful property is:

18

Theorem 3.2.1. Let X be a random variable with a stable distribution, having stability pa-rameter p, then if 0 < p < 2, we have

P (X > n) ∼ C1n−p, P (X < −n) ∼ C2n

−p, (3.5)

where C1, C2 are parameters which depend on the particular X. Let f be the density of thedistribution of X then

f(x) ∼ C3x−p−1, f(−x) ∼ C4x

−p−1, (3.6)

where C3, C4 are parameters which depend on the particular X.

This means for p > 1 stable distributions have well defined first moments. In such a casethe mean is called the location parameter of the distribution.

Let (Ω, F, P ) be a filtered probability space, with Ft, t ∈ R+ being the filtration. Thenfor any 0 < p < 2, Mt is a p-stable motion adapted to the filtration Ft if it has stationaryindependent increments, and for all s > s′ ≥ 0, λ ∈ R satisfies

E

[eiλ(M(s)−M(s′))

∣∣∣∣Fs′ ] = e−(t−s)|λ|p

.

The theory of integration with respect to p-stable motion was developed in [89]. In theaforementioned study it was proven that given a p-stable motion Mt an agreeable definitionof∫ tsYudMu could be developed when Y is a Ft adapted process, satisfying for all t > 0

P[∫ t

0|Ys|pds ≤ ∞

]= 1.

Remark. Here the increments are stable distributions with stability parameter p, location pa-rameter (which is the mean) = 0, scale parameter = t−s (size of increment) and skew parameter= 0. Setting p = 2 would yield Brownian motion.

Initially to state results we consider that E[h(X1, X2, . . . Xm)] = θ = 0, later we will relaxthis. Say for some r > 1:

E[|h(X1, X2, . . . Xm)|r] <∞, (3.7)

for r < 2, this is a weaker assumption than the finite variance condition in Section 3.1. LetY1, Y2, . . . be a sequence of iid p-stable distributions which are independent of the sequenceX1, X2, . . . . Then from Theorem 3.3 of [33] one obtains:

Theorem 3.2.2. Then for r > p, we have

1

nmp

∑i1<i2<···<im≤n

h(Xi1 , Xi2 , . . . , Xim)Yi1Yi2 . . . Yim ⇒ Gp, (3.8)

where Gp is some distribution whose precise formulation maybe presented in terms of the dis-tribution function of X1 and stochastic integrals against p-stable motions.

To exploit Theorem 3.2.2 we introduce the weighted U statistic

UWn,h; :=

1(nm

) ∑i1<i2<···<im≤n

h(Xi1 , Xi2 , . . . , Xim)Yi1Yi2 . . . Yim .

19

Abusing notation we shall refer to UWn,h as Un,h. From (3.2.2) it is clear that

nm−mp

m![Un(h)] ⇒ Gp. (3.9)

If θ 6= 0, we replace h in (3.9) with h − θ, then provided one has a thorough understandingof Gp, confidence intervals for θ maybe obtained. The issue of deriving the distribution of Gp

leads us to almost sure central limit theorems.

3.2.2 Almost sure central limit theorems in statistics

The earliest example of Almost Sure Central Limit Theorems (ASCLT) comes from Erdosand Hunt [42]. Take the example of i.i.d random variables X1, X2, . . . which have commondistribution X. Say X has mean θ and variance 1, where θ is a parameter to be estimated. Forall n > 0, Xn := Xn − θ, and Sn :=

∑ni=1 Xi. For any set A ⊂ Rd, such that the boundary of

A has 0 Lebesgue measure, the following is known from classical results,

Sn√n⇒ N (0, 1) =⇒ P

[Sn√n∈ A

]→ P [N (0, 1) ∈ A] , (3.10)

where⇒ represents weak convergence, and N (0, 1) represents a random variable with standardnormal distribution. This does not mean,

limn→∞

1

n

n∑k=1

1A

(Sk√k

)→ P [N (0, 1) ∈ A] a.s.

Indeed the following well known result from 1-dimensional random walks, establishes the con-trary

lim infn→∞

1

n

n∑k=1

1

(Sk√k> 0

)→ 0 and lim sup

n→∞

1

n

n∑k=1

1

(Sk√k> 0

)→ 1. (3.11)

The situation is somewhat redeemed by

limn→∞

1

log n

n∑k=1

1

k1A

(Sk√k

)→ P [N (0, 1) ∈ A] a.s, (3.12)

which was proven in [17, 92]. This is an example of ASCLT, the result in [42] is a weakerversion of (3.12). An account of such ASCLT theorems maybe found in [10].

From (3.10), for any α ∈ (0, 1), we can find nα > 0 and aα,1, aα,2 such that for any fixedn > nα,

P

(n∑i=1

Xi

n− aα,1√

n≤ θ ≤

n∑i=1

Xi

n− aα,2√

n

)> 1− α. (3.13)

However, say we define An to be the event that∣∣∣n′|n′ < n, θ /∈(∑n′

i=1Xin′− aα,1√

n′,∑n′

i=1Xin′− aα,2√

n′

)∣∣∣

n≥ 1

2.

20

Then An occurs infinitely often.This shows that the data collected will very likely be “accurate” for n samples, where n

is a particular fixed large number, however the data collected has no hope of being “accurate”for n samples, for all n large enough. Thus if one wishes to use different large finite chunksfrom an infinite stream of data, one must use disjoint chunks. From (3.12) we see that a wayaround this is to weight the data collected at every epoch.

Now let us see how to apply ASCLT to U statistics, we will take m = 1. With h,X1, X2, . . . , Y1, Y2, . . . , and Un,h (by which we intend to mean UW

n,h) as in Section 3.2.1, thefollowing is true for m = 1 [57] (Theorem 4.1):

Theorem 3.2.3. Let

Tn(h) :=1

nm−1+1p

∑i1<i2<···<im≤n

h(Xi1 , Xi2 , . . . , Xim)Yi1Yi2 . . . Yim ,

and Tn(h) ⇒ Gp. If m = 1, then for all A ⊂ Rd

limn→∞

1

log n

n∑k=1

1

k1A

(Tk(h)

)→ Gp(A) a.s. (3.14)

Theorem 3.2.3 has an extension for m > 1, however the idea is to use Theorem 3.2.3 andTheorem 3.2.2. For m 6= 1, the scaling in Theorem 3.2.3 and 3.2.2 are not compatible so we donot state the general result.

One must keep in mind that the purpose of CLT type theorems in statistics is for estab-lishing confidence intervals, after a “single value estimation” is obtained. With this in mind theoutline of how to approach estimation problems under (3.7) is: First establish an approximationθ of θ by standard SLLN type results, then use Theorem 3.2.3 and Theorem 3.2.2 in tandem.Using Tn(h− θ) in Theorem 3.2.3 obtain an estimate of Gp, finally use (3.9) with h = h− θ toget a confidence interval for θ.

3.3 Simulations for neural avalanche data

Here we apply the methods in Section 3.2 to data for neural avalanches. The data appearsas a -1.5 exponent power law with a cut-off near the system size (number of neurons beingobserved) ([5],[84]). Because of cutoff the distributions are in principle finite variance distri-butions. However if the cut-off values are large the variance becomes extremely large andstatistical estimation becomes complicated, as outlined in Remark 3.1.

The Abelian distribution ZN,p from definition (2.1.1) is regarded as [44] the underlyingdistribution that generates the the neural data. The parameter N is taken to be the systemsize (where cut-off occurs), and α is considered to be the connection strength, where p = α

N.

We may show limα→1 limN→∞ ZN,p ⇒ a distribution over N with a -1.5 exponent power lawtail. Th value of α = 1 corresponds to the critical system. Usually the brain operates close tocriticality.

On one hand non-critical systems have finite second moment, so CLT methods are validfor establishing confidence intervals, on the other hand critical systems do not have finite first

21

moment. Thus it seems the p-stable method based on Theorem 3.2.3 and Theorem 3.2.2 isunnecessary for non-critical systems and unusable for critical ones. However the brain worksclose to criticality, from Theorem 2.3.1 and Lemma (2.1.1) it is clear that when α is close to1, both the variance and the mean of ZN,p are very large, and the variance is much larger thanthe mean. In this setting the treatment we developed for estimation under finite mean, infinitevariance criteria seems reasonable, and will be proven as such by simulations.

3.3.1 Outline of p -stable method in the context of sample mean

This is how we apply the p-stable method in the case when the U-statistic is the sample mean,viz: h(x) = x. First we spell out a crude algorithm for which Theorem 3.2.3 and Theorem3.2.2 are applicable directly, then we tweak it slightly for reasons which will be given in duecourse. The data is X1, X2, . . . ...Xn, which are iid, we generate independent iid instances ofp-stable distributions Y1, Y2, . . . Yn. The stable distribution in question has location parameterequal to 0, skew parameter equal to 0, scale parameter equal to 1, stability parameter equal top. In case E[|X1|2] =∞, and r := sup l |E[|X1|l] <∞ satisfies 1 < r < 2, we choose p veryclose to but less than r. On the other hand if E[|X1|2] <∞, the choice of p is arbitrary, with1 < p < 2, being the only constraint.

i. The point estimate is calculated as

µ =n∑i

Xi

n.

ii. Now we calculate the quantities XY and Y from data by:

XY =n∑i

XiYin

, Y =n∑i

Yin. (3.15)

Now we calculate the quantity Un,p from data by:

Un,p =1

n

n∑i

((Xi − µ)Yi). (3.16)

The convergence results are:

n1− 1pUn,p ⇒ Gp, n−

1p

n∑i

((Xi − µ)Yi)⇒ Gp, (3.17)

ci be some pre-prescribed confidence interval. We are able to use almost sure central limittheorems to estimate the distribution of Gp, say we conclude a random variable having thesame distribution as L ≤ Gp ≤ U with probability ci. Then we it follows that:

E1 ≤ µ ≤ E2, E1 =XY − U

n1−1/p

Y, E2 =

XY − Ln1−1/p

Y. (3.18)

22

Remark. We will actually have Y1, Y2, . . . Yn follow stable distributions with location parameterequal to 1 instead of 0 (everything else is as before). The reason for such an accommodationbeing necessary is that E1 − E2 which is the length of the confidence interval has the termn1−1/pY in the denominator. This random variable has the distribution of Y1 and we want itto be bounded away from 0. At first glass it might seem that allowing for E[Yi] = 1 invalidatesthe convergence result of Theorem 3.2.2, but this is overcome as follows

1

n1p

n∑i

((Xi − µ)(Yi − 1))⇒ Gp

1

n1p

n∑i

((Xi − µ)Yi) −1

n1p

n∑i

(Xi − µ)⇒ Gp.

Now, because there is r, satisfying p < r < 2, and E[|X1|r] < ∞ , we may conclude [20]1

n1p

∑ni (Xi − µ)⇒ 0.

If the data were from neural LFP avalanche recordings, we can assume that they follow theAbelian distribution with system size N . From Lemma (2.1.1) it is clear limN→∞E[ZN,p] = 1

1−α ,so from (3.18) it follows that:

1− 1

E1

≤ α ≤ 1− 1

E2

0.97

0.98

0.99

1.00

CLT P stableTechnique

Ran

ge

xm = 105

0.997

0.997

0.998

0.998

P stable

0.9900

0.9925

0.9950

0.9975


Ran

ge

xm = 6 × 105

0.97

0.98

0.99

1.00


Ran

gexm = 8 × 105

Figure 3.1: CLT and p-stable method (p value used is 1.7) for calculating confidence intervalsfor α for three different values of xm: On the x-axes we indicate the method used to obtainconfidence interval for α. On the y-axes is shown the range of the 4% confidence intervalobtained for each method. Red dots indicate the ends of the confidence intervals. The blue5 symbol indicates a lower bound for the confidence interval cannot be calculated using themethod in question. To calculate confidence intervals we use 1000 instances of synthetic data.The points indicated by × show the sample mean calculated from 900000 instances of syntheticdata. The inset on the leftmost is to show the P-stable results for this case more prominently.

We summarize our findings in Figure 3.1, where we use the programming language R todo simulations, the data is synthetic, generated from 1.5 power laws with cut-off 106 (this isvery reasonable for avalanche data), number of samples is 1000. The seed is 1990.

23

3.3.2 Discussion

Let N denote the cut-off value, formally the underlying distribution is a power law between[1, N ]. It should be noted that bigger the value of N , the larger is the ratio of the variance andthe mean. Also n denote the number of observations from which we make an inference. A crudeschematic comparison of the p stable and the CLT method from the theoretical convergenceresult that underpin there existence can be made as follows:

For p stable: n1− 1p [Some Statistic] ⇒ Some Distribution

For CLT: n12 [Some Statistic] ⇒ Some Distribution

When there is a lot of data, i.e. n is high, we have n12 n1− 1

p (since p ∈ (1, 2)). So the CLTmethod works better in such a setting. When n is not very large this “exponent argument” isno longer the critical factor. When we have very sparse amount of data, the p stable methodworks better , since the underlying convergence results require milder moment conditions. Thetrade off between moment conditions and “exponent argument” is again the crucial factor whenit comes to choosing a value of p. Lower values of p are on one hand unsuitable because theyhave a lower exponent in the underlying convergence result. However with lower values of pthe p-stable method has more relaxed moment conditions needed to be valid.

24

Part III :

External input and neuronal avalanches

25

Chapter 4


A brief summary of the work: Power laws and other signs of complexity are often foundin the natural world. To create a general framework explaining their emergence, a theory ofself-organized criticality (SOC) was proposed and has subsequently been extensively developedover the last 30 years. It considers systems that reach and maintain closeness to criticality in theform of second order phase transition by their own dynamics. The two most central conditionsfor a system to be SOC are local energy preservation and separation of time scales. The latterimplies that external input is delivered to the system on a much slower time-scale than the time-scale of internal dynamics. Criticality was shown to be a good state for various computationsleading to the conjecture, that the brain also operates close to the critical point and thuspresents an example of a SOC system. However, for neuronal systems that are constantlybombarded by the incoming signals, a separation of time-scales assumption is highly unnatural.Here we are making the first step towards an understanding of the importance of time-scaleseparation by allowing for external input during the avalanche, without compromising theseparation of avalanches. We develop an analytic treatment and provide numerical simulationsof the simple neuronal model. We show that although a power law scaling remains, even verymoderate external input leads to a change in the power-law exponent and thus universalityclass. Our analytic results derived for the simplified model are unchanged if we consider abranching network model. We obtain a new characteristic exponent of 1.25, whereas for theperfectly time-scale separated system an exponent of 1.5 is expected. Our results indicate, thatadditional drive can change the universality class of the system and lead to the 1.25 exponentthat was also previously observed in the cracking dynamics and models of amorphous plasticity.The work in this part was done in collaboration with Dr Anna Levina.

Organization: The work here can be divided into two parts. The first part focuses on theimplication of our mathematical results on physical systems. Here we use our mathematicalresults to motivate simulations on more realistic established models. The mathematical re-sults are stated and described only to the extent needed for explaining their implications forthe phenomenon under consideration. We then seek to understand other SOC models fromthe perspective gained for neuronal avalanches. The second part concerns with the rigorousmathematical results. We first introduce a robust rigorous analytic framework for describingneuronal avalanches. Although this framework simplifies a complex physical phenomenon, it

26

is analytically tractable while preserving the essential details of the neural phenomenon. Thatmuch of the predicted outcomes are observed in more complex models proves that the modeldescribed here is not an over simplification. Th main mathematical tools used relate to discreteprobability, although an interesting link to enumerative graph theory is identified.

27

Chapter 5

Neuroscience interpretations

Recently there were many power law distributions reported in various types of neuronal systemsvarying from cortical slices from rats [5] to in vivo recordings in humans [84]. A prominent expla-nation for the abundance of observed power-laws is closeness of brain networks to critical states.Indeed such a notion has been shown to bring about optimal computational capabilities [11],optimal transmission and storage of information [12], and sensitivity to sensory stimuli [60].For the brain, that is constantly bombarded with different types of stimuli and permanentlyupdates connection strengths, staying close to criticality requires a self-organization that willhave a critical state as an attractor.

Decades before the first neuronal experiments on criticality, a concept of Self-OrganizedCriticality (SOC) was proposed to explain emergence of complexity [1]. Typically SOC systemis a slowly driven, intermittent system with non-linear interactions. Accumulation of the slowly-delivered external drive leads to cascades of relaxations, whose sizes measured as the number ofsites participating in a cascade are distributed according to a power-law. For neuronal system,these cascades were termed “neuronal avalanches”. Preceding the experimental results, firstmodels of SOC in neuronal networks appeared [44]. Later the model was extended to be trulyself-organized [66] and also many other models appeared [32, 75]. Separation of time scales,that ensures absence of input during the avalanche, has long been considered to be an essentialingredient of SOC models [59, 39]. It implies, that no external input is delivered to the systembefore it reaches the stable configuration and the avalanche is over. However, in neuronalsystems inputs are coming constantly and there is no chance for a strict timescale separation.

We study how the relaxation of the time-scale separation condition influences avalanchesize distribution. The first step towards the understanding of the importance of time-scaleseparation is to allow for external input during the avalanche without compromising the sepa-ration of the avalanches. We develop an analytic treatment and provide numerical simulationsof the simple neuronal model. We show that a power law scaling is preserved, however even verymoderate external input leads to the change of the slope of avalanche sizes distribution. Our an-alytic results derived for the simplified model are unchanged if we consider a branching networkmodel. We obtain a new characteristic exponent of −1.25, whereas for the perfectly time-scaleseparated system an exponent of −1.5 is expected. Contradictory to our analytic conclusions,most results reported for spiking data (that has the least time-scale separation) provide thepower-law exponents of ≈ −2, pointing to the idea that identification of the avalanches in

28

neuronal recordings is strongly influenced by the input.

5.1 Models

For our analytical and numerical investigations, we will use the following two models. TheBranching Model (BM) [28, 53] is a standard model to study an abstract signal propagationthat serves as a simplified model for neuronal avalanches. For our studies, we equip the standardBM with an additional input process during avalanches. Unfortunately, BM does not allow fora complete analytic description for a finite network. To overcome this difficulty, we introducea simpler Levels Model (LM). We carry out a rigorous mathematical study of LM, and thencheck in simulations that results for BM are the same as for LM. In the limit of the infinitesystems sizes, both LM and BM are well approximated by branching processes [70, 68].

5.1.1 The Branching Model (BM)

The Branching Model (BM) consists of N neurons connected as in the Erdos-Renyi randomgraph with probability of connection pconn. It was first used for studying benefits of criticalityin neural systems [60] and later was employed in many modeling investigations of neuronalavalanches [52, 95, 71, 65]. Every edge in the network is assigned a weight pij = σ/(pconnN),as a result the average sum of all outgoing weights equals σ. Each node denotes a neuron thatcan be in one of n states, ci(t) denotes the state of the i-th node at the time t: ci = 0 indicatesa resting state, ci = 1 is the active state, and ci = 2, . . . , n − 1 are the refractory states. Allstates except for the active state are attained due to deterministic events: if 0 < ci(t) < n− 1then ci(t+ 1) = ci(t) + 1, and if ci(t) = n− 1 then ci(t+ 1) = 0.

For every node i, the excited state ci(t) = 1 can be reached only from the resting stateci(t−1) = 0 in one of the following circumstances: (1) If a neighbor j is active at time t−1 thenwith probability pji i will get activated at time t; (2) If there are active nodes in the network,the node i can receive an external stimulus with probability φ/N . The condition on the activityin the network in (2) is a major difference to previously studied models [60]; it allows to keepthe avalanche separation intact while introducing an external input during the avalanche. Weinitiate the network with one random node set to the active state and the remaining nodes inthe resting state and observe activity propagation (avalanche). Then we record the distributionfor activity propagation sizes (measured in the number of activations during the avalanche),and durations (measured as the time-steps taken until activity dies out).

It was shown [60] that in the model without input, a network can exhibit different dy-namical regimes depending on the value of parameter σ (called branching parameter): whenσ < 1 the activity dies out exponentially fast, for σ > 1 there is a possibility for indefiniteactivity propagation. In the critical regime, obtained for σ = 1, activity propagation size sis distributed as a power-law with exponent 1.5. However, until now it was not known, whateffect additional inputs have on these distributions.

29

Figure 5.1: Schematic representation of the levels model without external input, for N = 6neurons with M = 7 energy levels. The avalanche size is 4, avalanche durations is 3.

5.1.2 The Levels Model (LM)

The Levels Model (LM) without input is inspired by the simple network model of perfect inte-grators [44]. The neuronal avalanches produced by the model were shown to exhibit critical,subcritcal, and supracritical behavior depending on the control parameter, similar to the ex-perimental observation in cortical slices and cultures [5]. The different modifications of LMwere extensively studied mathematically [69, 35, 36, 68]. The version we will use here was in-troduced in the context of dynamical systems to prove ergodicity of avalanche transformations[36]. The main difference between original biophysical model [44] and LM is that the laterdoes not allow self-connections. However, when parameters are rescaled to accommodate forchanged connectivity, distributions of avalanche sizes and durations are same in both models.

The LM consists of a fully-connected network of N units, each unit i is described by itsenergy level ei ∈ 1, . . . ,M. Connections are defined such that receiving one input changesthe energy level by 1. In the language of neuronal modeling, ei is the membrane potential andconnection strength is set to be equal 1. If neuron j reaches threshold level M , it fires a spikeand then we reset it: ej 7→ 1. All neurons k that are connected to j such that ek < M areupdated: ek 7→ ek + 1. After firing the spike, a neuron is set to be refractory until activitypropagation is over. We initialize the model by randomly choosing energy levels of all neuronsfrom independent copies of a uniform distribution on [1,M ].

After initialization, all neurons in the energy level M spike, followed by dissemination ofenergy. If as a result more neurons reach the level M , then they are in turn discharged andso on. Until the activity stops, which will happen latest when all neurons have fired. Thispropagation of activity we call an avalanche and the number of neurons fired is its size. Theprogress of the avalanche in a system with N = 6, and M = 7 is demonstrated in Fig. 5.1.

We introduce external input as: If o is the number of neurons fired in an avalanche,we additionally activate r among the remaining N − o neurons. Here r is a random numberdrawn from a binomial distribution B(o, φ). The parameter φ ∈ [0, 1) represents the rate ofthe external input i.e., φ is the average number of inputs delivered during an avalanche of size1. After these r additional firings more neurons may reach the energy level M , resulting ina second cascade of firings. The process will stop after a maximum of N discharges becauseno neuron is allowed to fire twice. We study the dependence of the avalanche size distributionon the strength of the input. We use AN,φ to denote the random variable that counts theavalanche size. When φ = 0, we have the no-external input regime. Our model posses an

30

100 102 104

s

10-7

10-4

10-1

P(s)

= 0 = 0.2 = 0.4 = 1.5 = 1.25 = 1.95 = 1.4

100 102D

10-5

10-1

P(D

)

Figure 5.2: Avalanche size distributions in LM with various input strengths. Inset showscorresponding durations distributions that are also changing their exponent. Input strength φand the power-law exponents of the lines are indicated in the legend. N = M = 105, .

Abelian property, namely it does not matter in which order to discharge neurons, size of theavalanche will be the same regardless. This allows us to introduce external input in such asimple form.

5.2 Results and interpretation

5.2.1 Dependence on extent of external input for LM

In the no external input regime, critical behavior is observed when M = N . In this case theavalanche size probability scales as a power-law i.e., P (AN,0 = k) ∼ C1k

−1.5 [36]. For the restof the article we consider M = N , which still serves as the critical value of the parameter inthe “driven” case, with φ > 0.

We let o denote the size of the avalanche that would have been observed without externaldrive, then there will be on average o× φ inputs. When φ = o(1/N), we can show analyticallythat P (AN,φ = k) ∼ C5k

−1.5. This is the small input regime, the perturbation of the system isnot strong enough to induce significant changes in the dynamics. This result demonstrates thestability of the classical models. At the other end of the spectrum, we could force a fraction ofthe neurons to fire as a result of external input. Thus φ = Θ(N), where Θ is taken as in theBachmann-Landau notation ∗. In such a case we can show that AN,φ converges in distributionto a normal variable, as N →∞. Essentially, the immense external input in this regime (namedthe large input regime) has reduced the neuronal activity to “noise”.

The most interesting case is the moderate input regime, where φ = Θ(1). In this case wecan mathematically derive (for detailed derivations see the Chapter 6) the following result:

limk→∞

P (AN,φ = k) ∼ C3k−1.25. (5.1)

∗f(n) = Θ(g(n)) if ∃k1 > 0, ∃k2 > 0, ∃n0 such that ∀n > n0 we have k1 · g(n) ≤ f(n) ≤ k2 · g(n)

31

We verified (5.1) by simulating a finite LM with N = 105 neurons, and inputs of varyingstrength. As expected, the 1.5 power-law is transformed by the input into the 1.25 power-law(see 5.2). Also, we numerically test the avalanche duration distribution i.e the number of time-steps during an avalanche. Both observables deviate from the power-law in the tail because offinite system size and restriction on double activation, besides this for the numerical simulationssupports analytic results.

In the moderate input regime, in-spite of the compromised time-scale separation, power-law scaling is preserved for both avalanche size and duration distributions. However the powerlaw exponent is changed. Surprisingly, as long as φ = Θ(1) the limiting exponent remains equalto 1.25. This means “φ” does not need to be externally tuned to achieve criticality.

5.2.2 Finite size effects and numerical simulations for LM

A scaling relationship given by Eq. 5.1 is valid for any given φ if N , k are both large enough,and k/N is small enough. To define a more precise parameter relationship that will allow us totest results in simulations, we devise sufficient but not necessary condition for Eq. 5.1 to hold.We require k to satisfy

N ≥ k2. (5.2)

And we require φ to satisfy for some positive δ,

e−(φ log(N))2 ≤ N−.5−δ. (5.3)

For any N and φ satisfying Eq. 5.3 if k is small, i.e., φ = Θ( 1k), we get PE,med

φ (AN,p = k) ∼C4k

−1.5, same as for no input systems. However, as k grows large we get PE,medφ (AN,p = k) ∼

C3k−1.25, indicating multi-fractal behavior [54]. Using mean field approximations we show that

as long as k ≤ φ−2, we have limk→∞ PE,medφ (AN,p = k) ∼ C4k

−1.5. The simple intuition behindthis argument is that for very small avalanches, there is substantial probability not to receiveany external inputs. Thus the 1.5 power law characteristic of traditional models with a strongseparation of time-scale framework is still visible.

We simulate the LM for different input strength and observe a good agreement with ouranalytic results, Fig. 5.4 solid lines. Aberrant behavior for large avalanche sizes is due to thefinite size of the system and the imposed condition that no avalanche can be larger than thesystem size. Theoretical prediction for the onset of the 1.25 power-law scaling is indicated bythe magenta line, this too is in good agreement with numerical observations.

5.2.3 BM with input

A BM without external input corresponds to the situation where φ = 0, in such a scenario theprobability distribution for avalanches follow a 1.5 power law [28]. Here we will discuss whatchanges in the avalanche size distribution upon adding a moderate input. A useful characteristicof the LM is that the avalanche can be separated into two stages, an original avalanche andthe after-shock avalanche that is triggered by external inputs. Although this feature makes theLM analytically tractable, it also makes its construction seems contrived. In contrast, in theBM external input is added at a fixed rate during the avalanches, while keeping the separationbetween the avalanches intact.

32

100 102 104

s

10-5C

P(s)

= 0 = 0.3 = 0.4 = 0.4 = 1.5 = 1.25

= 1.5

= 1.25

Figure 5.3: Avalanche size distributions in the branching model with various input strengths.Input strength φ is indicated in the legend. N = 105, n = 10. Distributions for φ > 0 areshifted such that they all coincide for s = 100.

100

s

10-10

C P

(s)

LM, = 0.1LM, = 0.2LM, = 0.3LM, = 0.4LM, = 0.5BM, BM, BM, BM, BM,

= 0.1 = 0.2 = 0.3 = 0.4 = 0.5

PL, = 1.25onset of 1.25

102 104

10-8

10-6

10-4

10-2

Figure 5.4: Avalanche size distributions in the levels model and branching model with variousinput strengths. Input strength φ and the model type are indicated in the legend. Magenta lineindicates the analytic prediction for the onset of the power-law with exponent −1.25. For bothmodels, we take N = 105. To improve visibility, the distributions are shifted by multiplicationwith cφ = 10−10φ+1.

In the moderate input regime, for any suitable strength of the external signal, the exponentchanges from 1.5 to 1.25, Fig. 5.3. For large avalanches, finite size effects observed previously inthe LM are enhanced by the possibility for the system to get additional external input duringthe aftershock.

For the BM, the input is delivered at a constant rate and is thus proportional to the dura-tion of the pre-avalanche, while in LM the input is proportional to the size of the pre-avalanche.However, both systems show very similar avalanche size distributions for various input inten-sities, Fig. 5.4. Let tr denote the transition time between the power-law with exponent 1.5and the power-law with exponent 1.25. It so happens in both models tr is roughly the same,analytic arguments for the LM showed tr ≈ φ−2.

33

5.3 Conclusion

We demonstrated that moderate input during the relaxation of the system changes the power-law slopes of duration and size distributions. This result is particularly relevant for neuronalsystems, where separation of input from activity is inconceivable.

Models exhibiting criticality are classified into into universality classes based on power-law exponents. Quantitative characteristics of various “emergent properties” in critical systemsbelonging to the same universality class are found to be similar (see [21]). By introducing exter-nal input to models from the 1.5 exponent universality class, we have changed them to modelswith characteristic power law exponent 1.25. Although a 1.25 exponent is more seldom thanthe ubiquitous 1.5 exponent, the former has been observed in several models. For example inmodels for slow crack growth in heterogeneous materials [14], driven elastic manifolds in disor-dered media [63], fracturing processes under annealed disorder [18], mesomodels of amorphousplasticity [101], and networks growing randomly [78].

Among the assortment of examples above, a curious one comes from the study of “shearplasticity of amorphous materials in two dimensions” (see [101]). The model proposed in [101]can be described very generally as follows: it requires that the globally applied external stressbe less than equal to the difference of the local yield stress and local residual stress at eachlocation, this balance is the fundamental equilibrium requirement for the system. When thesystem is in balance disturbances called slips are introduced to instigate avalanches. These slipsresult in a change in the residual stress across the system. To restore the balance the yield stressis randomly redrawn at every location in the medium, if this does not lead to the fundamentalequilibrium being satisfied then the yield stress is randomly redrawn again. The number oftimes the local yield stress is redrawn before equilibrium is reached gives the avalanche size.This avalanche size is distributed according to a power law with exponent 1.25.

There are classical models studying these intermittent slip avalanches ( See [108, 90] ),such models show avalanche size distributed according to a power-law with exponent around1.5. Here is a brief broad summary of the model in [108], here too there is a balancing in-equation which is the fundamental equilibrium requirement for the system. The in-equation isviolated by introducing slips. However here the equilibrium is not restored by overhauling theyield stress at all locations independently. Instead the random element here is a fluctuatingdeformation resistance, which is not redrawn independently at each stage of the avalanche butis a diffusion with memory.

From the present perspective an explanation of this discrepancy is as follows: In theclassical setting, avalanches occurred on such a short time-scale that swift small local noisychanges are all that transpire. The mismatch in timescales for internal events and externalagents is relaxed in the adapted model so that the yield stress can be globally reset duringcascades. Thus here again we can envisage a system where the negation of the separation oftime scale condition, a classic necessity of SOC theory only leads to a change in the power lawexponent, and not a total break down of the scale free property.

Since we have changed the power-law exponent by introducing external input in a biolog-ically feasible way, one should be careful using the universality-classes while studying neuronalsystems. The fact that the obtained probability distribution exhibits multi-fractal character-istics with a sharp cutoff indicates that care must be taken while fitting recorded data to

34

power-law curves.There are many open questions related to the present investigation. The most important

one is: how the full elimination of time-scale separation changes the outcome? In the presentcontribution we did not allow for avalanches to be mixed and run parallel to each other. Withsimultaneous avalanches, there is no clear understanding how one should attribute each event toany particular avalanche. Information-theoretical measures were proposed to keep track of thespikes belonging to one or another avalanche [104]. So far the most established way to study apossible “melange of avalanches” [85, 105] is to use binning and identify empty bins to determinepauses between the avalanches. This procedure results in different power-law exponents fordifferent bin-sizes [5], unless system exhibits a true time-scale separation [71]. However, gluingof avalanches together might lead to selecting smaller bin-size, than is suggested by the activitypropagation time-scale. I would phrase this as: Our results clearly indicate avalanches beingglued together by the external input. Bin sizes suggested by the activity propagation time-scale will prove too short to discern this phenomenon. The under-sized bin duration will inturn result in cutting of avalanches into smaller pieces and increasing the power-law exponent.This might be a reason behind observation of the exponents above 1.5 and even around 2for neuronal spiking data [46], and LFP in ex vivo turtle recordings [94]. Here we consider afully connected system, it has been shown in [106] that network topology effects power laws,models with weaker connections can produce distributions with exponents significantly largerthan 1.25, even with time scale separation is relaxed. All of this points to the fact that itis imprudent to fixate on models that produce correct power laws. Our result is a first steptowards understanding the diversity of power-law exponents reported in the neuronal data.

5.4 Statistical analysis of data derived from simulation

of BM

The mathematical analysis of the LM estimates the 1.25 Power Law behavior will be observedfor avalanche sizes between φ−2 and

√N , where N is the system size, φ is the input intensity.

The upper bound of√N is not a mathematically tight bound, and as such we expect it can

be extended to larger values. Indeed when we look at the data collected from the simulationsof the BM with system size N = 105, and φ = .3, the power law is observed between 12 and1250. Note that here .3−2 = 11.11, and

√105 = 316.

1. Goodness of fit. To test goodness of fit we pursue the approach outlined in [22]. Here wefor the sake of consistency adopt the termonologies used in [22], which is pretty standard.The empirical data set is taken from the data obtained during simulation of BM withφ = .3 by selecting only avalanches of size between 12 and 1250. We fit it to the power-law model using the MLE approach outlined in [3], the best-fit power law has exponent1.232. We calculate the KS statistic between the empirical data and this hypothesized1.232 exponent power law model, the value obtained is 0.00202. Next, we generate ahundred instances of power-law distributed synthetic data sets with exponent parameter1.232 and upper and lower bounds of 1250 and 12 respectively. We fit each synthetic dataset individually to its own respective power-law model based on MLE methods. Afterthis we calculate the KS statistic for each data set with respect to its own model. Finally

35

Figure 5.5: The power law distribution is not a significantly better fit to the data in the predictedregion than the log-normal distribution. The main plot is a log-log plot of the avalanche size (s)vs the empirical probability of avalanche sizes (Pob(s)) as observed in the data. The expectedpower law region between 12 and 1250 is marked in blue. The top-right inset is a log-log plotof the probability density (PLN(s)) of the best fitted log-normal distribution between 12 and1250. The bottom left inset is a log-log plot of the probability density (PPL(s)) of the bestfitted power law distribution between 12 and 1250.

36

we count the fraction of the time the resulting statistic is larger than 0.00202, the valuethus enumerated is 0.26. This value is large enough to accept the hypotheisis [22].

2. Testing Against Alternative hypothesis. We compare the fitting of the best fit powerlaw with the fitting of the best fit truncated log-normal [30].The empirical data is thesame empirical data considered in the last section. We use the log-likelihood ratio test todetermine which of the two hypothesis’s are better in light of the data. The log-likelihoodof the best-fit power law is −818558.2, whereas the log-likelihood of the best-fit truncatedlog-normal is −818572.We find the Pl to have a bigger log-likelihood, and is hence thebetter fit.

37

Chapter 6

The Levels Model, a mathematicalanalysis

Here we give a mathematical description of what was referred to as the Levels Model (LM) in thelast chapter. In the previous chapter the levels model was described in a broad general fashion.Special attention was paid to how it differed from the Branching Model. The LM describedhere has rich mathematical structure, which renders it useful to mathematical analysis. Weintroduce it as a set of finite configurations equipped with the coarse sigma algebra, calledthe (N, p) BB space. The classical LM without external input corresponds to the (N, p) BBspace equipped with what we will call the uniform probability measure. To reconcile with thehigh level description of the LM provided in the preceding chapters, we often remark on howabstract definitions relate to various neural objects.

6.1 The “(N, p) BB” space

Here we introduce the (N, p) BB space, this will be the central object of study for this chapter.

Definition 6.1.1. Given positive integers N and M , with M > N , define p = 1M

. The set(N, p) BB is a set of (0, 1) matrices. A (0, 1) matrix ω belongs to the set (N, p) BB if and onlyif for all j ∈ 1, 2, . . . N,

∑Mi=1 ai,j(ω) = 1, where ai,j(ω) denotes the (i, j)-th entry of ω.

Remark. i. The parameters N and M are freely chosen (but for the constraint M > N), pis derived from M . However the name (N, p) BB bears the term p, and not M , this is indeference to classical considerations (see [28]). Another quantity that is used in relationto the (N, p) BB space is α = N

M. Whenever we speak of a (N, p) BB, we assume we are

speaking in terms of some N and p satisfying the conditions discussed here.

ii. The set (N, p) BB is finite and we can equip it with the course sigma algebra. The set(N, p) BB equipped with this sigma algebra is called the (N, p) space. The elements ofthe set (N, p) BB are referred to as configurations. Throughout the chapter, we reservequantities like ω, ω′ etc. to denote configuration in the (N, p) BB space.

iii. For all i ∈ 1, 2, . . .M, and for all j ∈ 1, 2, . . . , N, aij is a map between the (N, p) BBspace and the set 0, 1. We will in the course of this chapter equip the (N, p) space with

38

various probability measures, in the presence of each probability measure aij is a randomvariable. Therefore we call maps between the (N, p) space and R (like aij) as universalrandom variables. Abusing notation we use the term random variable in place of universalrandom variable, leaving the distinction to be understood by the reader.

iv. BB refers to “Balls and Baskets”. This because a configuration ω can be interpreted asan array of baskets. aij(ω) = 1 means the basket placed at the (i, j)-th position of the gridcontains a ball, aij(ω) = 0 means the basket placed at the (i, j)-th location of the array isempty. This intuition is not revisited in the article.

The motivation for constructing the (N, p) BB space comes from neuroscience. We thinkof a configuration ω as a record the energy levels of N neurons at some moment of time.Each neuron occupies one of M energy levels, if aij(ω) = 1, then the neuron j is at the i-thenergy level. Since for all j ∈ 1, 2, . . . N, there is a unique i, such that aij(ω) = 1, weensure that at any instant a neuron has one unique energy level. Formally for j ∈ 1, 2, . . . N,Eelj (ω) = infiai,j(ω) = 1 (= supiai,j(ω) = 1). Eel

j (ω) documents the energy level of the j-thneuron. There is a linear ordering of the M possible energy levels, which means Eel

j (ω) = Mindicates that neuron j is at the highest energy level. Throughout the article as we introducevarious abstract artifacts, we will try to present a parallel commentary on their interpretationfrom the neuronal point of view.

For all i ∈ 1, 2, . . .M, Yi(w) :=∑N

j=1 ai,j(ω), Yi(w) accounts for the number of neuronsat the energy level i. Define the random variable AN,p by

AN,p := inf

i| i ≥ 0,

M−i∑j=M

Yj ≤ i

.

AN,p is called the avalanche size. The motivation for considering such a random variables comesfrom biology, when the neurons are at the highest energy level, they fire thus spreading all theirenergy uniformly to the other neurons. Each neuron on account of this internal energy beingdelivered climbs up to the next highest energy level. Because of one the one initial firing theother neurons may get energized to the highest level, thereby firing themselves. A series ofsuch firings is called an Avalanche. The random variable AN,p gives the avalanche size (numberof neurons involved in one consecutive sequence of firings). In this spirit we often say thata configuration ω has generated AN,p(ω) firings. The following sets are constructed from aconfiguration ω,

Fire(ω) :=

j| ai,j(ω) = 1, for some i ≥M − AN,p(ω)

,

NF ire(ω) :=

j| ai,j(ω) = 0, ∀ i ≥M − AN,p(ω)

.

We will equip the (N, p) BB space with various probability measures. Each such probabil-ity measure arises from biological motivations. The first and simplest is what we call the uniformmeasure, we denote it by P . It is defined as follows: κM be a random variable taking values in1, 2, . . . ,M, which is uniformly distributed. κMj , j ∈ Z+, be iid copies of κM defined on some

39

probability space κ. The map CUf : κ 7→ (N, p) BB is defined as ai,j(CUf(θ)) = 1i(κMj (θ)

), here

1 is the indicator function. P is the push forward measure of CUf on (N, p) BB. Intuitively withthe uniform measure every neuron has equal probability of lying in any of the energy levels,also there is no correlation between the energy levels of different neurons. In [36] one finds

P (AN,p = k) =

(N

k

)pk (1− (k + 1)p)N−k(k + 1)k−1. (6.1)

For the remainder of this article a random variable following such a distribution will be said tohave the Avalanche distribution. Before ending the section, we prove the following result thatenumerates the number of configurations satisfying a given property.

Theorem 6.1.1. Define the set of configurations a(N, k, a, p) as

a(N, k, a, p) := ω | ω ∈ (N, p) BB, AN,p(ω) = k, YM(ω) = a.

Then |a(N, k, a, p)| =(k−1a−1

)kk−a.

Proof. Define the sets V := R, 1, 2, 3, · · · k, a(N, k, p) := ω | ω ∈ (N, p) BB, AN,p(ω) = k.T lV denote the set of labeled trees which have V as its set of vertices. We will define a functionΨ : a(N, k, p)→ T lV . For ω ∈ a(N, k, p), here is how define Ψ(ω):

|Fire(ω)| = k, we first introduce a ranking for the members of Fire(ω). Formally rank(·, ω)is a one-one function between Fire(ω) and 1, 2, . . . k. Say i ∈ Fire(ω) and Eel

i (ω) = i′,define score(i) = k × (M − i′) + i. The elements of Fire(ω) are ordered (ranked) linearlyaccording to the inverse of their scores. This means for any i∗ ∈ Fire(ω), rank(i∗, ω) = |i|i ∈Fire(ω), score(i) ≤ score(i∗)|. Now ∀u ∈ V , such that such that Eel

u (ω) = M attach (drawan edge between) u and R in Ψ(ω). Further, ∀i, j such that for some r , rank(i, ω) = r,Eelj (ω) = M − r, attach i to j in Ψ(ω).

It is straightforward to prove that the graph Ψ(ω) is a tree, and that the map Ψ is bothinjective and surjective. We know from Cayley’s theorem ([77]) that the number of labelledtrees with k + 1 vertices is (k + 1)k−1. So we have established that in a (N, p)-BB space thenumber of configurations ω such that AN,p(ω) = k is (k + 1)k−1. This can be used to prove(6.1).

Note that a configuration ω has the property that AN,p(ω) = k, YM(ω) = a if and onlyif R is joined to exactly a neighbors in Ψ(ω). The number of such configurations has beencomputed to be

(k−1a−1

)kk−a (see [77]).

6.2 A technical model

This section introduces a simple probability measure on the (N, p) BB space, this new proba-bility space will help with computations that arise in future sections. The results of this sectiontherefore are not interesting by themselves, but will serve as tools in later efforts.

Suppose we start with a (N, p) BB space. There are N neurons, each lying in one of M(M > N) energy levels. We consider p = 1

M= α

N, α ≤ 1. Previously we had an uniform

measure P on this space, i.e. each neuron was placed independently with equal chance of beingin one of the M energy levels. For λ an integer valued parameter, we will construct a second

40

probability measure on the (N, p) BB space. This measure denoted by Pλ is the pushforwardmeasure of P by the map Tλ : (N, p) BB→ (N, p) BB, defined below.

Take a configuration ω. Here is the configuration Tλ(ω):i. For j such that Eel

j (ω) ≥M − λ,

aM,j (Tλ(ω)) = 1 & ai,j (Tλ(ω)) = 0, ∀ i < M.

ii. For j such that Eelj (ω) < M − λ, define Shj,λ(ω) = Eel

j (ω) + λ (we shall suppress thesubscripts in Shj,λ for convenience), and set

aSh(ω), j (Tλ(ω)) = 1 & aij (Tλ(ω)) = 0, ∀ i 6= Sh(ω).

Theorem 6.2.1. Let AN,p be the avalanche random variable on the (N, p) BB space, and k, λbe non negative integers satisfying k + λ+ 1 < M . We have

Pλ(AN,p = k) = (λ+ 1)(Nk

)pk (1− (k + 1 + λ)p)N−k(k + 1 + λ)k−1

P (AN,p( Tλ(ω) ) = k, YM(ω) = 0 ) = (λ)(Nk

)pk (1− (k + 1 + λ)p)N−k(k + λ)k−1.

Proof. a(N, k, a, p, λ) = ω|AN,p(ω) = k, YM(ω) = a ∩ Range(Tλ). When ω ∈ a(N, k, a, p, λ),there are exactly (λ+1)a configurations ω′, such that Tλ(ω

′) = ω. Using Theorem 6.1.1, we get

Pλ(a(N, k, a, p, λ)) =

(N

k

)pk(1− (k + λ+ 1)p)N−k

(k − 1

a− 1

)kk−a(λ+ 1)a.

Pλ(AN,p = k) =∑k

a=1

(Nk

)pk(1− (k + λ+ 1)p)N−k

(k−1a−1

)kk−a(λ+ 1)a

= (λ+ 1)(Nk

)pk (1− (k + 1 + λ)p)N−k(k + 1 + λ)k−1.

6.3 A model with moderate external input

We will construct yet another measure on the (N, p) BB space, we shall call it PE,medφ . For

any real number φ satisfying 0 < φ ≤ 1, we will define the random function τφ : (N, p) BB →(N, p) BB. For any ω ∈ (N, p) BB, τφ(ω) is defined as :i. Say AN,p(ω) = o, when N − o ≥ oφ, we choose a subset of size do× φe from NF ire(ω). Thechosen set be denoted by EF ire(ω). If N − o < do× φe, define EF ire(ω) = NF ire(ω).ii. When j ∈ EF ire(ω)

aM,j (τφ(ω)) = 1 & aij (τφ) = 0, ∀ i < M.

When j /∈ EF ire(ω), set ai,j (τφ(ω)) = ai,j(ω), for all i.

PE,medφ is the pushforward measure of P by τφ. The intuition behind the definition of τφ, is

as follows : during the avalanche we want to introduce some external signals to the system.The number of these external signals is |EF ire(ω)|, and the neurons receiving external inputare those whose numbers lie in the set EF ire(ω). The intricacy here is that the size of the setEF ire(ω) is do×φe, this means that the external input is proportional to the size of the originalavalanche. The longer the avalanche the more the number of external stimulus’s deliveredduring it.

41

Theorem 6.3.1. Let AN,p be the avalanche random variable on the (N, p) BB space, φ and τφare as above. For any positive integers k, o satisfying k ≥ o+ do×φe, and for p = p×N

N−o we have

P (AN,p(τφ(ω)) = k | AN,p(ω) = o ) =

(do× φe)(N−do×φe−ok−do×φe−o

)pk−do×φe−o (1− (k − o+ 1)p)N−k(k − o)k−1−do×φe−o.

Remark. We will study the regime as N, k →∞, N k. We consider φ > 0 to be a constant,i.e we consider φN →∞. Hence many of the formulas will fail to hold when one directly setsφ = 0, and compares with results in the no input regime where we use the measure P .

6.3.1 Asymptotics

The main result here shows that PE,medφ (AN,p = k) becomes a power law as N tends to ∞.

Since we are dealing with asymptotic behavior, we will for clarity replace do × φe with o × φ.The introduction of this simplification has no bearings on the final result. We will first establishsome lemma’s.

Lemma 6.3.2. The following is true for N, k positive integers satisfying N > k, and p = 1N+1

,

limk→∞N≥k2

(Nk

)pk (1− (k + 1)p)N−k(k + 1)k−1

k−1.5=

1√2π,

limk→∞N≥k2

(Nk

)pk (1− (k + 1)p)N−k(k)k−1

k−1.5=

1√2π × e

.

Lemma 6.3.3. Let k, o, o′, g be positive integers satisfying, k = o + o′ + g, and 0 ≤ o′

o≤ 1 .

Then

limk→∞go′→∞

(k − o)o′

(k − o+ 1)o′= 1

(1 +O(

1

g)

),

limk→∞go′→λ

(k − o)o′

(k − o+ 1)o′= e−

11+λ

(1 +O(

1

g)

).

Remark. Typically we will apply 6.3.3 with o′ = oφ. The two parts tell us how to deal withthe asymptotics in the respective cases where the original avalanche is very small compared tothe whole avalanche, and when it is not. Continuing upon the remark following Theorem 6.3.1,observe that the setting g = o′ = 0 is beyond the scope of Lemma 6.3.3, this is a setting thatbecomes significant if we were to consider φ = 0.

Lemma 6.3.4. Let N, k, o, o′ be positive integers satisfying, k − o − o′ ≥ 0 , o > 0 , andφo = o′ , with 0 < φ < 1 . Also set p := 1

N−o . Then,

limo→∞o2

N→0

(N−o′−ok−o′−o

)p−o

′(k − o− o′)!(

N−ok−o

)(k − o)!

= 1. (6.2)

42

Also,(k − o)!

(k − o− o′)!(k − o+ 1)o′≤ e−

o′(o′+1)2(k−o+1) . (6.3)

In addition to the conditions at the beginning of the lemma , if we assume N = C o′2, for someconstant C > 0, we get,

1 ≤ limo→∞

(N−o′−ok−o′−o

)p−o

′(k − o− o′)!(

N−ok−o

)(k − o)!

≤ eC . (6.4)

Further if in addition to the conditions at the beginning of the lemma , we assume that√k log(k) ≥ o, then the following is true (asymptotics are taken in the sense o→∞ ):

(k − o)!(k − o− o′)!(k − o+ 1)o′

∼ e−o′2

2(k−o) e−Rem , where |Rem| = O(log(k)√

k). (6.5)

Proof. Let (x)n = (x)(x−1) . . . (x−n+ 1), denote the falling factorial function. We can derivethe following: (

N−o′−ok−o′−o

)p−o

′(k − o− o′)!(

N−ok−o

)(k − o)!

=(N − o)o′

(N − o)o′≥ 1. (6.6)

Now, notice that(N − o)o′

(N − o)o′≤ (N − o)o′

(N − o− o′)o′∼ eo

′2(N−o)−1

. (6.7)

The final bound in (6.7) follows from

log

[(N − o)o′

(N − o− o′)o′]

= −o′ log[1− o′

N − o] =

o′2

N − o+ Int, where |Int| = O

(o3

(N − o)2

).

From (6.6) and (6.7) we can derive both (6.2) and (6.4).

Observe that (k−o)!(k−o−o′)!(k−o+1)o′

=(k−o)o′

(k−o+1)o′. The proof of (6.3) is an immediate consequence

of

log

[(k − o)o′

(k − o+ 1)o′

]=

o′∑i=1

log

(1− i

k − o+ 1

)≤ −

o′∑i=1

i

k − o+ 1= − o′(o′ + 1)

2(k − o+ 1).

To prove (6.5), observe that when√k log(k) ≥ o, there exists 0 ≤ θ ≤ 1, such that

o′∑i=1

log

(1− i

k − o+ 1

)=

o′∑i=1

(− i

k − o+ 1− 1

2

(i

k − o+ 1

)21

1− θ ik−o+1

)

=o′∑i=1

(− i

k − o+ 1

)±O

(log(k)√

k

)= − o′(o′ + 1)

2(k − o+ 1)±O

(log(k)√

k

).

43

We say that f(k) C if limk→∞ f(k) ≤ C.

Theorem 6.3.5. We assume that (N, φ, k) satisfies :

i. φ > 1log(k).5

.

ii. N ≥ k2.

Then with PE,medφ (AN,p = k) as in section 6.3 and p = 1

N+1, there exist positive constants D1, D2

depending on φ , such that

D2 PE,medφ (AN,p = k)

k−1.25 D1. (6.8)

Proof. Here is the proof of Theorem 6.3.5. In light of (6.2), one finds :

P ( AN,p(τφ(ω)) = k| AN,p(ω) = o) =(N − ok − o

)pk−o (1− (k − o+ 1)p)N−k(k − o)k−1−o (φo)(k − o)!

(k − o− oφ)!(k − o)oφ.

Now it can be shown that P (AN,p = o) ∼ o−1.5√2π

. Thus,

P

(AN,p(τφ(ω)) = k, AN,p(ω) = o

)= P ( AN,p(τφ(ω)) = k| AN,p(ω) = o )

o−1.5√2π.

Using the Euler–Maclaurin formula for expressing sums as integrals, we get

PE,medφ (AN,p = k) =

∫ k1+φ

1

P


)do

+ φ1√

2πe2k−1.5 + (1 + φ)1.5k−1.5

e−kφ1+φ

√2πe

+ o(k−1.5)O(φ). (6.9)

Now using (6.5)∫ √k

log k

1

P


)do ∼ φ

e2π

∫ √k

log k

1

(k − o)−1.5o−.5do

∼ φ

eπ

(k−1.25

(log k).5− k−1.5

). (6.10)

Observe that using the fact φ > (log k)−.5, for some positive constant C(1), we have,∫ k1+φ

√k log k

P


)do ≤ C(1)φ

∫ k1+φ

√k log k

(k − o)−1.5o−.5e−(φo)2

2(k−o) do

≤ C(1)φe−(φ log k)2∫ k

1+φ

√k log k

o−.5(k − o)−1.5do ≤ φ C(1)e−(φ log k)2k−.75 ∼ o(k−1.25).

44

Thus ∫ k1+φ

√k log k

P


)do = o(k−1.25). (6.11)

Now, define Ak = φ2

2(k−√k

log k), Bk = φ2

2(k−√k log k)

, using (6.5) we get

φ

2πe(k −

√k log k)−1.5Qb ≥

∫ √k log k√k

log k

P


)do (6.12)

≥ φ

2πe(k −

√k

log k)−1.5Qs, (6.13)

where

Qs =

∫ √k log k√k

log k

o−.5e−o2Bkdo ∼ 1

2B.25k

∫ ∞φ2

2

t−.75e−tdt,

Qb =

∫ √k log k√k

log k

o−.5e−o2Akdo ∼ 1

2A.25k

∫ ∞φ2

2

t−.75e−tdt.

Thus we arrive at∫ √k log k√k

log k

P


)do ∼ Dφk

−1.25, Dφ =

√φ

21.75πe

∫ ∞φ2

2

t−.75e−tdt.

(6.14)Using (6.9), (6.10), (6.11) and (6.14), one arrives at

PE,medφ (AN,p = k) = Dφk

−1.25 +φ

eπ

(k−1.25

(log k).5− k−1.5

)+ φ

1√2πe2

k−1.5 + o(k−1.5).

Remark. The conditions enforced on (N, φ, k) in Theorem 6.3.5 are sufficient but not neces-

sary. For example the condition N > k2 is used to ensure terms like eoφ

N−o−1 are equal to 1. Themilder condition N > o2, would suffice for this.

6.3.2 Cutoff at k > φ−2

Consider φ is small but fixed. We take k = φ−δ, i.e k−1δ = φ. A rough simplification of the

proof for Theorem 6.3.5 shows that the term PE,medφ (AN,p = k) is computed as,

PE,medφ (AN,p = k) ∼ φ

∫ k1+φ

1

(k − o)−1.5o−.5e−(φo)2

2(k−o) do

∼ φ

[∫ √k1

(k − o)−1.5o−.5do+

∫ k1+φ

√k

(k − o)−1.5o−.5e−(φo)2

2(k−o) do

]∼ Cφk−1.25.

(6.15)

45

The reason for breaking up the main integral in (6.15) is that when k > e1φ , and o >

√k, one

observes

e−(φo)2

2(k−o) ∼ 0. (6.16)

With k = φ−δ, (6.16) is no longer true. To account for this we define δ′ = 12

+ 1δ

+ δ′′′, with anyδ′′′ > 0 satisfying 1

δ δ′′′. We also need to ensure δ > 2. Such a setup allows for (6.15) to be

replaced with

PE,medφ (AN,p = k) ∼ φ

∫ k1+φ

1

(k − o)−1.5o−.5e−(φo)2

2(k−o) do

∼ φ

[∫ kδ′

1

(k − o)−1.5o−.5do+

∫ k1+φ

kδ′(k − o)−1.5o−.5e

−(φo)2

2(k−o) do

]∼ φk−1.25+

12δ ∼ k−1.25−

12δ .

(6.17)

This rough calculation shows that δ = 2 is where the distribution departs from a 1.5 powerlaw.

6.4 Large and small input regimes

Suppose we have a (N, p)-BB space.With λ ≤ N an integer parameter we will define a new random variable XN,p,λ. For a

configuration ω, a second configuration ω′ is defined by the following procedure. When j ≤ λ,

aM,j (ω′) = 1 & aij (ω′) = 0, ∀ i < M.

When j > λ, set ai,j (ω′) = ai,j(ω), for all i.

Define XN,p,λ(ω) = AN,p(ω′)− λ. We can derive the following using Theorem 6.2.1 :

Theorem 6.4.1. P (XN,p,λ = k) =(N−λk

)pk(λ+ 1)(k + λ+ 1)k−1(1− (k + λ+ 1)p)N−λ−k

6.4.1 Small input regime

Here we choose λ = λ0, where λ0 is some constant independent of N . Using Theorem 6.4.1,and Stirling’s formula (like with the avalanche distribution) we show that as N and k grow toinfinity with k

N→ 0, P (XN,p,λ = k) = Θ(k−1.5).

6.4.2 Large input regime

Let us now put λ = λN = λ0 × N , λ0 < 1. We will also demand that α × (1 + λ0) < 1.This implies there is massive external input during firing that forces a proportion of the systemto spontaneously fire. Now observe that XN,p,λN

has the distribution of Quasi Binomial 1

distribution (see [25]) with n = (1 − λ0)N, a = (1 + λN)p, θ = p, b = 1 − nθ − a. Again

46

it is known as n → +∞, a Quasi Binomial 1 distribution approaches the Generalized Poissondistribution ([27]), which is a type of Lagrangian distribution ([23], [26]). It has further beenestablished that Lagrangian random variables approach the standard normal distribution undercertain conditions (see [24]). All of these together lead to the following theorem.

Theorem 6.4.2.XN, α

N,λN−µN

σNconverges in distribution to a standard normal variable as N goes

to ∞, where

µN =α(1 + λ0N)(1− λ0)

1− α(1− λ0), (σN)2 =

α(1 + λ0N)(1− λ0)(1− α(1− λ0))3

.

47

Part IV :

Markov processes in randomenvironments, and their relationship to

neuroscience

48

Chapter 7


In Part II, we focused on the Abelian and avalanche distributions. We treated them as prob-ability distributions, and we sought to understand their properties. In Part III, we developeda probability space, where the Abelian distribution appeared as the distribution of a randomvariable. We used a measure space called the (N, p) BB space and provided justification for whyit was a correct space to consider in the context of modeling neural avalanches. We however didnot justify why the uniform measure was the correct probability measure to equip the (N, p)BB space with. All other measures on the (N, p) BB space that we introduced in Part III weremotivated by trying to introduce various types of external inputs during avalanches. In eachcircumstance we assumed that the uniform distribution was the correct measure to considerwhen there was no input. This brings us back to the question of why the uniform measure is acorrect choice. In nature most probability measures appear as stationary measures of Markovprocesses. We therefore try to find the measure as a stationary measure of suitable Markov pro-cesses. We first begin with understanding possible stationary measures for Markov processes inrandom environments. These are a rich class of processes that are useful in formalizing physicalexamples. This is because they not only consider the evolution of a process, but also assumethat the governing laws for the dynamics of the process change in a stochastic manner. Chapter8 mostly concerns with understanding the interaction of a process and the random environmentit occupies, in terms of the stationary measures they present. Chapter 9 generalizes results ofthe Chapter 8.

The link to neuro-science can be envisioned as follows: Consider a pair of neurons whichhave correlated firing patterns. Indeed that such correlated activity is present is not onlyintuitively reasonable but also experimentally proved. Consider a single neuron in two possiblestates 0, 1, and an environment which is in two possible states 0, 1. The environmentrepresents the state of a second neuron which influences the activity of the neuron we arestudying. Here we model the temporal dynamics of the two neuron system as a Markov processin random environment. The state of the second neuron effects but does not fully determinethe state of the first neuron. That the two neurons can both align with each others states, andalso deviate from each others influence is thought to be a strong reason behind the formationof memory. There are also external spikes in the form of external sensory inputs which canimmediately re-configure the state of the two neurons. We will show that this external signalcan by the manner of its influence control the ergodic property of the stochastic process.

49

Chapter 8

Constructions of Markov processes inrandom environments, where the statespaces are finite

8.1 Introduction

This chapter concerns with dynamics of Markov chains in random environments [31]. Thereexists a substantial literature (see, e.g., Ref. [110] and the bibliography therein) where suchprocesses are considered in an environment that is randomly chosen but kept fixed throughoutthe time dynamics. In Ref. [41], a particular construction was proposed where the environ-ment influences the basic process, but remains unaffected by it, resulting in a product-formof a stationary distribution. In [9] a ‘combined’ Markov process has been introduced, withbasic and environment processes influencing each other and the product-form of the stationarydistribution still preserved. In the current discussion we give a different construction of Markovmodels similar to [9]. Our construction can be applied under more general conditions but withan added proviso: we have to affiliate an additional state for the combined process.

Consider sets Z = z1, z2, . . . zM and X = x1, x2, . . . xN. We call Z the environmentspace, and X the collection of basic states. We also are given a family of linear dissipativeN × N matrices Qz ∈ Cb(X) : z ∈ Z indexed by z ∈ Z, with entries Qz(xj|xi) specifyingthe jump intensity from xi to xj. Dissipativity means Qz(xj|xi) = (−1)1j(i)|Qz(xj|xi)|, and∑x′∈X

Qz(x′|x) = 0. Thus, for a fixed z ∈ Z, matrix Qz yields the generator of a continuous-time

Markov chain with values in X. It describes the dynamics of the basic state when the environ-ment is fixed (see [16], also [43]). We call it a basic Markov chain or a basic process.

We assume that the basic process has a stationary measure mz(x), x ∈ X, depending onz ∈ Z, with mz(x) > 0, ∀ x, z. Formally, ∀ x′ ∈ X∑

x∈X

Qz(x′|x)mz(x) = 0. (8.1)

Next suppose we have Ax ∈ Cb(Z) : x ∈ X a family of M ×M dissipative matricesindexed by x ∈ X, with elements Ax(zj|zi) representing jump rates from zi to zj. As earlier, we

50

have Ax(zj|zi) = (−1)1j(i)|Ax(zj|zi)|, and∑z′∈Z

Ax(z′|z) = 0. Thus we have a family of Markov

chains taking values in Z and describing a random evolution of the environment for a fixedbasic state. We suppose that each of these processes has a stationary measure νx(z) > 0. Thatis, ∑

z∈Z

Ax(z′|z)νx(z) = 0. (8.2)

With these at hand, we would like to construct a continuous-time Markov chain on X×Z(referred to as a combined Markov chain or process) in a meaningful way, so that both thestate and the environment can change together, while we still can use a product-measuresas stationary measure for the combined process. This means the combined chain will haveg(x, z) = mz(x)×νx(z) a stationary measure. Thus we are working with a random environmentand a random basic process, interacting with each other. An advantage of preserving theproduct-form for the invariant measure has been discussed in [48], while studying Jacksonnetworks (see [58]). Ref [41] also puts forward a significance of having a product-form for thestationary distribution of the combined process.

Constructions leading to the above properties have been carried out in [9]. These con-structions are based upon certain assumptions. First, Ref. [9] assumes that νx(z) is the samefor all x. Next, it is assumed that the combined process can jump from state (x, z) to a (x′, z′)only when either x = x′ or z = z′. The construction presented here will not have these re-strictions. However, as was said, we have to introduce a ‘transitional’ state c, representing anadditional level the combined process may attain. We show for any ε > 0 we can construct acombined process so that that g(c) = ε. By assuming that m and ν are strictly positive weensure g is also positive valued.

Section 2.1 outlines general features of our construction. Section 2.2 gives a specific con-struction with a minimality property. Section 3 concerns with studying how this constructionworks out for a specific example. Section 3.1 describes an example from queuing theory, 3.2analyses it in detail. Section 4.1 is dedicated to a refined construction developing the examplein Section 3. Section 4.2 analyses the combined process for this refined construction. In thebeginning of Section 4 we sketch a physical intuition behind the theorems of Section 4.2.

8.2 A construction with a single transitional state

8.2.1 A general result

Let X, Z , Ax and Qz be as in the introduction. Define the state space Y = X×Z∪c. Wecall X × Z a natural space, and c a transition state. Given ε > 0, define the positive functiong as

g(p) = mz(x)× νx(z)

when p = (x, z), i.e., p is a part of the natural space

= ε , when p = c. (8.3)

Given a function (x, z) ∈ X × Z 7→ τ(x, z) define a matrix R with entries R(x′, z′|x, z)

R(x′, z′|x, z) = (−τ(x, z))1z(z′)1x(x′) |Ax(z′|z)Qz(x′|x)| . (8.4)

51

Observe that (8.4) has been defined so as to ensure R(x′, z′|x, z) is non-positive only when(x, z) = (x′, z′). In what follows we choose τ(x, z) = τε(x, z) and entries Rε(x

′, z′|c), Rε(c|x, z) ≥0, and Rε(c|c) ≤ 0, such that the following relations (A), (B) hold true.

1. Rε is dissipative, and ∀ p ∈ Y ,∑s∈Y

Rε(s|p) = 0 (we henceforth write Rε(x′, z′|x, z), to

stress the importance of the value of ε ).

2. The function g from (8.3) yields a stationary measure for the Markov chain generated byRε , i.e.∑p∈Y

Rε(s|p)g(p) = 0.

The above would ensure (by Hille -Yosida theorems) that Rε generates a Markov Processon Y , which has a stationary measure g. The ingredients of the construction are as follows:

1. Given (xi, zj) ∈ X × Z, choose τε(xi, zj) > 0 large enough such that∑(x,z)∈X×Z

Rε(xi, zj|x, z)g(x, z) ≤ 0 (8.5)

and ∑(x′,z′)∈X×Z

Rε(x′, z′|xi, zj) ≤ 0. (8.6)

Such a choice of τε(xi, zj) is possible because Axi(zj|zj) < 0, Qzj(xi|xi) < 0 andg(xi, zj) > 0, ∀i, j.

2. Next, we can choose Rε(xi, zj|c) ≥ 0, and Rε(c|xi, zj) ≥ 0, such that∑(x,z)∈X×Z

Rε(xi, zj|x, z)g(x, z) +Rε(xi, zj|c)ε = 0 (8.7)

and ∑(x′,z′)∈X×Z

Rε(x′, z′|xi, zj) +Rε(c|xi, zj) = 0. (8.8)

3. After that, we choose Rε(c|c) ≤ 0, such that∑(x,z)∈X×Z

Rε(c|x, z)g(x, z) +Rε(c|c)ε = 0. (8.9)

4. One may show ∑(x′,z′)∈X×Z

Rε(x′, z′|c) +Rε(c|c) = 0. (8.10)

As a result we obtain the following

52

Theorem 8.2.1. Let Zx(t) ∈ Z, Xz(t) ∈ X be two families of discrete time Markov chainsindexed by x ∈ X and z ∈ Z respectively, with generators Ax, Qz and stationary measuresνx(z) and mz(x) respectively. Given (xi, zj), i = 1, ...N, j = 1, ...,M, ε > 0 there exist τε(xi, zj)and an additional state c such that (5)-(10) hold. Hence, conditions (A), (B) are satisfied,Rε is a generator of a discrete time Markov chain Y (t) ∈ Y and the function g satisfyingg(y) = mz(x)νx(z) for y = (x, z) and g(c) = ε stands for its stationary measure.

Yt is called the combined Markov chain.

Remark. 1. In the definition of Rε(x′, z′|x, z) (see (8.4)) one may use Ax

′(z′|z) instead of

Ax(z′|z) and/or Qz′(x′|x) instead of Qz(x′|x). The above arguments would still hold true.

2. Here is an intuitive way to make sense of the transition state c. Suppose that instead ofhaving to satisfy (A), (B) we are interested in satisfying (A) only. We could then chooseτε(xi, zj) > 0, ∀i, j so that ∑

(x,z)∈X×Z

Rε(x′, z′|xi, zj) = 0. (8.11)

Moreover, we do not have to work with (8.5) and (8.6), and add state c. Using (8.11)only we can define a dissipative matrix R′ε on X×Z. Let Y ′t be a Markov chain generatedby R′ε, with a stationary measure g′. When we run chain Y ′t , it wanders into states thatg does not favor (that is, states to which g assigns a small mass compared to g′). At thispoint process Yt would have an option of switching to c and emerging in a region morefavorable for g.

3. The fact that ε > 0 can be chosen arbitrarily small comes from the fact c can have a jumprate arbitrarily large i.e transitions through c can be made arbitrarily fast. It means thatregardless of how large g(X×Z) is, one can always chose g(c) = ε. The way the combinedprocess Yt uses c is a significant topic in this article.

4. A somewhat surprising feature of our construction is that it is based on the product ofentries of marginal generators, rather than their sums as was the case in [9]. This rep-resents both a novelty and a challenge for future works as it re-ignites an old question ofhow to produce a new Markov chain from previously given ones in a meaningful manner.From this point view, an extension of the current construction to more general classes ofMarkov processes, particularly, diffusions would be interesting.

8.2.2 A minimal version of the single transitional state construction

In (8.5) and (8.6), we set a criterion for choosing τε(xi, zj) > 0, admitting a wide range oflegitimate choices (we simply say “ choose τε(xi, zj) > 0, large enough”). From this point on weshall narrow this down by imposing a stronger condition upon τε(xi, zj). This yields a uniqueτε(xi, zj), providing a minimal choice among possibilities allowed by (8.5) and (8.6). To this

53

end, we define

τε(xi, zj) = max

∑(x,z)∈X×Z(x,z)6=(xi,zj)

|Ax(zj|z)Qz(xi|x)| g(x, z)

Axi(zj|zj)Qzj(xi|xi)g(xi, zj),

∑(x′,z′)∈X×Z(x′,z′)6=(xi,zj)

|Axi(z′|zj)Qzj(x′|xi)|Axi(zj|zj)Qzj(xi|xi)

.

(8.12)

Note that τε(xi, zj) from (8.12) satisfies (8.5), (8.6). It depends on g, which is the productmeasure we are targeting. Moreover, for (x, z) ∈ X × Z, either the rate Rε(c|x, z) = 0 orRε(x, z|c) = 0. Essentially the choice of τε in (8.12) prevents the combined process from goingin cycles around c. It partitions the natural space X × Z into three types of states. The firsttype are those states from which the combined process can jump to c, the second are stateswhere jumps from c are possible. The third (possibly empty) set of states are those which allowno direct jumps to or from c.

Remark. An alternative way of choosing τε > 0 such that properties (8.5) and (8.6) hold trueis to put τε(xi, zj) = τ ∀ i, j, where the value τ is defined as the smallest positive number suchthat ∀ xi, zj, ∑

(x,z)∈X×Z

(−τ)1zj (z)1xi (x) |Ax(zj|z)Qz(xi|x)| g(x, z) ≤ 0, and

∑(x′,z′)∈X×Z

(−τ)1zj (z′)1xi (x

′) |Axi(z′|zj)Qzj(x′|xi)| ≤ 0.

Such a choice is more faithful to the original families Ax, Qz, as it changes all speeds uniformly.So we can call it a uniform choice. In this discussion the uniform choice is not used, for it doesnot partition X × Z into the three types described above.

8.3 Application of the minimal construction to queueing

theory

We apply the basic construction to an example. The example captures a phenomenon in thequeueing systems, exhibits a general interaction between the environment and basic processeswhere the use of the transition state c is transparent.

8.3.1 A formal description

Here we set X = 0, 1, 2, · · ·N and Z = µ0, µ1, · · ·µM, where µ0 = 12, µj = 1

2+ j

M, j =

1, 2, . . .M . Intuitively, a basic state represents the number of jobs in the queue whereas thestate of the environment represents the efficiency of the server. A higher value of µ leads to a

54

higher output rate of the server. For a fixed µ ∈ Z,

Qµ(i+ l|i) = 0, ∀ l such that |l| > 1,

Qµ(i+ 1|i) = 1, when i = 0, 1, . . . N − 1,

Qµ(i− 1|i) = µ, when i = 1, 2, . . . N,

Qµ(i|i) = −(Qµ(i− 1|i) +Qµ(i+ 1|i)) = −(µ+ 1), i = 1, . . . N − 1,

Qµ(0|0) = −1, Qµ(N |N) = −µ.

(8.13)

Observe that mµ(i) =1

η(µ)

1

µiis a stationary probability measure for the process, where η(µ) =

N∑i=0

1µi

is a normalizing constant. (It is the only stationary probability measure, as the basic

process is irreducible). Next for a fixed i ∈ X \ 0,

Ai(µj+l|µj) = 0, ∀ l such that |l| > 1,

Ai(µj+1|µj) = i, when j = 0, . . .M − 1,

Ai(µj−1|µj) = 1, when j = 1, . . .M,

Ai(µj|µj) = −(Ai(µj+1|µj) + Ai(µj−1|µj)) = −(i+ 1), j = 1, . . .M − 1,

Ai(µ0|µ0) = −i, Ai(µ0|µ0) = −1.

(8.14)

The Markov chain on Z with generator Ai has a unique stationary probability measure νi(µj) =

ij

σ(i)where σ(i) =

M∑j=0

ij. For i = 0 we set

A0(µj+l|µj) = 0, ∀ l such that |l| > 1,

A0(µj+1|µj) =1

M2, j = 0, . . . ,M − 1,

A0(µj−1|µj) = 1, j = 1, . . . ,M,

A0(µj|µj) = −1− 1

M2, j = 1, . . .M − 1,

A0(µ0|µ0) = − 1

M2, A0(µM |µM) = −1.

(8.15)

The stationary measure for generator A0 is ν0(µj) = 1M2jσ(0)

where σ(0) = 1+o( 1M2 ). As before,

we construct a dissipative operator Rε on Y = X × Z ∪ c by using (8.4):

Rε(i′, µ′|i, µ) = (−τε(i, µ))1µ(µ

′)1i(i′)|Ai(µ′|µ)Qµ′(i′|i)| for (i, µ), (i′, µ′) ∈ X × Z

. Again choose τε(i, µ) minimally as in (8.12). Next, set:

g((i, µ)) = mµ(i)νi(µ) , ∀(i, µ) ∈ X × Zg(c) = ε.

(8.16)

Further define Rε(i′, µ′|c), Rε(c|i, µ) and Rε(c|c) by using (8.7), (8.8) and (8.9). From Theorem

8.2.1 conclude that Rε is a dissipative operator, it is the generator of a Markov process on Y ,

55

with g serving as a unique stationary measure. As M increases, we can see when the processjumps from the natural space X × Z to c. Indeed, M → ∞ means that the spacing betweenconsecutive levels of the server output is shrinking to zero. (Formally, the process on Z, for afixed i becomes like a diffusion.)

Remark. An intuitive picture of how the basic and environment processes interact is as follows:

1. For a fixed µ ∈ Z, the basic process is a continuous-time Markov chain on X. The stateof the basic process captures the number of jobs in the queue. The arrival rate of jobs inthe queue is 1, the rate at which jobs are cleared from the queue is µ. We think of µ asthe efficiency at which the server is operating.

2. For fixed i > 1, the server looks to increase the output performance captured by the valueof µ. This is achieved by a drift towards higher values of µ, and the drift grows with i.The value of i captures the number of jobs in the queue: the more the number of jobs themore fervently the server tries to drive towards higher efficiency . When i = 0, the servertries to save power by developing a drift towards lower productivity.

3. State c can be interpreted as a maintenance state which the process attains to do repairs.Exactly when a repair is needed is evident from specifying the states from which jumps toc are possible. The nature of performed repairs will be revealed by identifying the statesto which jumps from c are plausible.

4. In its present form the construction works only when the number of basic and environ-ment states are both finite. That is, we can’t directly put M = ∞ or N = ∞ becausethe algorithm of finding the elements of the matrix Rε must terminate. This makes itinteresting to analyze the limits as M , N →∞.

8.3.2 A technical analysis

Throughout the rest of Chapter 8 N is considered fixed. And we will study what happens asM tends to infinity. Remark 3 in the previous paragraph makes it important to delineate fromwhich states in the natural space jumps to c are possible, and to which states jumps from care possible. The main tool for studying this will be Theorem 8.3.1. The final conclusions aregiven in the remarks at the end of the subsection, before doing all this we make some technicalconsiderations.Observe that ∀ i, j, Next: νi(µj+1) = iνi(µj) and mµj+1(i) = mµj(i)(1 + o( 1

M78

)).

νi+1(µj)

νi(µj)=

(i+ 1)j i

(i+ 1)M+1 − 1

/ij(i− 1)

(i)M+1 − 1

=iM+2−j

(i+ 1)M+1−j(i− 1)(1 + o(

1

M)).

νi−1(µj)

νi(µj)=

iM+1−j

(i− 1)M+2−j (i− 2)(1 + o(1

M)), ∀ i > 2.

(8.17)

56

The foregoing argument implies the following recurrence relations: when i 6= 1, 0,

g(i, µj+1) = ig(i, µj)(1 + o(1

M78

)), hence

g(i, µj+l) = ilg(i, µj)(1 + o(1

M19

)),when 0 < l < M34 ;

e−CM18 ilg(i, µj)(1 + o(

1

M19

)) ≤ g(i, µj+l) ≤ eCM18 ilg(i, µj)(1 + o(

1

M19

)).

(8.18)

Here C is a positive contant, depending on N . Furhtermore

g(i, µj−1) =1

ig(i, µj)(1 + o(

1

M78

)),

g(i+ 1, µj) = g(i, µj)1

µj

iM+2−j

(i+ 1)M+1−j(i− 1)(1 + o(

1

M)) (when i ≥ 2),

g(i− 1, µj) = g(i, µj)µjiM+1−j(i− 2)

(i− 1)M+1−j(i− 1)(1 + o(

1

M)) (when i ≥ 3).

(8.19)

Observe that the term denoted as o( 1M

) above can be chosen so that its decay as M → ∞depends only on N and is uniform in i and j. (This fact will be used in subsequent calculations.)Formally this would be the statement that given ε > 0, there exists M ′′ > 0, such that∀M ′ > M ′′ and ∀ i, j, we have M ′ 7

8 [g(i,µj+1)

ig(i,µj)−1] ≤ ε. A similar uniform bound holds for (8.17)

(8.18) and (8.19).The following theorem follows trivially from the construction

Theorem 8.3.1. A jump from (i, µj) to c is possible if and only if∑(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

|Ai′(µj|µj′)Qµj(i|i′)|g((i′, µj′))

Ai(µj|µj)Qµj(i|i)g(i, µj)≥

∑(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

|Ai(µj′ |µj)Qµj′ (i′|i)|Ai(µj|µj)Qµj(i|i)

⇔∑

(i′,µj′ )∈X×Z(i′,j′) 6=(i,j)

|Ai′(µj|µj′)Qµj(i|i′)|g((i′, µj′)) ≥∑

(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

|Ai(µj′|µj)Qµj′ (i′|i)|g(i, µj)

⇔ τε(i, µj) =∑

(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

|Ai′(µj|µj′)Qµj(i|i′)| g((i′, µj′))

Ai(µj|µj)Qµj(i|i)g(i, µj).

(8.20)

In what follows we characterize when (8.20) holds true for M large enough. We will usethe abbreviation g(i, µj) = g(i, j). When i 6= 0, 1, N and j 6= 0,M , we have that∑

(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

(−1)1j(j′)(−1)1i(i

′)Ai(µj′ |µj)Qµj′ (i′|i)g(i, j)

=g(i, j) [2(µj−1 + 1) + (i+ 1)(µj + 1) + 2i(µj+1 + 1)]

= g(i, j)

[3µj + 3µji+ 3i+ 3 + o(

1√M

)

].

(8.21)

57

The following calculation is valid for i 6= 0, 1, 2 ((8.18) and (8.19) are instrumental here):∑(i′,µj′ )∈X×Z(i′,µj′ )6=(i,µj)

∣∣∣Ai′(µj|µj′)Qµj(i|i′)∣∣∣ g((i′, j′))

= µj(g(i+ 1, j − 1) + 2g(i+ 1, j) + g(i, j + 1) + g(i+ 1, j + 1))

+ µji(g(i, j − 1) + g(i+ 1, j − 1) + g(i+ 1, j))

+ i(g(i− 1, j − 1) + g(i, j − 1) + g(i− 1, j))

+ 1(g(i, j + 1) + g(i− 1, j + 1) + g(i− 1, j)− g(i− 1, j − 1))

≥ (i− 1)g(i− 1, j) = g(i, j)

[(

i

i− 1)M+1−jµj(i− 2) + o(

1√M

)

].

(8.22)

Now recall: N is fixed and µj ≤ 32. As i 6= 0, 1, 2, comparing (8.22) and (8.21) shows that

for given ε′ ∈ (0, 1) we can choose M large enough such that ∀ j ≤ ε′ ×M and ∀ i > 1 theconditions in (8.20) are satisfied. (In fact, direct computations show the above fact to be truefor i = 2.)

Remark. When i 6= 0, 1, the environment has a propensity to move towards a higher efficiency.When j ≤ ε′×M , the system achieves this by jumping to the transition state c. The minimalityproperty in (8.12) ensures that the process cannot revert from c to a low-efficiency state. Instead,it must progress to a higher-efficiency of the server . It can also reject some jobs and go to astate with i ∈ 0, 1; in subsequent sections we will refine the construction to prevent such apossibility.

To carry out calculations for i = 0, observe that for j 6= 0:∑(i′,µj′ )∈X×Z(i′,µj′ ) 6=(0,µj)

|A0(µj′|µj)Qµj′ (i′|0)|g(0, j) = o(1

M2)× 1

η(µj), (8.23)

and ∑(i′,µj′ )∈X×Z(i′,µj′ )6=(0,µj)

|Ai′(µj|µj′)Qµj(0|i′)|g((i′, j′))

≥ A1(µj|µj−1)Qµj(0|1)g((1, j − 1)) =µjµj−1

1M

1η(µj−1)

.

(8.24)

Now, for M large enough, by virtue of (8.20) there is a possibility of jump from (0, µj) to c,when j 6= 0. On the other hand, when j = 0, a comparison of the two terms in (8.20) revealsthat the process can jump from c to (0, µ0). Then minimality guarantees that a jump in theopposite direction is impossible.

Remark. When i = 0, the natural disposition for the process is to attain the lowest efficiency.If µ 6= 1

2, the process jumps to c from a state where by minimality (8.12) it may go to µ = 1

2.

Or it can create some jobs for itself and attain a state i > 0. This is impractical but possibleunder the present construction. In the next Section we will make revisions to outlaw this.

58

8.4 A construction with multiple maintenance states

The remarks in the previous section necessitate a construction more suitable to the queuingsystem philosophy. Specifically the maintenance state in the single state construction for thequeuing example may create or delete jobs which is undesirable. To this end we introducehere multiple transition states. We first describe the construction in the the specific caseof the queuing example, Section 8.4.2 has theorems which evince that indeed in this multiplemaintenance state framework the above mentioned creation and deletion of jobs has “negligible”probability. Section 8.4 shows that not only the construction is robust enough to handlegeneralizations to multiple transition states, but also such generalizations are often intuitivelynecessary.

Suppose sets X, Z, families Ai, Qµ and functions mµ(i), νi(µ) are as in Section 3. Set

Y = X × Z ∪ c0, c1, c2, · · · cN. We will construct a dissipative matrix Rε on Y by using thefamily of operators Ai, and Qµ, which will have g (defined below) as a stationary measure. Weset

g((i, µ)) = mµ(i)νi(µ) , ∀(i, µ) ∈ X × Zg(c) = ε,∀ c ∈ c0, c1, c2, · · · cN.

(8.25)

The construction below is similar to the one described in Section 8.2.1, except for multiplicityof transient states. This can be understood as follows, in Section 8.3.1 we partition the naturalspace X×Z into N + 1 parts according to the number of jobs in the queue. We now make sureci interacts only with elements of the i-th partition. There might still be jumps between theci’s, the probability of such jumps is shown to be very small (see Theorems 8.4.1 and 8.4.2 ).Thus the maintenance states only alter the server efficiency, rarely creating or rejecting jobs.The construction can be applied in a variety of settings where there is a partition of the naturalspace as described above.

8.4.1 Description of the refined construction on the example

We construct Rε, a matrix of dimension |Y|×|Y|. We describe its entries in a sequence of steps.In order to ensure the matrix is dissipative, the row sums must be zero, and all non-diagonalentries non-negative. We use the notation introduced in Section 8.1.

1. Given (i, µ), (i′, µ′) ∈ X × Z.Define

Rε(i′, µ′|i, µ) = (−τε(i, µ))1µ(µ

′)1i(i′)∣∣∣Ai(µ′|µ)Qµ′(i′|i)

∣∣∣ (8.26)

where

τε(i, µ) = max

∑

(i′,µ′)∈X×Z(i′,µ′)6=(i,µ)

|Ai′(µ|µ′)Qµ(i|i′)|g(i′, µ′)

Ai(µ|µ)Qµ(i|i)g(i, µ),

∑(i′,µ′)∈X×Z(i′,µ′)6=(i,µ)

|Ai(µ′|µ)Qµ′(i′|i)|Ai(µ|µ)Qµ(i|i)

.

(8.27)

59

It follows that |VrN − VcN | = |τ1(N) − τ2

(N)|. Next we reduce τ1(N) and τ2

(N) to morecomparable expressions.

In the argument below, we use (i) dissipativity of Qµj , (ii) |n− j| ≤ 1 implies mµn(N) =mµj(N)(1 + o( 1√

M)), and (iii) the fact that νN is a stationary measure for AN :

τ1(N) =

∑µn,µj ,ii 6=N

|AN(µj|µn)|Qµj(i|N)g(N,µn) =∑µn,µj

|AN(µj|µn)||Qµj(N |N)|g(N,µn)

=∑µj

|Qµj(N |N)|mµj(N)∑µn

|AN(µj|µn)|νN(µn)(1 + o(1√M

))

= 2∑µj

|Qµj(N |N)|mµj(N)|AN(µj|µj)|νN(µj)(1 + o(1√M

))

= 2∑µj ,ii 6=N

Qµj(N |i)mµj(i)|AN(µj|µj)|νN(µj)(1 + o(1√M

)).

Similarly,

τ2(N) =

∑µn,µj ,ii 6=N

|Ai(µj|µn)|Qµj(N |i)g(i, µn)

=∑µn,µj ,ii 6=N

Qµj(N |i)mµj(i)νi(µn)|Ai(µj|µn)|(1 + o(1√M

))

= 2∑µj ,ii 6=N

Qµj(N |i)mµj(i)|Ai(µj|µj)|νi(µj)(1 + o(1√M

)).

The rest of the proof will be devoted to examining |τ1(N) − τ2(N)|. Observe that

|τ1(N) − τ2(N)| =

∣∣∣∣∣∣∣2∑µj ,ii 6=N

Qµj(N |i)mµj(i)[|AN(µj|µj)|νN(µj)− |Ai(µj|µj)|νi(µj)

](1 + o(

1√M

))

∣∣∣∣∣∣∣(8.34)

Using the fact that the only non-zero contributions comes from i = N − 1, the last expressionequals ∣∣∣∣∣∣ 2

∑µj

Qµj(N |N − 1)mµj(N − 1) (|AN(µj|µj)|νN(µj)−

|AN−1(µj|µj)| νN−1(µj)) (1 + o(1√M

))

∣∣∣∣ .(8.35)

63

By virtue of (8.17), (8.18), coincides with

2

∣∣∣∣∣ M−1∑j=0

µjg(N,µj)(N + 1− ( N

N−1)M+2(N−1N

)j(N − 2))

×(

1 + o( 1√M

))

+ µMg(N,µM)(1− N(N−2)(N−1)2 )

(1 + o( 1√

M)) ∣∣∣∣∣

= 2

∣∣∣∣∣ M−1∑j=0

µjg(N,µj)(N + 1− ( NN−1)M+2(N−1

N)j(N − 2))

×(

1 + o( 1√M

))

+ µMg(N,µM−1−M

34)NM

34+1 1

(N−1)2

(1 + o( 1

M19

)) ∣∣∣∣∣.

(8.36)

The definitions below break the sum in (8.36) into two parts, which will be consideredseparately:

U1 =M−1∑

j=M−1−M34

µjg(N,µj)(N + 1− (N

N − 1)M+2(

N − 1

N)j(N − 2)),

U2 =M−1−M

34∑

j=0

µjg(N,µj)(N + 1− (N

N − 1)M+2(

N − 1

N)j(N − 2)).

Formally,M−1∑j=0

µjg(N,µj)(N + 1− ( NN−1)M+2(N−1

N)j(N − 2)) = U1 + U2.

Using the formula for geometric progressions and the fact that g(i, µj+l) = ilg(i, µj)(1 +

o( 1

M19

)) when 0 < l < M34 , we obtain:

U1 =M−1∑

j=M−1−M34

µjg(N,µj) ((N + 1)− (N

N − 1)M+2(

N − 1

N)j(N − 2))

=M

34∑

j=0

µMg(N,µM−1−M

34)N j ((N + 1)− (

N

N − 1)M

34+3(

N − 1

N)j(N − 2)) (1 + o(

1

M19

))

= µMg(N,µM−1−M

34) (N + 1

N − 1NM

34+1 − (

N

N − 1)M

34+3(N − 1)M

34+1) (1 + o(

1

M19

))

= µMg(N,µM−1−M

34)NM

34+1 (

N + 1

N − 1− (

N

N − 1)2) (1 + o(

1

M19

))

= −µMg(N,µM−1−M

34)NM

34+1 1

(N − 1)2(1 + o(

1

M19

)).

(8.37)

In what follows we use the fact that:

e−CM18 ilg(i, µj)(1 + o(

1

M19

)) ≤ g(i, µj+l) ≤ eCM18 ilg(i, µj)(1 + o(

1

M19

)),

64

where C is a positive constant, depending on N . This yields

|U2| = −M−1−M

34∑

j=0

µjg(N,µj) ((N + 1)− (N

N − 1)M+2(

N − 1

N)j(N − 2))

≤ −M−1−M

34∑

j=0

eCM18 µMg(N,µ0)N

j ((N + 1)− (N

N − 1)M+2(

N − 1

N)j(N − 2)) (1 + o(

1

M19

))

= −eCM18 µMg(N,µ0) (

N + 1

N − 1NM−M

34 − (

N

N − 1)M+2(N − 1)M−M

34 ) (1 + o(

1

M19

))

= eCM18 µMg(N,µ0)N

M−M34 (N + 1

N − 1− (

N

N − 1)M

34+2) (1 + o(

1

M19

))

≤ e2CM18 µMg(N,µ

M−1−M34)(N + 1

N − 1− (

N

N − 1)M

34+2) (1 + o(

1

M19

)).

(8.38)

Eqns (8.37) and (8.38) show that U1 + U2 = U1(1 + o( 1

M19

)).

By using (8.36) and (8.37), we get

|τ1(N) − τ2(N)| = o(1

M19

)× µMg(N,µM)× 1

(N − 1)2(1 + o(

1

M19

)).

Also |Rε(cN |cN)| ≥ Rε((N,µM)|cN). A direct computation shows that Rε((N,µM)|cN) =µMg(N,µM )ε(N−1)2 (1+o( 1

M19

)), Rε(cN |(N,µM)) = 0, and Rε((N,µM)|(N,µM)) = (1+o( 1

M19

))×−3µM .

This completes the proof of Theorem 8.4.1.

Remark. For N = 2, the above proof does not work, but a direct computation will yield thesame result.

Theorem 8.4.2. For N ≥ N ′ ≥ 2, given max[Rε(c

′N |cN ′+1), Rε(cN ′+1|cN ′)

]= o( 1

M19

) |Rε(cN ′|cN ′)|,we have

max[Rε(cN ′|cN ′−1), Rε(cN ′−1|cN ′)

]= o(

1

M19

) ×min[|Rε(c

′N |cN ′)|, |Rε(cN ′−1|cN ′−1)|

].

Proof. Similar to proof of Theorem 8.4.1.

8.5 Closing comments

1. In Section 8.4, for any r > 0 (possibly depending on the parameter M), if we replace thefamily of operators Ai by Air, defined as

Air(µ′|µ) = r × Ai(µ′|µ), ∀i ∈ X,µ′, µ ∈ Z.

65

We then carry out the construction by using Air instead of Ai, all the ensuing results wouldstill hold true. Multiplying everything by r represents a universal and uniform speeding upof the environment Markov chain. One might think that the asymptotic results emergingas M → ∞ only hold because the environment process is being equipped with a largernumber of possible states between 1

2and 3

2, without being given the adequate speed to

run through them, to compensate for this it needs c to make long jumps. However, thisview is dispelled by the argument above.

2. The use of multiple maintenance states suggests an interesting direction. As was notedin Section 8.2.1, the transition state c has been introduced because we could not chooseτε(x, z) > 0 such that∑

(x,z)∈X×Z

(−τε(xi, zj))1zj (z)1xi (x) |Ax(zj|z)Qz(xi|x)| g(x, z) = 0,

∑(x′,z′)∈X×Z

(−τε(xi, zj))1zj (z′)1xi (x

′) |Axi(z′|zj)Qzj(x′|xi)| = 0.(8.39)

Instead we selected τε(x, z) > 0 to satisfy (8.5) and (8.6). This forced us into using c toaccommodate (A) and (B) of Section 2.1. Now, assume that the natural space X × Zcan be partitioned into parts BhHh=1 so that there exists τε(x, z) > 0, satisfying ∀ Bh,the property that∑

(xi,zj)∈Bh

∑(x,z)∈X×Z

(−τε(xi, zj))1zj (z)1xi (x) |Ax(zj|z)Qz(xi|x)| g(x, z) = 0

∑(xi,zj)∈Bh

∑(x′,z′)∈X×Z

(−τε(xi, zj))1zj (z′)1xi (x

′) |Axi(z′|zj)Qzj(x′|xi)| = 0.

In this case we can use a collection of maintenance sets chHh=1 such that each ch interactsonly with the partition element Bh, and there is no interaction between ch1 and ch2 , ∀1 ≤ ch1 < ch2 ≤ H. If we can choose a partition where the size of the parts is small, weget a satisfying construction. Whenever a maintenance state ch is attained, the ‘repair’is not drastic, as the process returns to a state close to the state it occupied immediatelybefore it jumped to ch (in the same Bh).

With this in mind, it could be considered that a pair of matrix families Ax, Qz forwhich such a partition of the natural space can be realized, represents a “nice” form ofinteraction between the environment and basic states. In fact, these constructions canbe generalized to the situation where X and Z are compact metric spaces. This will bediscussed in forthcoming chapters. Here the notion of partition into “small” parts can beformally understood as the requirement that each element of the partition has a smalldiameter.

66

Chapter 9

Extending to state spaces which arecompact metric spaces

Definition 9.0.1. Let E be a compact metric space, m be a finite non-negative Borel mea-sure on E. Moreover let µ(du′, u) be a family of non-negative Borel measures defined on E,indexed by u ∈ E. λ : E → (0,∞) be a continuous mapping of E into IR+. Call the triplet(m,µ(du′, u), λ) a workable structure on E, if the following holds,

1. For any measurable set O, the mapping u → µ(O, u) is a continuous mapping of E intoIR.

2. supu∈E µ(E, u) ≤ 1.

3. There exists Enn∈N, a countable subalgebra generating B(E) (the sigma algebra ofBorel measurable sets in E), such that

∫E

µ(En, u)m(du) ≤ C1m(En), ∀n ∈ N, where

0 < C1 <∞, is a constant.

4. 0 < C min < λ(u) < C max <∞, where C min, C max are constants.

E be a compact metric space with the metric d, the diameter of E is defined as rE =supx,y∈E d(x, y). Define the set Ec = E ∪ c, equip this set with the metric dc defined asdc(x, y) = d(x, y), x, y ∈ E. dc(c, y) = dc(y, c) = rE, ∀y ∈ E.

Theorem 9.0.1. Consider a compact metric space E equipped with a workable structure (m,µ(du′, u), λ). Then for any ε > 0, there exists 0 < Cε <∞, 1 < Gε <∞ and a Borel probabilitymeasure µ(du′, c) on Ec, such that the linear operator Rε on C(Ec) defined as

Rε(f)(u) = λ(u)

∫Ec

(f(u′)− f(u))µ(du′, u), with

λ(u) := Gελ(u),∀u ∈ E, λ(c) := Cε,

µ(O, u) :=µ(O, u)

Gε

, ∀u ∈ E, ∀O ∈ B(E),

µ(c, u) := 1− µ(E, u)

Gε

,

(9.1)

67

is the generator of a Feller process on Ec, with mε as a stationary measure, where

mε(O) := m(O),∀O ∈ B(E), mε(c) = ε. (9.2)

Proof. Rε as defined in (9.12) is the generator of a Markov process on Ec (see Theorem 3.1,Chapter 8 ([43]) ). Now we will first construct Cε, Gε, µ(du′, c), then we will show that withour definitions of these quantities

∫u∈Ec

Rε(f)(u)mε(du) = 0, ∀f ∈ C(Ec). This will prove the

theorem. Note that since (m,µ(du′, u), λ) is a workable structure we have 0 < Cmax, Cmin, C1 <∞, and a sub-algebra % = Enn∈N, such that

∀u ∈ E, Cmin < λ(u) < Cmax (9.3)∫E

µ(En, u)m(du) ≤ C1m(En), ∀n ∈ N. (9.4)

Without loss of generality we may consider C1 > 1. Define Gε = C1×Cmax

Cmin. For f ∈ C(E), define

f ∈ C(Ec) as f(u) = f(u),∀u ∈ E, f(c) = 0, R be a linear continuous function on C(E),defined as R(f)(u) = λ(u)×Gε

∫Ec

(f(u′)− f(u))µ(du′, u). Note µ(∗, u) is a probability measure

on Ec defined in terms of Gε and µ(∗, u) by (9.12).

∫u∈E

R(1En)(u)m(du) =

∫u∈E

λ(u)µ(En, u)m(du)−∫u∈E

λ(u)×Gε × 1En(u)m(du)

≤ Cmax

∫E

µ(En, u)m(du)−Gε × Cminm(En)

≤ CmaxC1m(En)−Gε × Cminm(En)

≤ 0.

Define ˆµ(c) : % → [0,∞) as ˆµ(c)(En) = −∫

u∈ER(1En)(u)m(du). We see that this is a non-

negative valued functions on sets, and since R is linear, we get ˆµ(c) as countably additive.

So we can extend it to a finite measure ˆµ(du, c) on E. Define µ(∗, c) =ˆµ(∗,c)ˆµ(E,c)

, and define

Cε =ˆµ(E,c)ε

. Observe with our choice of Gε, Cε, µ(∗, c), one gets ∀f ∈ C(E).∫u∈Ec

Rε(f)(u)mε(du) = 0

all that remains is to show the function g ∈ C(Ec), defined as g(u) = 0, u ∈ E. g(c) = 1,satisfies

∫u∈Ec

Rε(g)(u)mε(du) = 0. Define 1 ∈ C(Ec) as 1(u) = 1,∀u ∈ Ec, by construction

Rε(1)(u) = 0, ∀u ∈ Ec. Consider h ∈ C(E) as h(u) = 1, ∀u ∈ E. Observe that g = 1− h. Bylinearity of Rε we are done.

Remark. Note that Gε is independent of ε.

68

9.1 Applying the construction to Markov chains in ran-

dom environments

Let Z be a compact metric space, with dz the metric function. Intuitively Z is to be understoodas the space of possible environments. X be another compact metric space, with dx the metricfunction. Intuitively X is to be understood as the space of possible basic states. We will inthis discussion consider products of metric spaces, we assume the product space to be equippedwith the 1-product metric. We have a family of linear operators Qz on C(X), indexed by z ∈ Z.For each fixed z, Qz is defined as

Qz(f)(x) = λ1(z, x)

∫X

(f(x′)− f(x))µ1(dx′, x, z),

µ1(∗, x, z) is a family of Borel probability measures on X indexed by x ∈ X.λ1(x, z) is a continuous positive function on X × Z.

(9.5)

As λ1(x, z) is continuous and positive , we have Cmin, Cmax such that 0 < Cmin < λ1(x, z) <Cmax <∞. We will insist on a few more conditions:

1. For a fixed z, there exists a Borel probability measure m1z, such that∫

X

Qz(f)(x)m1z(dx) = 0, ∀f ∈ C(X). (9.6)

2. There exists a finite positive Borel measure m1(∗) on X, m1z is absolutely continuous to

m1(∗), ∀z. So we have a density function h1, such that m1z(E) =

∫E

h1(x, z)m1(dx), ∀E ∈

B(X). AdditionallyH1(z) = h1(x, z) is taken to be a continuous map of Z into L∞(X,m1),such that the image of the map does not contain the 0- function. This means ∀z ∈ Z,0 < C4 < ||h1( , z)||L∞(X,m1)

≤ C3 <∞.

3. µ1(∗, x, z) is absolutely continuous with respect to m1(∗), so there exists g1(x′, x, z) such

that µ1(∗, x, z) =∫E

g1(x′, x, z)m1(dx

′). Additionally G1(x, z) = g1(x′, x, z) is taken to be

a continuous map of X × Z into L∞(X,m1), such that the image of the map does notcontain the 0- function. This means ∀(x, z) ∈ X×Z, 0 < C2 < ||g1( , x, z)||L∞(X,m1)

≤ C1.

With these conditions in place, one observes that for any fixed z ∈ Z, Qz is the generator ofa Markov process on X with stationary measure m1

z. This process can be thought to capturethe evolution of the basic state when the environment is constant.

Remark. The existence of the canonical measure m1 is natural. The consideration that h1(x, z),g1(x

′, x, z) are bounded away from 0 is also reasonable. We would however have liked to considerin the general case that G1(x, z) = g1(x

′, x, z) is a continuous map of X × Z into L1(X,m1),and H1(z) = h1(x, z) is a continuous map of Z into L1(X,m1).

69

Similarly, we have a family of linear operators Ax on C(Z), indexed by x ∈ X. For eachfixed x, Ax is defined as

Ax(f)(z) = λ2(x, z)

∫Z

(f(z′)− f(z))µ2(dz′, x, z),

µ2(∗, x, z) is a family of Borel probability measures on Z indexed by z ∈ Z.λ2(x, z) is a continuous positive function on X × Z.

(9.7)

As λ2(x, z) is continuous and positive , we have Dmin, Dmax such that 0 < Dmin < λ2(x, z) <Dmax <∞. We will insist on a few more conditions

1. For a fixed x, there exists a Borel probability measure m2x, such that∫

Z

Ax(f)(z)m2x(dz) = 0, ∀f ∈ C(Z) (9.8)

2. There exists a finite positive Borel measure m2(∗) on Z, m2x is absolutely continuous to

m2(∗), ∀x. So we have a density function h2, such that m2x(F ) =

∫F

h2(x, z)m2(dz), ∀F ∈

B(Z). AdditionallyH2(x) = h2(x, z) is taken to be a continuous map ofX into L∞(Z,m2),such that the image of the map does not contain the 0- function. This means ∀x ∈ X,0 < D4 < ||h1(x, )||L∞(Z,m2)

≤ D3 <∞.

3. µ2(∗, x, z) is absolutely continuous with respect to m2(∗), so there exists g2(z′, z, x) such

that µ2(∗, x, z) =∫E

g2(z′, z, x)m2(dz

′). Additionally G1(z, x) = g2(z′, z, x) is taken to be

a continuous map of Z ×X into L1(Z,m2).

With these conditions in place, one observes that for any fixed x ∈ X, Ax is the generator ofa Markov process on Z with stationary measure m2

x. This process can be thought to capturethe evolution of the environment when the basic state is fixed.

Definition 9.1.1. With the above considerations, consider the metric space X × Z, , for anyE ∈ B(X), F ∈ B(Z) define the finite non-negative Borel measure M as

M(E × F ) =

∫E×F

h2(x, z)h1(x, z)m1(dx)m2(dz)

Also define the family of Borel probability measures Θ(dx′, dz′, x, z) , indexed by (x, z) ∈ X×Zas

Θ(E,F, x, z) =

∫E×F

g1(x′, x, z)g2(z

′, z, x)m1(dx′)m2(dz

′)

Theorem 9.1.1. Considering the above definitions, the triplet (M,Θ, λ2(x, z)× λ1(x, z)) is aworkable structure on X × Z

70

Proof. All we need to show is that there exists ∞ > C > 0, such that for any F ∈ B(Z),E ∈ B(X), we have ∫

X×Z

Θ(E,F, x, z)M(dx, dz) ≤ C ×M(E × F )

Without loss of generality take Cmax = Dmax = 1. First applying (9.6) with f = 1E , we getthat ∀z ∈ Z ∫

X

(

∫E

g1(x′, x, z)m1(dx

′))h1(x, z)m1(dx) ≤ 1

Cmin

∫E

h1(x, z)m1(dx). (9.9)

And from (9.8) we can conclude that ∀x ∈ X∫Z

(

∫F

g2(z′, z, x)m2(dz

′))h2(x, z)m2(dz) ≤ 1

Dmin

∫F

h2(x, z)m2(dz). (9.10)

∫X×Z

Θ(E,F, x, z)M(dx, dz) =

∫X×Z

h1(x, z)h2(x, z)((

∫E

g1(x′, x, z)m1(dx

′)×∫F

g2(z′, z, x)m2(dz

′))m1(dx)m2(dz)

≤ C1 × C3

∫X

∫E

(

∫Z

(

∫F

g2(z′, z, x)m2(dz

′))h2(x, z)m2(dz))m1(dx′)m1(dx)

(Using (9.10) )

≤ C1C3

Dmin

∫X

∫E

∫F

h2(x, z)m2(dz)m1(dx′)m1(dx)

≤ C1C3D3

Dmin

∫F

∫X

∫E

m1(dx′)m1(dx)m2(dz)

≤ C1C3D3

DminC2C4

∫F

(

∫X

(

∫E

g1(x′, x, z)m1(dx

′))h1(x, z)m1(dx))m2(dz)

(Using (9.9) )

≤ C1C3D3

DminC2C4Cmin

∫F

∫E

h1(x, z)m1(dx)m2(dz)

≤ C1C3D3

DminC2C4CminD4

∫F

∫E

h1(x, z)h2(x, z)m1(dx)m2(dz)

≤ C1C3D3

DminC2C4CminD4

M(E × F ).

This concludes the proof.

So one can one can use (M,Θ, λ2(x, z)× λ1(z, x)) to construct an Markov chain on X ×Z ∪ c, this should be thought of as how the system and the environment evolve together.

71

9.2 δ- smooth workable structure

9.2.1 Introduction

The notion of a workable structure basically says given a jump Process Yt on a compact set E,and a suitable measure m on E, one may use an additional state c which redirects sample pathsof Yt, so that the new Markov process has m as a stationary measure. This c may be thoughtas a stochastic control to control the Markov processes stationary measure. Now, we ask howbig an influence does this c exert? Because the process generated by Rε travels fast throughc, the stationary measure assigns a very small mass to c, in this sense c has a very small role.However what about the lengths of the jumps through c? We describe this new constructionwhere the process from c is redirected to a point close to where it came to c from. To do thiswe will introduce multiple c’s and we will require a stronger notion of a “suitable” measure m.This extra condition maybe thought of as a criteria for how far away Yt is from having m as astationary measure.

9.2.2 Formalization

Definition 9.2.1. E be a compact metric space, m be a finite non-negative Borel measureon E. µ(du′, u) be a family of non-negative Borel measures defined on E, indexed by u ∈ E.λ : E → (0,∞) be a continuous mapping of E into IR+. Given, δ > 0, call the triplet(m,µ(du′, u), λ) a δ- smooth workable structure on E, if the following holds,

1. For any measurable set O, the mapping u → µ(O, u) is a continuous mapping of E intoIR.

2. ∀u ∈ E, µ(E, u) = 1.

3. There exists Enn∈N a countable subalgebra generating B(E) (the sigma algebra of ofBorel measurable sets in E), such that

∫E

µ(En, u)m(du) ≤ C1m(En), ∀n ∈ N, where

0 < C1 <∞.

4. 0 < C min < λ(u) < C max <∞.

5. There exists the class of subsets of E, namely PnNn=1, such that

(a) ∀n, supx,y∈Pn d(x, y) ≤ δ.

(b) ∪Nn=1Pn = E. Each Ei defined above satisfies Ei ⊂ Pj, for some j ∈ 1, 2, . . . N.(c) ∀ i, j, satisfying 1 ≤ i ≤ N , and ∀u ∈ E, m(Pi ∩ Pj) = µ(P1 ∩ P2, u) = 0.

(d)∫E

λ(u)µ(Pn, u)m(du) =∫E

λ(u)1Pn(u)m(du).

Remark. 1. Since the intersections of the various Pi’s has almost 0 volume wrt to all mea-sures under consideration, we loosely call PnNn=1 a partition of E. Also sometimes wewill refer to (m,µ(du′, u), λ, PnNn=1) a δ- smooth workable structure on E, i.e we as-sume the partition is supplied in the definition. We call such a partition a δ- partition ofE.

72

2. Let Yt be the jump process generated by R(f)(u) = λ(u)∫E

(f(u′)−f(u))µ(du′, u). In order

for Yt to have m as a stationary measure, we would need ∀f ∈ C(E),∫E

R(f)(u)m(du) = 0. (9.11)

With a δ- smooth workable structure in place, we at least have (9.11) holding true forf = 1Pn. When δ is small 1PnNn=1 is quite a wide and important class of functions.

Theorem 9.2.1. If ∀ δ > 0, (m,µ(du′, u), λ) a δ- smooth workable structure on E, then m is astaionary measure for the jump process Yt generated by R(f)(u) = λ(u)

∫E

(f(u′)−f(u))µ(du′, u),

acting on C(E).

Proof. For any open ball B, we cover it using a δ with the properties (5)-(a)-(d) of Definition9.2.1. Pass δ to 0, and use Dominated Convergence Theorem to get the result.

Given a metric space E, define EcN = E ∪ c1, c2, . . . , cN. As earlier we can metricize

EcN , by using the metric dc defined as dc(x, y) = d(x, y), dc(ci, y) = dc(y, ci) = rE, d(ci, cj) =

(1− 1j(i)), ∀ x, y ∈ E,∀ i, j ∈ 1, 2, . . . N.

Theorem 9.2.2. Consider a compact metric space E equipped with a δ-smooth workable struc-ture (m,µ(du′, u), λ, PnNn=1). Then for any ε > 0, there exists 1 < G < ∞, probabilitymeasures µ(du′, ci) on Ec

N and 0 < Cε(i) < ∞, ∀i ∈ 1, 2, 3, . . . N, such that the linearoperator Rε on C(Ec

N) defined as

Rε(f)(u) = λ(u)

∫EcN

(f(u′)− f(u))µ(du′, u).

λ(u) = Gλ(u), ∀u ∈ E. λ(cj) = Cε(j), ∀ j,

µ(O, u) =µ(O, u)

G, ∀u ∈ E, ∀O ∈ B(E),

µ(cj, u) = (1− µ(E, u)

G)× 1Pj(u).

(9.12)

is the generator of a Feller process on EcN , which has mε as a stationary measure, where

mε(O) = m(O),∀O ∈ B(E), mε(cj) = ε. (9.13)

Furthermore the support of the measure µ(du′, ci), will lie in Pi.

Proof. Let E = Enn∈N be an algebra with the desired properties guaranteed in (3) of Defini-tion 9.2.1.

Rε as defined in (9.12) is the generator of a Markov process on EcN (see Theorem 3.1,

Chapter 8 [43] ). Now we will first construct Cε, G, µ(du′, c), then we will show that with ourdefinitions of these quantities

∫u∈EcN

Rε(f)(u)mε(du) = 0, ∀f ∈ C(EcN). This will prove the

73

theorem. Cmax, Cmin be the constants of (4.) in Definition 9.2.1. Without loss of generalitywe may consider C1 > 1. Define G = C1×Cmax

Cmin. For f ∈ C(E), define f ∈ C(Ec

N) as f(u) =

f(u), ∀u ∈ E, f(ci) = 0,∀i. R be a linear continuous function on B(E), defined as R(f)(u) =λ(u)×G

∫Ec

(f(u′)− f(u))µ(du′, u). Note µ(∗, u) is a probability measure on Ec defined in terms

of G and µ(∗, u). Define the class %j = Ei|Ei ∈ E, Ei ⊂ Pj. Define ˆµ(cj) : %j → [0,∞)

as ˆµ(cj)(En) = −∫

u∈ER(1En)(u)m(du). By methods similar to the proof of Theorem 9.0.1, we

observe that this is a non-negative valued functions on sets, and since R is linear, we get ˆµ(c)as countably additive. So we can extend it to a positive finite measure ˆµ(du, cj) on Pj. Define

µ(∗, cj) =ˆµ(∗,cj)ˆµ(Pj ,cj)

, and define Cε(j) =ˆµ(E,c)ε

. We can do this ∀j. Observe with our choice of

Gε, Cε, µ(∗, c), one gets ∀f ∈ C(E). ∫u∈EcN

Rε(f)(u)mε(du) = 0

all that remains is to show is that ∀j, the function gj ∈ C(EcN), defined as gj(u) = 0, u ∈

E. gj(ci) = 1i(j), satisfies∫

u∈EcNRε(gj)(u)mε(du) = 0. Observe that gj = 1Pj∪cj − 1Pj . We

know that∫

u∈EcNRε(1Pj)(u)mε(du) = 0. Also 5-d. of Definition 9.2.1 yields

∫u∈EcN

Rε(1Pj∪cj)(u)mε(du) = 0.

This completes the proof.

74

Part V :

Conclusions and future work

75

The results presented in the course of this dissertation exposes some new directions ofstudy.

• In Part III, we endeavored to study neural systems where there can be external activityduring avalanches. This external activity however was in the form of a Poisson processcharacterized by a single parameter φ. In most real-world situations the input signalis more complicated. However, there are experiments on stimulated cortical slices [95]carried out by varying the stimulation strength and detection algorithm, such methodsmaybe used to test the results presented here. It would be best however if the resultscould be extended to signals where the rate varies over time. It should be noted thata true departure from separation of time scales regime requires that the input strengthvaries during avalanches.

• The Central Limit Theorem (CLT) is used as the basis for many statistical techniques. Inprinciple whenever there is a large noise to signal ratio, the p-stable method of Part II canserve as a ready replacement. Take for example time series analysis. In the simplest casesay one asks the question: is an observed stream of data un-correlated? A common way ofanswering this question is to check if the sample auto-correlation function satisfies boundsprescribed by the CLT under independence hypothesis. However with strong noise thismethod becomes erroneous. In general one would like to test the hypothesis that the time-series X1, X2, . . . is an AR(l) (AR stands for auto-regression) [97]. Under the hypothesisXt and Xt+l+1 will be a-causal (independent after factoring out the shared dependence onXt+1, . . . , Xt+l). This hypothesis can be tested using the partial auto-correlation function,in a similar manner as with independence.

• The evidence of criticality which we have chosen to focus on is the emergence of powerlaw statistics. Now, power laws are indicative of criticality, this does not mean powerlaws cannot be produced without criticality. Examples of non-critical systems showingpower laws maybe found [99, 80, 76]. For example if one breaks the interval into twoparts by choosing a random point, then breaks the two produced intervals each intotwo further intervals by choosing random points, and continues in this vein, eventuallythe distribution of the size of the parts will become a power law. Another interestingprocedure comes from [87], here an exponentially growing process is stopped at a randomtime distributed as a negative exponential distribution. Then the produced statistics arepower laws. So it is good practice to test systems for other features of criticality. Onesuch feature is shape collapse. Suppose we are sampling avalanches from an underlyingdistribution. For any such avalanche ω, define

fω(t) := # of spikes inω during t-th step.

Let D(ω) denote the duration of an avalanche. We define

gT (t) := E(fω(t)|D = T ), hT (t) := gT (t/T ).

Then if the underlying system from which avalanches are sampled is critical one findsthat hT (t) is independent of T [93, 74]. This phenomenon is called shape collapse, and it

76

has been found to exist in spontaneous systems [47]. It would be interesting to see if wecould prove results relating to shape collapse for either the LM with input or BM withinput as introduced in Part III. Clues for how to do this can be found in recent results [49]studying completely time-scale separated systems.

• Another way to test for critical systems is by “tuning the network through criticality”.Experimentally one can place the network in a super-critical state by making “the tissuehyper-excited” [7], and also it is possible to push the network to a sub-critical state bysuppressing excitatory synaptic activity [73]. One then tries to see if power laws occurin between these two regimes. A non-equilibrium version of the LM without input wasstudied in [37]. In [37] the authors show that the uniform measure is only suitable forstudying sub-critical and critical systems. They further show that the system is alwaysout of equilibrium in the super-critical state. These issues will persist when studyingversions of LM where the time scale separation hypothesis is weakened. Hence thesekinds of test for criticality are not suitable for our models.

77

References

[1] P. Bak. How nature works: The science of Self-Organized Criticality. Springer Verlag,1999.

[2] P. Bak, C. Tang, and K. Wiesenfeld. Self-organized criticality: an explanation of 1/fnoise. Physical Review Letters, 59:381–384, 1987.

[3] Heiko Bauke. Parameter estimation for power-law distributions by maximum likelihoodmethods. The European Physical Journal B, 58(2):167–173, 2007.

[4] C. Bedard, H. Kroeger, and A. Destexhe. Does the 1/f frequency-scaling of brain signalsreflect self-organized critical states? Physical Review Letters, 97:118102, 2006.

[5] J. Beggs and D. Plenz. Neuronal avalanches in neocortical circuits. J. Neurosci, 23:11167–11177, 2003.

[6] J. Beggs and D. Plenz. Neuronal avalanches are diverse and precise activity patterns thatare stable for many hours in cortical slice cultures. J. Neurosci, 24(22):5216–5229, 2004.

[7] John M. Beggs and Dietmar Plenz. Neuronal avalanches in neocortical circuits. TheJournal of Neuroscience, 23(35):11167–11177, 2003.

[8] John M. Beggs and Nicholas Timme. Being critical of criticality in the brain. FrontiersIn Physiology, 3:163, 2012.

[9] Y. Belopolskaya and Y. Suhov. Models of Markov processes with a random transitionmechanism. arXiv preprint arXiv:1508.05598, 2015.

[10] I. Berkes. Results and problems related to the pointwise central limit theorem. In Asymp-totic Methods in Probability and Statistics, pages 59–96. Elsevier, 1998.

[11] Nils Bertschinger and Thomas Natschlager. Real-time computation at the edge of chaosin recurrent neural networks. Neural computation, 16(7):1413–1436, 2004.

[12] Joschka Boedecker, Oliver Obst, Joseph T. Lizier, N. Michael Mayer, and Minoru Asada.Information processing in echo state networks at the edge of chaos. Theory in Biosciences,131(3):205–213, 2012.

[13] Juan A. Bonachela and Miguel A. Munoz. Self-organization without conservation: trueor just apparent scale-invariance? J. Stat. Mech. Theory Exp., 2009(09):P09009, 2009.

78

[14] Daniel Bonamy, Stephane Santucci, and Laurent Ponson. Crackling dynamics in materialfailure as the signature of a self-organized dynamic phase transition. Physical ReviewLetters, 101(4):045501, 2008.

[15] Konstantin Borovkov, Geoffrey Decrouez, and Matthieu Gilson. On stationary distribu-tions of stochastic neural networks. Journal of Applied Probability, 51(3):837–857, 2014.

[16] P Bremaud. Markov chains: Gibbs fields, Monte Carlo simulation, and queues. SpringerScience & Business Media, 2013.

[17] Gunnar A Brosamler. An almost everywhere central limit theorem. In Mathematical Pro-ceedings of the Cambridge Philosophical Society, volume 104, pages 561–574. CambridgeUniversity Press, 1988.

[18] Guido Caldarelli, Francesco D Di Tolla, and Alberto Petri. Self-organization and annealeddisorder in a fracturing process. Physical Review Letters, 77(12):2503, 1996.

[19] Charalambos A. Charalambides. Combinatorial methods in discrete distributions. JohnWiley & Sons, 2005.

[20] Bong Dae Choi and Soo Hak Sung. On chung’s strong law of large numbers in generalbanach spaces. Bulletin of the Australian Mathematical Society, 37(1):93–100, 1988.

[21] Kim Christensen and Nicholas R Moloney. Complexity and criticality, volume 1. WorldScientific Publishing Company, 2005.

[22] Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. Power-law distributionsin empirical data. SIAM review, 51(4):661–703, 2009.

[23] P.C. Consul and L.R. Shenton. Use of lagrange expansion for generating discrete gener-alized probability distributions. SIAM Journal on Applied Mathematics, 23(2):239–248,1972.

[24] P.C. Consul and L.R. Shenton. Some interesting properties of lagrangian distributions.Communications in Statistics-Theory and Methods, 2(3):263–272, 1973.

[25] Prem C. Consul. A simple urn model dependent upon predetermined strategy. Sankhya:The Indian Journal of Statistics, Series B, pages 391–399, 1974.

[26] Prem C. Consul and Felix Famoye. Lagrangian probability distributions. Springer, 2006.

[27] Prem C. Consul and Gaurav C. Jain. A generalization of the poisson distribution. Tech-nometrics, 15(4):791–799, 1973.

[28] Mauro Copelli and Osame Kinouchi. Optimal dynamical range of excitable networks atcriticality. Nature Physics, 2(5):348–351, 2006.

[29] A. Corral, C. J. Perez, A. Dıaz-Guilera, and A. Arenas. Self-organized criticality and syn-chronization in a lattice model of integrate-and-fire oscillators. Physical Review Letters,74:118–121, 1995.

79

[30] Alvaro Corral and Alvaro Gonzalez. Power-law distributions in geoscience revisited. arXivpreprint arXiv:1810.07868, 2018.

[31] Anirban Das. Constructions of markov processes in random environments which lead to aproduct form of the stationary measure. Markov Processes and Related Fields, 23(2):211–231, 2017.

[32] L. de Arcangelis, C. Perrone-Capano, and H. J. Herrmann. Self-organized criticalitymodel for brain plasticity. Physical Review Letters, 96:028107(4), 2006.

[33] Herold Dehling, Manfred Denker, and Wojbor A Woyczynski. Resampling u-statisticsusing p-stable laws. Journal of Multivariate Analysis, 34(1):1–13, 1990.

[34] Manfred Denker. Asymptotic distribution theory in nonparametric statistics. Springer,1985.

[35] Manfred Denker and Anna Levina. Avalanche dynamics. Stochastics and Dynamics,16(02):1660005, 2016.

[36] Manfred Denker and Ana Rodrigues. Ergodicity of avalanche transformations. DynamicalSystems, 29(4):517–536, 2014.

[37] R.E. Lee DeVille and Charles S. Peskin. Synchrony and asynchrony in a fully stochasticneural network. Bulletin of Mathematical Biology, 70(6):1608–1633, 2008.

[38] Serena di Santo, Pablo Villegas, Raffaella Burioni, and Miguel A Munoz. Landau–ginzburg theory of cortex dynamics: Scale-free avalanches emerge at the edge of syn-chronization. Proc. Natl. Acad. Sci. USA, 115(7):E1356–E1365, 2018.

[39] R. Dickman, M. A. Munoz, A. Vespignani, and S. Zapperi. Paths to self-organizedcriticality. Brazilian Journal of Physics, 30:27 – 41, 03 2000.

[40] Ronald Dickman, Alessandro Vespignani, and Stefano Zapperi. Self-organized criticalityas an absorbing-state phase transition. Physical Review E, 57(5):5095, 1998.

[41] A. Economou. Generalized product-form stationary distributions for Markov chains inrandom environments with queueing applications. Advances in Applied Probability, pages185–211, 2005.

[42] P. Erdos, G.A. Hunt, et al. Changes of sign of sums of random variables. Pacific Journalof Mathematics, 3(4):673–687, 1953.

[43] S. N. Ethier and T.G. Kurtz. Markov processes: characterization and convergence, JohnWiley & Sons, 2009.

[44] Christian W. Eurich, J Michael Herrmann, and Udo A. Ernst. Finite-size effects ofavalanche dynamics. Physical Review E, 66(6):066137, 2002.

[45] Donald Alexander and Stuart Fraser. Nonparametric methods in statistics. 1956.

80

[46] Nir Friedman, Shinya Ito, Braden AW Brinkman, Masanori Shimono, RE Lee DeVille,Karin A. Dahmen, John M. Beggs, and Thomas C. Butler. Universal critical dynamics inhigh resolution neuronal avalanche data. Physical Review Letters, 108(20):208102, 2012.

[47] Nir Friedman, Shinya Ito, Braden A.W. Brinkman, Masanori Shimono, R.E. Lee DeVille,Karin A. Dahmen, John M. Beggs, and Thomas C. Butler. Universal critical dynamics inhigh resolution neuronal avalanche data. Physical Review Letters, 108(20):208102, 2012.

[48] M. Gannon, E. Pechersky, Y. Suhov, and A. Yambartsev. Random walks in a queueingnetwork environment. Journal of Applied Probability, 53(02):448–462, 2016.

[49] James P Gleeson and Rick Durrett. Temporal profiles of avalanches on networks. Naturecommunications, 8(1):1227, 2017.

[50] B. Gutenberg and C. F. Richter. Earthquake magnitude, intensity, energy, and accelera-tion Bulletin of the Seismological Society of America, 46.2, 1956.

[51] Ryan N Gutenkunst, Joshua J Waterfall, Fergal P Casey, Kevin S Brown, Christopher RMyers, and James P Sethna. Universally sloppy parameter sensitivities in systems biologymodels. PLOS Computational Biology, 3(10):e189, 2007.

[52] C. Haldeman and J. Beggs. Critical branching captures activity in living neural networksand maximizes the number of metastable states. Physical Review Letters, 94:058101,2005.

[53] Clayton Haldeman and John M. Beggs. Critical branching captures activity in livingneural networks and maximizes the number of metastable states. Physical Review Letters,94:058101, Feb 2005.

[54] David Harte. Multifractals: theory and applications. Chapman and Hall/CRC, 2001.

[55] A. V. Herz and J. J. Hopfield. Earthquake cycles and neural reverberations: collectiveoscillations in systems with pulse-coupled threshold elements. Physical Review Letters,75:1222–1225, 1995.

[56] Wassily Hoeffding. A class of statistics with asymptotically normal distribution. Theannals of mathematical statistics, pages 293–325, 1948.

[57] Hajo Holzmann, Susanne Koch, and Aleksey Min. Almost sure limit theorems for u-statistics. Statistics & probability letters, 69(3):261–269, 2004.

[58] J. R. Jackson. Networks of waiting lines. Operations research, 5(4):518–521, 1957.

[59] H. J. Jensen. Self-Organized Criticality: Emergent complex behavior in physical andbiological systems. Cambridge Univ Pr, 1998.

[60] O. Kinouchi and M. Copelli. Optimal dynamical range of excitable networks at criticality.Nature Physics, 2:348–352, 2006.

81

[61] Andreas Klaus, Shan Yu, and Dietmar Plenz. Statistical analyses support power lawdistributions found in neuronal avalanches. PLOS One, 6(5):e19779, 2011.

[62] Chris G. Langton. Computation at the edge of chaos: phase transitions and emergentcomputation. Physica D, 42(1-3):12–37, 1990.

[63] Lasse Laurson, Stephane Santucci, and Stefano Zapperi. Avalanches and clusters inplanar crack front propagation. Physical Review E, 81(4):046116, 2010.

[64] A.J. Lee. U-statistics. Theory and Practice. Marcel Dekker, New York, 1990.

[65] A. Levina, U. Ernst, and J. M. Herrmann. Criticality of avalanche dynamics in adaptiverecurrent networks. Neurocomput., 70:1877–1881, 2007.

[66] A. Levina, J. M. Herrmann, and T. Geisel. Dynamical synapses causing self-organizedcriticality in neural networks. Nature Physics, 3:857–860, 2007.

[67] A. Levina, J. M. Herrmann, and T. Geisel. Phase transitions towards criticality in aneural system with adaptive interactions. Physical Review Letters, 102(11):118110, 2009.

[68] Anna Levina. A mathematical approach to self-organized criticality in neural networks.Thesis, Universitat Gottingen, 2008.

[69] Anna Levina and J. Michael Herrmann. The Abelian distribution. Stochastics and Dy-namics, 14(03):1450001, 2014.

[70] Anna Levina, J. Michael Herrmann, and Manfred Denker. Critical branching processesin neural networks. Proc. Appl. Math. Mech., 7(1):1030701–1030702, 2007.

[71] Anna Levina and Viola Priesemann. Subsampling scaling. Nature Communications,8:15140, May 2017.

[72] Matteo Martinello, Jorge Hidalgo, Amos Maritan, Serena Di Santo, Dietmar Plenz, andMiguel A. Munoz. Neutral theory and scale-free neural dynamics. Physical Review X,7(4):1–11, 2017.

[73] Alberto Mazzoni, Frederic D Broccard, Elizabeth Garcia-Perez, Paolo Bonifazi,Maria Elisabetta Ruaro, and Vincent Torre. On the dynamics of the spontaneous ac-tivity in neuronal networks. PLOS One, 2(5):e439, 2007.

[74] Amit P Mehta, Karin A Dahmen, and Yehuda Ben-Zion. Universal mean moment rateprofiles of earthquake ruptures. Physical Review E, 73(5):056104, 2006.

[75] Daniel Millman, Stefan Mihalas, Alfredo Kirkwood, and Ernst Niebur. Self-organized crit-icality occurs in non-conservative neuronal networks during/up/’states. Nature physics,6(10):801–805, 2010.

[76] Michael Mitzenmacher. A brief history of generative models for power law and lognormaldistributions. Internet Mathematics, 1(2):226–251, 2004.

82

[77] John W. Moon. Various proofs of cayley’s formula for counting trees. In A seminar onGraph Theory, number s 70, page 78, 1967.

[78] Stefano Mossa, Marc Barthelemy, Harry E. Stanley, and Luis A. Nunes Amaral. Trun-cation of power law behavior in “scale-free” network models due to information filtering.Physical Review Letters, 88(13):138701, 2002.

[79] Miguel A. Munoz. Colloquium: Criticality and dynamical scaling in living systems.Reviews of Modern Physics, 90(3):031001, 2018.

[80] Mark E.J. Newman. Power laws, pareto distributions and zipf’s law. Contemporaryphysics, 46(5):323–351, 2005.

[81] John Nolan. Stable distributions: models for heavy-tailed data. Birkhauser New York,2003.

[82] Thomas Petermann, Tara C. Thiagarajan, Mikhail A. Lebedev, Miguel AL Nicolelis,Dante R. Chialvo, and Dietmar Plenz. Spontaneous cortical activity in awake monkeyscomposed of neuronal avalanches. Proc. Natl. Acad. Sci. USA, 106(37):15921–15926,2009.

[83] Viola Priesemann and Oren Shriki. Can a time varying external drive give rise to apparentcriticality in neural systems? PLOS Computational Biology, 14(5):e1006081, 2018.

[84] Viola Priesemann, Mario Valderrama, Michael Wibral, and Michel Le Van Quyen. Neu-ronal avalanches differ from wakefulness to deep sleep–evidence from intracranial depthrecordings in humans. PLOS Computational Biology, 9(3):e1002985, 2013.

[85] Viola Priesemann, Michael Wibral, Mario Valderrama, Robert Propper, MichelLe Van Quyen, Theo Geisel, Jochen Triesch, Danko Nikolic, and Matthias Hans JoachimMunk. Spike avalanches in vivo suggest a driven, slightly subcritical brain state. Frontiersin Systems Neuroscience, 8:108, 2014.

[86] Gunnar Pruessner. Self-organised criticality: theory, models and characterisation. Cam-bridge University Press, 2012.

[87] William J. Reed and Barry D. Hughes. From gene families and genera to incomesand internet file sizes: Why power laws are so common in nature. Physical ReviewE, 66(6):067103, 2002.

[88] J. Riordan. An introduction to combinatorial analysis. John Wiley and Sons, 1958.

[89] Jan Rosinski and W.A. Woyczynski. On Ito stochastic integration with respect to p-stablemotion: inner clock, integrability of sample paths, double and multiple integrals. TheAnnals of Probability, pages 271–286, 1986.

[90] Oguz Umut Salman and Lev Truskinovsky. Minimal integer automaton behind crystalplasticity. Physical Review Letters, 106:175503, Apr 2011.

83

[91] Silvia Scarpetta and Antonio de Candia. Alternation of up and down states at a dy-namical phase-transition of a neural network with spatiotemporal attractors. Front. Syst.Neurosci., 8:88, 2014.

[92] Peter Schatte. On strong versions of the central limit theorem. MathematischeNachrichten, 137(1):249–256, 1988.

[93] James P. Sethna, Karin A. Dahmen, and Christopher R. Myers. Crackling noise. Nature,410(6825):242, 2001.

[94] Woodrow L Shew, Wesley P. Clawson, Jeff Pobst, Yahya Karimipanah, Nathaniel CWright, and Ralf Wessel. Adaptation to sensory input tunes visual cortex to criticality.Nature Physics, 11(8):659, 2015.

[95] Woodrow L Shew, Hongdian Yang, Thomas Petermann, Rajarshi Roy, and DietmarPlenz. Neuronal Avalanches Imply Maximum Dynamic Range in Cortical Networks atCriticality. Journal of Neuroscience, 29(49):15595–15600, 2009.

[96] Oren Shriki, Jeff Alstott, Frederick Carver, Tom Holroyd, Richard NA Henson, Marie LSmith, Richard Coppola, Edward Bullmore, and Dietmar Plenz. Neuronal avalanches inthe resting meg of the human brain. J. Neurosci., 33(16):7079–7090, 2013.

[97] Robert H. Shumway and David S. Stoffer. Time series regression and exploratory dataanalysis. Time Series Analysis and Its Applications: With R Examples, pages 48–83,2006.

[98] Didier Sornette. Critical phenomena in natural sciences: chaos, fractals, selforganizationand disorder: concepts and tools. Springer Science & Business Media, 2006.

[99] Michael P.H. Stumpf and Mason A. Porter. Critical truths about power laws. Science,335(6069):665–666, 2012.

[100] Enzo Tagliazucchi, Pablo Balenzuela, Daniel Fraiman, and Dante R Chialvo. Criticalityin large-scale brain fmri dynamics unveiled by a novel point process analysis. Front.Physiol., 3:15, 2012.

[101] Mehdi Talamali, Viljo Petaja, Damien Vandembroucq, and Stephane Roux. Avalanches,precursors, and finite-size fluctuations in a mesoscopic model of amorphous plasticity.Physical Review E, 84:016115, July 2011.

[102] I. Tweddle. James Stirling’s methodus differentialis: an annotated translation of Stirling’stext. Springer Science & Business Media, 2012.

[103] Nicholas W. Watkins, Gunnar Pruessner, Sandra C. Chapman, Norma B. Crosby, andHenrik J. Jensen. 25 years of self-organized criticality: concepts and controversies. SpaceScience Reviews, 198(1-4):3–44, 2016.

[104] Rashid V. Williams-Garcıa, John M. Beggs, and Gerardo Ortiz. Unveiling causal activityof complex networks. EPL (Europhysics Letters), 119(1):18003, 2017.

84

[105] Jens Wilting and Viola Priesemann. Branching into the unknown: inferring collectivedynamical states from subsampled systems. arXiv preprint arXiv:1608.07035, 2016.

[106] Mohammad Yaghoubi, Ty de Graaf, Javier G Orlandi, Fernando Girotto, Michael AColicos, and Jorn Davidsen. Neuronal avalanche dynamics indicates different universalityclasses in neuronal cultures. Scientific Reports, 8(1):3417, 2018.

[107] Shan Yu, Tiago L Ribeiro, Christian Meisel, Samantha Chou, Andrew Mitz, RichardSaunders, and Dietmar Plenz. Maintained avalanche dynamics during task-inducedchanges of neuronal activity in nonhuman primates. Elife, 6:e27119, 2017.

[108] Michael Zaiser and Nikos Nikitas. Slip avalanches in crystal plasticity: scaling ofthe avalanche cut-off. Journal of Statistical Mechanics: Theory and Experiment,2007(04):P04013, 2007.

[109] Stefano Zapperi, Kent B. Lauritsen, and Harry E. Stanley. Self-organized branchingprocesses: Mean-field theory for avalanches. Physical Review Letters, 75(22):4071–4074,Nov 1995.

[110] O Zeitouni. Lecture notes on random walks in random environments, 2001. Preprintavailable at http://www-ee. technion. ac. il/zeitouni/ps/notes1. ps.

85

VITAAnirban Das

Anirban Das was born and brought up in Kolkata, India. He attended the Indian Institute ofTechnology, Kharagpur for his undergraduate studies. At the Indian Institute of Technologyhe earned both his Bsc. & Msc. degrees in Mathematics and Computing. Then he joinedthe Department of Mathematics at the Pennsylvania State University as a graduate studentpursuing a Phd. in Mathematics. He specialized in probability theory, particularly focusing onits application in neuroscience.

modeling and mathematical analysis of neural avalanches …

Documents