noise resilience of an rgng-based grid cell model · noise resilience of an rgng-based grid cell...

Noise Resilience of an RGNG-based Grid Cell Model

Jochen Kerdels1 and Gabriele Peters1

1University of Hagen, Universitatsstrasse 1 , D-58097 Hagen, Germany{jochen.kerdels, gabriele.peters}@fernuni-hagen.de

Keywords: noise resilience, grid cell model, input space representation, recursive growing neural gas

Abstract: Grid cells are neurons in the entorhinal cortex of mammals that are known for their peculiar, grid-like firingpatterns. We developed a generic computational model that describes the behavior of neurons with such firingpatterns in terms of a competitive, self-organized learning process. Here we investigate how this process cancope with increasing amounts of noise in its input signal. We demonstrate, that the firing patterns of simulatedneurons are mostly unaffected with regard to their structure even if high levels of noise are present in the input.In contrast, the maximum activity of the corresponding neurons decreases significantly with increasing levelsof noise. Based on these results we predict that real grid cells can retain their triangular firing patterns in thepresence of noise, but may exhibit a noticeable decrease in their peak firing rates.

1 INTRODUCTION

Several regions of the mammalian brain containneurons that exhibit peculiar, grid-like firing pat-terns (Fyhn et al., 2004; Hafting et al., 2005; Boccaraet al., 2010; Killian et al., 2012; Yartsev et al., 2011;Domnisoru et al., 2013; Jacobs et al., 2013). The mostcommon example for such neurons are so-called gridcells found in the medial entorhinal cortex (MEC) ofrat (Fyhn et al., 2004; Hafting et al., 2005). The activ-ity of these cells correlates with the animal’s locationin a periodic, triangular pattern that spans across theentire environment of the animal.

We developed a generic computational model thatis able to describe the behavior of neurons with suchgrid-like firing patterns (Kerdels and Peters, 2013;Kerdels and Peters, 2015b; Kerdels, 2016). Here weinvestigate how this model reacts to random noise inits input signal as it would be expected to occur innatural neurobiological circuits where each neuron re-ceives input from hundreds to thousands of other neu-rons (Koch, 2004)1.

The next section summarizes our grid cell modelbriefly and highlights those mechanisms of the modelthat are especially relevant to the investigation of in-put signal noise. Subsequently, section 3 outlines howscalable amounts of noise are added to the input sig-nal of our grid cell model. Section 4 then reportsand analyses the results obtained from simulation runs

1p. 411ff

Figure 1: Illustration of the RGNG-based neuron model.The top layer is represented by three units (red, green, blue)connected by dashed edges. The prototypes of the top layerunits are themselves RGNGs. The units of these RGNGsare illustrated in the second layer by corresponding colors.

with varying levels of noise. Finally, we discuss ourfindings in section 5.

2 GRID CELL MODEL

We developed a generic neuron model that is ableto describe, among others, the behavior of gridcells (Kerdels, 2016). At its core the model usesthe recursive growing neural gas (RGNG) algorithmto describe the collective behavior of a group ofneurons. The RGNG algorithm extends the regu-lar growing neural gas (GNG) algorithm proposed byFritzke (Fritzke, 1995) in a recursive fashion2. A reg-

2A formal definition of the RGNG algorithm is providedin the appendix.

peters

Textfeld

Jochen Kerdels and Gabriele Peters, Noise Resilience of an RGNG-based Grid Cell Model, 8th International Conference on Neural Computation Theory and Applications (NCTA 2016), Porto, Portugal, November 9-11, 2016. accepted manuscript. The final publication will be published by SCITEPRESS. -> nominated for best paper award

ular GNG is an unsupervised learning algorithm thatapproximates the structure of its input space with anetwork of units where each unit represents a localregion of input space while the network itself repre-sents the input space topology. In a recursive GNGor RGNG the units can not only represent local inputspace regions but may also represent entire RGNGsthemselves resulting in a layered, hierarchical struc-ture of interleaved input space representations.

In our grid cell model we use a single RGNG withtwo layers to model a group of neurons (Fig. 1). Theunits in the top layer (TL) represent the individualcells of the group. Each of these TL units contains anRGNG located in the bottom layer (BL). These BL-RGNGs can be interpreted as the dendritic trees of theneurons, each of which learning a separate represen-tation of the entire input space. The units of the BL-RGNGs correspond to local subsections of the den-dritic tree that recognize input patterns from specific,local regions of input space. In the model, these localregions are represented by so-called reference vectorsor prototypes. Thus, each modelled neuron (TL unit)has a set of prototypes (BL units) that are arrangedin a network (BL RGNG) that constitutes a piece-wise approximation of the neuron’s input space. Theneurons (TL units) themselves are organized in a net-work (TL RGNG) as well resulting in an interleavedarrangement of the individual input space approxima-tions.

Please note that the term “neuron” is used differ-ently here compared to its regular usage in a growingneural gas context. It is used synonymously only withthe TL units of the model and not the BL units. Fur-thermore, the RGNG is inherently different from ex-isting hierarchical versions of the growing neural gaslike those proposed by, e.g., Doherty et al. (Dohertyet al., 2005) or Podolak and Bartocha (Podolak andBartocha, 2009). These approaches represent the in-put space in a hierarchical fashion with every unit ofthe hierarchical GNG corresponding to a single localregion of input space. In our model, this holds trueonly for the BL units whereas every TL unit repre-sents the entire input space.

2.1 Learning

The two-layer RGNG used in the grid cell model hasno explicit training phase and updates its input spaceapproximation continuously. In a first step, each in-put ξ is processed by all neurons. To this end everyneuron determines the best and second best matchingunits (BMUs) termed s1 and s2, respectively. Unitss1 and s2 are those BL units whose prototypes s1�wand s2�w are closest to the input ξ according to a dis-

Figure 2: Geometric interpretation of ratio r, which is usedas basis for an approximation of the top layer unit’s “activ-ity”.

tance function D. Once determined, the BMUs of ev-ery neuron are adapted towards the input ξ and thecorresponding BL networks are updated where nec-essary.

In a second step, the single neuron whose BMUwas closest to the input and its direct neighboring neu-rons in the TL network are allowed to adapt towardsthe input a second time. This selective adaptationaligns the input space representations of the individ-ual neurons and interleaves them evenly to cover theinput space as well as possible (Kerdels, 2016).

2.2 Activity Approximation

The RGNG-based model describes a group of neu-rons for which we would like to derive their “activ-ity” for any given input as a scalar that representsthe momentary firing rate of the particular neuron.Yet, the RGNG algorithm itself does not provide adirect measure that could be used to this end. There-fore, we derive the activity au of a modelled neuron ubased on the neuron’s best and second best matchingBL units s1 and s2 with respect to a given input ξ as:

au := e−(1−r)2

2σ2 ,

with σ = 0.2 and ratio r:

r :=D(s2�w, ξ)−D(s1�w, ξ)

D(s1�w, s2�w), s1,s2 ∈ u�w�U,

using a distance function D. Figure 2 provides a geo-metric interpretation of the ratio r. If input ξ is closeto BMU s1 in relation to s2, ratio r becomes 1. If onthe other hand input ξ has about the same distanceto s1 as it has to s2, ratio r becomes 0.

This measure of activity allows to correlate the re-sponse of a neuron to a given input with further vari-ables. An example of such a correlation is shownin figure 3b as a firing rate map, which correlates

(a) (b)Figure 3: (a) Example trace of rat movement within arectangular, 1m×1m environment recorded for a durationof 10 minutes. Movement data published by Sargolini etal. (Sargolini et al., 2006). (b) Color-coded firing rate mapof a simulated grid cell ranging from dark blue (no activity)to red (maximum activity).

the animal’s location (Fig. 3a) with the activity of asimulated grid cell at that location. The firing ratemaps resulting from our simulations are constructedaccording to the procedures described by Sargolini etal. (Sargolini et al., 2006) but using a 5× 5 boxcarfilter for smoothing instead of a Gaussian kernel asintroduced by Stensola et al. (Stensola et al., 2012).This conforms to the de facto standard of rate mapconstruction in the grid cell literature. Each rate mapintegrates position and activity data over 30000 timesteps corresponding to a single experimental trial witha duration of 10 minutes recorded at 50Hz.

3 INPUT SIGNAL

In general, the RGNG-based neuron model is inde-pendent of any specific type of input space and canprocess arbitrary input signals. However, only in-put spaces with certain characteristics will cause themodelled neurons to exhibit grid-like firing patternswith respect to a given variable, e.g., the animal’s lo-cation. More precisely, grid-like firing patterns willonly emerge if the input signals originate from a uni-formly distributed, two-dimensional manifold in theinput space that correlates with the respective exter-nal variable.

In the case of grid cells there are many conceivableinput spaces that meet these requirements (Kerdels,2016). For our experiments we choose an input spacewhere the position of the animal is encoded by twovectors as shown in figure 4. The two-dimensionalposition is represented by the activity of two sets ofcells that each are connected in a one-dimensional,periodic fashion like, e.g., a one-dimensional ring at-tractor network. Similar types of input signals for gridcell models were proposed in the literature by, e.g.,Mhatre et al. (Mhatre et al., 2010) as well as Pilly and

Figure 4: The input signal for our experiments is derivedby encoding a position (blue cross) in a periodic, two-dimensional input space (square) as the activity (black =high, white = low) of two sets of cells (vertical and hori-zontal groups of small boxes) that each are connected in aone-dimensional, periodic fashion, i.e., the activity wrapsaround at the borders of the resulting vectors.

Grossberg (Pilly and Grossberg, 2012).For the experiments discussed below the input sig-

nal ξ := (vx,vy) is implemented as two concatenated50-dimensional vectors vx and vy. To generate an in-put signal a position (x,y) ∈ [0,1]× [0,1] is read fromtraces (Fig. 3a) of recorded rat movements that werepublished by (Sargolini et al., 2006) and mapped ontothe corresponding elements of vx and vy as follows:

vxi := max

(1−∣∣∣∣ i−bdx+0.5c

s

∣∣∣∣ ,1−∣∣∣∣d + i−bdx+0.5c

s

∣∣∣∣ , 0),

vyi := max

(1−∣∣∣∣ i−bdy+0.5c

s

∣∣∣∣ ,1−∣∣∣∣d + i−bdy+0.5c

s

∣∣∣∣ , 0),

∀i ∈ {0 . . .d−1} ,

with d = 50 and s = 8. The parameter s controls theslope of the activity peak with higher values of s re-sulting in a broader peak.

Each input vector ξ := (vx, vy) was then aug-mented by noise as follows:

vxi := max[ min[ vx

i +ξn (2Urnd−1), 1] , 0] ,

vyi := max

[min[

vyi +ξn (2Urnd−1), 1

], 0],

∀i ∈ {0 . . .d−1} ,

with maximum noise level ξn and uniform randomvalues Urnd ∈ [0,1]. The rationale for this noise modelis as follows. Each element of the input vector rep-resents the normalized firing rate of an input neu-ron, where typical peak firing rates of neurons in theparahippocampal-hippocampal region range between1Hz and 50Hz (Hafting et al., 2005; Sargolini et al.,2006; Boccara et al., 2010; Krupic et al., 2012). Someproportion of this firing rate is due to spontaneousactivity of the corresponding neuron. According toKoch (Koch, 2004) this random activity can occurabout once per second, i.e., at 1Hz. Hence, the pro-portion of noise in the normalized firing rate result-ing from this spontaneous firing can be expected tolie between 1.0 and 0.02 given the peak firing ratesstated above. As we have no empirical data on thedistribution of peak firing rates in the input signal ofgrid cells we assume a uniform distribution. The pa-rameter ξn allows to control the maximum noise levelor the assumed minimal peak firing rate (implicitly).For example, a maximum noise level of ξn = 0.1 cor-responds to a minimal peak firing rate of 10Hz, and alevel of ξn = 0.5 corresponds to a minimal peak firingrate of 2Hz.

4 SIMULATION RESULTS

To investigate how the RGNG-based grid cell modelreacts to noise in its input signal we ran a series ofsimulations with increasing levels ξn of noise (or, cor-respondingly, decreasing minimal peak firing rates).All simulation runs used the fixed set of parame-ters shown in table 1 (appendix). Both, the num-ber θ1�M = 100 of simulated grid cells as well asthe number θ2�M = 20 of dendritic subsections percell are chosen to lie within a biologically plausiblerange. The latter number can be estimated based onthe morphological properties of MEC layer II stellatecells, which are presumed to be one type of princi-pal neurons that exhibit grid-like firing patterns (Gio-como et al., 2007; Giocomo and Hasselmo, 2008).The dendritic tree of these neurons has about 7500to 15000 spines, each hosting one or more synapticconnections (Lingenhhl and Finch, 1991). In our ex-periments we assume that the input space of each gridcell is comprised by the output of 100 input neuronsand that each grid cell recognizes θ2�M = 20 differ-ent input patterns originating from that space. Thisparametrization results in 3.75 to 7.5 available spinesfor each input dimension in a prototype pattern. Thisis a conservative choice that provides some marginfor possible variability in input dimension and num-ber of prototypes. Estimating the number θ1�M of

Figure 5: Artificial rate maps (left), gridness distributions(middle), and activity function plots (right) of simulationruns with varying levels ξn of noise (rows) added to theinputs. All simulation runs used a fixed set of parameters(Table 1) and processed location inputs derived from move-ment data published by Sargolini et al. (Sargolini et al.,2006). Each artificial rate map was chosen randomly fromthe particular set of rate maps. Average maximum activity(MX) and average minimum activity (MN) across all ratemaps stated above. Gridness threshold of 0.4 indicated byred marks. Values of ratio r at average maximum activity(MX) given in blue. Insets show magnified regions of theactivity function where MX values are low.

grid cells in a grid cell group3 is more difficult. Sten-sola et al. (Stensola et al., 2012) estimate that thereare up to 10 different, discrete groups of grid spac-

3Grid cells are organized in groups that share the samegrid spacing and grid orientation (Stensola et al., 2012).

ings present in the MEC. Thus, as an upper bound thenumber of grid cells per grid cell group can be esti-mated as one-tenth of the total number of grid cellsin layer II of the rat MEC. This number is not exactlyknown. Based on empirical data it can be estimated tolie between 14200 (Gatome et al., 2010; Krupic et al.,2012) and 25000 (Hafting et al., 2005; Sargolini et al.,2006; Boccara et al., 2010) cells resulting in an up-per bound for the size θ1�M of a grid cell group tolie between 1420 and 2500 grid cells. In contrast, alower bound for the group size can not be estimatedreliably as the actual number of observed grid cellsper grid cell group is quite small (< 50) in individualanimals (Stensola et al., 2012; Krupic et al., 2012).Grid cell groups may be more numerous and containfewer cells than indicated by the upper bound. How-ever, in the context of our RGNG-based model thisuncertainty appears to be noncritical. As reported byKerdels (Kerdels, 2016) the behavior of the simulatedgrid cells is not influenced by the particular size ofthe grid cell group in any significant way. Thus, wechose a value of θ1�M = 100 as a compromise be-tween the possibility of a few large and many smallgrid cell groups. For a full, detailed derivation andcharacterization of the other parameters we refer toKerdels (Kerdels, 2016).

For the simulation runs reported here we used a se-quence of input locations taken from recorded move-ment data of rats running around in a square environ-ment while chasing food pellets. To keep the simula-tion runs comparable, each run used the same trajec-tory. The data was published by Sargolini et al. (Sar-golini et al., 2006) and is available for download4.

Figure 5 summarizes the results of these experi-ments. For each run (rows) the figure shows the firingrate map of one grid cell that was randomly chosenfrom the simulated grid cell group, the distributionof gridness scores5 of all simulated cells, and an ac-tivity function plot indicating the maximum activityexhibited by any of the simulated grid cells. The ex-emplary rate maps and gridness score distributions in-dicate that the RGNG-based model is able to tolerateincreasing levels ξn ∈ {0.1,0.3,0.5,0.7,0.9} of noisewithout loosing its ability to form the characteristic,triangular firing patterns of grid cells. However, withincreasing noise level the maximum average activity(MX) present in the rate maps drops by two orders ofmagnitude. This decrease is clearly visible in the ac-tivity function plots shown in the right column of fig-ure 5. Yet, the minimal difference between the max-

4http://www.ntnu.edu/kavli/research/grid-cell-data5The gridness score ([−2,2]) is a measure of how grid-

like the firing pattern of a neuron is. Neurons with gridnessscores greater 0.4 are commonly identified as grid cell.

imum average activity (MX) and the minimum aver-age activity (MN) for any given noise level ξn is atleast two orders of magnitude, i.e., each cell’s activitymay be normalized with respect to the particular noiselevel without much disturbance of the grid-like firingpattern.

The reduction of the maximum average activitywith increasing noise levels can be explained by theratio r used to derive each cell’s activity (Fig. 2). Theratio r describes the relative distance of the currentinput ξ to the best matching unit s1 and the secondbest matching unit s2. If r is close to zero, the in-put has roughly the same distance to unit s1 and tounit s2 resulting in a low activity of the correspondingneuron. If, on the other hand, r is close to one, theinput is close to the best matching unit s1 yielding ahigh activity of the simulated neuron. With increasingnoise the probability that an input matches the proto-type of the best matching unit very closely decreasessubstantially. Without noise all inputs originate froma lower-dimensional manifold in input space. Addingnoise to these inputs moves the inputs away from thismanifold in random directions creating a local, high-dimensional region surrounding the low-dimensionalmanifold that is prone to effects that are commonlyreferred to as the curse of dimensionality. As a conse-quence, each BL prototype becomes surrounded witha kind of “dead zone” for which it is unlikely that aninput will originate from it. These dead zones are in-dicated in the right column of figure 5 as light grayareas in the activity function plots. The values of theratio r at the border of these zones (blue marks inFig 5) define the probable maximum activity of thecorresponding neurons.

5 CONCLUSIONS

The simulation results presented here indicate that theprototype-based representation of input space utilizedby our grid cell model shows a high resilience to noisepresent in its input signal. This means, that even inputsignals with very low peak firing rates around 1Hzand a corresponding large proportion of noise throughspontaneous activity can be processed by our model.According to these results, we would expect that realgrid cells

• can process inputs with low peak firing rates,

• may show a similar reduction in activity when theproportion of noise in their inputs is high, and

• would not suffer a degradation of their firing fieldgeometry in such a case.

These hypothesis may be actively tested by experi-ments that introduce controlled noise to the input sig-nal, e.g., by the use of optogenetics (Hausser, 2014).Alternatively, a passive comparison of the peak firingrates present in the input signal of a grid cell and theresulting peak firing rate of the grid cell itself may in-dicate a possible correlation. To the best of our knowl-edge no such experiments were done yet.

However, neurons are also known to have severalstrategies to directly compensate for noise in their in-put signal by, e.g., changing electrotonic propertiesof their cell membranes (Koch, 2004). Such adapta-tions could be represented in our grid cell model bynormalizing ratio r with respect to the level of noise.How such a normalization could be implemented isone subject of our future research.

In addition to these neurobiological aspects thepresented results also illustrate the general ability ofprototype-based learning algorithms like the GNG orRGNG to learn the structure of an input space even inthe presence of very high levels of noise as shown inthe last row of figure 5.

REFERENCES

Boccara, C. N., Sargolini, F., Thoresen, V. H., Solstad, T.,Witter, M. P., Moser, E. I., and Moser, M.-B. (2010).Grid cells in pre- and parasubiculum. Nat Neurosci,13(8):987–994.

Doherty, K., Adams, R., and Davey, N. (2005). Hierar-chical growing neural gas. In Ribeiro, B., Albrecht,R., Dobnikar, A., Pearson, D., and Steele, N., editors,Adaptive and Natural Computing Algorithms, pages140–143. Springer Vienna.

Domnisoru, C., Kinkhabwala, A. A., and Tank, D. W.(2013). Membrane potential dynamics of grid cells.Nature, 495(7440):199–204.

Fritzke, B. (1995). A growing neural gas network learnstopologies. In Advances in Neural Information Pro-cessing Systems 7, pages 625–632. MIT Press.

Fyhn, M., Molden, S., Witter, M. P., Moser, E. I., andMoser, M.-B. (2004). Spatial representation in the en-torhinal cortex. Science, 305(5688):1258–1264.

Gatome, C., Slomianka, L., Lipp, H., and Amrein, I. (2010).Number estimates of neuronal phenotypes in layer{II} of the medial entorhinal cortex of rat and mouse.Neuroscience, 170(1):156 – 165.

Giocomo, L. M. and Hasselmo, M. E. (2008). Time con-stants of h current in layer ii stellate cells differ alongthe dorsal to ventral axis of medial entorhinal cortex.The Journal of Neuroscience, 28(38):9414–9425.

Giocomo, L. M., Zilli, E. A., Fransn, E., and Hasselmo,M. E. (2007). Temporal frequency of subthreshold os-cillations scales with entorhinal grid cell field spacing.Science, 315(5819):1719–1722.

Hafting, T., Fyhn, M., Molden, S., Moser, M.-B., andMoser, E. I. (2005). Microstructure of a spatial mapin the entorhinal cortex. Nature, 436(7052):801–806.

Hausser, M. (2014). Optogenetics: the age of light. NatMeth, 11(10):1012–1014.

Jacobs, J., Weidemann, C. T., Miller, J. F., Solway, A.,Burke, J. F., Wei, X.-X., Suthana, N., Sperling, M. R.,Sharan, A. D., Fried, I., and Kahana, M. J. (2013). Di-rect recordings of grid-like neuronal activity in humanspatial navigation. Nat Neurosci, 16(9):1188–1190.

Kerdels, J. (2016). A Computational Model of Grid Cellsbased on a Recursive Growing Neural Gas. PhD the-sis, FernUniversitat in Hagen, Hagen.

Kerdels, J. and Peters, G. (2013). A computational modelof grid cells based on dendritic self-organized learn-ing. In Proceedings of the International Conferenceon Neural Computation Theory and Applications.

Kerdels, J. and Peters, G. (2015a). Analysis of high-dimensional data using local input space histograms.Neurocomputing, 169:272 – 280.

Kerdels, J. and Peters, G. (2015b). A new view on grid cellsbeyond the cognitive map hypothesis. In 8th Confer-ence on Artificial General Intelligence (AGI 2015).

Killian, N. J., Jutras, M. J., and Buffalo, E. A. (2012). Amap of visual space in the primate entorhinal cortex.Nature, 491(7426):761–764.

Koch, C. (2004). Biophysics of Computation: InformationProcessing in Single Neurons. Computational Neuro-science Series. Oxford University Press, USA.

Krupic, J., Burgess, N., and OKeefe, J. (2012). Neuralrepresentations of location composed of spatially pe-riodic bands. Science, 337(6096):853–857.

Lingenhhl, K. and Finch, D. (1991). Morphological char-acterization of rat entorhinal neurons in vivo: soma-dendritic structure and axonal domains. ExperimentalBrain Research, 84(1):57–74.

Mhatre, H., Gorchetchnikov, A., and Grossberg, S. (2010).Grid cell hexagonal patterns formed by fast self-organized learning within entorhinal cortex (publishedonline 2010). Hippocampus, 22(2):320–334.

Pilly, P. K. and Grossberg, S. (2012). How do spatial learn-ing and memory occur in the brain? coordinated learn-ing of entorhinal grid cells and hippocampal placecells. J. Cognitive Neuroscience, pages 1031–1054.

Podolak, I. and Bartocha, K. (2009). A hierarchicalclassifier with growing neural gas clustering. InKolehmainen, M., Toivanen, P., and Beliczynski, B.,editors, Adaptive and Natural Computing Algorithms,volume 5495 of Lecture Notes in Computer Science,pages 283–292. Springer Berlin Heidelberg.

Sargolini, F., Fyhn, M., Hafting, T., McNaughton, B. L.,Witter, M. P., Moser, M.-B., and Moser, E. I.(2006). Conjunctive representation of position, di-rection, and velocity in entorhinal cortex. Science,312(5774):758–762.

Stensola, H., Stensola, T., Solstad, T., Froland, K., Moser,M.-B., and Moser, E. I. (2012). The entorhinal gridmap is discretized. Nature, 492(7427):72–78.

Yartsev, M. M., Witter, M. P., and Ulanovsky, N. (2011).Grid cells without theta oscillations in the entorhinalcortex of bats. Nature, 479(7371):103–107.

APPENDIX

Recursive Growing Neural Gas

The recursive growing neural gas (RGNG) has essen-tially the same structure as the regular growing neuralgas (GNG) proposed by Fritzke (Fritzke, 1995). Likea GNG an RGNG g can be described by a tuple6:

g := (U,C,θ) ∈ G,

with a set U of units, a set C of edges, and a set θ ofparameters. Each unit u is described by a tuple:

u := (w,e) ∈U, w ∈W := Rn∪G, e ∈ R,with the prototype w, and the accumulated error e.Note that in contrast to the regular GNG the proto-type w of an RGNG unit can either be a n-dimensionalvector or another RGNG. Each edge c is described bya tuple:

c := (V, t) ∈C, V ⊆U ∧ |V |= 2, t ∈ N,with the units v ∈ V connected by the edge and theage t of the edge. The direct neighborhood Eu of aunit u ∈U is defined as:

Eu := {k|∃(V, t) ∈C, V = {u,k} , t ∈ N} .The set θ of parameters consists of:

θ := {εb,εn,εr,λ,τ,α,β,M} .Compared to the regular GNG the set of parametershas grown by θ�εr and θ�M. The former parameter isa third learning rate used in the adaptation functionA (see below). The latter parameter is the maximumnumber of units in an RGNG. This number refersonly to the number of “direct” units in a particularRGNG and does not include potential units present inRGNGs that are prototypes of these direct units.

Like its structure the behavior of the RGNG is ba-sically identical to that of a regular GNG. However,since the prototypes of the units can either be vectorsor RGNGs themselves, the behavior is now definedby four functions. The distance function

D(x,y) : W ×W → Rdetermines the distance either between two vectors,two RGNGs, or a vector and an RGNG. The interpo-lation function

I(x,y) : (Rn×Rn)∪ (G×G)→W

6The notation g�α is used to reference the element α

within the tuple.

generates a new vector or new RGNG by interpolat-ing between two vectors or two RGNGs, respectively.The adaptation function

A(x,ξ,r) : W ×Rn×R→W

adapts either a vector or RGNG towards the input vec-tor ξ by a given fraction r. Finally, the input function

F(g,ξ) : G×Rn→ G×R

feeds an input vector ξ into the RGNG g and returnsthe modified RGNG as well as the distance between ξ

and the best matching unit (BMU, see below) of g.The input function F contains the core of the RGNG’sbehavior and utilizes the other three functions, but isalso used, in turn, by those functions introducing sev-eral recursive paths to the program flow.

F(g,ξ): The input function F is a generalized ver-sion of the original GNG algorithm that facilitates theuse of prototypes other than vectors. In particular, itallows to use RGNGs themselves as prototypes result-ing in a recursive structure. An input ξ ∈ Rn to theRGNG g is processed by the input function F as fol-lows:

• Find the two units s1 and s2 with the smallest dis-tance to the input ξ according to the distance func-tion D:

s1 := argmin u∈g�U D(u�w, ξ) ,

s2 := argmin u∈g�U\{s1}D(u�w, ξ) .

• Increment the age of all edges connected to s1:

∆c�t = 1, c ∈ g�C ∧ s1 ∈ c�V .

• If no edge between s1 and s2 exists, create one:

g�C ⇐ g�C ∪ {({s1,s2} ,0)} .

• Reset the age of the edge between s1 and s2 tozero:

c�t ⇐ 0, c ∈ g�C ∧ s1,s2 ∈ c�V .

• Add the squared distance between ξ and the pro-totype of s1 to the accumulated error of s1:

∆s1�e = D(s1�w, ξ)2 .

• Adapt the prototype of s1 and all prototypes of itsdirect neighbors:

s1�w ⇐ A(s1�w, ξ, g�θ�εb) ,

sn�w ⇐ A(sn�w, ξ, g�θ�εn) , ∀sn ∈ Es1 .

• Remove all edges with an age above a giventhreshold τ and remove all units that no longerhave any edges connected to them:

g�C ⇐ g�C \ {c|c ∈ g�C ∧ c�t > g�θ�τ} ,g�U ⇐ g�U \ {u|u ∈ g�U ∧ Eu = /0} .

• If an integer-multiple of g�θ�λ inputs was pre-sented to the RGNG g and |g�U | < g�θ�M, add anew unit u. The new unit is inserted “between” theunit j with the largest accumulated error and theunit k with the largest accumulated error amongthe direct neighbors of j. Thus, the prototype u�wof the new unit is initialized as:

u�w := I( j�w,k�w) , j = argmax l∈g�U (l�e) ,

k = argmax l∈E j(l�e) .

The existing edge between units j and k is re-moved and edges between units j and u as wellas units u and k are added:

g�C ⇐ g�C \ {c|c ∈ g�C ∧ j,k ∈ c�V} ,g�C ⇐ g�C ∪ {({ j,u} ,0) ,({u,k} ,0)} .

The accumulated errors of units j and k are de-creased and the accumulated error u�e of the newunit is set to the decreased accumulated error ofunit j:

∆ j�e = −g�θ�α · j�e, ∆k�e =−g�θ�α · k�e,u�e := j�e .

• Finally, decrease the accumulated error of allunits:

∆u�e =−g�θ�β ·u�e, ∀u ∈ g�U .

The function F returns the tuple (g,dmin) containingthe now updated RGNG g and the distance dmin :=D(s1�w, ξ) between the prototype of unit s1 and in-put ξ. Note that in contrast to the regular GNG thereis no stopping criterion any more, i.e., the RGNG op-erates explicitly in an online fashion by continuouslyintegrating new inputs. To prevent unbounded growthof the RGNG the maximum number of units θ�M wasintroduced to the set of parameters.

D(x,y): The distance function D determines the dis-tance between two prototypes x and y. The calculationof the actual distance depends on whether x and y areboth vectors, a combination of vector and RGNG, orboth RGNGs:

D(x,y) :=

DRR(x,y) if x,y ∈ Rn,

DGR(x,y) if x ∈ G ∧ y ∈ Rn,

DRG(x,y) if x ∈ Rn ∧ y ∈ G,

DGG(x,y) if x,y ∈ G.

In case the arguments of D are both vectors, theMinkowski distance is used:

DRR(x,y) := (∑ni=1 |xi− yi|p)

1p , x = (x1, . . . ,xn) ,

y = (y1, . . . ,yn) ,

p ∈ N.Using the Minkowski distance instead of the Eu-clidean distance allows to adjust the distance measurewith respect to certain types of inputs via the param-eter p. For example, setting p to higher values resultsin an emphasis of large changes in individual dimen-sions of the input vector versus changes that are dis-tributed over many dimensions (Kerdels and Peters,2015a). However, in the case of modeling the behav-ior of grid cells the parameter is set to a fixed valueof 2 which makes the Minkowski distance equivalentto the Euclidean distance. The latter is required inthis context as only the Euclidean distance allows theGNG to form an induced Delaunay triangulation ofits input space.

In case the arguments of D are a combination ofvector and RGNG, the vector is fed into the RGNGusing function F and the returned minimum distanceis taken as distance value:

DGR(x,y) := F(x,y)�dmin,

DRG(x,y) := DGR(y,x) .

In case the arguments of D are both RGNGs, the dis-tance is defined to be the pairwise minimum distancebetween the prototypes of the RGNGs’ units, i.e., sin-gle linkage distance between the sets of units is used:

DGG(x,y) := minu∈x�U,k∈y�U

D(u�w, k�w) .

The latter case is used by the interpolation functionif the recursive depth of an RGNG is at least 2. Asthe RGNG-based grid cell model has only a recursivedepth of 1 (see next section), the case is consideredfor reasons of completeness rather than necessity. Al-ternative measures to consider could be, e.g., averageor complete linkage.

I(x,y): The interpolation function I returns a newprototype as a result from interpolating between theprototypes x and y. The type of interpolation dependson whether the arguments are both vectors or bothRGNGs:

I(x,y) :=

{IRR(x,y) if x,y ∈ Rn,

IGG(x,y) if x,y ∈ G.

In case the arguments of I are both vectors, the result-ing prototype is the arithmetic mean of the arguments:

IRR(x,y) :=x+ y

2.

In case the arguments of I are both RGNGs, the result-ing prototype is a new RGNG a. Assuming w.l.o.g.that |x�U | ≥ |y�U | the components of the interpolatedRGNG a are defined as follows:

a := I(x,y) ,

a�U :=

(w,0)

∣∣∣∣∣∣∣∣w = I(u�w,k�w) ,

∀u ∈ x�U,

k = argminl∈y�U

D(u�w, l�w)

,

a�C :=

({l,m} ,0)

∣∣∣∣∣∣∣∣∣∣∃c ∈ x�C

∧ u,k ∈ c�V

∧ l�w = I(u�w, ·)∧ m�w = I(k�w, ·)

,

a�θ := x�θ .

The resulting RGNG a has the same number of unitsas RGNG x. Each unit of a has a prototype that wasinterpolated between the prototype of the correspond-ing unit in x and the nearest prototype found in theunits of y. The edges and parameters of a correspondto the edges and parameters of x.

A(x,ξ,r): The adaptation function A adapts a proto-type x towards a vector ξ by a given fraction r. Thetype of adaptation depends on whether the given pro-totype is a vector or an RGNG:

A(x,ξ,r) :=

{AR(x,ξ,r) if x ∈ Rn,

AG(x,ξ,r) if x ∈ G.

In case prototype x is a vector, the adaptation is per-formed as linear interpolation:

AR(x,ξ,r) := (1− r)x+ r ξ.

In case prototype x is an RGNG, the adaptation is per-formed by feeding ξ into the RGNG. Importantly, theparameters εb and εn of the RGNG are temporarilychanged to take the fraction r into account:

θ∗ := ( r, r · x�θ�εr, x�θ�εr, x�θ�λ, x�θ�τ,

x�θ�α, x�θ�β, x�θ�M) ,

x∗ := (x�U, x�C, θ∗) ,

AG(x,ξ,r) := F(x∗,ξ)�x .

Note that in this case the new parameter θ�εr is usedto derive a temporary εn from the fraction r.

This concludes the formal definition of the RGNG al-gorithm.

Table 1: Parameters of the RGNG-based model usedthroughout all simulation runs. Parameters θ1 control thetop layer RGNG while parameters θ2 control all bottomlayer RGNGs of the model.

θ1 θ2

εb = 0.004 εb = 0.001εn = 0.004 εn = 0.00001εr = 0.01 εr = 0.01λ = 1000 λ = 1000τ = 300 τ = 300α = 0.5 α = 0.5β = 0.0005 β = 0.0005

M = 100 M = 20

Parameterization

Each layer of an RGNG requires its own set of pa-rameters. In case of our two-layered grid cell modelwe use the sets of parameters θ1 and θ2, respec-tively. Parameter set θ1 controls the main top layerRGNG while parameter set θ2 controls all bottomlayer RGNGs. Table 1 summarizes the parameter val-ues used for the simulation runs presented in this pa-per. For a detailed characterization of these parame-ters we refer to Kerdels (Kerdels, 2016).

noise resilience of an rgng-based grid cell model · noise resilience of an rgng-based grid cell...

Documents