modelling memory and learning consistently from …cs.anu.edu.au/~andrew.coward/modmempsychphys.pdf1...

63
1 Modelling Memory and Learning Consistently from Psychology to Physiology L. Andrew Coward College of Engineering and Computer Science Australian National University Canberra, ACT 0200, Australia [email protected] Abstract Natural selection pressures have resulted in the physical resources of the brain being organized into modules that perform different general types of information processes. Each module is made up of submodules performing different information processes of the general type, and each submodule is made up of yet more detailed modules. At the highest level, modules correspond with major anatomical structures like the cortex, hippocampus, basal ganglia, cerebellum etc. In the cortex, for example, the more detailed modules include areas, columns, neurons, and a series of neuron substructures down to molecules. Any one memory or learning phenomenon requires many information processes performed by many different anatomical structures. However, the modular structure makes it possible to describe a memory phenomenon at a high (psychological) level in terms of the information processes performed by the major anatomical structures. The same phenomenon can be described at each level in a hierarchy of more detailed descriptions, in terms of the information processes performed by anatomical substructures. At higher levels, descriptions are approximate but can be mapped to more detailed, more precise descriptions as required down to neuron levels and below. The total information content of a high level description is small enough that it can be fully understood. Small parts of a high level phenomenon, when described at a more detailed level, also have a small enough information content to be understood. The information processes and resultant hierarchy of descriptions therefore make it possible to understand cognitive phenomena like episodic, semantic or working memory in terms of neuron processes via a series of intermediate levels of description. 1 Introduction Understanding how the human brain supports higher cognitive phenomena like memory and learning cannot depend on models at the psychological level and other models at the physiological level with no clear connection between them. A hierarchy of descriptions of the same phenomenon from psychological to physiological is needed, with clear mapping between the levels (Coward and Sun 2004; Sun et al. 2005). At any one level, it must be possible to describe how at each point in time the observed situation causes the situation at the next point in time, in other words, descriptions must be causal. The mapping between levels must be well understood. As in the physical sciences, the higher levels will be more approximate, but there must be clear understanding of when a more detailed level is necessary to achieve a given degree of quantitative accuracy (Coward and Sun 2007). Such a hierarchy of causal descriptions requires consistent information models for components at each level of description. The information models make it possible to map precisely from a causal description on one level of detail to descriptions on other levels of detail. For memory and learning, at the highest level a causal description in psychological terms is required. At a more detailed level, a causal description in terms of major anatomical structures, such as the cortex, thalamus, basal ganglia, cerebellum, etc., is needed that precisely maps into the psychological description. At an even more detailed level, a description in terms of components of the major anatomical structures (such as

Upload: others

Post on 28-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • 1

    Modelling Memory and Learning Consistently from Psychology to Physiology L. Andrew Coward College of Engineering and Computer Science Australian National University Canberra, ACT 0200, Australia [email protected] Abstract Natural selection pressures have resulted in the physical resources of the brain being organized into modules that perform different general types of information processes. Each module is made up of submodules performing different information processes of the general type, and each submodule is made up of yet more detailed modules. At the highest level, modules correspond with major anatomical structures like the cortex, hippocampus, basal ganglia, cerebellum etc. In the cortex, for example, the more detailed modules include areas, columns, neurons, and a series of neuron substructures down to molecules. Any one memory or learning phenomenon requires many information processes performed by many different anatomical structures. However, the modular structure makes it possible to describe a memory phenomenon at a high (psychological) level in terms of the information processes performed by the major anatomical structures. The same phenomenon can be described at each level in a hierarchy of more detailed descriptions, in terms of the information processes performed by anatomical substructures. At higher levels, descriptions are approximate but can be mapped to more detailed, more precise descriptions as required down to neuron levels and below. The total information content of a high level description is small enough that it can be fully understood. Small parts of a high level phenomenon, when described at a more detailed level, also have a small enough information content to be understood. The information processes and resultant hierarchy of descriptions therefore make it possible to understand cognitive phenomena like episodic, semantic or working memory in terms of neuron processes via a series of intermediate levels of description. 1 Introduction

    Understanding how the human brain supports higher cognitive phenomena like memory and learning cannot depend on models at the psychological level and other models at the physiological level with no clear connection between them. A hierarchy of descriptions of the same phenomenon from psychological to physiological is needed, with clear mapping between the levels (Coward and Sun 2004; Sun et al. 2005). At any one level, it must be possible to describe how at each point in time the observed situation causes the situation at the next point in time, in other words, descriptions must be causal.

    The mapping between levels must be well understood. As in the physical sciences, the higher levels will be more approximate, but there must be clear understanding of when a more detailed level is necessary to achieve a given degree of quantitative accuracy (Coward and Sun 2007).

    Such a hierarchy of causal descriptions requires consistent information models for components at each level of description. The information models make it possible to map precisely from a causal description on one level of detail to descriptions on other levels of detail.

    For memory and learning, at the highest level a causal description in psychological terms is required. At a more detailed level, a causal description in terms of major anatomical structures, such as the cortex, thalamus, basal ganglia, cerebellum, etc., is needed that precisely maps into the psychological description. At an even more detailed level, a description in terms of components of the major anatomical structures (such as

  • 2

    cortical columns and subcortical nuclei) must map into the higher level descriptions. At an even more detailed level, a causal description in terms of neuron algorithms must map into higher anatomical descriptions.

    Experience with the design of very complex electronic real-time systems indicates that a number of practical considerations place severe constraints on the architectures of such systems (Coward 2001). These practical considerations include resource limitations and the need to make changes and additions to some features without undesirable side effects on other features. “System architecture” means the way in which the information handling resources required to support system features are separated into subsystems. Each subsystem is individually customized for efficient performance of a different type of information recording, processing and/or communication, and there are limits on the type and degree of information exchange between subsystems. The separation between memory and processing and the sequential execution of information processes, often known as the von Neumann architecture, are important aspects of these architectural constraints.

    Although there are minimal direct resemblances between such electronic systems and brains, natural selection results in a number of practical considerations that influence brain architectures in analogous ways (Coward 2001, 2005). If the brains of two species can learn the same set of behaviours, but the brain architecture of one of these species requires fewer resources, then the species with the more efficient architecture will have a significant natural selection advantage. If one species can learn new behaviours with less interference to previously learned behaviours than another species, then again the more effective species will have natural selection advantages.

    It can be demonstrated theoretically (Coward 2001) that these and other practical considerations tend to constrain the architecture of any sufficiently complex learning system into some specific architectural forms, analogous with but qualitatively different from the von Neumann architecture. There is considerable evidence (Coward 2005) that the mammal brain has been constrained into these forms, known as the recommendation architecture.

    As a result of these architectural forms, different general information models can be assigned to different major anatomical structures including the cortex, hippocampal system, thalamus, basal ganglia, amygdala, hypothalamus and cerebellum. More specific information models that support the general models can be assigned to substructures of these structures, such as areas and columns in the cortex; CA fields, dentate gyrus and associated cortices in the hippocampal system; and striatum, substantia nigra, globus pallidus, nucleus accumbens, etc. in the basal ganglia. Yet more specific supporting information models can be assigned to neurons, synapses and ion channels.

    This hierarchy of consistent information models is the foundation for modelling memory phenomena consistently on a psychological level, on the level of major anatomical structures, on the level of more detailed structures and on the level of neuron physiology. Cognitive phenomena such as retrieval of episodic memories can be mapped into sequences of activities in major anatomical structures that result in the memory retrieval, an activity within one such structure (such as the cortex) can be mapped into sequences of activities in substructures of that structure (such as cortical columns), and an activity in a substructure can be mapped into a sequence of activities at a neuron level. If necessary, a neuron activity can be mapped into a sequence of chemical activities at the

  • 3

    synapse level, etc. The end result is understanding of the psychological phenomenon in terms of neurons.

    2 The Recommendation Architecture Model

    The information models for the major subsystems in the recommendation architecture are illustrated in figure 1, along with the anatomical structures of the mammal brain that correspond with each subsystem (Coward 1990, 2001, 2005, 2009a; Coward and Gedeon 2009). The primary separation is between a subsystem called clustering (corresponding with the cortex) that organizes the system resources used to define and detect conditions within the information available to the system and a subsystem called competition (corresponding with the basal ganglia and thalamus) that interprets each current condition detection as a set of recommendations in favour of many different behaviours, each with an individual weight, and implements the behaviour with the largest total weight across all current condition detections. Reward feedback results in adjustments to the weights that recommended recently implemented behaviours. However, such reward feedback cannot change condition definitions because such changes would interfere with all the other behaviours recommended by the same condition.

    Clustering is organized as a modular hierarchy, with more detailed modules detecting sets of very similar conditions and higher level modules made up of groups of more detailed modules detecting larger sets of somewhat less similar conditions. The primary driving force that generates this hierarchy is the need to share the resources used to detect similar conditions. Individual modules detect conditions relevant to many different behaviours, and module outputs therefore have very complex behavioural meanings.

    Competition is organized as a component hierarchy. The use of reward feedback within competition means that component outputs can only have very simple behavioural meanings, and components must correspond with one individual behaviour or with one general type of behaviour.

    The information available to the cortex includes sensory inputs and inputs indicating internal activity of the brain (including the cortex). A condition is defined by a set of inputs and a specified state for each input. A condition is detected if a high proportion of the inputs that define it are in their specified state. Two conditions are similar if there is significant overlap in the information defining them and/or they often tend to be present at the same time (i.e. in the same system input states). This definition of similarity implies that two similar conditions will tend to have similar behavioural implications. Rather than connecting every individual condition detection to the component hierarchy, resource economy can be achieved by organizing conditions into groups on the basis of similarity. The conditions making up the group are recorded in a module, and the module generates an output to the component hierarchy only if a significant subset of the conditions in the group is present. If necessary, sufficiently different subsets can be indicated by different outputs. The group of similar conditions defines the receptive field of the module.

  • 4

    Figure 1 The recommendation architecture mapped into mammal brain anatomy. The primary separation is between a modular hierarchy (called clustering and corresponding with the cortex) that defines and detects conditions in the information available to the brain, and a component hierarchy (called competition and corresponding with the thalamus and basal ganglia) that at each point in time receives some of the conditions detected by clustering, interprets each condition as a set of recommendations in favour of a range of different behaviours, each with an individual weight, and implements the most strongly recommended behaviour. Most conditions must be defined heuristically, and within clustering there is a subsystem (corresponding with the hippocampal system) that determines when and where new conditions will be recorded. Positive or negative reward feedback following a behaviour can adjust the recently active weights that recommended the behaviour, but cannot change condition definitions without severe interference with all the other behaviours recommended by the same conditions. Detections of conditions indicating that a reward is appropriate are received by the nucleus accumbens, which applies changes to weights in the basal ganglia. Most behaviours are implemented by release of information flows into, out of or within clustering, and a separate subsystem (corresponding with the thalamus) is required to efficiently manage such information releases. Conditions indicating the appropriateness of general types of behaviours (e.g. aggressive, fearful, food seeking, etc.) are provided to structures (corresponding with the amygdala and hypothalamus) that modulate the relative probability of such behaviour types. Frequently required sequences of behaviours that need to be executed rapidly and accurately are recorded and executed by the cerebellum.

    Conditions (and receptive fields) can be defined on many different levels of complexity, where the complexity of a condition is the total number of raw system inputs (including all duplicates) that contribute to the condition, either directly or via intermediate conditions. Receptive fields on different levels of complexity will tend to be effective in discriminating between different types of circumstances with different types of behavioural implications. For example, relatively simple receptive fields will be able to discriminate between different visual features (e.g. tail, wing and tooth). More complex receptive fields will be able to discriminate between different categories of visual object (e.g. cat, dog and bird). Yet more complex receptive fields will be able to discriminate between different types of groups of visual objects (e.g. cat chased by dog, cat confronting dog and cat avoiding dog). Simple receptive fields will be elements in the definitions of more complex receptive fields.

  • 5

    In a learning system, most conditions and receptive fields must be defined heuristically from system experience. Hence the information model for the cortex as a whole includes definition and detection of receptive fields at different levels of complexity within the information available to the brain.

    In the basal ganglia, high-level components correspond with general types of behaviour, more detailed components within a high level component correspond with more specific behaviours of the general type. A receptive field detection by the cortex is communicated to a large number of components in the basal ganglia. Each component interprets the detection as a recommendation in favour of its corresponding behaviour, with a weight that is different for each component. The information model for the basal ganglia is therefore interpretation of cortical receptive field detections as behavioural recommendations and implementation of the most strongly recommended behaviours.

    In general, a behaviour will be implemented by an information flow into, within or out of the cortex. There is therefore a requirement for a subsystem that provides coordinated management of all such information flows. This subsystem requires information on current cortical activity and information on currently selected behaviours. In the mammal brain, the thalamus corresponds with this subsystem. Thus the thalamus receives information from the cortex and gates information flows into the cortex (attention behaviours), out of the cortex (e.g. from the motor cortex to drive motor behaviours) and within the cortex (e.g. various cognitive behaviours).

    If a behaviour is implemented, it will have consequences. If these consequences are good, the probability of the behaviour being selected in the future can be increased, if the consequences are bad, the probability should be decreased. A reward subsystem is therefore required that receives indications of positive and negative consequences of behaviours and changes recently active weights that led to the selection of recent behaviours. The nucleus accumbens within the basal ganglia corresponds with this subsystem. Because components in the component hierarchy correspond with behaviours, appropriate targetting of reward feedback is straightforward.

    Indications of the positive or negative consequences of behaviours will often be receptive field detections by the cortex. For example, the anterior cingulate cortex detects receptive fields that correlate with the presence of error conditions (Kiehl et al. 2000). The anterior cingulate cortex projects to the nucleus accumbens (Devinsky et al. 1995), where such receptive field detections are interpreted as recommendations in favour of negative rewards. If the recommendations are accepted (i.e. if there is adequate total weight into the nucleus accumbens), then the recently accepted behavioural weights elsewhere in the basal ganglia will be weakened.

    Any one receptive field detection recommends a wide range of different behaviours, and reward feedback follows an individual behaviour. Hence changes to receptive field definitions as a result of reward feedback would damage the integrity of the recommendation weights in favour of all the other behaviours. In general, any major change to a receptive field will damage the integrity of its associated behavioural meanings. As a result, receptive field changes must be strongly constrained. To a first approximation, a receptive field can be expanded slightly by addition of similar conditions, but previously added conditions cannot be changed or deleted. One exception to this rule is that if a condition is added and does not occur again within a significant period of time, it can probably be removed again. Another exception is that if a receptive

  • 6

    field ceases to occur for some long period of time (perhaps because the source of its inputs has been damaged), it can probably be removed and its resources reassigned to another receptive field.

    An implication of this constraint on receptive field changes is that individual fields cannot be guided to correspond with unambiguous cognitive circumstances such as object categories. Any one receptive field may be detected within some instances of many different such object categories and will therefore have recommendation strengths in favour of behaviours appropriate to all of the categories (such as naming them). However, an array of such fields can be evolved so that it discriminates between such categories. In other words, the predominant recommendation strength across all the receptive fields detected within an instance of any category will be in favour of behaviours appropriate for that category, even though no one receptive field corresponds with any one category (Coward 2001, 2005).

    There is another important implication of the stability of receptive fields and their associated behavioural meanings. In response to an input state, a number of receptive fields will be detected, contributing their behavioural recommendation strengths. However, there are some potentially relevant recommendation strengths that are associated with receptive fields not actually being detected. For example, suppose there is a receptive field that is not currently being detected, but that has often been detected in the past when many of the currently detected receptive fields were also detected. Given this past consistency, there could be behavioural value in generating receptive field detections on the basis of frequent past simultaneous activity with currently detected receptive fields. Such value could be accessed by each receptive field also having recommendation strengths in favour of indirect activation of any receptive fields that have often been active in the past at the same time. Behavioural value could also exist for receptive field activations on the basis of recent simultaneous activity and also on the basis of simultaneous receptive field expansion. Furthermore, there could also be behavioural value in receptive field activations on the basis of past activity just after the activity of currently detected receptive fields, where the temporally correlated activity could be frequent, recent or receptive field expansion based.

    If unrestricted, such indirect activations would generate a huge amount of activity, and they must therefore be behaviours that compete with other behaviours for acceptance. These indirect activation mechanisms are the information basis for a wide range of memory phenomena.

    In order to achieve a high integrity behaviour, there must be an adequate range of recommendations available. To achieve an adequate range, there must be at least a minimum number of receptive field detections in response to every overall system input state. If this minimum is not reached, some receptive fields must expand in order to extend the range of available recommendations. However, to minimize changes to receptive fields, those requiring the least degree of expansion must be identified.

    There is therefore a requirement for a further major subsystem that determines whether receptive field expansions are required, and if so identifies the modules in which the least such expansion will be required and drives expansion in those modules. In the mammal brain, the hippocampal system corresponds with this resource management subsystem. The information model for the hippocampal system is therefore determination

  • 7

    of when and where in the cortex receptive field expansions are appropriate and driving the required expansions.

    There is an additional requirement to record frequently used sequences of behaviours to ensure their rapid and accurate execution whenever required. In the mammal brain, the cerebellum corresponds with this subsystem.

    There is a requirement for a subsystem that modulates the relative probabilities of different general types of behaviours. Such general types could include aggressive, fearful, food-seeking, etc. There are two ways in which relative probabilities could be affected. One is to change the relative arousal of components in the component hierarchy corresponding with the behaviour types. The other is to change the probability of recommendations of the general type being generated. To economize on resources, most receptive field detections must be able to recommend any type of behaviour. However, if there are substantial behavioural advantages in separate receptive field definitions at some levels of complexity for different behavioural types, the advantages might outweigh the resource costs. Hence there could be some modules within the cortex that tend to recommend one general type of behaviour, and temporarily broadening their receptive fields would increase the probability of such behaviours being selected.

    Finally, there are two dynamic considerations. Efficient use of receptive field detection resources requires that the same set of resources must be able to detect receptive fields at one level of complexity simultaneously within multiple different sensory circumstances, keeping the different detections separate until it is appropriate to combine them in a controlled fashion. For example, if two dogs are chasing a squirrel, receptive fields must be detected within each dog and also the squirrel. There must be a set of receptive fields at a level of complexity that is effective for discriminating between objects. Resource economy dictates that each object will result in receptive field detections which are subsets of the same set, generally with some overlap between the subsets. Once receptive fields have been detected within all the individual objects, the detections must be combined to detect more complex receptive fields that can discriminate between different types of groups of objects. The problem is that the sets of simpler receptive fields must be kept completely separate (e.g. one activated set of receptive fields must not correspond with the head of a dog on the body of a squirrel) but must all be active at the same time for them to be combined to detect the more complex receptive fields (two dogs chasing a squirrel). As will be discussed in more detail later, one role of the gamma band frequency in the EEG is to maintain a separation between receptive fields detected within different objects but using the same neural resources.

    The second dynamic consideration is that because there is behavioural value in indirect activation of receptive fields on the basis of activity shortly after currently active fields, there is an analogous requirement for simultaneous activity in the same resources corresponding with a sequence of points in time. This simultaneous activity is required so that the appropriate links supporting future indirect activations can be established. As discussed later, one role of the theta band frequency in the EEG is to maintain separation between receptive field detections corresponding with different points in time. 3. Review of Experimental Data Literature Tulving (1984, 1985) and later Schacter and Tulving (1994) developed an approach to classifying memory and learning phenomena. They suggested there were three criteria

  • 8

    that could be used to identify a memory system in the brain, which was separate from other memory systems. These criteria are the existence of a group of memory tasks which have some common characteristics, the existence of a list of features different from the list for any other system and the existence of multiple dissociations between any two systems. A dissociation is a way in which a similar manipulation of tasks performed by different systems produces different effects on the performance. An important category of dissociations is observations of patients with brain damage that affects one type of memory and not another.

    On the basis of a wide range of evidence, Schacter and Tulving (1994) proposed that there are five independent memory systems: semantic memory; episodic memory; priming memory; procedural memory and working memory. Semantic memory is the memory for facts and the meaning of words, without recall of the context in which those facts or words were learned. Episodic memory is the memory for events, including autobiographical memory for events with personal involvement. Semantic and episodic memories are together known as declarative memory because they are consciously accessible and can be described verbally. Priming memory is the ability to make use of experiences in the recent past to enhance behaviour, even when there is no conscious awareness or memory of those experiences. Procedural memory is the ability to learn skills, including motor skills. Working memory is the ability to maintain direct access to multiple objects, so that information derived from the objects is immediately available to influence behaviour. 3.1 Semantic Memory Semantic memory is defined as the ability to recall a wide range of organized information including facts and word meanings (Tulving 1972). The typical experimental test of semantic memory is category verification, where a subject is presented with paired category names and category (e.g. mammal, monkey or mammal and pigeon) and asked to identify if the pairing is correct. Identification speed is slightly slower for non-typical instances (e.g. mammal-bat) than for typical or incorrect pairings (Rips et al. 1973).

    Functional neuroimaging indicates that semantic knowledge is encoded in many different cortical areas, especially the posterior temporal and frontal cortices, with the areas activated during semantic memory tasks generally being those also active during sensory or motor processing (Martin 2007). However, damage to the most anterior portions of the temporal cortices results in general loss of semantic memory capabilities, although this area does not show strong activation during semantic memory tasks (Rogers et al. 2006). There appears to be no consistent evidence for cortical area specialization for semantic domain (e.g. natural or man-made objects) or category (e.g. animals, fruit, tools and vehicles) although there is animal specific activity in the left anterio-medial temporal pole and tool specific activity in the left posterior middle temporal gyrus which appears in a subset of experiments with lower statistical confidence (Devlin et al. 2002). 3.2 Episodic Memory In the laboratory, episodic memory is measured by both recognition and recall experiments. In recognition experiments, subjects are shown a set of novel objects and later shown a mixture of further novel objects and objects from the earlier set and asked to identify objects seen before.With photographs, subjects have a remarkable high

  • 9

    capability to identify previously perceived objects (Standing et al. 1970). In recall experiments, subjects are asked to describe one past event. To trigger the recall, subjects are given a word (Robinson 1976) or group of words (Crovitz and Schiffman 1974). Such experiments are often used to measure the past time period for which episodic memory retrieval has been degraded and the degree of degradation (Kensinger et al. 2001). 327

    Observation of the severity of retrograde amnesia in amnesic patients with damage to their hippocampal system indicates graduations between semantic and episodic memory (Nadel and Moscovitch 1997). In such patients, the most severe amnesia is for personal autobiographic memories. Amnesia for personal information, public events and persons is less severe, and amnesia for words and general facts is often minimal.

    Functional neuroimaging of the brain during episodic memory recall indicates that there is strong activity in the prefrontal cortex during episodic recall (Fletcher et al. 1997) and also in visual and visual association areas (Addis et al. 2007). There is somewhat weaker activity in the hippocampal system (Fletcher et al. 1997). Strong cerebellar activity is observed during episodic memory retrieval (Fliessbach et al. 2007). 3.3 Procedural Memory Experimental tests of procedural memory are generally confined to simple learning tasks and are focussed on clarifying the distinction between procedural and other types of memory, in particular declarative memory (meaning both semantic and episodic). Typical investigations investigate the ability of patients who have lost the ability to create new semantic and episodic memories to learn simple procedural skills. For example, a wide range of such amnesic patients showed the ability to learn to read words reflected in a mirror at the same rate as normal controls and retain the skill for at least 3 months, despite failing to recall any familiarity with the task at the start of each session (Cohen and Squire 1981).

    For more complex skills, declarative knowledge speeds up the learning process (Mathews et al. 1989; Sun et al. 1996), and declarative memory is required for high levels of procedural skill performance (Mathews et al. 1989). However, there can be inconsistencies between procedural and declarative knowledge.When highly skilled subjects describe their skill, the descriptions often correspond with beginner methods rather than actual methods, and generating a verbal description can result in reversion to the less effective beginner method (Bainbridge 1977; Berry 1987).

    There is a range of evidence derived from the cognitive deficits associated with degeneration of the basal ganglia indicating that this structure plays an important role in procedural learning. The symptoms of Parkinson’s disease include difficulty with voluntary movement and with initiation of movement and in general slowness of movement. The observed physical deficit (Jankovic 2008) is degeneration of dopaminergic neurons in the substantia nigra compacta (SNc) nucleus of the basal ganglia. The major symptom of Huntingdon’s disease is the intrusion of irregular, unpredictable, purposeless, rapid movements that flow randomly from one body part to another (Berardelli et al. 1999). The observed physical deficit is loss of striatal cells that project into the indirect pathway (Starr et al. 2008).

    In addition, the cerebellum plays an important role in procedural learning (Torriero et al. 2007), although it also plays a role in a wide range of higher cognitive processes

  • 10

    (Leiner et al. 1993) including semantic (Devlin et al. 2002), episodic and working memory (Cabeza et al. 2002). 3.4 Working Memory Working memory refers to the number of different objects that can be maintained active in the brain at the same time. A typical working memory experiment is list recall. Subjects are shown a sequence of objects and immediately afterwards asked to list all the objects in any order. Normal subjects can fully recall sequences of seven (plus or minus two) random digits, but only four or five random words or letters (McCarthy and Warrington 1990). Miller (1956) suggested that the limit of seven plus or minus two is a fundamental information processing limit. Cowan (2000) argued that seven is an overestimate because subjects were able to rehearse and/or chunk items and proposed that the true information limit is close to four items, based on a wide range of observations. One key observation is that there are performance discontinuities around the number four, with errorless performance below four and sharp increases in errors above four (Mandler and Shebo 1982).

    In a variation of the list recall test, the number of items that subjects can report on longer lists is measured. Typically, there is enhanced recall for the first few items on the list and enhanced recall for the last few items (Baddeley 2000). A brief delay occupied with another task eliminates the recency effect but has much less influence on the primacy effect (Glanzer 1972). List recall capability is greater if there is a semantic connection between objects on the list. Recall of meaningful sentences can extend to 16 words or more, but recall for random words is limited to four or five (Baddeley et al. 1987).

    Neuroimaging indicates considerable overlap in the cortical areas active during working memory and declarative memory tasks. For example, Cabeza et al. (2002) used fMRI to compare brain activity during an episodic memory task (recalling if a word was on a list of 40 words studied much earlier) with a working memory task (recalling if a word was in a list of four words presented 15 s earlier). They found that the cerebellum and left dorsolateral cortex areas were active during both tasks, bilateral anterior and ventrolateral cortex areas were more active during episodic retrieval and Broca’s area and bilateral posterior/dorsal areas were more active during working memory retrieval. A patient with damage to the left parietal lobe showed a deficit in working memory but unaffected declarative memory (Warrington and Shallice 1969; Shallice and Warrington 1970). 3.5 Priming Memory Priming memory is a short-term effect of exposure to a stimulus on the response to a similar later stimulus. Priming memory decays rapidly over a period of minutes, then more slowly over periods of hours and days. One experimental test of priming is word stem completion, in which subjects are asked to complete each of a list of three letter word stems with the first English word that comes to mind. A stem is the first three letters of a word, and in such experiments there are typically about ten possible completions. Previous study of the word increases the probability of the word being generated, provided the study is less than about 2 h prior to test (Graf et al. 1984). Amnesic patients

  • 11

    with no ability to create new declarative memories have priming memory that is the same as and decays at the same rate as in normal subjects (Graf et al. 1984).

    Another important priming memory experiment is tachistoscopic image recognition, when a subject is shown a sequence of brief (

  • 12

    (Kensinger et al. 2001), demonstrating a dissociation between episodic and semantic memory.

    Evidence for the separation between priming and procedural memories comes from the study of Huntington’s Syndrome patients. Such patients are characterized by damage to the basal ganglia and exhibit severe deficits in motor skill learning, but their priming memory appears intact (Heindel et al. 1989).

    As mentioned earlier, patients have been observed to exhibit deficits in working memory, with no apparent deficit in declarative memory (Warrington et al. 1971). In these patients, performance in an immediate memory span test in which they were presented with strings of one to four digits, letters or words revealed good recall for one item strings but well below normal recall for strings with more than one item. However, performance in recall of a short story showed performance slightly better than for normal subjects. 4 Other Modelling Approaches There is a very large number of attempts to model memory phenomena, ranging from attempts to prove that synaptic weight changes are always present during learning (Martin and Morris 2002) to high-level psychological models such as Baddeley’s working memory model that uses subsystems like phonological memory and central executive with no attempt to map into physiology. In general, previous models focus on one or two levels of description and do not present a consistent hierarchy of descriptions with information models on each level which can be mapped between all levels from psychology to physiology.

    The five system memory model proposed by Schacter and Tulving (1994) and described earlier is in fact a psychological level model with some implications for high level anatomy. As discussed earlier, there have been successful attempts to map the model into major anatomical structures by the evidence from deficits resulting from local brain damage (Schacter and Tulving 1994) and various imaging techniques including fMRI and PET. For example, Devlin et al. (2002) and Rogers et al. (2006) analyze the brain regions active during semantic memory processes. Kassubek et al. (2001) investigated brain regions active during procedural memory processes. Fletcher et al. (1997) and Addis et al. (2007) have investigated the brain regions active during episodic memory processes. These investigations demonstrate differences in the cortical areas active during different types of memory and are valuable high level descriptions, but do not provide information models for the processes that can be mapped into deeper level descriptions.

    Another type of approach has been the development of phenomenological models like Baddeley’s working memory model (1986) and the spreading activation model for semantic memory (Collins and Loftus 1975). These models attempt information models for the phenomena at high level but do not offer any mapping to more detailed levels. The ACT-R model (Anderson 1996) offers a detailed information model which can, for example, model working memory in a fair amount of quantitative detail (Lovett et al. 1999). However, although there have been attempts to map ACT-R to fMRI imaging, the ACT-R information models do not provide any mapping into information models for more detailed anatomical or physiological structures.

  • 13

    Another extensive modelling approach to memory is that of Hasselmo and his collaborators. For example, they have proposed a model for the operation of working memory and episodic memory that uses reinforcement learning and attempts to account for a range of observations on rats (Zilli and Hasselmo 2007). This model postulates the existence of a number of buffers which can be in states reflecting current sensory inputs or a range of past sensory inputs that have been recorded. A key aspect of the model is the concept of actions which can be taken on the memory systems themselves in addition to actions on the external environment. There are some analogies between these proposed self actions and the indirect activation of receptive field information in the recommendation architecture. An implementation of their model has been described (Zilli and Hasselmo 2007), but the implementation does not provide mapping into plausible physiology. Furthermore, the model does not contain the recommendation architecture separation between clustering (i.e. condition definition and detection) and competition (i.e. reinforcement learning-based interpretation of conditions into behavioural recommendations). As a result, the model would have problems scaling up to learn complex combinations of behaviours.

    At the neurophysiological level there have been numerous proposals that synaptic plasticity supports memory (e.g. Martin and Morris 2002). However, although these proposals are based on experimental evidence that long-term changes to synaptic weights are associated with learning, they do not offer neuron level information models that can be mapped (through intermediate anatomical levels) into, for example, semantic and episodic memory.

    Another important set of models are those which attempt to understand the role of different EEG frequencies in memory. The beta frequency (12–30Hz), gamma frequency (30–80Hz) and the theta frequency (5–12Hz) occur throughout the neocortex and hippocampus. These frequencies appear as modulations placed upon the firing of pyramidal neurons and are probably managed by interneuron activity (Whittington and Traub 2003). A number of proposals have been made that these frequencies play various roles in memory.

    For example, Hasselmo et al. (2002) proposed that different functions are supported in different phases of the theta frequency in the hippocampus. In one part of a theta cycle, associations between sensory events are learned. In the other part of the cycle, previously learned associations are retrieved. As a result, the theta rhythm plays an important role in the reversal of previously learned associations (e.g. when the physical location of a reward changes). This model has been simulated with a considerable degree of physiological detail (Cutsuridis et al. 2008, 2010; Cutsuridis and Wenneckers 2009). However, the model focusses on the hippocampus and does not address the role of the cortex in these memory functions, other than the general view that it is the long term storage location for declarative memory, with the hippocampus providing intermediate term storage. How declarative memories are transferred from hippocampus to cortex is not addressed.

    A number of workers have suggested that the gamma frequency binds together the activity of neurons representing the features of the stimulus that is the focus of attention (e.g. Engel and Singer 2001). To this proposal has been added the idea that the gamma frequency is important for the formation of declarative memories (Axmacher et al. 2006).

  • 14

    There is a relationship between these proposals and the recommendation architecture requirement for separation between different populations of receptive field detections correspondingwith different cognitive stimuli but within the same neural resources. This relationship will be discussed below. However, the various dynamic models do not provide causal descriptions of higher level cognitive phenomena which can be mapped into the dynamic models.

    Previously proposed memory models can model memory phenomena at one or two levels of detail, but do not provide a hierarchy of information models which can map consistently from psychological phenomena to neuron physiology.

    The primary focus of this chapter is on memory and learning phenomena. However, the recommendation architecture is a general cognitive architecture that can provide descriptions of a wide range of higher cognitive phenomena in terms of brain anatomy and physiology (Coward 2005). A comparison between the recommendation architecture approach and a wide range of alternative cognitive architectures including Haikonen’s neural architecture, Baar’s global workspace, virtual machine models, simulation models, kernel architectures and forward models has been performed (Coward and Gedeon 2009). This study concluded that the alternative approaches do not take adequate account of natural selection pressures on brain resources, and the mapping between higher cognition and detailed anatomy and physiology is more plausible for the recommendation architecture.

    5 Brain Anatomy and the Recommendation Architecture Model The RA cognitive architecture maps the information model for each major anatomical structure into more detailed models for its substructures and so on all the way down to neuron physiology. The architecture explains how the more detailed models are implemented physiologically, and how the detailed models interact to support higher level models, up to descriptions of memory on a psychological level. 5.1 Cortical Structure The cortex is a 3–4mm thick sheet of tissue, with an area in adult humans of about 2,600 cm2. The cortical sheet is organized into six layers, with the layers differing in cell type, size and density and in intralayer and interlayer connectivity. Six layers are generally prominent, but sublayers are often visible, and sometimes a major layer can be absent. The cortex is organized perpendicular to the sheet into groups of about 100 cells linked across the layers. These groups, called minicolumns, are produced by the cortical growth process, each minicolumn from a small set of progenitor cells. Cortical columns are much larger vertical structures, perhaps formed by binding together a number of minicolumns. Cortical columns vary from 300 to 600 µm in diameter and are distinguished by similarity of the receptive fields of all their principal neurons and by common short-range horizontal connections (Mountcastle 1997).

    The cortical sheet is separated into at least 50 areas (Brodmann 1908; Petrides and Pandya 1999, 2002; Morasan et al. 2001). These areas differ in the cell types, sizes and densities observed in each layer and sublayer, in the concentration of various neurochemicals such as the neurofilament protein that influences neuron size and shape, in the relative degree of myelinization of each layer and sublayer, in the interconnectivity with adjacent layers and in the interconnectivity with other areas.

  • 15

    5.2 Cortical Information Models As discussed earlier, the information model for the cortex is definition and detection of receptive fields on many different levels of complexity within the information available to the brain. The number of detections must reach at least a minimum but not an excessive level in response to every overall brain input state. A receptive field is defined by a set of similar information conditions, with the receptive field being detected if a significant subset of the conditions is detected. 5.2.1 Information Model for a Cortical Area The information model for an area is definition of receptive fields within one range of complexity and detection of the most significant receptive fields in each input state. These receptive fields must adequately address five requirements. The first requirement is that, in order to preserve previously acquired behavioural meanings, receptive fields can expand to detect additional, similar conditions but with some tightly restricted exceptions cannot change or discard previously added conditions. The second requirement is that, in response to any brain input state that results in inputs to the area, the number of receptive field detections must reach at least a minimum level. This second requirement is to ensure that, as discussed earlier, enough alternative behavioural recommendations are available to generate a high integrity behavioural selection. If detections are below the minimum, some receptive fields will expand until the minimum is reached.

    Figure 2 The concept of conditions and condition similarity. A visual domain of 30 x 30 pixels is defined. A visual object is present in the domain. A condition is defined by a set of elements (in this case pixels), with a state specified for each element of the set. Such conditions are illustrated in a and b. If a visual input results in a high proportion of the pixels in the set being in their specified state, the condition is present. Two conditions are similar if a high proportion of the elements and states that define them are the same and/or often occur in the same input states. Conditions I and II are very similar by this definition, since seven of their elements are the same, and all of the elements occur in the same input state. With this definition of similarity, similar conditions will tend to have similar behavioural implications (such as recommending saying “that is a bird”)

    The third requirement is that, in response to any brain input state that results in inputs

    to the area, the number of receptive field detections does not reach an excessive level. The fourth requirement is that, to conserve resources, two receptive fields should not be

  • 16

    detected consistently in the same input states. In other words, receptive fields must be as orthogonal as possible, or as statistically independent as possible. The fifth requirement is that the set of receptive fields programmed in one area must be able to discriminate between different perceptual circumstances, whenever such a difference implies that a different behavioural response is appropriate. The type of perceptual circumstance is different for different areas, the receptive fields in one area discriminate between visual features, in another between object categories, in yet another between different types of groups of objects, etc. as discussed earlier.

    Figure 3 The concept of receptive field expansion management. In the illustration there are six modules, each module having a receptive field defined by a set of similar conditions as described in figure 2. If a significant proportion of the conditions programmed in a module are detected, the module receptive field is detected and the module produces an output. The array of modules is required to detect at least two receptive fields in every input state. In scenario I, the visual input state results in detection of a significant proportion of conditions in modules b and d, corresponding with receptive field detection and resulting in outputs from those modules. The level of condition detection in other modules is relatively low. In scenario II, the visual input state is less familiar, and initially only module d has enough condition detection for receptive field detection. Because this is less than the minimum, the module with the highest degree of condition detection below the proportion for receptive field detection (i.e. module g) is selected, and additional conditions actually present in the current input are added to the module. Such additions expand the receptive field of the module and result in enough condition detections for receptive field detection and module output (scenario III). Selection of the module without receptive field detection but with the highest degree of condition detection is equivalent to selecting the module requiring the least receptive field expansion

    The conceptual process for definition of receptive fields can be understood by consideration of figures 2 and 3. In figures, there is a visual input domain which is an array of pixels. Visual inputs appear within the domain. A condition is defined by a set of elements (in this case pixels) and a state for each element, the condition is detected if a

  • 17

    high proportion of the elements are in their defined state. As illustrated in figure 2, two conditions are similar if their elements are the same and/or often occur in the same input states. The modules illustrated in figure 3 have receptive fields defined by groups of similar conditions, with the receptive field being detected if a high proportion of the conditions is detected. An array of modules such as the one illustrated in figure 3 detects different receptive fields, all in one range of complexity. Such an array is required to detect at least a minimum number of receptive fields in every input state. If the minimum is not achieved, modules that are detecting a significant number of conditions but less than the proportion for receptive field detection add conditions until the proportion for receptive field detection is reached. These conditions are selected (by a biased random process to be discussed later) from the large number of conditions that are present in the current input state. Receptive field expansions therefore occur in those modules for which the smallest expansion is required, causing the least damage to existing behavioural meanings of the receptive fields.

    Arrays of modules are arranged in a sequence, with the conditions detected in one array being combinations of the conditions detected in the preceding array. Hence there is a steady increase in receptive field complexity down the sequence, where complexity can be defined as the number of rawsensory inputs (in the conceptual example pixels) that must be present in their appropriate state for the receptive field to be detected.

    Figure 4 Sequence of arrays of modules in which the conditions detected in one array are combinations of receptive field detections by modules in the preceding array. Few receptive fields will correspond with cognitively unambiguous circumstances, but at the appropriate level of receptive field complexity, the sets of receptive fields detected within instances of one category will be sufficiently different from the sets detected within instances of any other category that the currently detected set can effectively guide selection of a behaviour appropriate to current category. In other words, the array of columns will be able to discriminate between different categories. Receptive fields on different levels of complexity will be able to discriminate effectively between different types of cognitive categories (features, objects, groups of objects, etc.). Multiple levels of receptive field complexity may be relevant to, for example, discrimination between different types of groups of objects

    It may be possible to describe receptive fields in areas close to sensory input in

    sensory terms [such as the class of visual shapes that contain the receptive field, see Tanaka (2003)]. However, for higher areas, receptive fields can best be understood

  • 18

    simply as groups of receptive fields in other areas that have tended to be detected at similar times in the past. The receptive field in one area receives inputs from the group of receptive fields in other areas and is detected if a high proportion of the group is present.

    The mechanism for definition of receptive fields means that in general one receptive field will not correspond with one unambiguous cognitive circumstance such as an object category. In other words, no single receptive field will always be detected in all instances of one object category and in no instances of any other category. However, an array can be evolved to discriminate between types of cognitive circumstance, with arrays with different receptive field complexities being effective for discriminating between different types as illustrated in figure 4.

    Figure 5 An array of modules detecting receptive fields at a level of complexity effective for discriminating between different visual object categories like DOG and CAT. In I, the modules that detect their receptive fields in some instances of DOG are shaded. A different subset of these shaded modules will detect its receptive field in each actual instance of DOG, and each shaded module will have some recommendation strengths (in subcortical structures) in favour of DOG-appropriate behaviours such as saying “that is a dog”. In II, the modules in the same array that detect their receptive fields in some instances of CAT are shaded. There is some overlap between the CAT and DOG modules illustrated in I and II, reflecting the existence of some visual similarities between cats and dogs. Receptive fields detected in some instances of both CAT and DOG will have recommendation strengths in favour of both DOG- and CAT-appropriate behaviours. All the modules will also have recommendation strengths in favour of behaviours appropriate to many other categories of object. In III, the modules in the same array that detect their receptive fields in one actual CAT instance are shaded. Although some modules with DOG-appropriate recommendation strengths are active, the predominant recommendation strength across all receptive fields will be CAT-appropriate. The module array can thus discriminate between dogs and cats.

    The meaning of discrimination can be understood from figure 5. The modules in the array illustrated in the figure detect receptive fields within a range of complexity appropriate for discriminating between visual objects. Each receptive field corresponds with some visual circumstance which may occur in some instances of many different object categories. Thus, for example, one module may detect its receptive field in some instances of the category DOG, but also in some instances of the category CAT. In general, one receptive field will be detected in some instances of many different categories. There are no modules that detect their receptive fields in all DOG instances and in no instances of other categories, and individual modules are therefore cognitively ambiguous. However, the set of modules that detect their receptive fields in response to an actual CAT instance will have a predominant recommendation strength in favour of CAT-appropriate behaviours.

  • 19

    Module receptive field definition is not directly guided by the existence of cognitive categories, and an array might lack discrimination in some cases. A discrimination problem and the mechanism for resolving it are illustrated in figure 7. In the figure, because of a visual similarity an instance of a dog and an instance of a cat activate the same receptive fields. In other words, the array of receptive fields cannot discriminate between the two instances. The effect will be that the same behaviours will be recommended for both objects. Suppose that the dog was seen first, a DOG-appropriate behaviour was performed and rewarded, and the set of columns now have a strong overall recommendation strength in favour of repeating that behaviour in the future.When the cat is seen, there is strong recommendation strength in favour of a DOG-appropriate behaviour, but the behaviour is punished because it is incorrect. This strong recommendation followed by negative reward triggers expansion of receptive fields in modules with strong internal activity, even if the number of active modules already reaches the minimum required level. There is a reasonable probability that these modules will not detect their receptive fields in a repetition of the DOG instance, making it possible to discriminate between the instances in the future.

    Figure 6 Managing discrimination problems. Initially, the visual similarities between the two pictures result in activation of the same sets of columns at the level of receptive field complexity that discriminates between object categories. Hence the same behaviour will be recommended in response to the two pictures (such as saying “cat”) even though one is a cat and the other is a dog. The contradictory consequence feedback in response to the same behaviour following activation of the same columns results in forced receptive field expansions which lead to different column sets in response to the two objects

    Because receptive fields expand over time, there is a requirement to limit the total number that is activated in response to one input state. This limitation is achieved by inhibition between receptive fields so that only the most strongly present are detected. The means by which this is achieved will be discussed after the information model for a pyramidal neuron has been described (figure 9). 5.2.2 Information Model for a Cortical Column The information model for a cortical column is definition and detection of a receptive field, including identification of the circumstances in which expansion of its receptive field could be appropriate. This information model can be understood by consideration of

  • 20

    figure 7. The cortical column information model also includes indirect activation of a receptive field under special circumstances in which the field is not actually present in the current sensory input state. Such indirect activations are behaviours that must be adequately recommended by currently active receptive fields.

    Figure 7 Information model for a cortical column. A simple cortical column model made up of three layers of principal neurons is illustrated. The conditions detected by neurons in the top layer are combinations of receptive field detections in the areas that provide inputs to the column. The conditions detected by neurons in the second layer are combinations of receptive field detections by neurons in the top layer. The conditions detected by neurons in the bottom layer are combinations of receptive field detections by neurons in the middle layer. Outputs from the bottom layer are the column receptive field detections and go to other cortical areas and also to subcortical structures where they are interpreted as behavioural recommendations. Receptive field complexities therefore increase somewhat from the top to the bottom layer. If in response to some input state there is no activity in the bottom layer but significant activity in the middle layer, this indicates that there is some similarity between the current state and other states that have contained the column receptive field. Such middle layer activity therefore indicated that if the receptive field of the column were expanded slightly, it would be detected in the current state. The strength of middle layer activity therefore indicates the degree to which receptive field expansion would be appropriate if required in response to the current input state. Outputs from the middle layer go to the resource manager, where they are used to determine the set of columns with the strongest degree of internal activity. If additional receptive field detections are required, then set of columns receive inputs from the resource manager that drive receptive field expansion.

    In figure 7, there are three layers of principal neurons (generally pyramidal). The inputs that define the conditions detected in the top layer come from other areas of the cortex. The inputs that define the conditions detected in the middle layer come from the top layer, and the inputs that define conditions detected in the bottom layer come from the middle layer. The bottom layer provides outputs to other cortical areas and to subcortical structures where they are interpreted as behavioural recommendations. Note that this is a conceptual model, there could be more layers for various functional reasons (Coward 2005), and connectivity need not necessarily be sequential from top to bottom layer. From the point of view of the information model, the key point is that the conditions detected in the layer that provides outputs are less complex than the conditions

  • 21

    detected in earlier layers. A situation could therefore arise in which there in no output from the column as a whole, but significant condition detection in earlier layers. The implication of such a situation is that the current input state contains many of the conditions that contribute to the definition of the column receptive field but not sufficient to trigger receptive field detection. Hence the degree of activity of neurons in the middle layer is a good indicator of the degree to which expansion of the column receptive field would be appropriate if required in response to the current input state.

    The requirement is to determine if receptive field expansions are required (based on the overall degree of activity of columns across the area) and if so to determine the columns with the highest degree of internal activity and trigger receptive field expansions in those columns. Although in principle this requirement could be met by all-to-all connectivity between columns, it is more efficient and effective to use a resource manager (Coward 1990, 2005, 2009a). In figure 7, the output from the middle layer of the column to the resource manager carries information on the internal activity of the layer, and the inputs from the resource manager drives receptive field expansions as described below. 5.2.3 Information Model for a Pyramidal Neuron The information model for a pyramidal neuron is definition and detection of a receptive field within a column. There are a number of aspects to this information model. First, separate information conditions that constitute the receptive field must be defined by groups of elements, with a condition being detected if enough of the elements are in the appropriate state. Second, the receptive field of the neuron must be detected and an output generated if a significant proportion of the conditions are detected. Third, it must be possible to expand the receptive field in appropriate circumstances by adding appropriate conditions. Fourth, it must be possible to limit the total activity across an array of columns, ultimately by limiting neuron activity in an appropriate fashion. Fifth, it must be possible to activate the neuron in some circumstances in the absence of its receptive field.

    The pyramidal neuron information model is conceptually illustrated in figure 8. In this information model, conditions correspond with groups of synapses on one branch of the apical dendrite. These synapses are connection points for axons coming from other pyramidal neurons. An incoming action potential causes a synapse to inject voltage potential into its branch. The magnitude of the injected potential is proportional to the weight of the synapse. In the conceptual figure, synaptic weights are in arbitrary units (w). As described in the next section, an injected potential increases rapidly after the arrival of the action potential and then decays more slowly. A significant proportion of these synapses must receive incoming action potentials for the total potential in the branch to reach a threshold enabling a potential contribution deeper into the dendrite, and thus raise the chance of the soma producing an output action potential indicating detection of its receptive field. Because of the decay in potentials, the inputs to one branch must occur within a relatively short time so that the potentials resulting from action potentials at different synapses add to each other. Synapses on one branch thus correspond with the elements defining one information condition, which is detected if a high proportion of the inputs are active.

  • 22

    Figure 8 Information model for a pyramidal neuron. The neuron has a body (or soma) and two sets of dendritic trees. Inputs that define the conditions that in turn define the receptive field of the neuron synapse on different branches of the apical dendrite. If an action potential arrives at a synapse, it injects a voltage potential into the branch that is proportional to the weight of the synapse. If the total weight injected by action potentials arriving at all the synapses on a branch exceeds a threshold, the branch injects potential deeper into the dendrite. The group of synapses on a branch thus define an information condition. If the total potential injected into the dendrite reaches a high enough level (i.e. enough conditions are detected), the dendrite will inject potential into the soma. If the potential injected into the soma exceeds a threshold, the soma will generate an output action potential. The group of conditions defined on apical branches thus define the receptive field of the neuron. Inhibitory signals from local interneurons target the soma (or the proximal dendrites) and reduce the chance of the neuron producing an output. The basal dendrite can also inject potential to trigger an output action potential; the inputs to the basal dendrite come from sources that can generate such an output in the absence of the neuron receptive field

    The neuron will only produce an output if a number of branches inject potential deeper into the dendrite within a relatively short period of time. The group of conditions corresponding with branches thus define the receptive field of the neuron, which is detected if a significant proportion of the conditions are detected.

    A neuron can expand its receptive field by addition of new conditions. Such addition is achieved by means of provisional conditions such as the one illustrated in the lower left of figure 8. The total weight of all the synapses that define the elements of the condition is not sufficient for the branch to reach the threshold for potential injection deeper into the dendrite. However, there are additional synapses on the branch that come from the resource manager. If within a relatively short period of time, a high proportion of the condition defining elements are active and the inputs from the resource manager are also active, there will be enough potential in the branch to inject potential deeper into the dendrite. If a number of regular conditions are also being detected, then there will be enough potential to trigger firing of the neuron. If the neuron fires, a backpropagating acting potential into the branch increases the weights of any synapses that have recently received an action potential from their source neuron. This is the long-term potentiationmechanism observed by Bi and Poo (1998). The effect is that the weights of

  • 23

    the condition defining inputs are increased to the point that they can contribute to the firing of the neuron independent of the state of the inputs from the resource manager. Effectively, a new condition has been recorded.

    A key requirement is that new conditions on a neuron must be similar to previously recorded conditions on the neuron or on other neurons in the same layer of the same column. As discussed earlier, similarity means that the elements defining the condition are the same as and/or often active in the past at the same time as the elements defining other conditions. There are a number of factors that ensure this similarity. First, for a new condition to be recorded, conditions already programmed on the same neuron must be detected. Second, the inputs to a provisional condition are selected from those available in the neighbourhood. Third, the inputs to a provisional condition are selected from those often active in the past when the neuron is also active. This final factor is achieved utilizing REM sleep (Coward 1990 etc.). In REM sleep, there is a rerun of a selection of past experience. Provisional connectivity is established between neurons often active at the same time during this rerun.

    Limiting activity within a column and also across an array of columns is achieved using inhibitory interneurons with the connectivity illustrated in figure 9.

    As described earlier, there is behavioural value in the ability to indirectly activate columns on the basis of different types of temporally correlated past activity. Inputs to indirectly activate neurons in a column must be segregated from inputs defining receptive fields. Hence these inputs target a different dendrite system, the basal dendrites.

    The staged integration model described in this section is generally consistent with the ideas of Hausser and Mel (2003). However, the details of the staging could be different; the key requirement is to achieve appropriate receptive field expansions.

    Figure 9 Interneuron connectivity to limit column activity. Interneuron outputs inhibit their targets. Interneurons target pyramidal neurons in the same layer of the same column. Interneurons can receive inputs from pyramidals in the same layer of the same column or from the same layer of columns in the same area. Connectivity between columns can be economized by biasing it in favour of connectivity between interneurons and pyramidals often active at the same time. Connectivity within a column limits overall activity within the column. Connectivity between columns effectively results in a competition between columns, reducing the total number of active columns.

  • 24

    5.2.4 Pyramidal Neuron Dynamics As described earlier, a number of frequencies can be observed in the EEG which reflects the firing patterns of pyramidal neurons. In the recommendation architecture model, these frequencies reflect different information processing functions by networks of pyramidal neurons. In general terms, one action potential generated by a neuron indicates a detection of its receptive field at one point in time. The degree to which the receptive field is present is reflected in the rate at which action potentials are generated, and such rates averaged across many neurons result in the beta band frequencies. As discussed earlier, effective use of cortical resources requires the ability to maintain separate but simultaneously active populations of receptive field detections within the same resources. The gamma band reflects an information process that maintains a separation between simultaneous receptive field detections within different sensory objects in the same neuron network. The theta band reflects an information process that maintains a separation between groups of receptive fields that are detected at a sequence of different points in time.

    Figure 10 Leaky integration and integration windows in a pyramidal neuron. For simplicity, staged integration is omitted. Action potential spikes arriving at different synapses each result in the injection of potential into the neuron. This injected potential rises over a period of a couple of milliseconds, then decays again with a decay constant of the order of 10 ms. It is assumed that the synaptic strengths are all the same. The postsynaptic potential resulting from different spikes adds linearly, and the total potential must exceed a threshold to result in (or contribute to) firing of the neuron. Because of the decay constant, two spikes must arrive close together in time to reinforce each other. Unless a number of spikes arrive within a fairly short period of time (of the order of the decay constant), the threshold will not be reached. This situation thus supports the concept of an integration window or period within which a minimum number of spikes must arrive for there to be a contribution to target neuron firing.

    The information model for the response of a pyramidal neuron to a sequence of action potentials is illustrated in figure 10. For simplicity, dendritic structure and staged integration is omitted for the figure, and it is assumed that the synaptic strengths are all the same. The potential injected by one action potential rises relatively rapidly after the arrival of the input action potential, peaks after 2–3 ms and then decays with a time constant of the order of 8 ms. This time constant implies that action potentials arriving significantly more than 8ms apart do not reinforce each other. Hence if the threshold for a

  • 25

    group of synapses to contribute to the firing of the neuron is several times the maximum potential injected by one spike, at least a minimum number of spikes must arrive within a period of the order of the time constant for the group of spikes to contribute to the firing of the neuron. The concept of an integration window is therefore useful, where at least a minimum number of spikes must arrive within the integration window to contribute to neuron firing. The integration window is not completely fixed, it depends both on the ratio of leaky integration peak to threshold and on how close together the spikes arrive, but simulations demonstrate that it is a viable concept (Coward 2004).

    An important concept that makes the integration window functionally useful is frequency modulation as illustrated in figure 11. A frequency modulation signal applied to a neuron shifts each of its output spikes towards the nearest peak in the frequency modulation signal. Such a frequency modulation signal could be imposed by a regular sequence of spikes at the modulation frequency. Interneuron inputs generally inhibit the firing of their targets, but if an inhibitory spike arrives more than about 5 milliseconds before an excitatory spike, it adds to the excitatory effect (Gulledge and Stewart 2003). A regular stream of interneuron spikes will therefore tend to impose a frequency modulation on the outputs of pyramidal neurons.

    One functional value of frequency modulation can be understood from figure 12. If unmodulated inputs from some visual domain are provided to an array of columns, then the degree of receptive field detection will be much lower than if the inputs are modulated. Hence a domain in the visual field can be selected by placing a frequency modulation on all the inputs from that domain, and the effect will be preferential receptive field detection (and therefore behavioural recommendation generation) with respect to that domain. This is the information model for the attention function. Note that if the modulation is placed on inputs, the outputs of the neurons will also be frequency modulated, and the modulation will propagate through following layers of neurons.

    Figure 11 Concept of frequency modulation. In I, a neuron produces an output made up of a sequence of action potential spikes. For ease of explanation, the illustrated sequence is regular; the actual sequences produced by neurons are irregular but the concept applies in the same way. In II, a frequency modulation signal is applied to the neuron. The effect of the signal is to shift output spikes towards the nearest peak in the modulation signal

  • 26

    Figure 12 Frequency modulation of inputs from a source. In I, three unmodulated input sources target a neuron. Within any integration window, there is only a total of about two spikes. If at least 4 spikes must be present to generate an output, there will be no outputs. In II, the inputs are modulated. This results in the same or fewer spikes in the integration window around the modulation minimum, but more around the maximum. There are sufficient around the maximum to fire the target neuron. Hence if a group of inputs from one source (such as a domain in the visual field) are frequency modulated, receptive fields will be detected only within the inputs from that domain

    Figure 13 Frequency modulation enabling simultaneous receptive field detections in multiple objects without interference. Inputs from different sources are given different phases of the same frequency modulation. As a result, spikes from the different sources tend to arrive in different integration windows, meaning that receptive fields present within different input sources can be detected without interference in the same neural resources, even in the same neuron.

  • 27

    Another functional value can be seen from figure 13. If the interval between peaks in the modulation frequency is much larger than the integration window, then several integration windows can fit within one modulation cycle. If inputs from several different sources are modulated with the same frequency but at different phases, the spikes from the different sources will tend to arrive in different integration windows. Hence neuron receptive fields will be detected separately and independently within the different input sources. In other words, receptive fields within several different input sources (such as visual objects) can be detected simultaneously without crosstalk in the same neural resources.

    The principle of maintaining a separation between different populations of receptive field detections can be extended to make use of multiple different frequencies. For example, in order to separate detection groups of receptive fields corresponding with sensory circumstances at a sequence of different points in time, a different frequency could be used.

    Figure 14 Organization of the thalamus into nuclei with reciprocal connectivity between a nucleus and specific cortical areas. In addition, different additional cortical areas provide inputs to each nucleus but do not receive inputs in return. All connectivity between the thalamus and the cortex are excitatory 5.3 Structure of the Basal Ganglia and Thalamus The basal ganglia and thalamus are organized into nuclei: clusters of neurons separated by regions containing mainly axons. The major nuclei and connectivity of the basal ganglia and thalamus can be understood by consideration of figures 14–17.

    The thalamus is made up of a number of major nuclei as illustrated in figure 14. Each nucleus has strong reciprocal excitatory connectivity with a different cortical area, plus inputs from some other areas. Between each nucleus and the cortex, there is a nucleus called the thalamic reticular nucleus (TRN). As illustrated in figure 15, all axons between

  • 28

    a thalamic nucleus and its associated cortices pass through a sector of the TRN, which regulates the information flow (Guillery et al. 1998).

    There are a number of separate nuclei that make up the basal ganglia. These nuclei and the major connectivity between them and with the thalamus and cortex are illustrated in figure 16.

    There are in fact several parallel paths from the cortex through the striatum, GPi/SNr and thalamus and back to the cortex (Alexander et al. 1986). These paths start and end in a cortical area different for each path, and each path goes through different subnuclei of the striatum, GPi, SNr and thalamus. In each path, the striatum also receives cortical inputs from some additional areas, but there is no return connectivity from the path to those areas.

    Figure 15 All connectivity between a thalamic nucleus and the cortex passes through a sector of the TRN. The primary connection paths for one type of thalamic nucleus are illustrated (Guillery et al. 1998). The principal neurons of the thalamic nucleus, thalamocortical projection neurons, send excitatory connections to layer 4 of the cortex. Thalamocortical neurons receive excitatory inputs from pyramidal neurons in layers 5 and 6. Axons from layer 6 pyramidals and from thalamocortical neurons branch in the TRN to form synapses on the inhibitory interneurons that are the most common cells in the TRN, but the axons from layer 5 do not form such synapses. These inhibitory interneurons project to thalamocortical neurons. In addition, thalamocortical neurons receive inhibitory inputs from neurons in the basal ganglia

    Individual neurons in the striatum and in the globus pallidus correspond with specific aspects motor activity such as the direction of arm movement but not to the underlying pattern of muscle activity associated with that movement (Crutcher and DeLong 1984; Mitchell et al. 1987).

    The nucleus accumbens is sometimes regarded as part of the basal ganglia. As illustrated in figure 18, it gets substantial inputs from the amygdala and from the orbitofrontal cortex, and its outputs target the GPi and SNr nuclei of the basal ganglia. 5.4 Information Models for the Thalamus and Basal Ganglia The general information model for these structures is interpretation of cortical receptive field detections as behavioural recommendations and determining and implementing the most strongly recommended behaviour. Reward feedback following a behaviour adjusts

  • 29

    recently utilized recommendation weights. There are five components to this information model. The first is determination of raw total recommendation weights for each behaviour. The second is a competition between different behaviours to determine the most appropriate. The third is modulation to ensure that one and only one behaviour is selected. The fourth is implementation of the one selected behaviour. The fifth is utilizing reward feedback to modulate the probability of the same behaviour being selected in similar circumstances in the future.

    Figure 16 The major nuclei and connectivity of the basal ganglia. Excitatory inputs from the cortex arrive at spiny projection neurons in the striatum. Each such neuron receives one or two inputs from a very large number of different cortical pyramidals. Striatal spiny projection neurons inhibit their targets, and there are two populations of these neurons. D1 population neurons project directly to the globus pallidus internal segment (GPi) and substantia nigra pars reticula (SNr) and are excited by dopaminergic inputs from the substantia nigra pars compacta (SNc). D2 neurons project indirectly to the GPi and SNr via two additional nuclei, the globus pallidus external segment (GPe) and the subthalamic nucleus (STN). As a result, the D2 population neurons ultimately excite the GPi and SNr. D2 population neurons are inhibited by dopaminergic inputs from the SNc. The GPi and SNr generate constant (tonal) inhibitory outputs to the thalamus. The direct path therefore tends to reduce inhibition of the thalamus while the indirect path tends to maintain this inhibition. As described earlier, the thalamus has reciprocal excitatory connectivity with the cortex. There is no direct return path from the basal ganglia to the cortex 4.1 Information Model for the Thalamus The thalamus implements selected behaviours. Such an implementation is generally the release of information into the cortex, between cortical areas, or out of the cortex. Principal cells in the thalamus excite different groups of cortical columns and correspond with the releasing of information from those groups. Thalamic principal cells are excited by a range of columns in their groups and elsewhere. This connectivity can be viewed as the cortical columns recommending the release of outputs from the group. However, the basal ganglia tonically inhibits the thalamic cells, and release will not occur unless the tonic inhibition is reduced. The release of information includes imposing frequency

  • 30

    modulation with an appropriate phase. This frequency modulation could be the role of the TRN, with TRN interneurons inhibiting thalamocortical neuron activity out of phase with the targetted modulation peaks.

    Figure 17 Parallel paths linking basal ganglia, thalamus and cortex. In each path, one striatal subnucleus receives inputs from one primary area, sends outputs to one pair of GPi and SNr subnuclei, which send outputs to one thalamic nucleus that connects reciprocally with the primary cortical area. In each path, a different group of cortical areas also provide inputs to the striatal subnucleus but does connect reciprocally with the corresponding thalamic nucleus. Abbreviations follow Alexander et al. (1986)

    Figure 18 The primary connectivity of the nucleus accumbens. Following a behaviour, both the orbitofrontal cortex and amygdala provide information indicating the detection of circumstances in which reward feedback is appropriate. The nucleus accumbens interprets these inputs as recommendations in favour of adjustments of the weights in the GPi and SNr that resulted in the recent behaviour and applies the most strongly recommended adjustments. The weight adjustments therefore affect the probability of acceptance of similar behaviours in similar circumstances in the future 4.2 Information Model for the Striatum The information model for the striatum is that different striatal projection neurons correspond with different behaviours or types of behaviour. Each neuron receives inputs from many cortical pyramidals, which can be interpreted as raw recommendation weights in favour of the behaviour corresponding with the striatal neuron. For reasons discussed below, there is a pair of striatal projection neurons corresponding with each behaviour, with similar inputs, but one population D1 and the other population D2.

    Competition within the striatum is implemented by the extensive local axon collaterals of striatal projection neurons, which mainly target other striatal projection neurons (Somogyi et al. 1981). In addition, there are populations of striatal interneurons

  • 31

    that receive inputs from the cortex and target striatal projection neurons (Tepper and Bolam 2004). 5.4.3 Information Model for the GPi and SNr The information model for the GPi and SNr is that individual principal neurons correspond with behaviours or types of behaviour. A principal neuron produces steady (tonic) inhibitory output to thalamic principal neurons corresponding with the release of cortical outputs that implement the behaviour corresponding with that principal neuron. With this connectivity arrangement, if a principal neuron in the GPi and SNr is inhibited, the result will be implementation of its corresponding behaviour.

    A GPi and SNr principal neuron corresponding with a part