neural networks for high-level intelligence and cognition włodzisław duch (google: duch)...

32
Neural Networks for High- Neural Networks for High- Level Intelligence and Level Intelligence and Cognition Cognition Włodzisław Duch (Google: Duch) Włodzisław Duch (Google: Duch) Department of Informatics, Department of Informatics, Nicolaus Copernicus University, Torun, Nicolaus Copernicus University, Torun, Poland Poland IJCNN’2007, Orlando, Florida, August 14, 2007

Upload: quentin-white

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Neural Networks for High-Level Neural Networks for High-Level Intelligence and Cognition Intelligence and Cognition

Włodzisław Duch (Google: Duch) Włodzisław Duch (Google: Duch)

Department of Informatics, Department of Informatics, Nicolaus Copernicus University, Torun, Poland Nicolaus Copernicus University, Torun, Poland

IJCNN’2007, Orlando, Florida, August 14, 2007

Page 2: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

PromisePromise

1.1. Mind as a shadow of neurodynamics: geometrical Mind as a shadow of neurodynamics: geometrical model of mind processes, psychological spaces model of mind processes, psychological spaces providing inner perspective as an approximation to providing inner perspective as an approximation to neurodynamics.neurodynamics.

2.2. Intuition: learning from partial observations, solving Intuition: learning from partial observations, solving problems without explicit reasoning (and combinatorial problems without explicit reasoning (and combinatorial complexity) in an intuitive way.complexity) in an intuitive way.

3.3. Neurocognitive linguistics: how to find neural pathways Neurocognitive linguistics: how to find neural pathways in the brain.in the brain.

4.4. Creativity & word games. Creativity & word games.

Page 3: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Motivation & possibilitiesMotivation & possibilities

To reach human-level intelligence we need to go beyond pattern To reach human-level intelligence we need to go beyond pattern recognition, memory and control. How to reach this level? recognition, memory and control. How to reach this level?

•Top down style, inventing principles: laminar & complementary Top down style, inventing principles: laminar & complementary computing (Grossberg), chaotic attractors (Freeman & Kozma), AMD computing (Grossberg), chaotic attractors (Freeman & Kozma), AMD (John Weng), confabulation (Hecht-Nielsen), dynamic logic and mental (John Weng), confabulation (Hecht-Nielsen), dynamic logic and mental fields (Perlovsky), mnemonic equations (Caianello) ...fields (Perlovsky), mnemonic equations (Caianello) ...

•Bottom up style, systematic approximations, scaling up: neuromorphic Bottom up style, systematic approximations, scaling up: neuromorphic systems, CCN (Izhikevich), Ccortex, O’Reilly ... systems, CCN (Izhikevich), Ccortex, O’Reilly ...

•Designs for artificial brains based on cognitive/affective architectures Designs for artificial brains based on cognitive/affective architectures

•Integration of perception, affect and cognition, large-scale semantic Integration of perception, affect and cognition, large-scale semantic memory models, implementing control/attention mechanisms.memory models, implementing control/attention mechanisms.

Page 4: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Exponential growth of powerExponential growth of powerFrom

R. Kurzweil,

The Law of Accelerating Returns

By 2020 PC computerswill match the raw speed of brain operations!

Singularity is coming?

Page 5: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Attractor neural networks and Attractor neural networks and concept formation in psychological concept formation in psychological

spacesspaces::mind from brainmind from brain??

Attractor neural networks and Attractor neural networks and concept formation in psychological concept formation in psychological

spacesspaces::mind from brainmind from brain??

Włodzisław Duch Włodzisław Duch Department of Informatics, Department of Informatics,

Nicholas Copernicus University, Toruń, Poland. Nicholas Copernicus University, Toruń, Poland.

www.phys.uni.torun.pl/~duch www.phys.uni.torun.pl/~duch

Bioinspired Computational Models of Learning Bioinspired Computational Models of Learning and Memory, Lejondal Castle, Sept. 2002and Memory, Lejondal Castle, Sept. 2002

Page 6: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Mind the GapMind the GapMind the GapMind the GapGap between neuroscience and psychology: cognitive science is at best incoherent mixture of various branches.

Is a satisfactory understanding of the mind possible ?

Roger Shepard, Toward a universal law of generalization for psychological science (Science, Sept. 1987)“What is required is not more data or more refined data but a different conception of the problem”.

• Mind is what the brain does, a potentially conscious subset of brain processes.

How to approximate the dynamics of the brain to get satisfactory (geometric) picture of the mind?

Page 7: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

P-spacesP-spacesP-spacesP-spacesPsychological spaces:

K. Lewin, The conceptual representation and the measurement of psychological forces (1938), cognitive dynamic movement in phenomenological space.

George Kelly (1955), personal construct psychology (PCP), geometry of psychological spaces as alternative to logic.

A complete theory of cognition, action, learning and intention.

PCP network, society, journal, software …

Page 8: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Static Platonic modelStatic Platonic modelStatic Platonic modelStatic Platonic modelNewton introduced space-time, arena for physical events.Mind events need psychological spaces.

Goal: integrate neural and behavioral information in one model, create model of mental processes at intermediate level between psychology and neuroscience.

Static version: short-term response properties of the brain, behavioral (sensomotoric) or memory-based (cognitive).

Applications: object recognition, psychophysics, category formation in low-D psychological spaces, case-based reasoning.

Approach: • simplify neural dynamics, find invariants (attractors), characterize

them in psychological spaces; • use behavioral data, represent them in psychological space.

Page 9: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Platonic mind model.Platonic mind model.Platonic mind model.Platonic mind model.Feature detectors/effectors: topographic maps.Objects in long-term memory (parietal, temporal, frontal): local P-spaces.Mind space (working memory, prefrontal, parietal): construction of mind space features/objects using attention mechanisms.Feelings: gradients in the global space.

Page 10: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

More neurodynamicsMore neurodynamicsMore neurodynamicsMore neurodynamicsAmit group, 1997-2001, simplified spiking neuron models of column activity during learning.

Formation of new attractors =>formation of mind objects.

PDF: p(activity of columns| given presented features)

Stage 1: single columns respond to some feature. Stage 2: several columns respond to different features. Stage 3: correlated activity of many columns appears.

Page 11: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Category learningCategory learningCategory learningCategory learningLarge field, many models. Classical experiments: Shepard, Hovland and Jenkins (1961), replicated by Nosofsky et al. (1994)

Problems of increasing complexity; results determined by logical rules. 3 binary-valued dimensions:

shape (square/triangle), color (black/white), size (large/small). 4 objects in each of the two categories presented during learning.

Type I - categorization using one dimension only. Type II - two dim. are relevant (XOR problem). Types III, IV, and V - intermediate complexity between Type II - VI. All 3 dimensions relevant, "single dimension plus exception" type.Type VI - most complex, 3 dimensions relevant,

logic = enumerate stimuli in each of the categories.

Difficulty (number of errors made): Type I < II < III ~ IV ~ V < VI

Page 12: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Canonical dynamicsCanonical dynamicsCanonical dynamicsCanonical dynamicsWhat happens in the brain during category learning? Complex neurodynamics <=> simplest, canonical dynamics. For all logical functions one may write corresponding equations.

For XOR (type II problems) equations are:

22 2 2

2 2 2

2 2 2

2 2 2

1, , 3

4

3

3

3

V x y z xyz x y z

Vx yz x y z x

xV

y xz x y z yy

Vz xy x y z z

z

Corresponding feature space for relevant dimensions A, B

Page 13: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Inverse based ratesInverse based ratesInverse based ratesInverse based ratesRelative frequencies (base rates) of categories are used for classification:

if on a list of disease and symptoms disease C associated with (PC, I) symptoms is 3 times more common as R, then symptoms PC => C, I => C (base rate effect).

Predictions contrary to the base: inverse base rate effects (Medin, Edelson 1988).

Although PC + I + PR => C (60% answers) PC + PR => R (60% answers)

Why? Psychological explanations are not convincing.

Effects due to the neurodynamics of learning?

I am not aware of any dynamical models of such effects.

Page 14: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

IBR explanationIBR explanationIBR explanationIBR explanationPsychological explanation: J. Kruschke, Base Rates in Category Learning (1996).

PR is attended to because it is a distinct symptom, although PC is more common.

Basins of attractors - neurodynamics; PDFs in P-space {C, R, I, PC, PR}.

PR + PC activation leads more frequently to R because the basin of attractor for R is deeper.

Construct neurodynamics, get PDFs. Unfortunately these processes are in 5D.

Prediction: weak effects due to order and timing of presentation (PC, PR) and (PR, PC), due to trapping of the mind state by different attractors.

Page 15: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

LearningLearningLearningLearning

Neurodynamics Psychology

I+PC more frequent => stronger synaptic connections, larger and deeper basins of attractors.

Symptoms I, PC are typical for C because they appear more often.

To avoid attractor around I+PC leading to C, deeper, more localized attractor around I+PR is created.

Rare disease R - symptom I is misleading, attention shifted to PR associated with R.

Point of view

Page 16: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

ProbingProbingProbingProbing

Neurodynamics Psychology

Point of view

Activation by I leads to C because longer training on I+PC creates larger common basin than I+PR.

I => C, in agreement with base rates, more frequent stimuli I+PC are recalled more often.

Activation by I+PC+PR leads frequently to C, because I+PC puts the system in the middle of the large C basin and even for PR geadients still lead to C.

I+PC+PR => C because all symptoms are present and C is more frequent (base rates again).

Activation by PR+PC leads more frequently to R because the basin of attractor for R is deeper, and the gradient at (PR,PC) leads to R.

PC+PR => R because R is distinct symptom, although PC is more common.

Page 17: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

IntuitionIntuitionIntuitionIntuitionIntuition is also a concept difficult to grasp, but commonly believed to play important role in business and other decision making; „knowing without being able to explain how we know”.

Sinclair & Ashkanasy (2005): intuition is a „non-sequential information-processing mode, which comprises both cognitive and affective elements and results in direct knowing without any use of conscious reasoning”.

First tests of intuition were introduced by Wescott (1961), now 3 tests are used, Rational-Experiential Inventory (REI), Myers-Briggs Type Inventory (MBTI) and Accumulated Clues Task (ACT).

Different intuition measures are not correlated, showing problems in constructing theoretical concept of intuition. Significant correlations were found between REI intuition scale and some measures of creativity. Intuition may result from implicit learning of complex similarity-based evaluation that are difficult to express in symbolic (logical) way.

Intuition in chess has been studied in details.

Page 18: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Intuitive thinkingIntuitive thinkingIntuitive thinkingIntuitive thinking

Geometric representation of facts:

+ increasing, 0 constant, - decreasing.

Ohm’s law V=I×R; Kirhoff’s V=V1+V2.

True (I-,V-,R0), (I+,V+,R0), false (I+,V-,R0).

5 laws: 3 Ohm’s & 2 Kirhoff’s.

All laws A=B+C, A=B×C , A-1=B-1+C-1, have identical geometric interpretation!

13 true, 14 false facts; simple P-space, complex neurodynamics.

Question in qualitative physics: if R2 increases, R1 and Vt are constant, what will happen with current and V1, V2 ?

Page 19: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Intuitive reasoningIntuitive reasoningIntuitive reasoningIntuitive reasoning5 laws are simultaneously fulfilled, all have the same representation:

Question: If R2=+, R1=0 and V =0, what can be said about I, V1, V2 ?Find missing value giving F(V=0, R, I,V1, V2, R1=0, R2=+) >0 Suppose that variable X = +, is it possible? Not, if F(V=0, R, I,V1, V2, R1=0, R2=+) =0, i.e. one law is not fulfilled. If nothing is known 111 consistent combinations out of 2187 (5%) exist.

5

1 2 1 21

( , , , , , , ) ( , , )t i i i ii

F V R I V V R R F A B C

Intuitive reasoning, no manipulation of symbols; heuristics: select variable giving unique answer. Soft constraints or semi-quantitative => small |FSM(X)| values.

Page 20: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Brains and understandingBrains and understandingBrains and understandingBrains and understandingGeneral idea: when the text is read and analyzed activation of semantic subnetwork is spread; new words automatically assume meanings that increases overall activation, or the consistency of interpretation.Many variants, all depend on quality of semantic network, some include explicit competition among network nodes.

1. How to approximate this process in computer models?

2. How to use it for medical text understanding, correlate information from texts and genomic research?

3. How to build a practical system?

4. How to improve MDs training, understanding the learning processes.

Work in CCHMF, with John Pestian and Pawel Matykeiwcz.

Page 21: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Insights and brainsInsights and brainsInsights and brainsInsights and brainsActivity of the brain while solving problems that required insight and that could be solved in schematic, sequential way has been investigated. E.M. Bowden, M. Jung-Beeman, J. Fleck, J. Kounios, „New approaches to demystifying insight”. Trends in Cognitive Science 2005.

After solving a problem presented in a verbal way subjects indicated themselves whether they had an insight or not.

An increased activity of the right hemisphere anterior superior temporal gyrus (RH-aSTG) was observed during initial solving efforts and insights. About 300 ms before insight a burst of gamma activity was observed, interpreted by the authors as „making connections across distantly related information during comprehension ... that allow them to see connections that previously eluded them”.

Page 22: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Insight interpretedInsight interpretedInsight interpretedInsight interpreted

What really happens? My interpretation:

• LH-STG represents concepts, S=Start, F=final• understanding, solving = transition, step by step, from S to F• if no connection (transition) is found this leads to an impasse; • RH-STG ‘sees’ LH activity on meta-level, clustering concepts into

abstract categories (cosets, or constrained sets);• connection between S to F is found in RH, leading to a feeling of

vague understanding; • gamma burst increases the activity of LH representations for S, F

and intermediate configurations; • stepwise transition between S and F is found;• finding solution is rewarded by emotions during Aha! experience;

they are necessary to increase plasticity and create permanent links.

Page 23: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Memory & creativityMemory & creativityMemory & creativityMemory & creativityCreative brains accept more incoming stimuli from the surrounding environment (Carson 2003), with low levels of latent inhibition responsible for filtering stimuli that were irrelevant in the past. “Zen mind, beginners mind” (S. Suzuki) – learn to avoid habituation! Creative mind maintains complex representation of objects and situations.

Pair-wise word association technique may be used to probe if a connection between different configurations representing concepts in the brain exists.

A. Gruszka, E. Nęcka, Creativity Research Journal, 2002.

Words may be close (easy) or distant (difficult) to connect; priming words may be helpful or neutral; helpful words are related semantically or phonologically (hogse for horse); neutral words may be nonsensical or just not related to the presented pair.

Results for groups of people of low/high creativity are surprising …

Word 1 Priming 0,2 s Word 2

Page 24: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Creativity & associationsCreativity & associationsCreativity & associationsCreativity & associationsHypothesis: creativity depends on the associative memory, ability to connect distant concepts together. Results: creativity is correlated with greater ability to associate words & susceptibility to priming, distal associations show longer latencies before decision is made.

Neutral priming is strange!

• for close words and nonsensical priming words creative people do worse than less creative; in all other cases they do better.

• for distant words priming always increases the ability to find association, the effect is strongest for creative people. Latency times follow this strange patterns.

Conclusions of the authors:

More synaptic connections => better associations => higher creativity.

Results for neutral priming are puzzling.

Page 25: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Words in the brainWords in the brainWords in the brainWords in the brainThe cell assembly model of language has strong experimental support; F. Pulvermuller (2003) The Neuroscience of Language. On Brain Circuits of Words and Serial Order. Cambridge University Press.

Acoustic signal => phonemes => words => semantic concepts.Semantic activations are seen 90 ms after phonological in N200 ERPs.

Phonological density of words = # words that sound similar to a given word, that is create similar activations in phonological areas.

Semantic density of words = # words that have similar meaning, or similar extended activation network.

Perception/action networks, results from ERP & fMRI.

Page 26: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Words: simple modelWords: simple modelWords: simple modelWords: simple modelGoals: • make the simplest testable model of creativity; • create interesting novel words that capture some features of products;• understand new words that cannot be found in the dictionary.

Model inspired by the putative brain processes when new words are being invented. Start from keywords priming auditory cortex.

Phonemes (allophones) are resonances, ordered activation of phonemes will activate both known words as well as their combinations; context + inhibition in the winner-takes-most leaves one or a few words.

Creativity = imagination (fluctuations) + filtering (competition)

Imagination: many chains of phonemes activate in parallel both words and non-words reps, depending on the strength of synaptic connections. Filtering: associations, emotions, phonological/semantic density.

Page 27: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Creating new wordsCreating new wordsCreating new wordsCreating new wordsA real letter from a friend: I am looking for a word that would capture the following qualities: portal to new worlds of imagination and creativity, a place where visitors embark on a journey discovering their inner selves, awakening the Peter Pan within. A place where we can travel through time and space (from the origin to the future and back), so, its about time, about space, infinite possibilities.FAST!!! I need it sooooooooooooooooooooooon.

creativital, creatival (creativity, portal), used in creatival.comcreativery (creativity, discovery), creativery.com (strategy+creativity)discoverity = {disc, disco, discover, verity} (discovery, creativity, verity)digventure ={dig, digital, venture, adventure} still new! imativity (imagination, creativity); infinitime (infinitive, time) infinition (infinitive, imagination), already a company namejournativity (journey, creativity) learnativity (taken, see http://www.learnativity.com)portravel (portal, travel); sportal (space, sport, portal), taken timagination (time, imagination); timativity (time, creativity)tivery (time, discovery); trime (travel, time)

Page 28: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Word gamesWord gamesWord gamesWord gamesWord games were popular before computer games. They are essential to the development of analytical thinking. Until recently computers could not play such games.

The 20 question game may be the next great challenge for AI, because it is more realistic than the unrestricted Turing test; a World Championship with human and software players (in Singapore)?

Finding most informative questions requires knowledge and creativity.

Performance of various models of semantic memory and episodic memory may be tested in this game in a realistic, difficult application.

Asking questions to understand precisely what the user has in mind is critical for search engines and many other applications.

Creating large-scale semantic memory is a great challenge: ontologies, dictionaries (Wordnet), encyclopedias, MindNet (Microsoft), collaborative projects like Concept Net (MIT) …

Page 29: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Humanized interface

Store

Applications, eg. 20 questions game

Query

Semantic memory

Parser

Part of speech tagger& phrase extractor

On line dictionaries

Manual

verification

Page 30: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Puzzle generatorPuzzle generatorPuzzle generatorPuzzle generatorSemantic memory may be used to invent automatically a large number of word puzzles that the avatar presents. This application selects a random concept from all concepts in the memory and searches for a minimal set of features necessary to uniquely define it; if many subsets are sufficient for unique definition one of them is selected randomly.

It has charm, it has spin, and it has charge. What is it?

It is an Amphibian, it is orange and has black spots. How do you call this animal?

A Salamander.

If you do not know, ask Google!Quark page comes at the top …

Page 31: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Few conclusionsFew conclusionsFew conclusionsFew conclusionsNeurocognitive informatics: inspirations beyond perceptron ...Sydney Lamb, Rice Uni, wrote general book (1999) on the neural basis of language. How to create practical large-scale algorithms?

Various approximations to knowledge representation in brain networks are studied: the use of a priori knowledge based on reference vectors, formation of graphs of consistent concepts in spreading activation networks, ontology & semantic-based enhancements + specific relations.

Clusterization/categorization quality has been used to discover which semantic types are useful (selecting categories of features), expand and reduce the concept space, discovering useful “pathways of the brain”.

Can one identify specific clinotypes in summary discharges? Can they be used to improve training of young MDs?

Sessions on Medical Text Analysis and billing annotation challenge, April 1-5, 2007, IEEE CIDM, Honolulu, showed that human level competence in some text analysis tasks can be reached!

Page 32: Neural Networks for High-Level Intelligence and Cognition Włodzisław Duch (Google: Duch) Department of Informatics, Nicolaus Copernicus University, Torun,

Thank Thank youyoufor for

lending your lending your ears ears

......

Google: W. Duch => PapersDuch W, Intuition, Insight, Imagination and Creativity, IEEE Computational Intelligence Magazine 2(3), August 2007, pp. 40-52