ubiquitous cognitive computing: a vector symbolic approach

69

Upload: jorjanna-gomez

Post on 01-Jan-2016

32 views

Category:

Documents


2 download

DESCRIPTION

Ubiquitous Cognitive Computing: A Vector Symbolic Approach. BLERIM EMRULI EISLAB, Luleå University of Technology. Outline. Context and motivation Aims Background (concepts and methods) Summary of a ppended papers Conclusions and future work. Ubiquitous Cognitive Computing: - PowerPoint PPT Presentation

TRANSCRIPT

Ubiquitous Cognitive Computing: A Vector Symbolic Approach

BLERIM EMRULIEISLAB, Luleå University of Technology

Outline

Context and motivation Aims Background (concepts and methods) Summary of appended papers Conclusions and future work

Ubiquitous Cognitive Computing: A Vector Symbolic Approach

Ubiquitous Cognitive Computing: A Vector Symbolic Approach

Ubiquitous Cognitive Computing: A Vector Symbolic ApproachConventional computing

1+2/3 = 1.666…7 1010 XOR 1000 = 0010 1-64 bit variables

Cognitive computing Concepts, relations, sequences,

actions, perceptions, learning … Some concepts

man ≅ woman man ≅ lake

Ubiquitous Cognitive Computing: A Vector Symbolic ApproachCognitive computing

Bridging of dissimilar concepts man - fisherman - fish – lake man - plumber - water – lake

Relations between concepts and sequences 5 : 10 : 15 : 20 5 : 10 : 15 : 30

Ubiquitous Cognitive Computing: A Vector Symbolic Approach

Ubiquitous Cognitive Computing: A Vector Symbolic Approach“..invisible, everywhere computing that does not live on a personal device of any sort, but is in the woodwork everywhere (Weiser, 1994).”– Mark Weiser, widely considered to be the father of ubiquitous computing

Ubiquitous Cognitive Computing: A Vector Symbolic ApproachIs cognitive computing for ubiquitous systems, i.e., systems that in principle can appear “everywhere and anywhere” as part of the physical infrastructure that surrounds us.

Ubiquitous Cognitive Computing: A Vector Symbolic Approach

high-level processing

low-level processing(sensory integration)

high-level “symbol-like”representations

Intuition

Aims

Investigate mathematical concepts and develop computational principles with cognitive qualities, which can enable digital systems to function more like brains in terms of:

learning/adaption generalization association prediction …

Other desirable properties computationally lightweight suitable for distributed and

parallel computation robust and degrade gracefully

Related approaches

service-oriented architecture (SOA) traditional artificial intelligence techniques cognitive approach (Giaffreda, 2013; Wu et

al., 2014)

Geometric approach to cognition What can we do with words of 1-kilobyte

or more?

Pentti Kanerva started to explore this idea in the 80’s

Engineering perspective with inspiration from biological neural circuits and human long-term memory

Since the 90’s similar ideas developed also from

Peter Gärdenfors, Professor at Lund University

1 0 1 0 1 0 1 0 1

1 2 3 4 … …. 9996 9997 9998 9999 10000

Sparse Distributed Memory (SDM)

inspired by circuits in the brain model of human long-term memory associative memory

KEY IDEA: Similar or related concepts in memory correspond to nearby points in a high-dimensional space (Kanerva, 1988,

1993)

SDM interpreted as computer memory

SDM interpreted as feedforward neural network

Vector symbolic architectures (VSAs) Concepts and their interrelationships correspond

to points in a high-dimensional space

Able to represent concepts, relations, sequences… learn, generalize, associate… perform analogy-making using vector representations based on sound mathematical concepts and principles (Plate, 1994)

Vector symbolic architectures (VSAs) VSAs were developed to address some early

criticisms of neural networks (Fodor and Pylyshyn, 1988) while retaining useful properties such as learning, generalization, pattern recognition, robustness and noise immunity (30% corruption tolerable)

There are mathematical operators for how to construct operate, query etc. compositional structures, which are part of the VSA framework

Analogy-making

Analogy-making is a central element of cognition that enables animals to identify and manage new information by generalizing past experiences, possibly from a few learned examples

Present theories of analogy-making usually divide this process into three or four stages (Eliasmith and Thagard, 2001)

My work is focused mainly on the challenging mapping stage

Analogical mapping

Analogical mapping is the process of mapping relations and concepts from one situation (a source), x, to another (a target), y; M : x → y

Analogical mapping

The process of mapping relations and concepts that describe one situation (a source) to another (a target)

Analogical mapping (cont’d)

The process of mapping relations and concepts that describe one situation (a source) to another (a target)

Circle is above the square

Analogical mapping (cont’d)

The process of mapping relations and concepts that describe one situation (a source) to another (a target)

Square is below the circle

Analogical mapping (cont’d)

The process of mapping relations and concepts that describe one situation (a source) to another (a target)

Novel ‘‘above–below’’ relations

Generalization via analogical mapping

(Neumann, 2001)

Generalization via analogical mapping

(Neumann, 2001)

Generalization via analogical mapping

(Neumann, 2001)

Generalization via analogical mapping

(Neumann, 2001)

A difficult computational problem If analogical mapping is considered as a graph

comparison problem it is a challenging computational problem

VSAs use compressive representations, not graphs

The ability to encode symbol-like approximate representations makes VSAs computationally feasible and psychologically plausible

Gentner and Forbus (2011) and Eliasmith (2013)

Sum-up

I have adopted a vector-based geometric approach to cognitive computation because it appears to be sufficiently potent and suitable for implementation in resource-constrained devices

A central part of the work deals with analogy making and learning as a key mechanism enabling interoperability between heterogonous systems, much like ontologies play a central role in service-oriented architecture and the semantic web Raad and Evermann (2014): Is Ontology Alignment like

Analogy?

Thesis – Appended papers

A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples

B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory

C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration

D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data

Thesis – Cognitive computation papers

A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples

B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory

C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration

D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data

Thesis – Cognitive architecture for ubiquitous systems paper

A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples

B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory

C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration

D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data

Thesis – Encoding vector representations paper

A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples

B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory

C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration

D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data

Emruli B. and Sandin F.

Cognitive Computation6(1):74–88, 2014

Q1: Is it possible to extend the sparse distributed memory model so that it can store multiple mapping examples of compositional structures and make correct analogies from novel inputs?

Paper A

Analogical mapping unit (AMU)

SDM

Results: size of the memory and generalization

Results: size of the memory and generalization

minimum probability of error

Emruli B., Gayler W. R. and Sandin F.

IJCNN 2013, Dallas, TXAug. 4 – 9, 2013

Paper B

Q2: If such an extended sparse distributed memory model is developed, can it learn and infer novel patterns in sequences such as those encountered in widely used intelligence tests like Raven’s Progressive Matrices?

Bidirectionality of mapping vectors

Bidirectionality problem

Raven's Progressive Matrices

Rasmussen R. and Eliasmith C., Topics in Cognitive Science, Vol. 3, No. 1, 2011

Learning mapping vectors

SDM

Learning mapping vectors (cont’d)

SDM

Learning mapping vectors (cont’d)

SDM

Prediction

SDM

Results

Emruli B., Sandin F. and Delsing J.

Biologically Inspired Cognitive Architectures9:33–45, 2014

Q3: Could extended sparse distributed memory and vector-symbolic methodologies such as those considered in Q1 and Q2 be used to address the problem of designing an architecture that enables heterogeneous IoT devices and systems to interoperate autonomously and adapt to instructions in dynamic environments?

Paper C

Communication architecture

No shared operational semantics (Sheth, 1999; Obrst, 2003; Baresi et al., 2013)

Automation system

Learning by demonstration

Interact with the four systems to achieve a particular goal

Instructions of Alice and Bob are the same

Alice Bob

Results

One instruction per day by Alice and Bob

Sandin F., Emruli B. and Sahlgren M.

Knowledge and Information SystemsSubmitted

Paper D

Q4: Is it possible to extend the traditional method of random indexing to handle matrices and higher-order arrays in the form of N-way random indexing, so that more complex data streams and semantic relationships can be analyzed? What are the other implications of this extension?

Random indexing (RI)

Random indexing is (was) an approximative method for dimension reduction and semantic analysis of pairwise relationships

Main properties concepts and their interrelationships correspond to

random points in a high-dimensional space incremental coding/learning lightweight, suitable for processing of streaming data accuracy comparable to standard methods for

dimension reduction

Applications

natural language processing search engines pattern recognition (e.g., event detection in

blogs) graph searching (e.g., social network analysis) other machine learning applications

Results: one-way versus two-way Random Indexing (RI)

Anecdote

“ As an engineer, this can feel like a deal with the devil, as you have to accept error and uncertainty in your results. But the alternative is no results at all! ”

Pete Warden, data scientist and a former Apple engineer

Results: two-way RI versus PCA

Gavagai AB: Opinion mining

Loreen

Danny Saucedo

Thorsten Flinck

Viewer votes

33 %

22 %

8 %

Gavagai forecast

30 %

22 %

12 %

2012

Summary

The proposed AMU integrates the idea of mapping vectors with sparse distributed memory Demonstration of transparent learning and application of

multiple analogical mappings

The AMU solves a particular type of Raven’s matrix The SDM breaks the commutative (bidirectionality) property

of the binary mapping vectors

Summary (cont’d)

Outline of communication architecture that enables system interoperability by learning, without reference to a shared operational semantics Presenting a novel approach to a challenging problem

Extension of Random Indexing (RI) to multiple dimensions in an approximately fixed size representation Comparison of two-RI with the traditional (one-way) RI and PCA

Limitations

Hand-coding of the representations

The examples addressed in Paper C are relatively simple, more complex examples and symbolic representation schemes are needed to further test the architecture Attention mechanism needs to be developed Extension to higher-order Markov chains

In Paper D only one- and two-way RI are investigated and problems considered are relative small in scale and not demonstrated in streaming data

Future work

To apply the architecture outlined in Paper C in a “Living Lab” equipped with technology similar to that described in the hypothetical automation scenario

To improve and further investigate, both empirically and theoretically the implications of the NRI extension

Is the mathematical framework sufficiently general?

“A beloved child has many names.”

Holographic Reduced Representation (HRR) - 1994

Context-Dependent Thinning (CDT) - 2001 Vector Symbolic Architecture (VSA) - 2003 Hyperdimensional Computing (HC) - 2009 Analogical Mapping Unit (AMU) - 2013 Semantic Pointer Architecture (SPAUN) -

2013 Matrix Binding of Additive Terms (MBAT) -

2014

Key readings

Sparse Distributed Memory (Kanerva, 1988)

Conceptual Spaces (Gärdenfors, 2000) Holographic Reduced Representation

(Plate, 2003) Geometry and Meaning (Widdows, 2004) How to Build a Brain (Eliasmith, 2013) The Geometry of Meaning (Gärdenfors,

2014)

Credits

Supervisors

JERKER DELSING FREDRIK SANDIN LENNART GUSTAFSSON

Coauthors

ROSS GAYLER MAGNUS SAHLGREN

Discussions and inspiration

ASAD KHAN PENTTI KANERVA BRUNO OLSHAUSEN CHRIS ELIASMITH

Financial supportSTINT, ARROWHEAD PROJECT, NORDEAS NORRLANDSSTIFTELSE, AND THE

WALLENBERG FOUNDATION

COLLEAGUES, FAMILY AND FRIENDS

THE END

… or perhaps the beginning