implementing hmax with an integrate-&-fire array tranceiver ralph etienne-cummings, fope...

Implementing HMAX with an Integrate-&-Fire Array Tranceiver

Ralph Etienne-Cummings, Fope Folowesele, R. Jacob Vogelstein, Gert Cauwenberghs*

The Johns Hopkins University*UC – San Diego

Outline

Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways Models – HMAX HMAX with IFAT Conclusion

Introduction

Object detection, recognition and tracking are computationally difficult tasks

Primates excel at these tasks Engineered systems are unable to match their

level of proficiency, flexibility and speed Robots and other artificial systems are limited

in their ability to interact with the environment

Big Picture Our overall goal is to work towards developing a real-time

autonomous intelligent system that can detect, recognize and track objects under various viewing conditions

• Sense presence of object

Detect

• Identify and categorize object

Recognize • Monitor object movement

Track

Cross-Correlation Spiking HMAX Neural Kalman

The Approach

Emulate cortical functions of primates to design more intelligent artificial systems› Mimic the visual information processing of the

primate’s visual system› Model computationally-intensive algorithms in

neural hardware

Visual Prosthesis and Ocular Implants

Potential Applications

Population Surveillance and Visual Search Engines

Research Tool for Neuroscientists Techarena 2009; Future Predictions 2008; R. Friendman, Biomedical Computation Review 2009

Project Plan

Develop a spike-based processing platform on which we can demonstrate object detection, recognition and tracking› Design the next generation neural array transceiver › Realize silicon facsimiles of cortical simple cells,

complex cells, composite feature cells and MAX› Implement Spike-based Classification› Implement neural algorithms analogous to cross-

correlation and Kalman filtering for object detection and tracking respectively

Outline


Software vs. Hardware Models

Software models run slower than real time

and are unable to interact with the

environment

Silicon designs take a few months to be fabricated, after

which they are constrained by limited flexibility

IBM 2004; Tenore 2008

Solution Reconfigurable Models

Neural array transceivers are reconfigurable systems consisting of large arrays of silicon neurons

Useful for studying real-time operations of cortical, large-scale neural networks› Able to leverage the known

fundamental blocks such as the operation of neurons and synapses

› Flexible enough for testing out unknowns

Digital

General Purpose

Application Specific

Application-Specific Neural Array Transceivers

Specific to particular neural processes such as› Spatial frequency and orientation (Choi et al.

2005)› Acoustic localization (Horiuchi & Hynna 2001)› Retinotopic self-organization (Taba & Boahen

2006)› Learning and Memory (Arthur & Boahen 2004,

2006)

Digital Neural Array Transceivers

Utilize digital logic as an alternative approach to analog VLSI designs› FPGA conductance-based neuron model (Graas et al.

2004)› FPGA leaky integrate-and-fire neuron model (Pearson et

al. 2005)› DSP and FPGA populations of cortical cells for retinotopic

maps (Shi et al. 2006)› FPGA spike response neuron model (Ros et al. 2006)› FPGA Izhikevich neural models (Cassidy & Andreou 2008)

General Purpose Neural Array Transceivers

More easily amenable to multiple tasks› Integrate-and-fire cooperative-competitive ring of

neurons (Chicca & Indiveri 2006)› Integrate-and-fire with stop learning neural array

(Mitra & Indiveri 2008)› Hodgkin-Huxley type neural array (Zou et al. 2006)› Integrate-and-fire array transceiver (Goldberg et

al. 2001; Vogelstein et al. 2004, Folowosele et al. 2008)

Why Integrate-and-Fire Array Transceiver?

Flexible › No local or hardwired connectivity

Reprogrammable› Virtual synaptic connections with programmable

weight and equilibrium potential allowing for any arbitrary connection topology

Expandable› Multiple chips can be connected together

Outline


Integrate-and-Fire Array Transceiver (IFAT)

One of the earliest designs was by D.H. Goldberg et al in 2001

The chip was designed in a 0.5-micron process on a 1.5mm x 1.5mm die› 1024 integrate-and-fire neurons› 128 probabilistic synapses with two

sets of fixed parameters

D.H. Goldberg, Neural Networks, 2001

2nd Generation Integrate-and-Fire Array Transceiver (IFAT)

Each neuron implements discrete-time model of a single compartment neuron using switched-capacitor architecture

Synapses have two internal parameters› Synaptic weight› Equilibrium potential

2400 Neurons/Chip 4,194,304 synapses

R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a

IFAT Operation Incoming and outgoing address events are communicated

through the digital I/O port (DIO) The MCU looks up the synaptic parameters (conductance and

driving potential) and neuron address in RAM It then provides the parameters (driving potential via the DAC)

to the appropriate neuron on the I&F chip


IFAT Operation


Spike-Based CMOS Cameras:Octopus

Ic

event

reset

Vdd_r

Imaging Concept

Sample Image

Other Approaches:- W. Yang, “Oscillator in a Pixel,” 1994-J. Harris, “Time to First Spike,” 2002- A. Bermak, “Arbitrated Time to First Spike,” 2007

Culurciello, Etienne-Cummings & Boahen, 2001, 2003

IFAT Results

R.J. Vogelstein et al., NIPS, 2005

IFAT 3G: 3D Design in 150n CMOS

Tier A

•Address Event Representation (AER) Communication Circuits•Receive

r•Transmi

tter

Tier B

•Synapse•Bursting

Circuit•Control

Circuit

Tier C

•Neuron•Spike

Generating Circuit

In collaboration with the Sensory Communication and Microsystems Lab

Outline

Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways

› Models – HMAX HMAX with IFAT Conclusion

Visual Pathways

Primary Visual Cortex V1 transmits information to two primary pathways› Dorsal stream› Ventral stream

Dorsal pathway is associated with motion

Ventral pathway mediates the visual identification of objects

T. Poggio, NIPS, 2007Wikipedia, The Free Encyclopedia

Object Recognition for Computer Vision

T. Poggio, NIPS 2007

Neurobiological Software Models

VisNet (Wallis & Rolls 1997)› Homogenous architecture for invariance

and specificity HMAX (Riesenhuber & Poggio 1999)

› Feature complexity and invariance alternatingly increased in different layers of a processing hierarchy

› Utilizes different computational mechanisms to attain invariance and specificity

VisNet VisNet is a four layer feedforward network A series of hierarchical competitive networks with local

graded inhibition Convergent connections to each neuron from a topologically

corresponding region of the preceding layer Synaptic plasticity based on a modified Hebbian learning rule

with a temporal trace of each cell’s previous activity

E. Rolls & T. Milward, Neural Computation 2000

HMAX Summarizes and integrates

large amount of data from different levels of understanding (from biophysics to physiology to behavior)

Two main operations occur in the model› Gaussian-like tuning

operation in the S layers› Nonlinear MAX-like operation

in the C layers

M. Riesenhuber & T. Poggio, Nature Neuroscience 1999

An Implementation

Serre et al. 2007

System Layers S1

› Corresponds to classical simple cells of Hubel and Wiesel found in V1› Gaussian-like tuning to one of four possible orientations with different filter

sizes C1

› Corresponds to complex cells of Hubel and Wiesel› MAX pooling operation of S1 cells with the same orientation and scale band

S2› Pools over C1 units from a local spatial neighborhood› Behaves as radial basis function units – Gaussian-like dependence on the

Euclidean distance between a new input and a stored prototype C2

› Global maximum over all scales and positions for each S2 type over the entire S2 lattice

Serre et al. 2007

Learning and Classification Stages

Learning › During training, extract prototypes at the C1 level

from target image across all orientations Classification

› At runtime, extract C1 and C2 standard model features (SMFs) and pass them to a simple linear classifier

Serre et al. 2007

Scene Understanding System

Serre et al. 2007

Object Recognition in Clutter

C2 responses computed over a new input image and passed to a linear classifier

Superior to previous approaches on MIT-CBCL data sets

Comparable to previous on CalTech5 data sets

Data Sets Benchmark

C2 FeaturesBoost SVM

Leaves 84.0 97.0 95.9

Cars 84.8 99.7 99.8

Faces 96.4 98.2 98.1

Airplanes 94.0 96.7 94.9

Motorcycles 95.0 98.0 97.4

Faces 90.4 95.9 95.3

Cars 75.4 95.1 93.3

Serre et al. 2007

Summary

Benefits to using the fine information from low-level SMFs› C1 SMFs superior for shape based object recognition

Benefits to using the more invariant high-level SMFs› C2 SMFs suitable for semisupervised recognition of objects

in clutter› C2 SMFs excel at recognition of texture-based objects

which lack a geometric structure Too slow for real-time applications

Outline

Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways

› Models – HMAX HMAX with IFAT Conclusion

HMAX on IFAT System receives its inputs from silicon retinas Each simple cell receives inputs from four consecutive retinal

cells› Two with excitatory connections› Two with inhibitory connections

Excitatory and inhibitory synaptic weights are balanced so that the simple cells do not respond to uniform light

R.J. Vogelstein et al., NIPS 2007

C1, S2 and beyond

Implement C1, S2 and possibly C2 stages of the HMAX model

HMAX model provides a generic high-level computational function in a quantitative way

T. Serre, Dissertation 2006

Preliminary Results: S1 and C1 Stages S1 neurons are oriented spatial filters that detect local

changes in contrast C1 neurons take the MAX of similarly-oriented simple cells

over a region of space S1 cell integrates inputs from a 4x1 retinal receptive field C1 cell integrates inputs from an array of 5x5 similarly-

oriented S1 cells

F. Folowosele et al., BioCAS 2008

Canonical Models

Biologically plausible neural circuits for implementing both Gaussian-like and MAX-like operations

Kouh 2007

MAX Operation Nonlinear saturating pooling function on a set of inputs, such that

the output codes the amplitude of the largest input regardless of the strength and number of the other inputs

Set of input neurons {X} causes the output Z to generate spikes at a rate proportional to the input with the fastest firing rate

R.J. Vogelstein et. al, NIPS 2007

Test1: Test Images and Resulting Simple Cells

(A1-4) Generated test images

(B1-4) Horizontally-oriented simple cells that respond to light-to-dark transitions

(C1-4) Vertically-oriented simple cells that respond to dark-to-light transitions

F. Folowosele et al., ISCAS 2007

Test 1: MAX Network Computation Results

The ratio k obtained is approximately constant among all the simple cells, with a mean of 0.068 and a standard deviation of 0.0006

F. Folowosele et al., ISCAS 2007

Test2: Test Images and Resulting Simple and Complex Cells

Checkerboard Test Image The cells within each

square of the overlaid checkerboard pattern represent the 5x5 array of simple cells which are pooled to form a complex cell

2400 Simple Cells 80 complex cells MAX ratio: 0.1085 ± 0.02 After outliers are

removed, MAX ratio: 0.1179 ± 0.01

F. Folowosele et al., BioCAS 2008

Future: Attention Modulated HMAX

Riesenhuber , 2004

Conclusion General Purpose IF Array Transceivers

› Allows implementation spike-based algorithms› Digital implementations may end up being more

effective than the mixed signal version Object Recognition

› HMAX provides a biologically plausible hierarchical model of V1 – PFC

› Can be shown to outperform some benchmarks Implementation with IFAT

› Preliminary results on the early layers› Future must also include attention

Acknowledgments

Telluride Neuromorphic Engineering Workshop UNCF-Merck Fellowship National Science Foundation

References R.R. Murphy and E. Rogers, “Cooperative assistance for remote robot supervision,” Presence: Teleoperators and Virtual Environments Journal, vol. 5, no. 2,

pp. 224-240, 1996. T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, and T. Poggio, “A theory of object recognition: computations and circuits in the feedforward path of the

ventral stream in primate visual cortex,” AI Memo, MIT, Cambridge 2005. M. Riesenhuber, and T. Poggio, “Computational models of object recognition in cortex: a review,” Technical Report Artificial Intelligence Laboratory and

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 2000b. R.J. Vogelstein, U. Mallik, E. Culurciello, G. Cauwenberghs, R. Etienne-Cummings, “A multichip neuromorphic system for spike-based visual information

processing,” Neural Computation, vol. 19, pp. 2281-2300, 2007a. D.H. Goldberg, G. Cauwenberghs, and A.G. Andreou, “Probabilistic synaptic weighting in a reconfigurable network of VLSI integrate-and-fire neurons,” Neural

Networks, vol. 14, pp. 781-793, 2001. T.Y.W. Choi, P.A. Merolla, J.V. Arthur, K.A. Boahen, and B.E. Shi, “Neuromorphic implementation of orientation hypercolumns,” IEEE ISCAS 2005. R.J. Vogelstein, U. Mallik, J.T. Vogelstein, G. Cauwenberghs, “Dynamically reconfigurable silicon array of spiking neurons with conductance-based synapses,”

IEEE Transactions on Neural Networks, 2007b. A. Cassidy, S. Denham, P. Kanold, and A.G. Andreou, “FPGA-based silicon spiking neural array,” IEEE BioCAS 2007. B. E. Shi, E. K. C. Tsang, S. Y. M. Lam and Y. Meng, "Expandable hardware for computing cortical maps," IEEE ISCAS 2006. D.H. Hubel and T.N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” Journal of Physiology, vol. 160, no.

1, 1962. L.G. Ungerleider, and J.V. Haxby, “What and where in the human brain,” Curr. Opin. Neurobiol., pp. 157-165, 1994. E. Rolls and T. Milward, “A model of invariant object recognition in the visual system: Learning rules, activation functions, lateral inhibition, and information-

based performance measures, Neural Computation, vol. 12, pp. 2547-2572, 2000. P Merolla and K Boahen, “A recurrent model of orientation maps with simple and complex cells,” Advances in Neural Information Processing Systems (NIPS)

16, S Thrun and L Saul, Eds, MIT Press, pp 995-1002, 2004. R.P.N. Rao, “Robut Kalman filters for prediction, recognition, and learning,” Technical Report 645, Computer Science Department, University of Rochester,

1996. J. Licklider, “A duplex theory of pitch perception,” Cellular and Molecular Life Sciences (CMLS), vol. 7, no. 4, pp. 128-134, 1951 J. Tapson, “Autocorrelation properties of single neurons,” Proceedings of the 1998 South African Symposium on Communication and Signal Processing, 1998. J. Tapson, C. Jin, A. van Schaik and R. Etienne-Cummings, “A First-Order Nonhomogeneous Markov Model for the Response of Spiking Neurons Stimulated by

Small Phase-Continuous Signals,” Neural Computation, vol. 21, no. 6, pp. 1554-1588, June 2009. T. Lacey, “Tutorial: The Kalman filter,” Lecure Notes, Department of Computer Science, Georgia Institute of Technology, 1998. R. Linsker, “Neural network learning of optimal Kalman prediction and control,” Neural Networks, vol. 21, no. 9, pp. 1328-1343, 2008. R.E. Kalman, “A new approach to linear filtering and prediction problems,” Transactions of the ASME–Journal of Basic Engineering (Series D), pp. 35-45, 1960.

S. Mihalas and E. Niebur, “A generalized linear integrate-and-fire neural model produces diverse spiking behaviors,” Neural Computation, 2008 in Press. C. Cadieu, M. Kouh, A. Pasupathy, C.E. Connor, M. Riesenhuber, T. Poggio, “A model of V4 shape selectivity and invariance,” J. Neurophysiol., 2007.

implementing hmax with an integrate-&-fire array tranceiver ralph etienne-cummings, fope...

Documents

neural hardware slide

environment slide

ifat conclusion slide

introduction object

neural algorithms analogous

particular neural processes

uc san diego slide

visual prosthesis