implementing hmax with an integrate-&-fire array tranceiver ralph etienne-cummings, fope...
TRANSCRIPT
Implementing HMAX with an Integrate-&-Fire Array Tranceiver
Ralph Etienne-Cummings, Fope Folowesele, R. Jacob Vogelstein, Gert Cauwenberghs*
The Johns Hopkins University*UC – San Diego
Outline
Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways Models – HMAX HMAX with IFAT Conclusion
Introduction
Object detection, recognition and tracking are computationally difficult tasks
Primates excel at these tasks Engineered systems are unable to match their
level of proficiency, flexibility and speed Robots and other artificial systems are limited
in their ability to interact with the environment
Big Picture Our overall goal is to work towards developing a real-time
autonomous intelligent system that can detect, recognize and track objects under various viewing conditions
• Sense presence of object
Detect
• Identify and categorize object
Recognize • Monitor object movement
Track
Cross-Correlation Spiking HMAX Neural Kalman
The Approach
Emulate cortical functions of primates to design more intelligent artificial systems› Mimic the visual information processing of the
primate’s visual system› Model computationally-intensive algorithms in
neural hardware
Visual Prosthesis and Ocular Implants
Potential Applications
Population Surveillance and Visual Search Engines
Research Tool for Neuroscientists Techarena 2009; Future Predictions 2008; R. Friendman, Biomedical Computation Review 2009
Project Plan
Develop a spike-based processing platform on which we can demonstrate object detection, recognition and tracking› Design the next generation neural array transceiver › Realize silicon facsimiles of cortical simple cells,
complex cells, composite feature cells and MAX› Implement Spike-based Classification› Implement neural algorithms analogous to cross-
correlation and Kalman filtering for object detection and tracking respectively
Outline
Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways Models – HMAX HMAX with IFAT Conclusion
Software vs. Hardware Models
Software models run slower than real time
and are unable to interact with the
environment
Silicon designs take a few months to be fabricated, after
which they are constrained by limited flexibility
IBM 2004; Tenore 2008
Solution Reconfigurable Models
Neural array transceivers are reconfigurable systems consisting of large arrays of silicon neurons
Useful for studying real-time operations of cortical, large-scale neural networks› Able to leverage the known
fundamental blocks such as the operation of neurons and synapses
› Flexible enough for testing out unknowns
Digital
General Purpose
Application Specific
Application-Specific Neural Array Transceivers
Specific to particular neural processes such as› Spatial frequency and orientation (Choi et al.
2005)› Acoustic localization (Horiuchi & Hynna 2001)› Retinotopic self-organization (Taba & Boahen
2006)› Learning and Memory (Arthur & Boahen 2004,
2006)
Digital Neural Array Transceivers
Utilize digital logic as an alternative approach to analog VLSI designs› FPGA conductance-based neuron model (Graas et al.
2004)› FPGA leaky integrate-and-fire neuron model (Pearson et
al. 2005)› DSP and FPGA populations of cortical cells for retinotopic
maps (Shi et al. 2006)› FPGA spike response neuron model (Ros et al. 2006)› FPGA Izhikevich neural models (Cassidy & Andreou 2008)
General Purpose Neural Array Transceivers
More easily amenable to multiple tasks› Integrate-and-fire cooperative-competitive ring of
neurons (Chicca & Indiveri 2006)› Integrate-and-fire with stop learning neural array
(Mitra & Indiveri 2008)› Hodgkin-Huxley type neural array (Zou et al. 2006)› Integrate-and-fire array transceiver (Goldberg et
al. 2001; Vogelstein et al. 2004, Folowosele et al. 2008)
Why Integrate-and-Fire Array Transceiver?
Flexible › No local or hardwired connectivity
Reprogrammable› Virtual synaptic connections with programmable
weight and equilibrium potential allowing for any arbitrary connection topology
Expandable› Multiple chips can be connected together
Outline
Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways Models – HMAX HMAX with IFAT Conclusion
Integrate-and-Fire Array Transceiver (IFAT)
One of the earliest designs was by D.H. Goldberg et al in 2001
The chip was designed in a 0.5-micron process on a 1.5mm x 1.5mm die› 1024 integrate-and-fire neurons› 128 probabilistic synapses with two
sets of fixed parameters
D.H. Goldberg, Neural Networks, 2001
2nd Generation Integrate-and-Fire Array Transceiver (IFAT)
Each neuron implements discrete-time model of a single compartment neuron using switched-capacitor architecture
Synapses have two internal parameters› Synaptic weight› Equilibrium potential
2400 Neurons/Chip 4,194,304 synapses
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
IFAT Operation Incoming and outgoing address events are communicated
through the digital I/O port (DIO) The MCU looks up the synaptic parameters (conductance and
driving potential) and neuron address in RAM It then provides the parameters (driving potential via the DAC)
to the appropriate neuron on the I&F chip
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
IFAT Operation
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
Spike-Based CMOS Cameras:Octopus
Ic
event
reset
Vdd_r
Imaging Concept
Sample Image
Other Approaches:- W. Yang, “Oscillator in a Pixel,” 1994-J. Harris, “Time to First Spike,” 2002- A. Bermak, “Arbitrated Time to First Spike,” 2007
Culurciello, Etienne-Cummings & Boahen, 2001, 2003
IFAT Results
R.J. Vogelstein et al., NIPS, 2005
IFAT 3G: 3D Design in 150n CMOS
Tier A
•Address Event Representation (AER) Communication Circuits•Receive
r•Transmi
tter
Tier B
•Synapse•Bursting
Circuit•Control
Circuit
Tier C
•Neuron•Spike
Generating Circuit
In collaboration with the Sensory Communication and Microsystems Lab
Outline
Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways
› Models – HMAX HMAX with IFAT Conclusion
Visual Pathways
Primary Visual Cortex V1 transmits information to two primary pathways› Dorsal stream› Ventral stream
Dorsal pathway is associated with motion
Ventral pathway mediates the visual identification of objects
T. Poggio, NIPS, 2007Wikipedia, The Free Encyclopedia
Object Recognition for Computer Vision
T. Poggio, NIPS 2007
Neurobiological Software Models
VisNet (Wallis & Rolls 1997)› Homogenous architecture for invariance
and specificity HMAX (Riesenhuber & Poggio 1999)
› Feature complexity and invariance alternatingly increased in different layers of a processing hierarchy
› Utilizes different computational mechanisms to attain invariance and specificity
VisNet VisNet is a four layer feedforward network A series of hierarchical competitive networks with local
graded inhibition Convergent connections to each neuron from a topologically
corresponding region of the preceding layer Synaptic plasticity based on a modified Hebbian learning rule
with a temporal trace of each cell’s previous activity
E. Rolls & T. Milward, Neural Computation 2000
HMAX Summarizes and integrates
large amount of data from different levels of understanding (from biophysics to physiology to behavior)
Two main operations occur in the model› Gaussian-like tuning
operation in the S layers› Nonlinear MAX-like operation
in the C layers
M. Riesenhuber & T. Poggio, Nature Neuroscience 1999
An Implementation
Serre et al. 2007
System Layers S1
› Corresponds to classical simple cells of Hubel and Wiesel found in V1› Gaussian-like tuning to one of four possible orientations with different filter
sizes C1
› Corresponds to complex cells of Hubel and Wiesel› MAX pooling operation of S1 cells with the same orientation and scale band
S2› Pools over C1 units from a local spatial neighborhood› Behaves as radial basis function units – Gaussian-like dependence on the
Euclidean distance between a new input and a stored prototype C2
› Global maximum over all scales and positions for each S2 type over the entire S2 lattice
Serre et al. 2007
Learning and Classification Stages
Learning › During training, extract prototypes at the C1 level
from target image across all orientations Classification
› At runtime, extract C1 and C2 standard model features (SMFs) and pass them to a simple linear classifier
Serre et al. 2007
Scene Understanding System
Serre et al. 2007
Object Recognition in Clutter
C2 responses computed over a new input image and passed to a linear classifier
Superior to previous approaches on MIT-CBCL data sets
Comparable to previous on CalTech5 data sets
Data Sets Benchmark
C2 FeaturesBoost SVM
Leaves 84.0 97.0 95.9
Cars 84.8 99.7 99.8
Faces 96.4 98.2 98.1
Airplanes 94.0 96.7 94.9
Motorcycles 95.0 98.0 97.4
Faces 90.4 95.9 95.3
Cars 75.4 95.1 93.3
Serre et al. 2007
Summary
Benefits to using the fine information from low-level SMFs› C1 SMFs superior for shape based object recognition
Benefits to using the more invariant high-level SMFs› C2 SMFs suitable for semisupervised recognition of objects
in clutter› C2 SMFs excel at recognition of texture-based objects
which lack a geometric structure Too slow for real-time applications
Outline
Introduction Neural Arrays Our Integrate-&-Fire Array Transceivers Visual Object Recognition Pathways
› Models – HMAX HMAX with IFAT Conclusion
HMAX on IFAT System receives its inputs from silicon retinas Each simple cell receives inputs from four consecutive retinal
cells› Two with excitatory connections› Two with inhibitory connections
Excitatory and inhibitory synaptic weights are balanced so that the simple cells do not respond to uniform light
R.J. Vogelstein et al., NIPS 2007
C1, S2 and beyond
Implement C1, S2 and possibly C2 stages of the HMAX model
HMAX model provides a generic high-level computational function in a quantitative way
T. Serre, Dissertation 2006
Preliminary Results: S1 and C1 Stages S1 neurons are oriented spatial filters that detect local
changes in contrast C1 neurons take the MAX of similarly-oriented simple cells
over a region of space S1 cell integrates inputs from a 4x1 retinal receptive field C1 cell integrates inputs from an array of 5x5 similarly-
oriented S1 cells
F. Folowosele et al., BioCAS 2008
Canonical Models
Biologically plausible neural circuits for implementing both Gaussian-like and MAX-like operations
Kouh 2007
MAX Operation Nonlinear saturating pooling function on a set of inputs, such that
the output codes the amplitude of the largest input regardless of the strength and number of the other inputs
Set of input neurons {X} causes the output Z to generate spikes at a rate proportional to the input with the fastest firing rate
R.J. Vogelstein et. al, NIPS 2007
Test1: Test Images and Resulting Simple Cells
(A1-4) Generated test images
(B1-4) Horizontally-oriented simple cells that respond to light-to-dark transitions
(C1-4) Vertically-oriented simple cells that respond to dark-to-light transitions
F. Folowosele et al., ISCAS 2007
Test 1: MAX Network Computation Results
The ratio k obtained is approximately constant among all the simple cells, with a mean of 0.068 and a standard deviation of 0.0006
F. Folowosele et al., ISCAS 2007
Test2: Test Images and Resulting Simple and Complex Cells
Checkerboard Test Image The cells within each
square of the overlaid checkerboard pattern represent the 5x5 array of simple cells which are pooled to form a complex cell
2400 Simple Cells 80 complex cells MAX ratio: 0.1085 ± 0.02 After outliers are
removed, MAX ratio: 0.1179 ± 0.01
F. Folowosele et al., BioCAS 2008
Future: Attention Modulated HMAX
Riesenhuber , 2004
Conclusion General Purpose IF Array Transceivers
› Allows implementation spike-based algorithms› Digital implementations may end up being more
effective than the mixed signal version Object Recognition
› HMAX provides a biologically plausible hierarchical model of V1 – PFC
› Can be shown to outperform some benchmarks Implementation with IFAT
› Preliminary results on the early layers› Future must also include attention
Acknowledgments
Telluride Neuromorphic Engineering Workshop UNCF-Merck Fellowship National Science Foundation
References R.R. Murphy and E. Rogers, “Cooperative assistance for remote robot supervision,” Presence: Teleoperators and Virtual Environments Journal, vol. 5, no. 2,
pp. 224-240, 1996. T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, and T. Poggio, “A theory of object recognition: computations and circuits in the feedforward path of the
ventral stream in primate visual cortex,” AI Memo, MIT, Cambridge 2005. M. Riesenhuber, and T. Poggio, “Computational models of object recognition in cortex: a review,” Technical Report Artificial Intelligence Laboratory and
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 2000b. R.J. Vogelstein, U. Mallik, E. Culurciello, G. Cauwenberghs, R. Etienne-Cummings, “A multichip neuromorphic system for spike-based visual information
processing,” Neural Computation, vol. 19, pp. 2281-2300, 2007a. D.H. Goldberg, G. Cauwenberghs, and A.G. Andreou, “Probabilistic synaptic weighting in a reconfigurable network of VLSI integrate-and-fire neurons,” Neural
Networks, vol. 14, pp. 781-793, 2001. T.Y.W. Choi, P.A. Merolla, J.V. Arthur, K.A. Boahen, and B.E. Shi, “Neuromorphic implementation of orientation hypercolumns,” IEEE ISCAS 2005. R.J. Vogelstein, U. Mallik, J.T. Vogelstein, G. Cauwenberghs, “Dynamically reconfigurable silicon array of spiking neurons with conductance-based synapses,”
IEEE Transactions on Neural Networks, 2007b. A. Cassidy, S. Denham, P. Kanold, and A.G. Andreou, “FPGA-based silicon spiking neural array,” IEEE BioCAS 2007. B. E. Shi, E. K. C. Tsang, S. Y. M. Lam and Y. Meng, "Expandable hardware for computing cortical maps," IEEE ISCAS 2006. D.H. Hubel and T.N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” Journal of Physiology, vol. 160, no.
1, 1962. L.G. Ungerleider, and J.V. Haxby, “What and where in the human brain,” Curr. Opin. Neurobiol., pp. 157-165, 1994. E. Rolls and T. Milward, “A model of invariant object recognition in the visual system: Learning rules, activation functions, lateral inhibition, and information-
based performance measures, Neural Computation, vol. 12, pp. 2547-2572, 2000. P Merolla and K Boahen, “A recurrent model of orientation maps with simple and complex cells,” Advances in Neural Information Processing Systems (NIPS)
16, S Thrun and L Saul, Eds, MIT Press, pp 995-1002, 2004. R.P.N. Rao, “Robut Kalman filters for prediction, recognition, and learning,” Technical Report 645, Computer Science Department, University of Rochester,
1996. J. Licklider, “A duplex theory of pitch perception,” Cellular and Molecular Life Sciences (CMLS), vol. 7, no. 4, pp. 128-134, 1951 J. Tapson, “Autocorrelation properties of single neurons,” Proceedings of the 1998 South African Symposium on Communication and Signal Processing, 1998. J. Tapson, C. Jin, A. van Schaik and R. Etienne-Cummings, “A First-Order Nonhomogeneous Markov Model for the Response of Spiking Neurons Stimulated by
Small Phase-Continuous Signals,” Neural Computation, vol. 21, no. 6, pp. 1554-1588, June 2009. T. Lacey, “Tutorial: The Kalman filter,” Lecure Notes, Department of Computer Science, Georgia Institute of Technology, 1998. R. Linsker, “Neural network learning of optimal Kalman prediction and control,” Neural Networks, vol. 21, no. 9, pp. 1328-1343, 2008. R.E. Kalman, “A new approach to linear filtering and prediction problems,” Transactions of the ASME–Journal of Basic Engineering (Series D), pp. 35-45, 1960.
S. Mihalas and E. Niebur, “A generalized linear integrate-and-fire neural model produces diverse spiking behaviors,” Neural Computation, 2008 in Press. C. Cadieu, M. Kouh, A. Pasupathy, C.E. Connor, M. Riesenhuber, T. Poggio, “A model of V4 shape selectivity and invariance,” J. Neurophysiol., 2007.