cortical receptive fields using deep autoencoders

Download Cortical Receptive Fields Using Deep Autoencoders

Post on 30-Jan-2016




0 download

Embed Size (px)


Cortical Receptive Fields Using Deep Autoencoders. Work done as a part of CS397 Ankit Awasthi (Y8084) Supervisor: Prof. H. Karnick. The visual pathway. Some Terms. - PowerPoint PPT Presentation


  • Cortical Receptive Fields Using Deep AutoencodersWork done as a part of CS397Ankit Awasthi (Y8084)Supervisor: Prof. H. Karnick

  • The visual pathway

  • Some TermsA cell, neuron, neural unit, unit of the neural network may be used interchangeably and all refer to neuron in the visual cortex.Receptive field of a neuron refers to the region in space in which the presence of a stimulus will change the response of the neuron.

  • Precortical stagesRetinal cells and the cells in the LGN are center on surround off and vice versa

  • Why focus on cortical perception??Most cells in the precortical stages are hard-coded to a large extent and may be innateCortical cells are mostly learned through postnatal visual stimulationHubel Wiesel showed that irreversible damage was produced in kittens by sufficient visual deprivation during the so-called critical period

  • What are these???

  • How did you do that??Surely you did not use only visual informationProcessing in the later stages of visual cortex has some top-down influenceMuch of the visual inference involves input from other modalities (say facial emotion recognition)Thus we focus only on those stages of processing which require/use only visual information

  • Neurological Findings (with electrodes in cats cortex !!)The visual cortex consists of simple and complex cellsSimple cells can be characterized by a certain distributions of on and off areasComplex cells could not be explained with a simple distribution of on and off areasReceptive fields for simple cells should look like oriented edge detectors Receptive fields of different cells may be overlapping

  • Topographic RepresentationThere is a systematic mapping of each structure to the nextThe optic fibers from a part of the retina are connected to a small part in LGNA part of LGN is similarly connected to a small part in the primary visual cortexThis topography continues in other cortical regionsConvergence at each stage Larger receptive fields in later stages

  • How do we learn these layers then??

  • Why Deep Learning??Brains have a deep architectureHumans organize their ideas hierarchically, through composition of simpler ideasInsufficiently deep architectures can be exponentially inefficientDeep architectures facilitate feature and sub-feature sharing

  • Neural Networks (~1985)input vectorhidden layersoutputsBack-propagate error signal to get derivatives for learningCompare outputs with correct answer to get error signal

  • Restricted Boltzmann Machines (RBM)We restrict the connectivity to make learning easier.Only one layer of hidden units.No connections between hidden units.Energy of a joint configuration is defined as

    (for binary visible units)

    (for real visible units)


  • Restricted Boltzmann Machines (RBM)(contd.)Probability of a configuration is defined as

    The hidden nodes are conditionally independent given the visible layer and vice versa

    Using the definition of the energy function and probability, the conditional probabilities come out to be as follows

  • Maximum likelihood learning for an RBMijijijijt = 0 t = 1 t = 2 t = infinityStart with a training vector on the visible units.Then alternate between updating all the hidden units in parallel and updating all the visible units in parallel.a fantasy

  • Training a deep network

  • Deep Autoencoder(Hinton 2006)

  • Sparse DBNs(Lee at. al. 2007)In order to have a sparse hidden layer, the average activation of a hidden unit over the training is constrained to a certain small quantityThe optimization problem in the learning algorithm would look like

  • Related WorkIn [4], the authors had shown the features learned by independent component analysis were oriented edge detectorsLee et. al. in [10] show that the second layer learned using sparse DBNs match certain properties of cells in V2 area of the visual cortexBengio in [3] discuss ways of visualizinng higher layer featuresLee in [4] have come up with convolutional DBNs which incorporates weight sharing across the visual field and probablistic max pooling operation

  • Our experimental settingWe trained sparse DBNs on 100,000 randomly sampled patches of natural images of size 14x14The image were preprocessed to have same overall contrast and whitened as in [5]The hidden units in the first, second, third layer are all 200 in number

  • Getting first layer hidden featuresTo maximize the activation of the ith hidden unit, the input v should be

    Recall what was said about receptive fields of simple cells (oriented edge detectors)

  • First Hidden Layer Features(with each epoch)

  • Effect of Sparsity

  • Higher Layer FeaturesProjecting the a higher layer's weights onto the response of the previous layer..useless!!!Three different methods of projecting the hidden units onto the input spaceLinear Combination of Previous layer filters, Lee [2]Sampling from a hidden unit, Hinton [5], Bengio[3]Activation Maximization, Bengio [3]

  • Linear Combination of Previous Layer FiltersOnly few connections to the previous layer have their weights either too high or too lowSome of the largest weighted connections are used for linear combinationOverlooks the non-linearity in the network from one layer to the otherSimple and efficient

  • Linear Combination of Previous Layer Filters(Results)

  • Sampling from Hidden UnitsDeep Autoencoder ( using RBMs ) is a generative model ( top down sampling) and any two adjacent layers form a RBM ( Gibbs sampling)Clamp a particular hidden unit to 1 during Gibbs sampling and then do a top down sampling to the input layer

  • Sampling from Hidden Units(Results)

  • Activation MaximizationIntuition same as that for first layer featuresOptimization problem is much more difficult

    In general a non convex problemSolve for local minima for different random initializations, then take average or the minimum etc.

  • Activation Maximization(Results)

  • Analysis of ResultsAs observed, the second layer features are able to capture a combination of edges or angled stimuli The third layer features are very difficult to make sense of in terms of simple geometrical elementsNo good characterization of these cells is available, thus not to choose between the different methods

  • Larger Receptive Fields for Higher Layers We offer a simple solution to extend the size of the receptive fields for higher layersUsing the RBM trained on natural image patches, compute the response over the entire image with overlapping patchesResponses of some neighboring patches are taken as input for the next layer RBMThis is repeated for the whole networkThis has not been investigated exhaustively

  • Results(linear combination)First Layer Second Layer

  • Conclusion and Future WorkSimilarities in the receptive fieldsSupport for Deep Learning Methods as computational model for cortical processingAble to learn more complete parts of objects in the higher layers with bigger receptive fieldsFuture work would be to extend these ideas and establish the cognitive relevance of the computational models

  • ReferencesGeorey E. Hinton, Yee-Whye Teh and Simon Osindero, A Fast Learning Algorithm forDeep Belief Nets. Neural Computation, pages 1527-1554, Volume 18, 2006. D. H. Hubel & T. N. Wiesel Jiro Gyoba , Recpetive Fields, Binocular Interaction And Functional Architecture In The Cat's Visual Cortex,The Journal of Physiology, Vol. 16;0, No. 1., 1962Michael J. Lyons, Shigeru Akamatsu, Miyuki Kamachi & Jiro Gyoba , Coding Facial Expressions with Gabor Wavelets,Proceedings, Third IEEE International Conference on Automatic Face and Gesture Recognition,pp 200-205, 1-19.April 14-16 1998Hateren, J. H. van and Schaaf, A. van der , Independent Component Filters of Natural Images Compared with Simple Cells in Primary Visual Cortex,Proceedings: Biological Sciences, vol 265,pages 359-366, March 1998 Georey E. Hinton (2010). A Practical Guide to Training Restricted Boltzmann Machines,Technical Report,Volume 1Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent (2010). Visualizing Higher-Layer Features of a Deep Network,Technical Report 1341

  • References(contd.) Honglak Lee, Roger Grosse,Rajesh Ranganath, Andrew Y. Ng. Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations,ICML 2009 Geoffrey E. Hinton Learning multiple layers of representation,Trends in Cognitive Sciences Vol.11 No.10 ,2006 Andrew Ng. (2010). Sparse Autoencoder(lecture notes).Honglak Lee, Chaitanya Ekanadham, Andrew Y. Ng, Sparse deep belief net model for visual area V2NIPS,2007Ruslan Salakhutdinov Learning Deep Generative Models Phd thesis, 2009Yoshua Bengio Learning Deep Architectures for AI Foundations and Trends in Machine Learning,Vol. 2, No. 1, 1127, 2009


View more >