self-organizing maps - tutorial

33
"Apprentissage non supervisé" de la théorie à la pratique Miguel Arturo Barreto Sánz

Upload: askroll

Post on 18-Nov-2014

5.510 views

Category:

Technology


0 download

DESCRIPTION

Self-organizing maps tutorial

TRANSCRIPT

Page 1: Self-organizing maps -  Tutorial

"Apprentissage non supervisé" de la théorie à la pratique

Miguel Arturo Barreto Sánz

Page 2: Self-organizing maps -  Tutorial

1

Outline● Introduction The unsupervised learning ● The Self-Organizing Map The biological inspiration The algorithm Characteristics Examples ● Practical examples using MATLAB

Page 3: Self-organizing maps -  Tutorial

2

IntroductionUnsupervised learning is a way to form “natural groupings” or clusters of patterns.

Unsupervised learning seeks to determine how the data are organized.

It is distinguished from supervised learning in that the learner is given only unlabeled examples.. Among neural network models, the Self-Organizing Map (SOM) are commonly used unsupervised learning algorithms.

The SOM is a topographic organization in which nearby locations in the map represent inputs with similar properties.

Page 4: Self-organizing maps -  Tutorial

3

The Self-Organizing Map The biological inspiration

W. Penfield

Sensory information is processed in the neocortex by highly ordered neuronal networks.

• Tangential to the cortical surface, representations of the sensory periphery are organized into well-ordered maps.

• Taste maps in gustatory cortex (Accolla et al., 2007)

• Somatotopic maps in primary somatosensory cortex (Kaas, 1991).

Page 5: Self-organizing maps -  Tutorial

4

The Self-Organizing Map The biological inspiration

Other prominent cortical maps are the tonotopic organization of auditory cortex (Kalatsky et al., 2005),

The most intensely studied example is the primary visual cortex, which is arranged with superimposed maps of retinotopy, ocular dominance and orientation (Bonhoeffer and Grinvald, 1991).

Page 6: Self-organizing maps -  Tutorial

5

The Self-Organizing Map The biological inspiration

Humunculus

Page 7: Self-organizing maps -  Tutorial

6

Somatosensory cortex dominated by the representation of teeth in the naked mole-rat brainKenneth C. Catania, and Michael S. Remple.

The Self-Organizing Map The biological inspiration

Page 8: Self-organizing maps -  Tutorial

7

The Self-Organizing Map The biological inspiration

A remarkably high degree of organization is obvious in the primary somatosensory cortex, in which a clear pattern of cytoarchitectonic units termed ‘barrels’ are observed in perfect match with the arrangement of the whiskers on the snout of the mouse (Woolsey and Van der Loos, 1970)

Page 9: Self-organizing maps -  Tutorial

8

The Self-Organizing Map The biological inspiration

Mapping functionally related sensory information onto nearby cortical regions is thought to minimize axonal wiring length and simplify the synaptic circuits underlying correlation-based associational plasticity.

Page 10: Self-organizing maps -  Tutorial

9

The Self-Organizing Map

In a topology-preserving map, units located physically next to each other will respond to classes of input vectors that are likewise next to each other.

Although it is easy to visualize units next to each other in a two-dimensional array, it is not so easy to determine which classes of vectors are next to each other in a high-dimensional space.

Large-dimensional input vectors are, in a sense, projected down on the two dimensional map in a way that maintains the natural order of the input vectors.

This dimensional reduction could allow us to visualize easily important relationships among the data that otherwise might go unnoticed.

Teuvo Kohonen

Page 11: Self-organizing maps -  Tutorial

10

The Self-Organizing Map

A SOM is formed of neurons located on a regular, usually 1- or 2-dimensional grid.

The neurons are connected to adjacent neurons by a neighborhood relation dictating the structure of the map.

In the 2-dimensional case the neurons of the map can be arranged either on a rectangular or a hexagonal lattice

01

2

01

2

Input Input

Page 12: Self-organizing maps -  Tutorial

2

The algorithmThe weights of the neurons are initialized

t = 0

Page 13: Self-organizing maps -  Tutorial

2

The algorithm

Example

Page 14: Self-organizing maps -  Tutorial

2

The algorithmThe training utilizes competitive learning.

The neuron with weight vector most similar to the input is called the best matching unit (BMU).

The weights of the BMU and neurons close to it in the SOM lattice are adjusted towards the input vector.

The magnitude of the change decreases with time and with distance from the BMU.

BMU

Page 15: Self-organizing maps -  Tutorial

2

The algorithm

Next example

Page 16: Self-organizing maps -  Tutorial

2

The algorithm

Page 17: Self-organizing maps -  Tutorial

2

The algorithm

Page 18: Self-organizing maps -  Tutorial

2

The algorithm

Page 19: Self-organizing maps -  Tutorial

2

Characteristics

Quality of life word mapInputs: State of health, nutrition, educational services etc.

Page 20: Self-organizing maps -  Tutorial

2

Characteristics

z

x

y

x

y

Output 2 dimentionsInput 3 Dimentions

Page 21: Self-organizing maps -  Tutorial

2

Visualization

Page 22: Self-organizing maps -  Tutorial

2

Page 23: Self-organizing maps -  Tutorial

2

Introduction

Page 24: Self-organizing maps -  Tutorial

2

Visualization

Page 25: Self-organizing maps -  Tutorial

2

Clusters of sites with similar characteristics

14

What crops or varieties are likely to perform well where and when.

Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea.

Soil

Climate

Genotype

Page 26: Self-organizing maps -  Tutorial

2

Clusters of sites with similar characteristics

16

The COCH project

For commercial (mass production) crops (rice, corn) it is known the “when” and “where”

For native crops (guanabana, lulo) or special types of crops (coffee varieties) it is not the case.

DAPA (Diversification Agriculture Project Alliance)

When and what I must cultivate ?Market demand

Page 27: Self-organizing maps -  Tutorial

2

The challenges1. Large database2. Multivariable problem

1 Km

1 Km

1 point

1 336,025 points

Page 28: Self-organizing maps -  Tutorial

2

Introduction1. Large datasets2. Multivariate problem

Climate, management, variety, climate estimates, soil etc.

Example. BIOCLIM is a bioclimatic prediction system which uses surrogate terms (bioclimatic parameters) derived from mean monthly climate estimates, to approximate energy and water balances at a given location

B1. Annual Mean Temperature B2. Mean Diurnal Range(Mean(period max-min)) B3. Isothermality (P2/P7) B4. Temperature Seasonality (Coefficient of Variation) B5. Max Temperature of Warmest Period B6. Min Temperature of Coldest Period B7. Temperature Annual Range (P5-P6) B8. Mean Temperature of Wettest Quarter B9. Mean Temperature of Driest Quarter B10. Mean Temperature of Warmest Quarter

B11. Mean Temperature of Coldest Quarter B12. Annual Precipitation B13. Precipitation of Wettest Period B14. Precipitation of Driest Period B15. Precipitation Seasonality(Coefficient of Variation) B16. Precipitation of Wettest Quarter B17. Precipitation of Driest Quarter B18. Precipitation of Warmest Quarter B19. Precipitation of Coldest Quarter

The challenges

Page 29: Self-organizing maps -  Tutorial

2

Clusters of sites with similar characteristics

How to work ?How to obtain Prototypes, Clustering and Visualization at the same time ?

ApproachUnsupervised learningSelf-organizing maps

Two flavors of SOMs

Growing hierarchical map Different representations to different levels

Self-organizing mapsStatic map – Just one representation

Page 30: Self-organizing maps -  Tutorial

2

Clusters of sites with similar characteristics

Similarity of sugarcane growing environmentalconditions (1999-2005)using Self-organizing

maps

The clusters found in the feature space in many cases are not the same as those found in geographic space.

Represent clusters of a multidimensional space: map multidimensional data onto a two-dimensional lattice of cells.

Self-Organizing Map (SOM)

29

Page 31: Self-organizing maps -  Tutorial

2

P

ApproachesGHSOM

Page 32: Self-organizing maps -  Tutorial

2

IntroductionP1. Annual Mean Temperature P2. Mean Diurnal Range(Mean(period max-min)) P3. Isothermality (P2/P7) P4. Temperature Seasonality (Coefficient of Variation) P5. Max Temperature of Warmest Period P6. Min Temperature of Coldest Period P7. Temperature Annual Range (P5-P6) P8. Mean Temperature of Wettest Quarter P9. Mean Temperature of Driest Quarter P10. Mean Temperature of Warmest Quarter P11. Mean Temperature of Coldest Quarter P12. Annual Precipitation P13. Precipitation of Wettest Period P14. Precipitation of Driest Period P15. Precipitation Seasonality(Coefficient of Variation) P16. Precipitation of Wettest Quarter P17. Precipitation of Driest Quarter P18. Precipitation of Warmest Quarter P19. Precipitation of Coldest Quarter

GHSOM Component planes

Page 33: Self-organizing maps -  Tutorial

Merci !