spatial lda

58

Click here to load reader

Upload: north-carolina-state-university

Post on 18-Jul-2015

159 views

Category:

Engineering


11 download

TRANSCRIPT

Page 1: Spatial LDA

Spatial Latent Dirichlet Allocation

Page 2: Spatial LDA

Spatial Latent Dirichlet Allocation

Authors : - Xiaogang Wang and Eric Crimson

Review By: George Mathew(george2)

Page 3: Spatial LDA

Applications

Page 4: Spatial LDA

Applications• Text Mining

• Identifying similar chapters in a book

Page 5: Spatial LDA

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

Page 6: Spatial LDA

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

• Colocation Mining

• Identifying forest fires

Page 7: Spatial LDA

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

• Colocation Mining

• Identifying forest fires

• Music search

• Identifying genre of music based on segment of the song

Page 8: Spatial LDA

LDA-Overview

Page 9: Spatial LDA

LDA-Overview• A generative probabilistic model

Page 10: Spatial LDA

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

Page 11: Spatial LDA

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

Page 12: Spatial LDA

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

Page 13: Spatial LDA

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

• corpus - collection of all documents

Page 14: Spatial LDA

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

• corpus - collection of all documents

• label(Output) - class of the document

Page 15: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents

Page 16: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

Page 17: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

Page 18: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

Page 19: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

• moving pixels etc

Page 20: Spatial LDA

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

• moving pixels etc

• The paper takes an example of image classification in computer vision.

Page 21: Spatial LDA

Data Preprocessing

Page 22: Spatial LDA

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

Page 23: Spatial LDA

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

• A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor.

Page 24: Spatial LDA

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

• A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor.

• The local descriptors of each patch in the entire image set is clustered using k-means and stored in an auxiliary data structure(lets call it a “Workbook”).

Page 25: Spatial LDA

Clustering using LDA

Page 26: Spatial LDA

Clustering using LDA• Framework:

Page 27: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

Page 28: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

Page 29: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

Page 30: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

Page 31: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

Page 32: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

• 𝛼 and β are Dirichlet prior hyperparameters.

Page 33: Spatial LDA

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

• 𝛼 and β are Dirichlet prior hyperparameters.

• ɸk, ∏j and zji are hidden variables used.

Page 34: Spatial LDA

Clustering using LDA(cntd)• Generative algorithm:

• For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β)

• For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼)

• For a word i in document j, a topic label zji is sampled from the discrete distribution zji ~ Discrete(∏j)

• The value wji of word i in document j is sampled for the discrete distribution of topic zji, wji ~ Discrete(ɸzji)

• zji is sampled through Gibbs sampling procedure as follows:

• n(k)

-ji,w represents number of words in the corpus with value w assigned to topic k excluding word i in document j

• n(j)

-ji,k represents number of words in the document j assigned to topic k excluding word i in document j

Page 35: Spatial LDA

What’s the issue with LDA?

Page 36: Spatial LDA

What’s the issue with LDA?

• Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized.

Page 37: Spatial LDA

What’s the issue with LDA?

• Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized.

• Consider the scenario where there is a series of animals with grass as the background. Since we assume an image to be a document and since the animal is only a small part of the image, it would most likely be classified as grass.

Page 38: Spatial LDA

How can we resolve it?

Page 39: Spatial LDA

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

Page 40: Spatial LDA

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

Page 41: Spatial LDA

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

Page 42: Spatial LDA

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

• But since each overlapping document could contain a patch how would you decide which of the documents it should belong to?

Page 43: Spatial LDA

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

• But since each overlapping document could contain a patch how would you decide which of the documents it should belong to?

• So we could replace each document(region) as a point and if a patch is closer to a particular point, we could assign it to that document.

Page 44: Spatial LDA

• Framework:

• Besides the parameters used in LDA spatial information is also captured

• A hidden variable di indicates the document which word i is assigned to.

• Additionally for each document gd

j, xd

j, yd

j, represents the index, x coordinate and y coordinate of the document respectively.

• Additionally for each image gi, xi, yi, represents the index, x coordinate and y coordinate of the image respectively.

• Generative Algorithm:

• For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β)

• For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼)

• For a word i, a random variable di, is sampled from prior of p(di|η), indicating document for word i.

Clustering using Spatial LDA

Page 45: Spatial LDA

Clustering using Spatial LDA(contd)

• Generative Algorithm:

• Image index and location of word is chosen from distribution p(ci|cd

di,𝝈). A gaussian kernel is chosen

• For a word j in document di, a topic label zi is sampled from the discrete distribution zji ~ Discrete(∏di)

• The value wi of word i is sampled for the discrete distribution of topic zi, wi ~ Discrete(ɸzi)

• zji is sampled through Gibbs sampling procedure as follows:

• n(k)

-i,w represents number of words in the corpus with value w assigned to topic k excluding word i and n

(j)

-i,k represents number of words in the document j assigned to topic k excluding word i

• The conditional distribution of di is represented as follows:

Page 46: Spatial LDA

Results

cows cars faces bicyclesLDA(D) 0.376 0.555 0.717 0.556SLDA(D) 0.566 0.684 0.697 0.566LDA(FA) 0.558 0.396 0.586 0.529SLDA(FA) 0.033 0.244 0.371 0.422

Page 47: Spatial LDA

What the paper missed

Page 48: Spatial LDA

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

Page 49: Spatial LDA

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

Page 50: Spatial LDA

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

Page 51: Spatial LDA

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

• Few Advancements were made in the paper:

Page 52: Spatial LDA

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

• Few Advancements were made in the paper:

• James Philbin, Josef Sivic and Andrew Zisserman. Geometric Latent Dirichlet Allocation on a Matching Graph for Large-scale Image Datasets, International Journal of Computer Vision, Volume 95, Number 2, page 138--153, nov 2011

Page 53: Spatial LDA

Libraries for LDA

Page 54: Spatial LDA

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

Page 55: Spatial LDA

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

• Python - lda v1.0.2 - https://pypi.python.org/pypi/lda

Page 56: Spatial LDA

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

• Python - lda v1.0.2 - https://pypi.python.org/pypi/lda

• Java - GibbsLDA - http://gibbslda.sourceforge.net/

Page 57: Spatial LDA

References

Page 58: Spatial LDA

References• Wang, Xiaogang and Eric Grimson. Spatial Latent

Dirichlet Allocation. Advances in Neural Information Processing Systems 20 (NIPS 2007)

• D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.

• Diane J. Hu. Latent Dirichlet Allocation for Text, Images, and Music.