latent dirichlet allocation

22
Latent Dirichlet Allocation David M Blei, Andrew Y Ng & Michael I Jordan presented by Tilaye Alemu & Anand Ramkissoon

Upload: brock

Post on 06-Jan-2016

60 views

Category:

Documents


0 download

DESCRIPTION

David M Blei, Andrew Y Ng & Michael I Jordan presented by Tilaye Alemu & Anand Ramkissoon. Latent Dirichlet Allocation. Motivation for LDA. In lay terms: document modelling text classification collaborative filtering ... ...in the context of Information Retrieval - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Latent Dirichlet Allocation

Latent Dirichlet Allocation

David M Blei, Andrew Y Ng & Michael I Jordan

presented by Tilaye Alemu & Anand Ramkissoon

Page 2: Latent Dirichlet Allocation

Motivation for LDA

In lay terms: document modelling text classification collaborative filtering ... ...in the context of Information Retrieval

The principal focus in this paper is on document classification within a corpus

Page 3: Latent Dirichlet Allocation

Structure of this talk

Part 1: Theory Background (some) other approaches

Part 2: Experimental results some details of usage wider applications

Page 4: Latent Dirichlet Allocation

LDA: conceptual features

Generative Probabilistic Collections of discrete data

3-level hierarchical Bayesian model mixture models efficient approximate inference techniques variational methods EM algorithm for empirical Bayes parameter

estimation

Page 5: Latent Dirichlet Allocation

How to classify text documents

Word (term) frequency tf-idf

term-by-document matrix discriminative sets of words fixed-length lists of numbers little statistical structure

Dimensionality reduction techniques Latent Semantic Indexing

Singular value decomposition not generative

Page 6: Latent Dirichlet Allocation

How to classify text documents ct'd

probabilistic LSI (PLSI) each word generated by one topic each document generated by a mixture of topics a document is represented as a list of mixing

proportions for topics

No generative model for these numbers Number of parameters grows linearly with the corpus Overfitting How to classify documents outside training set

Page 7: Latent Dirichlet Allocation

A major simplifying assumption

A document is a “bag of words” A corpus is a “bag of documents”

order is unimportant exchangeability de Finetti representation theorem

any collection of exchangeable random variables has a representation as a (generally infinite) mixture distribution

Page 8: Latent Dirichlet Allocation

A note about exchangeability

Does not mean that random variables are iid iid when conditioned on wrt to an underlying

latent parameter of a probability distribution

Conditionally the joint distribution is simple and factored

Page 9: Latent Dirichlet Allocation

Notation

word: unit of discrete data, an item from a vocabulary indexed {1,...,V} each word is a unit basis V-vector

document: sequence of N words w=(w1,...,w

N)

corpus a collection of M documents D=(w1,...,w

M)

Each document is considered a random mixture over latent topics

Each topic is considered a distribution over words

Page 10: Latent Dirichlet Allocation

LDA assumes a generative processfor each document in the corpus

Page 11: Latent Dirichlet Allocation

Probability density for the DirichletRandom variable

Page 12: Latent Dirichlet Allocation

Joint distribution of a Topic mixture

Page 13: Latent Dirichlet Allocation

Marginal distribution of a document

Page 14: Latent Dirichlet Allocation

Probability of a corpus

Page 15: Latent Dirichlet Allocation

Marginalize over z

The word distribution

The generative process

Page 16: Latent Dirichlet Allocation

a Unigram Model

Page 17: Latent Dirichlet Allocation

probabilistic Latent Semantic Indexing

Page 18: Latent Dirichlet Allocation

Inference from LDA

Page 19: Latent Dirichlet Allocation

Variational Inference

Page 20: Latent Dirichlet Allocation

A family of distributions on latent variables

The Dirichlet parameter γ and the multinomial parameters φ are the free variational parameters

Page 21: Latent Dirichlet Allocation

The update equations

Minimize the Kullback-Leibler divergence between the distribution and the true posterior

Page 22: Latent Dirichlet Allocation

Variational Inference Algorithm