latent dirichlet allocation

22

Latent Dirichlet Allocation David M Blei, Andrew Y Ng & Michael I Jordan presented by Tilaye Alemu & Anand Ramkissoon

Upload: brock

Post on 06-Jan-2016

60 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

David M Blei, Andrew Y Ng & Michael I Jordan presented by Tilaye Alemu & Anand Ramkissoon. Latent Dirichlet Allocation. Motivation for LDA. In lay terms: document modelling text classification collaborative filtering ... ...in the context of Information Retrieval - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Latent Dirichlet Allocation

Latent Dirichlet Allocation

David M Blei, Andrew Y Ng & Michael I Jordan

presented by Tilaye Alemu & Anand Ramkissoon

Page 2: Latent Dirichlet Allocation

Motivation for LDA

In lay terms: document modelling text classification collaborative filtering ... ...in the context of Information Retrieval

The principal focus in this paper is on document classification within a corpus

Page 3: Latent Dirichlet Allocation

Structure of this talk

Part 1: Theory Background (some) other approaches

Part 2: Experimental results some details of usage wider applications

Page 4: Latent Dirichlet Allocation

LDA: conceptual features

Generative Probabilistic Collections of discrete data

3-level hierarchical Bayesian model mixture models efficient approximate inference techniques variational methods EM algorithm for empirical Bayes parameter

estimation

Page 5: Latent Dirichlet Allocation

How to classify text documents

Word (term) frequency tf-idf

term-by-document matrix discriminative sets of words fixed-length lists of numbers little statistical structure

Dimensionality reduction techniques Latent Semantic Indexing

Singular value decomposition not generative

Page 6: Latent Dirichlet Allocation

How to classify text documents ct'd

probabilistic LSI (PLSI) each word generated by one topic each document generated by a mixture of topics a document is represented as a list of mixing

proportions for topics

No generative model for these numbers Number of parameters grows linearly with the corpus Overfitting How to classify documents outside training set

Page 7: Latent Dirichlet Allocation

A major simplifying assumption

A document is a “bag of words” A corpus is a “bag of documents”

order is unimportant exchangeability de Finetti representation theorem

any collection of exchangeable random variables has a representation as a (generally infinite) mixture distribution

Page 8: Latent Dirichlet Allocation

A note about exchangeability

Does not mean that random variables are iid iid when conditioned on wrt to an underlying

latent parameter of a probability distribution

Conditionally the joint distribution is simple and factored

Page 9: Latent Dirichlet Allocation

Notation

word: unit of discrete data, an item from a vocabulary indexed {1,...,V} each word is a unit basis V-vector

document: sequence of N words w=(w1,...,w

N)

corpus a collection of M documents D=(w1,...,w

M)

Each document is considered a random mixture over latent topics

Each topic is considered a distribution over words

Page 10: Latent Dirichlet Allocation

LDA assumes a generative processfor each document in the corpus

Page 11: Latent Dirichlet Allocation

Probability density for the DirichletRandom variable

Page 12: Latent Dirichlet Allocation

Joint distribution of a Topic mixture

Page 13: Latent Dirichlet Allocation

Marginal distribution of a document

Page 14: Latent Dirichlet Allocation

Probability of a corpus

Page 15: Latent Dirichlet Allocation

Marginalize over z

The word distribution

The generative process

Page 16: Latent Dirichlet Allocation

a Unigram Model

Page 17: Latent Dirichlet Allocation

probabilistic Latent Semantic Indexing

Page 18: Latent Dirichlet Allocation

Inference from LDA

Page 19: Latent Dirichlet Allocation

Variational Inference

Page 20: Latent Dirichlet Allocation

A family of distributions on latent variables

The Dirichlet parameter γ and the multinomial parameters φ are the free variational parameters

Page 21: Latent Dirichlet Allocation

The update equations

Minimize the Kullback-Leibler divergence between the distribution and the true posterior

Page 22: Latent Dirichlet Allocation

Variational Inference Algorithm

Latent Dirichlet Allocation: Towards a Deeper …obphio.us/pdfs/lda_tutorial.pdf · Latent Dirichlet Allocation: ... Colorado Reed January 2012 ... code and simple examples that you

FLDA: Latent Dirichlet Allocation Based Unsteady Flow Analysishguo/publications/HongLGSYL14-small.pdf · FLDA: Latent Dirichlet Allocation Based Unsteady Flow Analysis Fan Hong, Chufan

Latent Dirichlet Allocation - Neural Information …papers.nips.cc/paper/2070-latent-dirichlet-allocation.pdfLatent Dirichlet Allocation David M. Blei, Andrew Y. Ng and Michael I

Text-classification using Latent Dirichlet Allocation - intro graphical model

Comparing Latent Dirichlet Allocation and Latent Semantic .../67531/metadc...Anaya, Leticia H. Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers. Doctor

Latent dirichlet markov allocation for sentiment analysisusir.salford.ac.uk/29460/1/Latent_Dirichlet_Markov_Allocation.pdf · Keywords: topic model, latent Dirichlet allocation (LDA),

Latent Dirichlet Allocation LDA)milos/courses/cs3750-Fall... · 2014. 10. 9. · 10/9/2014 1 Latent Dirichlet Allocation (LDA) Brief Review LDA Dirichlet Distribution The Model Theoretical

Autoencoding variational Bayes for latent Dirichlet allocation · Autoencoding variational Bayes for latent Dirichlet allocation 3 2 Latent Dirichlet Allocation LDA is probably the

Latent Dirichlet Allocation (Nicolas Loeff)

Latent Dirichlet Allocation and Its Application in ...csse.szu.edu.cn/staff/panwk/recommendation/OCCF/LDA.pdf · Latent Dirichlet Allocation. JMLR 2003. • Thomas L. Griffiths and

Latent Dirichlet Allocation - Alex Smola · 3. Latent Dirichlet allocation Latent Dirichlet allocation (LDA) is a generative probabilistic model of a corpus. The basic idea is that

Latent Dirichlet Allocation - University of California, Irvinecomputableplant.ics.uci.edu/emj/classes/280_04/papers/information... · Latent Dirichlet allocation (LD A) is a generati

Latent Dirichlet Allocation - Journal of Machine Learning ...jmlr.csail.mit.edu/papers/volume3/blei03a/blei03a.pdf · We describe latent Dirichlet allocation (LDA), a generative probabilistic

LIA$at$TREC$2012$Web$Track: Unsupervised$Search$Concepts ...romaindeveaud.github.io/publis/trec12_presentation.pdf · 2.1 Latent Dirichlet Allocation Latent Dirichlet Allocation is

Latent Dirichlet Allocation - Stanford Artificial ...ang/papers/jair03-lda.pdf · Latent Dirichlet Allocation David M ... Computer Science Division and Department of Statistics

Latent Dirichlet Allocation Uncovers Spectral

Goal-based Recommendation utilizing Latent Dirichlet Allocation

Latent Dirichlet Allocation for Topic Modelingmlg.eng.cam.ac.uk/teaching/4f13/1920/lda.pdf · Latent Dirichlet Allocation (LDA) Simple intuition (from David Blei): Documents exhibit

Latent Dirichlet Allocation - BYU Data Mining Labdml.cs.byu.edu/~cgc/docs/atdm/Readings/LDA-Paper.pdf · Latent Dirichlet allocation (LDA) is a generative probabilistic model of a

Latent Dirichlet Allocation - GitHub Pages · Florian Becker – Latent Dirichlet Allocation Institute of Theoretical Informatics Algorithmics Group (Probabilistic) Topic Models -

A Robust Latent Dirichlet Allocation Approach for the

Topic Model Latent Dirichlet Allocation

Blei2003-Latent Dirichlet Allocation

Max-Margin Latent Dirichlet Allocation for Image Classication … · WANG AND MORI: MAX-MARGIN LATENT DIRICHLET ALLOCATION 1 Max-Margin Latent Dirichlet Allocation for Image Classication

Latent Dirichlet Allocation - MIT CSAILpeople.csail.mit.edu/dsontag/courses/pgm13/slides/... · We describe latent Dirichlet allocation (LDA), a generative probabilistic model for

Lecture 13 & 14: Latent Dirichlet Allocation for Topic Modellingmlg.eng.cam.ac.uk/teaching/4f13/1516/lect1314.pdf · 2015-11-27 · Ghahramani Lecture 13 & 14: Latent Dirichlet Allocation

Latent Dirichlet Allocation - Svitlana Volkova - Home

Using Hierarchical Latent Dirichlet Allocation to ...downloads.hindawi.com/journals/sp/2017/4382348.pdf · ResearchArticle Using Hierarchical Latent Dirichlet Allocation to Construct

Latent Dirichlet Allocation - University of Minnesotabaner029/Teaching/Fall07/talks/Han… · Latent Dirichlet Allocation • LDA doesn’t model documents d explicitly. • LDA doesn’t

Robust Initialization for Learning Latent Dirichlet Allocationprofs.sci.univr.it/bicego/papers/2015_SIMBAD.pdf · Robust Initialization for Learning Latent Dirichlet Allocation Pietro

A Latent Dirichlet Allocation Systematic Literature Review ... · (TM) technique and Latent Dirichlet Allocation (LDA) topic model were used to cluster articles in different topics

Notes on Latent Dirichlet Allocation (LDA) for Beginners

VSSML16 L4. Association Discovery and Latent Dirichlet Allocation

Class-Speciﬁc Simplex-Latent Dirichlet Allocation c Simplex-Latent Dirichlet Allocation for Image Classiﬁcation Mandar Dixit1; Nikhil Rasiwasia2; Nuno Vasconcelos1 1Department

Latent Dirichlet Allocation - Stanford Universitystatweb.stanford.edu/~kriss1/lda_intro.pdf · Latent Dirichlet Allocation . ] Z ' 1 I w areobserveddata I , areﬁxed,globalparameters