![Page 1: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/1.jpg)
Department of Computer ScienceCSCI 5622: Machine Learning
Chenhao TanLecture 20: Topic modeling and variational inferrence
Slides adapted from Jordan Boyd-Graber, Chris Ketelsen
1
![Page 2: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/2.jpg)
Administrivia
• Poster printing (stay tuned!)• HW 5 (final homework) is due next Friday!• Midpoint feedback
2
![Page 3: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/3.jpg)
Learning Objectives
• Learn about latent Dirichlet allocation
• Understand the inituion behind variational inference
3
![Page 4: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/4.jpg)
Topic models
• Discrete count data
4
![Page 5: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/5.jpg)
Topic models
5
• Suppose you have a huge number of documents
• Want to know what's going on• Can't read them all (e.g. every New
York Times article from the 90's)• Topic models offer a way to get a
corpus-level view of major themes• Unsupervised
![Page 6: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/6.jpg)
Why should you care?
• Neat way to explore/understand corpus collections• E-discovery• Social media• Scientific data
• NLP Applications• Word sense disambiguation• Discourse segmentation
• Psychology: word meaning, polysemy• A general way to model count data and a general inference
algorithm6
![Page 7: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/7.jpg)
Conceptual approach
• Input: a text corpus and number of topics K • Output:
• K topics, each topic is a list of words• Topic assignment for each document
7
Forget the Bootleg, Just Download the Movie LegallyMultiplex Heralded As
Linchpin To GrowthThe Shape of Cinema, Transformed At the Click of
a MouseA Peaceful Crew Puts
Muppets Where Its Mouth IsStock Trades: A Better Deal For Investors Isn't SimpleThe three big Internet portals begin to distinguish
among themselves as shopping malls
Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens
Corpus
![Page 8: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/8.jpg)
Conceptual approach
8
computer,
technology,
system,
service, site,
phone,
internet,
machine
play, film,
movie, theater,
production,
star, director,
stage
sell, sale,
store, product,
business,
advertising,
market,
consumer
TOPIC 1 TOPIC 2 TOPIC 3
• K topics, each topic is a list of words
![Page 9: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/9.jpg)
Conceptual approach
9
• Topic assignment for each document
Forget the Bootleg, Just Download the Movie Legally
Multiplex Heralded As Linchpin To
GrowthThe Shape of
Cinema, Transformed At the Click of a
Mouse A Peaceful Crew Puts Muppets
Where Its Mouth Is
Stock Trades: A Better Deal For Investors Isn't
Simple
Internet portals begin to distinguish among themselves as shopping malls
Red Light, Green Light: A
2-Tone L.E.D. to Simplify Screens
TOPIC 2"BUSINESS"
TOPIC 3"ENTERTAINMENT"
TOPIC 1"TECHNOLOGY"
![Page 10: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/10.jpg)
Topics from Science
10
![Page 11: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/11.jpg)
Topic models
• Discrete count data• Gaussian distributions are not appropriate
11
![Page 12: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/12.jpg)
Generative model: Latent Dirichlet Allocation• Generate a document, or a bag of words• Blei, Ng, Jordan. Latent Dirichlet Allocation. JMLR, 2003.
12
![Page 13: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/13.jpg)
Generative model: Latent Dirichlet Allocation• Generate a document, or a bag
of words• Multinomial distribution
• Distribution over discrete outcomes
• Represented by non-negative vector that sums to one
• Picture representation
13
(1,0,0) (0,0,1)
(1/2,1/2,0)(1/3,1/3,1/3) (1/4,1/4,1/2)
(0,1,0)
![Page 14: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/14.jpg)
Generative model: Latent Dirichlet Allocation• Generate a document, or a bag
of words• Multinomial distribution
• Distribution over discrete outcomes
• Represented by non-negative vector that sums to one
• Picture representation• Come from a Dirichlet distribution
14
(1,0,0) (0,0,1)
(1/2,1/2,0)(1/3,1/3,1/3) (1/4,1/4,1/2)
(0,1,0)
![Page 15: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/15.jpg)
Generative story
15
computer,
technology,
system,
service, site,
phone,
internet,
machine
play, film,
movie, theater,
production,
star, director,
stage
sell, sale,
store, product,
business,
advertising,
market,
consumer
TOPIC 1
TOPIC 2
TOPIC 3
![Page 16: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/16.jpg)
Generative story
16
Forget the Bootleg, Just Download the Movie Legally
Multiplex Heralded As Linchpin To Growth
The Shape of Cinema, Transformed At the Click of
a Mouse
A Peaceful Crew Puts Muppets Where Its Mouth Is
Stock Trades: A Better Deal For Investors Isn't Simple
The three big Internet portals begin to distinguish
among themselves as shopping mallsRed Light, Green Light: A
2-Tone L.E.D. to Simplify Screens
TOPIC 2
TOPIC 3
TOPIC 1
![Page 17: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/17.jpg)
Generative story
17
Hollywood studios are preparing to let people
download and buy electronic copies of movies over
the Internet, much as record labels now sell songs for
99 cents through Apple Computer's iTunes music store
and other online services ...
computer, technology,
system, service, site,
phone, internet, machine
play, film, movie, theater,
production, star, director,
stage
sell, sale, store, product,
business, advertising,
market, consumer
![Page 18: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/18.jpg)
Generative story
18
Hollywood studios are preparing to let people
download and buy electronic copies of movies over
the Internet, much as record labels now sell songs for
99 cents through Apple Computer's iTunes music store
and other online services ...
computer, technology,
system, service, site,
phone, internet, machine
play, film, movie, theater,
production, star, director,
stage
sell, sale, store, product,
business, advertising,
market, consumer
![Page 19: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/19.jpg)
Generative story
19
Hollywood studios are preparing to let people
download and buy electronic copies of movies over
the Internet, much as record labels now sell songs for
99 cents through Apple Computer's iTunes music store
and other online services ...
computer, technology,
system, service, site,
phone, internet, machine
play, film, movie, theater,
production, star, director,
stage
sell, sale, store, product,
business, advertising,
market, consumer
![Page 20: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/20.jpg)
Generative story
20
Hollywood studios are preparing to let people
download and buy electronic copies of movies over
the Internet, much as record labels now sell songs for
99 cents through Apple Computer's iTunes music store
and other online services ...
computer, technology,
system, service, site,
phone, internet, machine
play, film, movie, theater,
production, star, director,
stage
sell, sale, store, product,
business, advertising,
market, consumer
![Page 21: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/21.jpg)
Missing component: how to generate a multinomial distribution
21
![Page 22: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/22.jpg)
Missing component: how to generate a multinomial distribution
22
What are Topic Models?
Dirichlet Distribution
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 12 of 26
![Page 23: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/23.jpg)
Missing component: how to generate a multinomial distribution
23
![Page 24: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/24.jpg)
Conjugacy of Dirichlet and Multinomial
24
What are Topic Models?
Dirichlet Distribution
• If � ⇠ Dir(↵), w ⇠ Mult(�), and nk = |{wi : wi = k}| then
p(�|↵,w) / p(w |�)p(�|↵) (1)
/Y
k
�nkY
k
�↵k�1 (2)
/Y
k
�↵k+nk�1 (3)
• Conjugacy: this posterior has the same form as the prior
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 14 of 26
![Page 25: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/25.jpg)
Conjugacy of Dirichlet and Multinomial
25
What are Topic Models?
Dirichlet Distribution
• If � ⇠ Dir(↵), w ⇠ Mult(�), and nk = |{wi : wi = k}| then
p(�|↵,w) / p(w |�)p(�|↵) (1)
/Y
k
�nkY
k
�↵k�1 (2)
/Y
k
�↵k+nk�1 (3)
• Conjugacy: this posterior has the same form as the prior
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 14 of 26
![Page 26: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/26.jpg)
Making the generative story formal
26
What are Topic Models?
Generative Model Approach
MNθd zn wn
Kβk
α
λ
• For each topic k 2 {1, . . . ,K}, draw a multinomial distribution �kfrom a Dirichlet distribution with parameter �
• For each document d 2 {1, . . . ,M}, draw a multinomialdistribution ✓d from a Dirichlet distribution with parameter ↵
• For each word position n 2 {1, . . . ,N}, select a hidden topic znfrom the multinomial distribution parameterized by ✓.
• Choose the observed word wn from the distribution �zn .
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 16 of 26
![Page 27: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/27.jpg)
27
What are Topic Models?
Generative Model Approach
MNθd zn wn
Kβk
α
λ
• For each topic k 2 {1, . . . ,K}, draw a multinomial distribution �kfrom a Dirichlet distribution with parameter �
• For each document d 2 {1, . . . ,M}, draw a multinomialdistribution ✓d from a Dirichlet distribution with parameter ↵
• For each word position n 2 {1, . . . ,N}, select a hidden topic znfrom the multinomial distribution parameterized by ✓.
• Choose the observed word wn from the distribution �zn .
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 16 of 26
Making the generative story formal
![Page 28: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/28.jpg)
28
Making the generative story formalWhat are Topic Models?
Generative Model Approach
MNθd zn wn
Kβk
α
λ
• For each topic k 2 {1, . . . ,K}, draw a multinomial distribution �kfrom a Dirichlet distribution with parameter �
• For each document d 2 {1, . . . ,M}, draw a multinomialdistribution ✓d from a Dirichlet distribution with parameter ↵
• For each word position n 2 {1, . . . ,N}, select a hidden topic znfrom the multinomial distribution parameterized by ✓.
• Choose the observed word wn from the distribution �zn .
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 16 of 26
![Page 29: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/29.jpg)
29
Making the generative story formal
What are Topic Models?
Generative Model Approach
MNθd zn wn
Kβk
α
λ
• For each topic k 2 {1, . . . ,K}, draw a multinomial distribution �kfrom a Dirichlet distribution with parameter �
• For each document d 2 {1, . . . ,M}, draw a multinomialdistribution ✓d from a Dirichlet distribution with parameter ↵
• For each word position n 2 {1, . . . ,N}, select a hidden topic znfrom the multinomial distribution parameterized by ✓.
• Choose the observed word wn from the distribution �zn .Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 16 of 26
![Page 30: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/30.jpg)
Topic models: What’s important
• Topic models (latent variables)• Topics to word types—multinomial distribution• Documents to topics—multinomial distribution
• Modeling & Algorithm• Model: story of how your data came to be• Latent variables: missing pieces of your story• Statistical inference: filling in those missing pieces
• We use latent Dirichlet allocation (LDA), a fully Bayesian version of pLSI, probabilistic version of LSA
30
![Page 31: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/31.jpg)
31
Which variables are hidden?
![Page 32: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/32.jpg)
32
Size of Variable
![Page 33: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/33.jpg)
Joint distribution
33
![Page 34: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/34.jpg)
Joint distribution
34
![Page 35: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/35.jpg)
Posterior distribution
35
![Page 36: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/36.jpg)
Variational inference
36
![Page 37: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/37.jpg)
KL divergence and evidence lower bound
37
![Page 38: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/38.jpg)
KL divergence and evidence lower bound
38
![Page 39: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/39.jpg)
A different way to get ELBO
• Jensen’s inequality
39
![Page 40: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/40.jpg)
Evidence Lower Bound
40
![Page 41: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/41.jpg)
Evidence Lower Bound
41
![Page 42: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/42.jpg)
Variational inference
• Propose variational distribution q• Find ELBO (evidence lower bound) using q• Set derivatives to 0 and update variables
42
![Page 43: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/43.jpg)
Variational distribution for LDA
43
![Page 44: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/44.jpg)
Overall Algorithm
44
![Page 45: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/45.jpg)
Updates to Maximize ELBO
45
![Page 46: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/46.jpg)
Homework 5
• The original algorithm also updates alphas• Not required for the homework
46
![Page 47: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/47.jpg)
Evaluation
Example
• Three topics
� =
2
664
cat dog hamburger iron pig.26 .185 .185 .185 .185.185 .185 .26 .185 .185.185 .185 .185 .26 .185
3
775 (4)
• Assume uniform �: (2.0, 2.0, 2.0)
• Compute update for �
�ni
/ �iv
exp
0
@ (�
i
) �
0
@X
j
�j
1
A
1
A (5)
• For the first word (dog) in the document: dog cat cat pig
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 35 of 40
47
![Page 48: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/48.jpg)
48
Evaluation
Update � for dog
� =
2
664
cat dog hamburger iron pig.26 .185 .185 .185 .185.185 .185 .26 .185 .185.185 .185 .185 .26 .185
3
775
�ni
/
�iv
exp
0
@ (�
i
) �
0
@X
j
�j
1
A
1
A
• � = (2.000, 2.000, 2.000)
• �(0) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• �(1) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• �(2) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• After normalization: {0.333, 0.333, 0.333}
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 36 of 40
![Page 49: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/49.jpg)
49
Evaluation
Update � for pig
� =
2
664
cat dog hamburger iron pig.26 .185 .185 .185 .185.185 .185 .26 .185 .185.185 .185 .185 .26 .185
3
775
�ni
/
�iv
exp
0
@ (�
i
) �
0
@X
j
�j
1
A
1
A
• � = (2.000, 2.000, 2.000)
• �(0) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• �(1) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• �(2) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• After normalization: {0.333, 0.333, 0.333}
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 37 of 40
![Page 50: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/50.jpg)
Evaluation
Update � for cat
� =
2
664
cat dog hamburger iron pig.26 .185 .185 .185 .185.185 .185 .26 .185 .185.185 .185 .185 .26 .185
3
775
�ni
/
�iv
exp
0
@ (�
i
) �
0
@X
j
�j
1
A
1
A
• � = (2.000, 2.000, 2.000)
• �(0) / 0.260 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.072• �(1) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• �(2) / 0.185 ⇥ exp ( (2.000) � (2.000 + 2.000 + 2.000)) =
0.051• After normalization: {0.413, 0.294, 0.294}
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 38 of 40
50
![Page 51: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/51.jpg)
Evaluation
Update �
• Document: dog cat cat pig• Update equation
�i
= ↵i
+
X
n
�ni
(6)
• Assume ↵ = (.1, .1, .1)
�0 �1 �2dog .333 .333 .333cat .413 .294 .294 x2pig .333 .333 .333↵ 0.1 0.1 0.1
sum 1.592 1.354 1.354
• Note: do not normalize!
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 39 of 40
51
![Page 52: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/52.jpg)
Evaluation
Update �
• Count up all of the � across all documents• For each topic, divide by total• Corresponds to maximum likelihood of expected counts
• Unlike Gibbs sampling, no Dirichlet prior
Machine Learning: Jordan Boyd-Graber | Boulder Topic Models | 40 of 40
52
![Page 53: Department of Computer Science CSCI 5622: Machine Learning ... · •A general way to model count data and a general inference algorithm 6. Conceptual approach •Input: a text corpus](https://reader034.vdocuments.net/reader034/viewer/2022042418/5f35139c3d03222caf69687c/html5/thumbnails/53.jpg)
Recap
• Topic models: a neat way to model discrete count data• Variational inference converts intractable optimization to
maximizing ELBO
53