semi-supervised learningwcohen/10-601/ssl.pdf · semi-supervised learning • given: – a pool of...
TRANSCRIPT
![Page 1: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/1.jpg)
Semi-Supervised Learning
William Cohen
![Page 2: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/2.jpg)
Outline • The general idea and an example (NELL) • Some types of SSL
– Margin-based: transductive SVM • Logistic regression with entropic regularization
– Generative: seeded k-means – Nearest-neighbor like: graph-based SSL
![Page 3: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/3.jpg)
INTRO TO SEMI-SUPERVISED LEARNING �(SSL)
![Page 4: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/4.jpg)
Semi-supervised learning • Given:
– A pool of labeled examples L – A (usually larger) pool of unlabeled examples U
• Option 1 for using L and U : – Ignore U and use supervised learning on L
• Option 2: – Ignore labels in L+U and use k-means, etc Uind
clusters; then label each cluster using L • Question:
– Can you use both L and U to do better?
![Page 5: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/5.jpg)
SSL is Somewhere Between Clustering and Supervised Learning
5
![Page 6: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/6.jpg)
SSL is Between Clustering and SL
6
![Page 7: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/7.jpg)
What is a natural grouping among these objects?
slides: Bhavana Dalvi
![Page 8: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/8.jpg)
SSL is Between Clustering and SL
8
clustering is unconstrained and may not give you what you want
maybe this clustering is as good as the other
![Page 9: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/9.jpg)
SSL is Between Clustering and SL
9
![Page 10: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/10.jpg)
SSL is Between Clustering and SL
10
![Page 11: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/11.jpg)
SSL is Between Clustering and SL
11
supervised learning with few labels is also unconstrained and may not give
you what you want
![Page 12: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/12.jpg)
SSL is Between Clustering and SL
12
![Page 13: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/13.jpg)
SSL is Between Clustering and SL
13
This clustering isn’t consistent with the labels
![Page 14: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/14.jpg)
SSL is Between Clustering and SL
14
|Predicted Green|/|U| ~= 50%
![Page 15: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/15.jpg)
SSL in Action: The NELL System
![Page 16: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/16.jpg)
Outline • The general idea and an example (NELL) • Some types of SSL
– Margin-based: transductive SVM • Logistic regression with entropic regularization
– Generative: seeded k-means – Nearest-neighbor like: graph-based SSL
![Page 17: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/17.jpg)
TRANSDUCTIVE SVM
![Page 18: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/18.jpg)
Two Kinds of Learning • Inductive SSL:
– Input: training set • (x1,y1),…,(xn,yn) • xn+1 , xn+2 ,…,xn+m
– Output: classiUier • f(x) = y
– ClassiUier can be run on any test example x
• Transductive SSL: – Input: training set
• (x1,y1),…,(xn,yn) • xn+1 , xn+2 ,…,xn+m
– Output: classiUier • f(xi) = y
– ClassiUier is only deUined for xi’s seen at training time
![Page 19: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/19.jpg)
![Page 20: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/20.jpg)
Standard SVM
![Page 21: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/21.jpg)
Tranductive SVM
![Page 22: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/22.jpg)
Not a convex problem – need to do some sort of search to guess the labels for the unlabeled examples
![Page 23: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/23.jpg)
SSL using regularized SGD for logistic regression 1. P(y|x)=logis*c(x.w)2. Definelossfunc*on
3. Differen*atethefunc*onandusegradientdescenttolearn
LCLD (w) ≡ logP(yi | xi,w)i∑ −µ w
2
2
23
![Page 24: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/24.jpg)
SSL using regularized SGD for logistic regression 1. P(y|x)=logis*c(x.w)2. Definelossfunc*on
3. Differen*atethefunc*onandusegradientdescenttolearn
LCLD (w) ≡ logP(yi | xi,w)i∑ −µ w
2
2
− P(y ' | x j,w)y '∑
j∑ logP(y ' | x j,w)
24
Entropy of predictions
on the unlabeled examples
![Page 25: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/25.jpg)
Again, a convex problem – need to do some sort of search to guess the labels for the unlabeled examples
Logistic regression with entropic regularization
Low entropy example:
very confident in
one class
High entropy example:
high probability of
being in either class
![Page 26: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/26.jpg)
SEMI-SUPERVISED�K-MEANS AND MIXTURE MODELS
![Page 27: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/27.jpg)
k-means
![Page 28: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/28.jpg)
K-means Clustering: Step 1
28
![Page 29: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/29.jpg)
K-means Clustering: Step 2
29
![Page 30: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/30.jpg)
K-means Clustering: Step 3
30
![Page 31: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/31.jpg)
K-means Clustering: Step 4
31
![Page 32: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/32.jpg)
K-means Clustering: Step 5
32
![Page 33: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/33.jpg)
![Page 34: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/34.jpg)
Seeded k-means
k is the number of classes using the labeled “seed” data
except keep the seeds in the class they are known to belong to
![Page 35: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/35.jpg)
Basu and Mooney ICML 2002
Unsupervised k-means
Seeded k-means (constrained k-means) Just use labeled data to
initialize the clusters
Some old baseline I’m not going to even talk
about
20 Newsgroups dataset
![Page 36: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/36.jpg)
Outline • The general idea and an example (NELL) • Some types of SSL
– Margin-based: transductive SVM • Logistic regression with entropic regularization
– Generative: seeded k-means • Some recent extensions….
– Nearest-neighbor like: graph-based SSL
![Page 37: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/37.jpg)
Seeded k-means for a hierarchical classiUication tasks
37
Simple extension: 1. Don’t assign to one of K classes: instead make a
decision about every class in the ontology • example à {1,..K} example à 00010001
2. Pick “closest” bit vector consistent with constraints • this is an (ontology-sized) optimization problem
that you solve independently for each example
Bit vector with one
bit for each category
![Page 38: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/38.jpg)
Seeded k-means
k is the number of classes using the labeled “seed” data
except keep the seeds in the classes they are known to belong to
best consistent set of categories from the ontology
![Page 39: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/39.jpg)
Automatic Gloss Finding �for a Knowledge Base
39
• Glosses: Natural language deUinitions of named entities. E.g. “Microsoft” is an American multinational corporation headquartered in Redmond that develops, manufactures, licenses, supports and sells computer software, consumer electronics and personal computers and services ...
– Input: Knowledge Base i.e. a set of concepts (e.g. company) and entities belonging to those concepts (e.g. Microsoft), and a set of potential glosses.
– Output: Candidate glosses matched to relevant entities in the KB.�“Microsoft is an American multinational corporation headquartered in Redmond …” is mapped to entity “Microsoft” of type “Company”.
[Automatic Gloss Finding for a Knowledge Base using Ontological Constraints, Bhavana Dalvi Mishra, Einat Minkov, Partha Pratim Talukdar, and William W. Cohen, 2014, Under submission]
![Page 40: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/40.jpg)
Example: Gloss Uinding
40
![Page 41: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/41.jpg)
Example: Gloss Uinding
41
![Page 42: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/42.jpg)
Example: Gloss Uinding
42
![Page 43: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/43.jpg)
Example: Gloss Uinding
43
![Page 44: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/44.jpg)
Training a clustering model
Labeled: Unambiguous
Unlabeled: Ambiguous
44
Fruit Company
![Page 45: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/45.jpg)
GLOFIN: Clustering glosses
45
![Page 46: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/46.jpg)
GLOFIN: Clustering glosses
46
![Page 47: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/47.jpg)
GLOFIN: Clustering glosses
47
![Page 48: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/48.jpg)
GLOFIN: Clustering glosses
48
![Page 49: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/49.jpg)
GLOFIN: Clustering glosses
49
![Page 50: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/50.jpg)
GLOFIN: Clustering glosses
50
![Page 51: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/51.jpg)
GLOFIN on NELL Dataset
51
0
10
20
30
40
50
60
70
80
Precision Recall F1
SVM
Label Propagation
GLOFIN
275 categories, 247K candidate glosses, #train=20K, #test=227K
![Page 52: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/52.jpg)
Outline • The general idea and an example (NELL) • Some types of SSL
– Margin-based: transductive SVM • Logistic regression with entropic regularization
– Generative: seeded k-means – Nearest-neighbor like: graph-based SSL
![Page 53: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/53.jpg)
Harmonic fields – Gharamani, Lafferty and Zhu
Observed label
• Idea: construct a graph connecting the most similar examples (k-NN graph)
• Intuition: nearby points should have similar labels – labels should “propagate” through the graph
• Formalization: try and minimize “energy” deUined as:
In this example y
is a length-10
vector
![Page 54: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/54.jpg)
Harmonic fields – Gharamani, Lafferty and Zhu
Observed label
• Result 1: at the minimal energy state, each node’s value is a weighted average of its neighbor’s weights:
![Page 55: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/55.jpg)
“Harmonic Uield” LP algorithm • Result 2: you can reach the minimal energy
state with a simple iterative algorithm: – Step 1: For each seed example (xi,yi):
• Let V0(i,c) = [| yi= c |] – Step 2: for t=1,…,T --- T is about 5
• Let Vt+1(i,c) =weighted average of Vt+1(j,c) for all j that are linked to i, and renormalize
• For seeds, reset Vt+1(i,c) = [| yi= c |]
€
V t+1(i,c) =1Z
wi, jVt ( j,c)
j∑
![Page 56: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/56.jpg)
This family of techniques is called “Label propagation”
Harmonic fields – Gharamani, Lafferty and Zhu
![Page 57: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/57.jpg)
This family of techniques is called “Label propagation”
Harmonic fields – Gharamani, Lafferty and Zhu
This experiment points out some of the issues with LP: 1. What distance metric
do you use? 2. What energy function
do you minimize? 3. What is the right value
for K in your K-NN graph? Is a K-NN graph right?
4. If you have lots of data, how expensive is it to build the graph?
![Page 58: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/58.jpg)
NELL: Uses Co-EM ~= HF
Paris Pittsburgh
Seattle Cupertino
mayor of arg1 live in arg1
San Francisco Austin denial
arg1 is home of traits such as arg1
anxiety selfishness
Berlin
Extract cities: Examples
Features
![Page 59: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/59.jpg)
Semi-Supervised Bootstrapped �Learning via Label Propagation
Paris
live in arg1
San Francisco Austin
traits such as arg1
anxiety
mayor of arg1
Pittsburgh
Seattle
denial
arg1 is home of
selfishness
![Page 60: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/60.jpg)
Semi-Supervised Bootstrapped Learning via Label Propagation
Paris
live in arg1
San Francisco Austin
traits such as arg1
anxiety
mayor of arg1
Pittsburgh
Seattle
denial
arg1 is home of
selfishness
Nodes “near” seeds Nodes “far from” seeds
Information from other categories tells you “how far” (when to stop propagating)
arrogance traits such as arg1
denial selfishness
![Page 61: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/61.jpg)
Difference: graph construction is not instance-to-instance but instance-to-feature
Paris San Francisco
Austin
anxiety Pittsburgh
Seattle
denial selfishness
![Page 62: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/62.jpg)
Some other general issues with SSL • How much unlabeled data do you want?
– Suppose you’re optimizing J = JL(L) + JU(U) – If |U| >> |L| does JU dominate J?
• If so you’re basically just clustering – Often we need to balance JL and JU
• Besides L, what other information about the task is useful (or necessary)? – Common choice: relative frequency of classes – Various ways of incorporating this into the
optimization problem
![Page 63: Semi-Supervised Learningwcohen/10-601/ssl.pdf · Semi-supervised learning • Given: – A pool of labeled examples L – A (usually larger) pool of unlabeled examples U • Option](https://reader030.vdocuments.net/reader030/viewer/2022040912/5e86a9db58f7f502e224fd06/html5/thumbnails/63.jpg)
Key and not-so-key points • The general idea : what is SSL and when do you want to use
it? – NELL as an example of SSL
• Different SSL methods: – margin-based approach: start with a supervised learner
• transductive SVM: what’s optimized and why • logistic reg with entropic regularization
– k-means versus seeded k-means: start with clustering • The core algorithm and what • Extension to hierarchical case and GLOFIN
– nearest-neighbor like: graph-based SSL and LP • The HF algorithm and the energy function being minimized