![Page 1: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/1.jpg)
Paired Sampling in Density-Sensitive Active Learning
Pinar Donmez joint work with Jaime G. Carbonell
Language Technologies Institute School of Computer Science Carnegie Mellon University
![Page 2: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/2.jpg)
Outline
Problem settingMotivationOur approachExperimentsConclusion
![Page 3: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/3.jpg)
Setting
X: feature space, label set Y={-1,+1} Data D ~ X x Y D = T U U
T: training set U: unlabeled set T is small initially, U is large
Active Learning: Choose most informative samples to label Goal: high performance with least number of labeling
requests
![Page 4: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/4.jpg)
Motivation
Optimize the decision boundary placement Sampling disproportionately on one side may not be
optimal Maximize likelihood of straddling the boundary with
paired samples
Three factors affect sampling Local density Conditional entropy maximization Utility score
![Page 5: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/5.jpg)
Illustrative Example
Left Figure significant shift in the current hypothesis large reduction in version space
Right Figure small shift in the current hypothesis small reduction in version space
Paired sampling Single point sampling
![Page 6: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/6.jpg)
Density-Sensitive Distance
Cluster Hypothesis: decision boundary should NOT cut clusters squeeze distances in high density regions increase distances in low density regions
Solution: Density-Sensitive Distance find the weakest link along each path in a graph G
a better way to avoid outliers (i.e. a very short edge in a long path)
Chapelle & Zien (2005)
![Page 7: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/7.jpg)
Density-Sensitive Distance
Apply MDS (Multi-dimensional Scaling) to to obtain a Euclidean embedding
Find eigenvalues and eigenvectors ofPick the first p eigenvectors s.t.
![Page 8: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/8.jpg)
Active Sampling Procedure
Given a training set T in MDS space1. Train logistic regression classifier on T
2. For all Compute the pairwise score
3. Choose the pair with the maximum score
4. Repeat 1-3
![Page 9: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/9.jpg)
Details of the Scoring Function S
Two components of S1. Likelihood of a pair having opposite labels (straddling the
decision boundary)2. Utility of the pair
By cluster assumption decision boundary should not clusters => points in different
clusters are likely to have different labels
In the transformed space, points in different clusters have low similarity (large distance)
Thus, we can estimate
![Page 10: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/10.jpg)
An Analysis Justifying our Claim
Pairwise distances are divided into bins Pairs are assigned to bins acc. to their distances For each bin, relative frequency of pairs with opposite class labels
are computed This graph (empirically) shows that likelihood of having opposite
labels for two points monotonically increases with the pairwise distance between them.
* This graph is plotted on g50c dataset.
![Page 11: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/11.jpg)
Utility Function
Two components Local density depends on
number of close neighbors their proximity
Conditional Entropy
For binary problems
![Page 12: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/12.jpg)
Uncertainty-Weighed Density
captures the density of a given point information content of its neighbors
novelty: each neighbor’s contribution weighed by its uncertainty reduces the effect of highly certain neighbors dense points with highly uncertain neighbors become
important
![Page 13: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/13.jpg)
Utility Function
utility of a pair is
regularize information content (entropy) of the pair proximity-weighted information content of neighbors
![Page 14: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/14.jpg)
Experimental Data
pair with maximum score selected
Six binary datasets
![Page 15: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/15.jpg)
Experiment Setting
For each data set start with 2 labeled data points (1 +, 1 -) run each method for 20 iterations results averaged over 10 runs
Baselines Uncertainty Sampling Density-only Sampling Representative Sampling (Xu et. al. 2003) Random Sampling
![Page 16: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/16.jpg)
Results
![Page 17: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/17.jpg)
Results
![Page 18: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/18.jpg)
Conclusion
Our contributions: combine uncertainty, density, and dissimilarity across
decision boundary proximity-weighted conditional entropy selection is
effective for active learning
Results show our method significantly outperforms baselines in
error reduction fewer labeling requests than others to achieve the same
performance
![Page 19: Paired Sampling in Density-Sensitive Active Learning](https://reader033.vdocuments.net/reader033/viewer/2022051218/568158bd550346895dc60426/html5/thumbnails/19.jpg)
Thank You!