زهره کریمی دسته بندی نیمه نظارتی (2) 1 introduction to semi-supervised...

Click here to load reader

Upload: meagan-stewart

Post on 27-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • (2) 1 Introduction to semi-supervised Learning, Xiaojin Zhu and Andrew B. Goldberg, University of Wisconsin, Madison, 2009.
  • Slide 2
  • 2 Mixture EM Co-Training SVM
  • Slide 3
  • Co-Training 3 Location Named entity Classification
  • Slide 4
  • Co-Training 4 Named entity Classification Location
  • Slide 5
  • Co-Training 5 Named entity Classification Location
  • Slide 6
  • Co-Training 6 : .
  • Slide 7
  • 7 view view Co-Training
  • Slide 8
  • 8 Web-page classification : hyperlink: hyperlink Classify Speech phonemes Audio video
  • Slide 9
  • Multiview learning (1) The squared loss c(x, y, f (x)) = (y f (x))2 0/1 loss c(x, y, f (x)) = 0 if y = f (x), and 1 otherwise c(x, y = healthy, f (x) = diseased) = 1 and c(x, y = diseased, f (x) = healthy) = 100 9
  • Slide 10
  • Multiview learning (2) 10
  • Slide 11
  • Multiview Learning (3) MULTIVIEW LEARNING k k The semi-supervised regularizer: k 11 Individual Regularized Risk Semi-Supervised regularizer
  • Slide 12
  • Multiview learning(4) 12 : emprical risk
  • Slide 13
  • 13 Mixture EM Co-Training SVM
  • Slide 14
  • (1) 14 kNN NN
  • Slide 15
  • 15 (2)
  • Slide 16
  • Regularization 16 f (1) f loss function (2) f ( regularization framework) special graph-based regularization
  • Slide 17
  • Mincut (1) 17 source sink source sink
  • Slide 18
  • 18 1 3 5 4 2 Mincut (2)
  • Slide 19
  • 19 Cost Function Regularizer Mincut Regularized Risk problem Mincut (3)
  • Slide 20
  • Harmonic Function (1) 20
  • Slide 21
  • Harmonic Function (2) 21
  • Slide 22
  • Harmonic Function (3) 22 unnormalized graph Laplacian matrix L W is an (l + u) (l + u) weight matrix, whose i, j -th element is the edge weight wij
  • Slide 23
  • Harmonic Function (4) 23 unnormalized graph Laplacian matrix
  • Slide 24
  • Manifold Regularization (1) 24 Transductive f (x) = y
  • Slide 25
  • Manifold Regularization (2) 25 Inductive
  • Slide 26
  • Manifold Regularization (3) 26 normalized graph Laplacian matrix L Laplacian
  • Slide 27
  • (1) 27
  • Slide 28
  • Spectral graph theory 28 (2)
  • Slide 29
  • (3) 29 a smaller eigenvalue corresponds to a smoother eigenvector over the graph The graph has k connected components if and only if 1 =... = k = 0. The corresponding eigenvectors are constant on individual connected components, and zero elsewhere.
  • Slide 30
  • Graph Spectrum 30
  • Slide 31
  • (4) 31 Regularization term ai i Regularization term . f ( i ) .
  • Slide 32
  • (5) 32 k-connected component Regularization term
  • Slide 33
  • (6) 33
  • Slide 34
  • 34 Mixture EM Co-Training SVM
  • Slide 35
  • 35 margin: geometric margin.
  • Slide 36
  • Support Vector Machines 36
  • Slide 37
  • Support Vector Machines 37 The signed geometric margin: The distance from the decision boundary to the closest labeled instance decision boundary Maximum margin hyperplane must be unique
  • Slide 38
  • Non-Separable Case (1) 38
  • Slide 39
  • Non-Separable Case (2) 39 lie inside the margin, but on the correct side of the decision boundary lie on the wrong side of the decision boundary and are misclassified are correctly classified
  • Slide 40
  • Non-Separable Case (3) 40
  • Slide 41
  • Non-Separable Case (4) 41
  • Slide 42
  • S3VM (1) 42
  • Slide 43
  • S3VM (2) 43 the majority (or even all) of the unlabeled instances are predicted in only one of the classes
  • Slide 44
  • S3VM (3) 44 Convex function The S3VM objective function is non-convex The research in S3VMs has focused on how to efficiently find a near-optimum solution
  • Slide 45
  • Logistic regression SVM and S3VM are non-probabilistic models probabilistic model conditional log likelihood Gaussian distribution as the prior on w:
  • Slide 46
  • Logistic regression L ogistic loss regularizer
  • Slide 47
  • Logistic regression
  • Slide 48
  • Entropy Regularizer Logistic Regression+Entropy Regulizer For SemiSupervised Learning Intuition if the two classes are well-separated, then the classification on any unlabeled instance should be confident: it either clearly belongs to the positive class, or to the negative class. Equivalently, the posterior probability p(y|x) should be either close to 1, or close to 0. Entropy
  • Slide 49
  • Semi-supervised Logistic Regression entropy regularizer for logistic regression
  • Slide 50
  • Entropy Regularizer
  • Slide 51
  • S3VM Entropy Regularization