muri meeting july 2002 gert lanckriet ( [email protected] )

Click here to load reader

Post on 08-Jan-2016

26 views

Category:

Documents

2 download

Embed Size (px)

DESCRIPTION

Convex Optimization in Machine Learning. MURI Meeting July 2002 Gert Lanckriet ( [email protected] ) L. El Ghaoui, M. Jordan, C. Bhattacharrya, N. Cristianini, P. Bartlett U.C. Berkeley. Convex Optimization in Machine Learning. Advanced Convex Optimization in Machine Learning. SDP. - PowerPoint PPT Presentation

TRANSCRIPT

  • MURI Meeting July 2002

    Gert Lanckriet ([email protected])L. El Ghaoui, M. Jordan, C. Bhattacharrya, N. Cristianini, P. BartlettU.C. Berkeley

    Convex Optimization in Machine Learning

  • Convex Optimization in Machine Learning

  • QPLPQCQPSDPSOCPAdvanced Convex Optimization in Machine Learning

  • Advanced Convex Optimization in Machine Learning

  • Linear Programming (LP)

  • Quadratic Programming (QP)

  • Quadratic Constrained Quadratic Programming (QCQP)

  • Second Order Cone Programming (SOCP)

  • Semi-Definite Programming

  • Advanced Convex Optimization in Machine Learning

  • MPM: Problem Sketch (1)aT z = b : decision hyperplane

  • MPM: Problem Sketch (2)

  • MPM: Problem Sketch (3)

  • MPM: Main Result (1)Marshall & Olkin / Popescu & Bertsimas??

  • MPM: Main Result (2)

  • LemmaMPM: Main Result (3)

  • MPM: Main Result (4)LemmaProbabilistic ConstraintDeterministic Constraint

  • MPM: Main Result (5)

  • MPM: GeometricInterpretation

  • MPM: Link with FDA (1)

  • MPM: Link with FDA (2)

  • MPM: Link with FDA (3)

  • Robustness to Estimation Errors: Robust MPM (R-MPM)

  • Robust MPM (R-MPM)

  • Robust MPM (R-MPM)

  • MPM: Convex Optimization to solve the problemLinear ClassifierNonlinear ClassifierKernelizingConvex Optimization:Second OrderCone Program (SOCP)) competitive with Quadratic Program (QP) SVMsLemma

  • MPM: Empirical resultsa=1b and TSA (test-set accuracy) of the MPM, compared to BPB (best performance in Breiman's report (Arcing classifiers, 1996)) and SVMs. (averages for 50 random partitions into 90% training and 10% test sets) Comparable with existing literature, SVMs a=1-b is indeed smaller than the test-set accuracy in all cases (consistent with b as worst-case bound on probability of misclassification) Kernelizing leads to more powerfull decision boundaries (alinear decision boundary < anonlinear decision boundary (Gaussian kernel))

  • Conclusions

  • Future directions

  • Advanced Convex Optimization in Machine Learning

  • The idea (1)

  • The idea (2)

  • The idea (3)

  • The idea (4)

  • The idea (5)

  • Hard margin SVM classifiers (1)

  • Hard margin SVM classifiers (2)

  • Hard margin SVM classifiers (3)

  • Hard margin SVM classifiers (4)

  • SDP !Hard margin SVM classifiers (5)

  • OptimizationLearning the kernel matrix !Hard margin SVM classifiers (6)

  • training set (labelled)test set (unlabelled)Learning the kernel matrix !Hard margin SVM classifiers (7)

  • ?Hard margin SVM classifiers (8)

  • Hard margin SVM classifiers (9)

  • Hard margin SVM classifiers (9)

  • Hard margin SVM classifiers (9)

  • Hard margin SVM classifiers (10)

  • Hard margin SVM classifiers (11)Learning Kernel Matrix with SDP !

  • Empirical results hard margin SVMs

  • Conclusions and future directions

  • Conclusions and future directions

  • See also