actively transfer domain knowledge

Download Actively Transfer Domain Knowledge

Post on 05-Jan-2016

21 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Actively Transfer Domain Knowledge. Transfer when you can, otherwise ask and dont stretch it. Xiaoxiao Shi Wei Fan Jiangtao Ren Sun Yat-sen University IBM T. J. Watson Research Center. Standard Supervised Learning. training (labeled). test (unlabeled). Classifier. - PowerPoint PPT Presentation

TRANSCRIPT

  • Actively Transfer Domain KnowledgeXiaoxiao Shi Wei Fan Jiangtao Ren

    Sun Yat-sen UniversityIBM T. J. Watson Research Center

    Transfer when you can, otherwise ask and dont stretch it

  • *Standard Supervised LearningNew York Timestraining (labeled)test (unlabeled) ClassifierNew York Times85.5%

  • *In RealityNew York Timestraining (labeled)test (unlabeled)New York TimesLabeled data are insufficient!47.3%How to improve the performance?

  • *Solution I : Active LearningNew York Timestraining (labeled)test (unlabeled) ClassifierNew York TimesLabelDomain Expert$Labeling Cost83.4%

  • *Solution II : Transfer LearningReutersOut-of-domaintraining (labeled)In-domaintest (unlabeled)Transfer ClassifierNew York TimesNo guarantee transfer learning could help!AccuracydropsSignificant Differences82.6%??43.5%

  • *MotivationActive Learning:Labeling costTransfer Learning:Domain difference risk

    Both have disadvantages, what to choose?

  • *Active Learner choose Proposed Solution (AcTraK)ReutersTransfer ClassifierDomain ExpertLabelUnreliableDecisionFunctionReliable, label by the classifierClassificationResultLabeledTrainingClassifierUnlabeled in-domainTraining Dataout-domain training (labeled)

  • *Transfer ClassifierMoML+ML-L+L-+-X: In-domain unlabeledClassify X by out-of-domain Mo: P(L+|X, Mo) and P(L-|X, Mo).Classify X by mapping classifiers ML+ and ML-: P(+|X, ML+) and P(+|X, ML-).Then the probability for X to be + is:T(X) = P(+|X) = P(L+|X, Mo) P(+|X, ML+) + P(L-|X, Mo) P(+|X, ML-)Out-of-domain dataset (labeled)In-domain labeled (few)P(L+|X, Mo)P(L-|X, Mo)P(+|X, ML+)P(+|X, ML-)TrainTrainTrainL+ = { (x,y=+/-)|Mo(x)=L+ }the true in-domainlabel may be either-or +

  • *ActiveLearnerOur Solution (AcTraK)ReutersTransfer ClassifierDomain ExpertLabelUnreliableDecisionFunctionReliable, label by the classifierClassificationResultTestLabeledTrainingClassifierunlabeledTraining Dataoutdomain training (labeled)

  • when prediction by transfer classifier is unreliable, ask domain experts

    *Decision FunctionTransfer ClassifierIn the following, ask the domain expert to label the instance, not the transfer classifier:a) Conflictb) Low in confidencec) Few labeled in-domain examples

  • *Decision Functiona) Conflict?b) Confidence?c) Size?Decision Function:Label by Transfer ClassifierLabel by Domain ExpertR : random number [0,1]AcTraK asks the domain expert to label the instance with probability of T(x): prediction by the transfer classifierML(x): prediction given by the in-domain classifier

  • *It can reduce domain difference risk. - According to Theorem 2, the expected error is bounded.It can reduce Labeling cost. - According to Theorem 3, the query probability is bounded.

    Properties

  • *Theoremsexpected error of the transfer classifierMaximum size

  • *Data SetsSynthetic data setsRemote Sensing: data collected from regions with a specific ground surface condition data collected from a new regionText classification: same top-level classification problems with different sub-fields in the training and test sets (Newsgroup)Comparable ModelsInductive Learning model: AdaBoost, SVMTransfer Learning model: TrAdaBoost (ICML07)Active Learning model: ERS (ICML01)

    Experiments setup

  • *Experiments on Synthetic DatasetsIn-domain:2 labeled training&testing4 out domain labeled training

  • *Experiments on Real World DatasetEvaluation metric:Compared with transfer learning on accuracy.Compared with active learning on IEA (Integral Evaluation on Accuracy).

  • *1. Comparison with Transfer Learner2. Comparison with Active Learner20 Newsgroupcomparison with active learner ERS

    Chart1

    0.60.7230.754

    0.5910.6740.706

    0.5360.7440.809

    0.5270.5730.78

    0.4910.7720.821

    0.5760.7130.751

    SVM

    TrAdaBoost

    AcTraK

    Datasets

    Accuracy

    Accuracy Comparison

    Sheet1

    SVMTrAdaBoostAcTraK

    0.570.89760.9449

    0.570.86040.9448

    0.570.9050.9449

    0.570.88420.9449

    0.570.9070.9449

    0.570.94760.9449

    0.60.7230.754

    0.5910.6740.706

    0.5360.7440.809

    0.5270.5730.78

    0.4910.7720.821

    0.5760.7130.751

    Sheet1

    SVM

    TrAdaBoost

    AcTraK

    Remote Sensing (Landmine)

    Sheet2

    SVM

    TrAdaBoost

    AcTraK

    20 Newsgroup

    Sheet3

    SVM

    TrAdaBoost

    AcTraK

    Datasets

    Accuracy

    Accuracy Comparison

    Chart2

    0.91

    1.83

    0.21

    0.88

    0.35

    0.84

    Datasets

    IEA

    IEA(AcTraK, ERS, 250)

    Sheet1

    SVMTrAdaBoostAcTraK

    0.570.89760.9449

    0.570.86040.9448

    0.570.9050.9449

    0.570.88420.9449

    0.570.9070.9449

    0.570.94760.9449

    0.60.7230.754

    0.5910.6740.706

    0.5360.7440.809

    0.5270.5730.78

    0.4910.7720.821

    0.5760.7130.751

    0.91

    1.83

    0.21

    0.88

    0.35

    0.84

    Sheet1

    SVM

    TrAdaBoost

    AcTraK

    Remote Sensing (Landmine)

    Sheet2

    SVM

    TrAdaBoost

    AcTraK

    20 Newsgroup

    Sheet3

    SVM

    TrAdaBoost

    AcTraK

    Datasets

    Accuracy

    Accuracy Comparison

    Datasets

    IEA

    IEA(AcTraK, ERS, 250)

  • *Actively Transfer Domain KnowledgeReduce domain difference risk: transfer useful knowledge (Theorem 2)Reduce labeling cost: query domain experts only when necessary (Theorem 3)Conclusions

    *************

Recommended

View more >