bridged refinement for transfer learning xing dikan, dai wenyua, xue gui-rong, yu yong shanghai jiao...

35
Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjt u.edu.cn

Upload: maurice-preston

Post on 03-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Bridged Refinement for Transfer Learning

XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong

Shanghai Jiao Tong University{xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Page 2: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Outline

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Conclusion

Page 3: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Conclusion

Page 4: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Motivation

• Email spamming: Whether a given mail is a spam or not.

– Training Data

– Test Data

A B C D …

Z Y

Pop music

basketball basketball

football

classic music

ownerMailbox:

Page 5: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Motivation

• New events always occur.news in 2006, commercial or politicsnews in 2007, commercial or politics

• Solution ?– Labeling new data again and again -- costly

• Therefore, …We try to utilize those old labeled data but take the

shift of distribution into consideration.[Transfer useful information]

Page 6: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Some other solutions

Page 7: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Problem

• We want to solve a classification problem.• The set of target categories is fixed.• Main difference from traditional classification:

– The training data and test data are governed by two slightly different distributions.

• We do not need labeled data in the new test data distribution.

Page 8: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Illustrative Example

+

+

-

-

+

+

-

-

sports

music+: normal mail

-: spam mail

Page 9: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Some other solutions

Page 10: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Some other solutions

Page 11: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Assumption

• P(c|d) doesn’t changes: Ptrain(c|d) = Ptest(c|d) Since– The set of target categories is fixed.– Each target category is definite.

• P(c|di) ~ P(c|dj), when di ~ dj.

~ means “similar”, “close to each other”• Consistency

– Mutual Reinforcement Principle

Page 12: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Some other solutions

Page 13: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Method: Refinement

• UConfc: scores of a base classifier, coarse-gained (Unrefined Confidence score of category c)

• M: adjacent matrix. Mij = 1 if di is a neighbor of dj

(then row L1 normalized).• RConfc: Refined Confidence score of category c.

• Mutual reinforcement principle yields:RConf c = α M RConfc + (1-α) UConfc

where α is a trade-off coefficient.

Page 14: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Method: Refinement

• Refinement can be regarded as reaching a consistency under the mixture distribution.

• Why not try to reach a consistency under the distribution of the test data?

Page 15: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Illustrative Example

-

+

+

-

+

-

+

-

-

+

+

-

+

-

+

-

Page 16: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

+

-

+

-

-

+

+

-

+

-

+

-

Page 17: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Overview

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Some other solutions

Page 18: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Method: Bridged Refinement

• Bridged Refinement

– Refine towards the mixture distribution– Refine towards the target distribution.

Page 19: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Outline

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Conclusion

Page 20: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Experiment

• Data set• Base classifiers• Different refinement styles• Performance• Parameter sensitivity

Page 21: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Experiment: Data set

• Source– SRAA

• Simulated autos (simauto)• Simulated aviation (simaviation)• Real autos (realauto)• Real aviation (realaviation)

– 20 Newsgroup• Top level categories: rec, talk, sci, comp

– Reuters-21578• Top level categories: org, places, people

Page 22: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Experiment: Data set

• Re-construction– 11 data sets

A2 B1 B2A1

Positive Negative

Training Data

Test Data

-+

Page 23: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Experiment: Base classifier

• Supervised – Generative model: Naïve Bayes classifier– Discriminative model: Support vector machines

• Semi-supervised:– Transductive support vector machines

Page 24: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Experiment: Refinement Style

• No refinement (base)• One step

– Refine directly on the test distribution (Test)– Refine on the mixture distribution only (Mix)

• Two steps– Bridged Refinement (Bridged)

Page 25: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Performance: On SVM

• Base• Test• Mix• Bridged

• Test (2nd) , Mix(3rd) v.s. Base (1st)• Test (2nd) v.s. Bridged (1st):

– Different start point

Page 26: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Performance: NB and TSVM

Page 27: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Parameter: K

Whether di is regardedas a neighborof dj is decidedby checkingwhether di is in dj’s k-nearestneighbor set.

Page 28: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Parameter: α

Error rate Vs.Differentalpha

Page 29: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Convergence

The refinementformula canbe solved ina close manner or an iterativemanner.

Page 30: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Outline

• Motivation• Problem• Solution

– Assumption– Method– Improvement and Final Solution

• Experiment• Conclusion

Page 31: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Conclusion

• Task: Transfer useful information from training data to the same classification task of the test data, while training and test data are governed by two different distributions.

• Approach: Take the mixture distribution as a bridge and make a two-step refinement.

Page 32: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Thank youPlease ask in slow and simple English

Page 33: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Backup 1: Tranductive

• The boundary after either step of refinement are actually never calculated explicitly. It is hidden in the refined labels of each data points.

• I draw it in the examples explicitly is for a clearer illustration only.

Page 34: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Backup 2: n-step

• One important problem left unsolved by us:– How to describe a distribution

\lembda D_train + (1-\lembda) D_test ?– One solution is sampling in a generative manner.

But this makes the result depends on each random number picked up in the generative process. It may cause the solution not very stable and hard to repeat.

Page 35: Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

Backup 3: Why mutual reinforcement principle ?

• If d_j has a high confidence to be in category c, then d_i, the neigbhor of d_j should also receive a high confidence score.