vowpal wabbit

38
VOWPAL WABBIT Paul Mineiro O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci

Upload: sergey-makarevich

Post on 11-Aug-2015

100 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Vowpal wabbit

VOWPAL WABBIT Paul Mineiro

O P E ND A T AS C I E N C EC O N F E R E N C E_

BOSTON 2015

@opendatasci

Page 2: Vowpal wabbit

Vowpal Wabbit

Paul MineiroMicrosoft

Page 3: Vowpal wabbit

Vowpal Wabbit: What Is It?

Machine Learning Toolkit and Research VehicleOpen Sourcehttps://github.com/JohnLangford/vowpal_wabbit/

Commercially Supportedhttp://azure.microsoft.com/en-us/services/machine-learning/

Currently sponsored by Microsoft ResearchFormerly sponsored by Yahoo! Research

Page 4: Vowpal wabbit

Vowpal Wabbit: What Is It?

Machine Learning Toolkit and Research VehicleOpen Sourcehttps://github.com/JohnLangford/vowpal_wabbit/

Commercially Supportedhttp://azure.microsoft.com/en-us/services/machine-learning/

Currently sponsored by Microsoft ResearchFormerly sponsored by Yahoo! Research

It’s aMindset!

Page 5: Vowpal wabbit
Page 6: Vowpal wabbit

Iterate quickly

Smash giant data sets

Go beyond classification

Page 7: Vowpal wabbit

Iterate quickly

Page 8: Vowpal wabbit

Sub-Linear Debugging

Key Technology: Online Learning

Key Concept: Progressive Validation Loss

Goal: Rapid Interactive Experimentation

Page 9: Vowpal wabbit

Sub-Linear Debugging

Key Technology: Online Learning

Key Concept: Progressive Validation Loss

Goal: Rapid Interactive Experimentation

Latency killsproductivity.

Page 10: Vowpal wabbit

Sub-Linear Debugging: Pitfalls

Bias-Variance Tradeoffs (``Learning Curves Cross’’)

Lower Bias: model class matches target better.

Lower Variance: fit less sensitive to training set.Ideal: push on both.Usually: pushing on just one, e.g.,

New features: lowering bias, increasing variance.

Regularizing: lowering variance.

Page 11: Vowpal wabbit

Smash giant data sets

Page 12: Vowpal wabbit

There’s no data like more data

Page 13: Vowpal wabbit

Subject to the Bayes limit, larger training sets admit beneficial tradeoffs of bias for variance, potentially resulting in substantially lower generalization error.

Page 14: Vowpal wabbit

There’s no data like more data

Page 15: Vowpal wabbit

Smash giant data sets

Strategy 1: Multinode

Page 16: Vowpal wabbit

Multinode Training

Start cluster spanning daemonStart (many) vw and point them at the daemonTwo strategies available:

iterative (SGD + Averaging)L-BFGS

Both might work poorly for non-convex problems

Page 17: Vowpal wabbit

Multinode Training

Start cluster spanning daemonStart (many) vw and point them at the daemonTwo strategies available:

iterative (SGD + Averaging)L-BFGS

Both might work poorly for non-convex problems

such as matrix factorization

Page 18: Vowpal wabbit

Smash giant data sets

Strategy 2: Multicore

Page 19: Vowpal wabbit

Multicore Training

Start several vw in daemon modeShared (lock-free!) stateSend data to children via netcat

Page 20: Vowpal wabbit

Multicore Training

Start several vw in daemon modeShared (lock-free!) stateSend data to children via netcat

… and then hope for the best.

Page 21: Vowpal wabbit

Go beyond classification

Page 22: Vowpal wabbit

Structured Prediction

Exploration Learning

Go beyond classification

Page 23: Vowpal wabbit

Structured Prediction

Go beyond classification

Page 24: Vowpal wabbit

Structured Prediction: What Is It?Linear DynamicsNon-linear Dynamics

Equilibrium ThermodynamicsNon-equilibrium Thermodynamics

ClassificationStructured Prediction

Page 25: Vowpal wabbit

Structured Prediction: What Is It?Linear DynamicsNon-linear Dynamics

Equilibrium ThermodynamicsNon-equilibrium Thermodynamics

ClassificationStructured Prediction

Shit we understood first

Everything else

Page 26: Vowpal wabbit

Structured Prediction: ExamplesTask Input Output

Image Segmentation

Machine Translation Ces deux principes se tiennent à la croisée de la philosophie, de la politique, de l’économie, de la sociologie et du droit.

Both principles lie at the crossroads of philosophy, politics, economics,  sociology, and law.

Syntactic Analysis The monster ate a big sandwich.

The monster ate a big sandwich.

Page 27: Vowpal wabbit

Structured Prediction HaikuA joint prediction

Across a single inputLoss measured jointly

Hal Daumé III

Page 28: Vowpal wabbit

Structured Prediction via Reduction(Imperatively) Define Search Space:

Process your inputMake calls to predictInform vw about losses experienced

Testing uses exactly same code as training

Page 29: Vowpal wabbit

Example: Entity and Relation ExtractionJames Earl Ray pleaded guilty in Memphis, Tenn. to

the assassination of civil rights leader

Martin Luther King Junior.

Page 30: Vowpal wabbit

Example: Entity and Relation ExtractionJames Earl Ray pleaded guilty in Memphis, Tenn. to

the assassination of civil rights leader

Martin Luther King Junior.

Person Location

Person

Page 31: Vowpal wabbit

Example: Entity and Relation ExtractionJames Earl Ray pleaded guilty in Memphis, Tenn. to

the assassination of civil rights leader

Martin Luther King Junior.

Person Location

Person

kill (James Earl Ray, Martin Luther King Junior)

Page 32: Vowpal wabbit

ER Search Space Pseudocodepreds={}foreach pos in input: // left to right

thispred=predict(input,pos,preds,’entity’)preds=preds {(pos,thispred)}if (label) loss(label,thispred,’entity’)

foreach pair in zip(preds,preds): thispred=predict(input,pair,preds,’relation’)

preds=preds {(pos,thispred)}if (label) loss(label,thispred,’relation’)

Page 33: Vowpal wabbit

ER Search Space Pseudocodepreds={}foreach pos in input: // left to right

thispred=predict(input,pos,preds,’entity’)preds=preds {(pos,thispred)}if (label) loss(label,thispred,’entity’)

foreach pair in zip(preds,preds): thispred=predict(input,pair,preds,’relation’)

preds=preds {(pos,thispred)}if (label) loss(label,thispred,’relation’)

Predict entities

Page 34: Vowpal wabbit

preds={}foreach pos in input: // left to right

thispred=predict(input,pos,preds,’entity’)preds=preds {(pos,thispred)}if (label) loss(label,thispred,’entity’)

foreach pair in zip(preds,preds): thispred=predict(input,pair,preds,’relation’)

preds=preds {(pos,thispred)}if (label) loss(label,thispred,’relation’)

ER Search Space PseudocodePredict entities

Predict relations

Page 35: Vowpal wabbit

preds={}foreach pos in input: // left to right

thispred=predict(input,pos,preds,’entity’)preds=preds {(pos,thispred)}if (label) loss(label,thispred,’entity’)

foreach pair in zip(preds,preds): thispred=predict(input,pair,preds,’relation’)

preds=preds {(pos,thispred)}if (label) loss(label,thispred,’relation’)

ER Search Space PseudocodePredict entities

Predict relations

Enforceconstraint

shere

Page 36: Vowpal wabbit
Page 37: Vowpal wabbit

Play with it.https://github.com/JohnLangford/vowpal_wabbit

Ask questions.https://groups.yahoo.com/neo/groups/vowpal_wabbit/info

Have fun.

Page 38: Vowpal wabbit

FIN