neural optimizer search with reinforcement learning2017/11/09  · 1(op 1);u 2(op 2)) irwan bello,...

20
Neural Optimizer Search with Reinforcement Learning Irwan Bello 1 Barret Zoph 1 Vijay Vasudevan 1 Quoc V. Le 1 1 Google Brain ICLR, 2017/ Presenter: Anant Kharkar Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation) Neural Optimizer Search with Reinforcement Learning ICLR, 2017/ Presenter: Anant Kharkar 1 / 20

Upload: others

Post on 06-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Neural Optimizer Search with Reinforcement Learning

Irwan Bello1 Barret Zoph1 Vijay Vasudevan1 Quoc V. Le1

1Google Brain

ICLR, 2017/ Presenter: Anant Kharkar

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 1

/ 20

Page 2: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 2

/ 20

Page 3: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 3

/ 20

Page 4: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Motivation

Classical optimizers:

SGD

SGD w/Momentum

Adam

RMSProp

Combination of stochastic methods and heuristic approximations

Want to automate process of generating update rulesProduce equation, not just numerical updates

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 4

/ 20

Page 5: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 5

/ 20

Page 6: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Approach

RNN controller produces update rule string

Controller updated based on performance of optimizer

RL approach to training

How to generate update rules? First define space of update rules

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 6

/ 20

Page 7: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Previous Work

LSTM for numerical updates (Andrychowicz et al., 2016)

Equations are more transferrable

Genetic programming for update equations (Orchard & Wang, 2016)

Slow and needs heuristics

Neural Architecture Search (Zoph & Le, 2017) - seen earlier

RNN produces network architecture

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 7

/ 20

Page 8: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 8

/ 20

Page 9: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Domain-Specific Language

Each optimizer has computational graph - binary expression tree

Components:

2 operands

Unary function for each operand

Binary function to combine

∆w = λ ∗ b(u1(op1), u2(op2))

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 9

/ 20

Page 10: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 10

/ 20

Page 11: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Controller RNN

Trained with Adam

Objective function: J(θ) = E∆∼pθ(.)[R(∆)]Optimize reward (accuracy of target model)

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 11

/ 20

Page 12: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Search Space

Operands

Gradients: g , g2, g3, sign(g)Moving averages: m, v , y , sign(m)Weights: 10−4w , 10−3w , 10−2w , 10−1wADAM, RMSProp, 1, small noise

Unary Functions

x ,−x , ex , log |x |, clip, drop, signBinary Functions

x + y , x − y , x ∗ y , xy+ε

Optimizers tested on 3x3 ConvNet (32 filters) for 5 epochs

Favors early progress

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 12

/ 20

Page 13: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 13

/ 20

Page 14: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Optimizer Discovery

Recurring element:esign(g)∗sign(m) ∗ g

If sign(g) agrees with running average, scale e - g keeps decreasing

Else scale 1e - gradient direction changed

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 14

/ 20

Page 15: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Outline

1 IntroductionMotivationApproach

2 MethodsDomain-Specific LanguageController RNN

3 ExperimentsOptimizer DiscoveryTransfer Experiment

4 Summary

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 15

/ 20

Page 16: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Rosenbrock Function

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 16

/ 20

Page 17: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

CIFAR-10

Wide ResNet (Zagoruyko & Komodakis, 2016) - 300 epochs

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 17

/ 20

Page 18: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

CIFAR-10

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 18

/ 20

Page 19: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Neural Machine Translation

Completely different model & task: WMT 2014 English → German taskGNMT model - 8 LSTM layers

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 19

/ 20

Page 20: Neural Optimizer Search with Reinforcement Learning2017/11/09  · 1(op 1);u 2(op 2)) Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural

Summary

RNN generates optimizer equations

Train RNN via RL setup

Optimizers tested on small ConvNet

New optimizers on par with state of the art

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le (Northwestern University and Intel Corporation)Neural Optimizer Search with Reinforcement LearningICLR, 2017/ Presenter: Anant Kharkar 20

/ 20