019 20160907 decoupled neural interfaces using synthetic gradients

Post on 16-Feb-2017

87 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Decoupled Neural Interfaces using Synthetic Gradients

Tran Quoc Hoan

@k09ht haduonght.wordpress.com/

Paper Alert 2016-09-09, Hasegawa lab., Tokyo

The University of Tokyo

Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu

https://arxiv.org/abs/1608.05343

Findings

Decoupled Neural Interfaces using Synthetic Gradients 2

• Modelling error gradients: by using the modeled synthetic gradient in place of true back propagated error gradients, decouple subgraphs and update independently and asynchronously

• Speed up training process and save memory for RNN

Neural network and the problem of locking

Decoupled Neural Interfaces using Synthetic Gradients 3

• Gradients have been back-propagated sequentially

• Layer 1 must wait for forward/backward computation at layer 2&3 for update

• Layer 1 is locked, coupled to the rest of network

Time consuming problem for complex network or big distributed network spread over multiple machines

Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients

Idea: Synthetic Gradient

Decoupled Neural Interfaces using Synthetic Gradients 4

predict this instead using back-propagationhi

�̂i

�̂i

Update

Train estimator

Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients

Idea: Synthetic Gradient

Decoupled Neural Interfaces using Synthetic Gradients 5

Mi : mini/simple neural network

Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients

Synthetic Gradient for RNN

Decoupled Neural Interfaces using Synthetic Gradients 6Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients

Only use synthetic gradient at trunked point of RNN

Experiments

Decoupled Neural Interfaces using Synthetic Gradients 7

Q: How about hardware setup for improvement (specially in DeepMind)? Does it work int my GPU clusters?

Experiments

Decoupled Neural Interfaces using Synthetic Gradients 8

top related