networks long short term memory - rctn.orglong short term memory networks brian cheung ... memory...
TRANSCRIPT
![Page 1: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/1.jpg)
Long Short Term Memory Networks
Brian Cheung
![Page 2: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/2.jpg)
● Proposed by Sepp Hochreiter in 1997● Originally used approximate error gradient
with Real Time Recurrent Learning and truncated backpropagation through time
● Used for processing long range contextual information
Hochreiter & Schmidhuber 1997
LSTM
![Page 3: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/3.jpg)
LSTMs reduce vanishing gradient problem
Standard Recurrent Network LSTM Network
● The darker the shade, the greater the sensitivity● The sensitivity decays exponentially over time as new inputs overwrite the
activation of hidden unit and the network ‘forgets’ the first input
Graves et al 2013
![Page 4: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/4.jpg)
Graves et al 2013
LSTMs reduce vanishing gradient problem
● Memory cells and gating units allow information to be stored for long periods of time.
● Memory cells are additive in time○ Gradients also additive in time which
alleviates vanishing gradient
![Page 5: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/5.jpg)
Backpropagation through time
● Derivative of objective function, O, with respect to linear activations, a
● Gradient descent weight update
![Page 6: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/6.jpg)
xιφωchg
X
Input Gate
Forget Gate
Output Gate
Cell State
Hidden State
g
Input to the LSTM node at a particular time point (character, phoneme, word)
Anatomy of an LSTM node
Determines whether the input will be stored in the memory cell
Determines whether current contents of memory will be forgotten (erased)
Determines if current memory contents will be output
Memory Cell
Output of LSTM node
Input nonlinearity of Memory Cell (i.e. tanh, sigmoid)
![Page 7: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/7.jpg)
Input Gate
Cell StateForget Gate
Output Gate
X
Hidden State
Anatomy of an LSTM node
![Page 8: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/8.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
**
Time: t-1 t
LSTM Forward Propagation
![Page 9: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/9.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
g
1. Input Gate
![Page 10: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/10.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
g
2. Forget Gate
![Page 11: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/11.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
3. Cell State
![Page 12: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/12.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
4. Output Gate
![Page 13: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/13.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
**
5. Hidden State
![Page 14: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/14.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
*
6. Output
![Page 15: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/15.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
LSTM Backprop, Hidden State
![Page 16: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/16.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
2. Output Gate
![Page 17: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/17.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
3. Cell State
![Page 18: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/18.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
4. Forget Gate
![Page 19: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/19.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
5. Input Gate
![Page 20: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/20.jpg)
Input GateForget Gate
Cell State
Output Gate
Hidden State
X
Hidden State
Cell State
*
g
+
*
Output
...*
6. Input
![Page 21: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/21.jpg)
Image Captioning
Vinyals et al 2014Donahue et al 2014
LSTM Applications
Kiros et al 2014
![Page 22: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/22.jpg)
LSTM Applications
Graves 2014
Handwriting Synthesis
![Page 23: Networks Long Short Term Memory - rctn.orgLong Short Term Memory Networks Brian Cheung ... Memory cells and gating units allow information to be stored for long periods of time. Memory](https://reader030.vdocuments.net/reader030/viewer/2022040608/5ec697296d058116241ca3f0/html5/thumbnails/23.jpg)
LSTM Applications● Speech recognition (Graves et al 2013)● Neural Machine Translation (Sutskever et al 2014)● Neural Turing Machine (Graves et al 2014)