generating sequences with recurrent neural networks · basic architecture •deep recurrent lstm...

Generating Sequences with Recurrent Neural Networks - Graves, Alex, 2013 Yuning Mao Based on original paper & slides

Upload: others

Post on 20-Sep-2020

21 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Generating Sequences with Recurrent Neural Networks - Graves, Alex, 2013

Yuning Mao

Based on original paper & slides

Page 2: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Generation and Prediction

• Obvious way to generate a sequence: repeatedly predict what will happen next

• Best to split into smallest chunks possible: more flexible, fewer parameters

Page 3: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

The Role of Memory

• Need to remember the past to predict the future

• Having a longer memory has several advantages:

• can store and generate longer range patterns

• especially ‘disconnected’ patterns like balanced quotes and brackets

• more robust to ‘mistakes’

Basic Architecture

• Deep recurrent LSTM net with skip connections

• Inputs arrive one at a time, outputs determine predictive distribution over next input

• Train by minimizing log-loss

• Generate by sampling from output distribution and feeding into input

Page 5: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Text Generation

• Task: generate text sequences one character at a time

• Data: raw wikipedia from Hutterchallenge (100 MB)

• 205 one-hot inputs (characters), 205 way softmax output layer

• Split into length 100 sequences, no resets in between

Page 6: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

NetworkArchitecture

Page 7: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Compression Results

Page 8: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Real Wiki data

Page 9: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Generated Wiki data

Page 10: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Handwriting Generation

• Task: generate pen trajectories by predicting one (x,y) point at a time

• Data: IAM online handwriting, 10K training sequences, many writers, unconstrained style, captured from whiteboard

• How to predict real-valued coordinates???

Page 11: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Recurrent Mixture Density Networks

• Suitably squashed output units parameterize a mixture distribution (usually Gaussian)

• Not just fitting Gaussians to data: every output distribution conditioned on all inputs so far

• For prediction, number of components is number of choices for what comes next

Page 12: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Network Details

• 3 inputs: Δx, Δy, pen up/down

• 121 output units

• 20 two dimensional Gaussians for x,y = 40 means (linear) + 40 std. devs (exp) + 20 correlations (tanh) + 20 weights (softmax)

• 1 sigmoid for up/down

Page 13: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Output Density

Page 14: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Handwriting Synthesis

• Want to tell the network what to write without losing the distribution over how it writes

• Can do this by conditioning the predictions on a text sequence

• Problem: alignment between text and writing unknown

• Solution: before each prediction, let the network decide where it is in the text sequence

Page 15: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Network Architecture

Page 16: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Unbiased Sampling

Page 17: Generating Sequences with Recurrent Neural Networks · Basic Architecture •Deep recurrent LSTM net with skip connections •Inputs arrive one at a time, outputs determine predictive

Biased Sampling

Solar Flare Forecasting: A Novel Deep Learning …bpoduval/psfiles/...Figure: Diagram of a MLP. Methods - LSTM I Long short term memory (LSTM) networks are a special type of recurrent

ACOUSTIC MODELING IN STATISTICAL PARAMETRIC · PDF fileACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS ... recurrent neural networks (LSTM-RNNs), ... a number of natural

Recurrent Neural Networks · Long Short-term Memory (LSTM) • LSTM doesn’t guarantee that there is no vanishing/exploding gradient, but it does provide an easier way for the model

A generalized LSTM-like training algorithm for second-order … · A generalized LSTM-like training algorithm for second-order recurrent neural networks Derek D. Monner⇤, James

Recurrent Networks and LSTM deep dive

Decoding Emotions from Brain Signals Using Recurrent ... · strong emotions and rated their emotions on quantitative scales. A Long Short-Term Memory (LSTM) recurrent neural network

Scene Labeling With LSTM Recurrent Neural Networks · 2015-05-26 · Scene Labeling with LSTM Recurrent Neural Networks Wonmin Byeon 1 2Thomas M. Breuel1 Federico Raue Marcus Liwicki1

Recurrent Neural Networks (Part - 2) - CILVR at NYU Towards End-to-End Speech Recognition with Recurrent Neural Networks Figure 1. ... LSTM (Graves & Schmidhuber ... tecture to the

Deep Learning for Natural Language Processingtsuruoka/tmp/tutorialNLPver2.pdf• Introduction • Word embeddings –Word2vec, Skip-gram, fastText • Recurrent Neural Networks –LSTM,

Compression of Neural Machine Translation Models via Pruningnlp.stanford.edu/pubs/see2016compression.pdf(2015b) to recurrent neural networks (RNN), in particular LSTM architectures

Learning Precise Timing with LSTM Recurrent Networks

A Recurrent Neural Network Based Recommendation Systemcs224d.stanford.edu/reports/LiuSingh.pdf17 Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) as well as novel 18 implementation

Learning Precise Timing with LSTM Recurrent Networksjmlr.csail.mit.edu/papers/volume3/gers02a/gers02a.pdfWhile Hidden Markov Models tend to ignore this information,recurrent neural

Long-Short Term Memory Network - WordPress.com · 2017-11-07 · Long-Short Term Recurrent Networks (LSTM) • Idea: Don’t multiply Multiplication == Vanishing gradients Insteadof

Generating Sequences With Recurrent Neural Networks · Generating Sequences With Recurrent Neural Networks ... Long Short-term Memory ... Note the ‘skip connections’ from the

Phased LSTM: Accelerating Recurrent Network …...Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences Daniel Neil, Michael Pfeiffer, and Shih-Chii

Recurrent Neural Networkstat.snu.ac.kr/mcp/RNN_DL.pdf · Recurrent Neural Network Long Short-term Memory (LSTM) LSTM introduces a memory cell state c t. There is no direct relationship

Partition-wise Recurrent Neural Networks for Point-based ... · current Neural Networks (RNN), especially Long Short-Term Memory (LSTM) [5], Gated Recurrent Units (GRU) [6], and bidirectional

Long Short Term Memory(LSTM) · 7/3/2018 · Long Short Term Memory(LSTM) Tiancheng Yuan & Renzhi Sheng Definition •A special type of Recurrent Neural Networks which has extraordinary

Generating Rhyming Poetry Using LSTM Recurrent Neural

Understanding LSTM Networks - Stanford University · 2018-01-14 · Understanding LSTM Networks Posted on August 27, 2015 Recurrent Neural Networks Humans don’t start their thinking

Sequential Data Modelling - University College Cork · KDD 2016 [2] “What to Do Next: Modeling User Behaviors by Time-LSTM” JCAI 2017 [3] Phased LSTM: Accelerating Recurrent Network

Recurrent (Neural) Networks: Part 1 - cilvr.nyu.edu Language modeling with neural networks ... LSTM-based LM will be described in the second talk today Tomas Mikolov, FAIR, 2015. Recurrent

PredRNN: Recurrent Neural Networks for Predictive …papers.nips.cc/.../6689-predrnn-recurrent-neural-networks...lstms.pdf · states are no longer constrained inside each LSTM unit

Recurrent neural networks and Long-short term memory (LSTM)milos/courses/cs3750/lectures/class20.pdf · Recurrent neural networks and Long-short term memory (LSTM) Jeong Min Lee CS3750

Translation Modeling with Bidirectional Recurrent Neural ...emnlp2014.org/papers/pdf/EMNLP2014003.pdf · 3 LSTM Recurrent Neural Networks ... In related elds like e.g. language modeling,

Understanding LSTM Networks - UNRCdc.exa.unrc.edu.ar/rio2016/sites/default/files/6.Understanding LSTM... · 11/2/2016 Understanding LSTM Networks ... Recurrent Neural Networks

Processing 11. Recurrent Neural Networks and Natural ... · LSTM and GRU Fran˘cois Fleuret EE-559 { Deep learning / 11. Recurrent Neural Networks and Natural Language Processing

Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Convolutional Neural Networks - e-learning•Convolutional Neural Networks •Deep Autoencoders and RBM •Gated Recurrent Networks (LSTM, GRU, …) •Recurrent, recursive and contextual

YoTube: Searching Action Proposal Via Recurrent and Static ... Searching... · A. Recurrent Neural Network Recurrent Neural Network (RNN), especially Long-Short Term Memory (LSTM)

GRID-LSTM€¦ · Grid-LSTM: Motivation Stacked LSTM, but LSTM units connections along depth dimension as well as temporal dimension

Recurrent Neural Networks - GitHub PagesRNN: LSTM: Source: Understanding LSTM Networks Christopher Olah LSTM •Modeling –Input gate –Input information Source: Understanding LSTM

SKIP RNN: LEARNING TO SKIP STATE UPDATES IN …Published as a conference paper at ICLR 2018 SKIP RNN: LEARNING TO SKIP STATE UPDATES IN RECURRENT NEURAL NETWORKS V´ıctor Campos y,

Recurrent neural networks and Long-short term memory (LSTM)jlee/papers/cs3750_rnn_lstm...Long Short-term Memory (LSTM) •The contents of the memory cells # "are regulated by various