deep reinforcement learning

Deep RL Abecon 20.05.2016

Upload: leszek-rybicki

Post on 16-Apr-2017

192 views

Category:

Science

1 download

Report

Download

Embed Size (px):

TRANSCRIPT

Deep RLAbecon 20.05.2016

RL-type Problems

• game of chess, GO, Space Invaders

• balancing a unicycle

• investing in stock market

• running a business

• making fast food

• life…!

Markov Decision Process

• S - set of states

• A - set of actions (or actions for state)

• P(s, s’ | a) - state change

• R(s, s’ | a) - reward

• ∈ [0, 1] - discount factor

Maximize the total discounted reward:

The GOAL

: discount factor0 instant gratification

…

patience1

Value Functions

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_dp.html

SARSA

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_td.html

Q-Learning

http://cs.stanford.edu/people/karpathy/reinforcejs/gridworld_td.html

Reinforce.jshttp://cs.stanford.edu/people/karpathy/reinforcejs/

// create an environment objectvar env = {};env.getNumStates = function() { return 8; }env.getMaxNumActions = function() { return 4; }

// create the DQN agentvar spec = { alpha: 0.01 } agent = new RL.DQNAgent(env, spec);

setInterval(function(){ // start the learning loop var action = agent.act(s); // s is an array of length 8 agent.learn(reward);}, 0);

http://cs.stanford.edu/people/karpathy/reinforcejs/

2013: Deep RL

http://arxiv.org/abs/1312.5602

2014: Google buys DeepMind

2015: AlphaGO

Deep Q-Learning1. Do a feedforward pass for the current state s to get predicted Q-values

for all actions.

2. Do a feedforward pass for the next state s’ and calculate maximum overall network outputs max a’ Q(s’, a’).

3. Set Q-value target for action to r + γmax a’ Q(s’, a’) (use the max calculated in step 2). For all other actions, set the Q-value target to the same as originally returned from step 1, making the error 0 for those outputs.

4. Update the weights using backpropagation.

http://www.nervanasys.com/demystifying-deep-reinforcement-learning/

Deep Q-Learning

http://www.nervanasys.com/demystifying-deep-reinforcement-learning/

https://www.youtube.com/watch?v=32y3_iyHpBc

http://gabrielecirulli.github.io/2048/

https://www.youtube.com/watch?v=32y3_iyHpBc

http://www.youtube.com/watch?v=hUNTeZJSUWE

Asynchronous Gradient Descent

http://arxiv.org/abs/1602.01783

http://www.youtube.com/watch?v=cFWL_y9BVaQ

http://www.rethinkrobotics.com/baxter/

http://www.youtube.com/watch?v=NHWjlCaIrQo

Intro to Deep Reinforcement Learning

Deep Learning for Reinforcement Learning in · PDF fileDeep Learning for Reinforcement Learning in ... Deep Learning for Reinforcement Learning in Pacman Deep Learning für ... Während

Deep Reinforcement Learning of Video Games...the eld of machine learning and reinforcement learning, in particular deep reinforcement learning. 1.2.1 Architectural neural network design

Deep Learning and Reinforcement Learning Workflows in A.I. · Deep Learning and Reinforcement Learning Workflows in A.I. Jon Cherrie Software Engineering Manager Deep Learning Toolbox

Deep Reinforcement Learning: Q-Learning - Svetlana …slazebni.cs.illinois.edu/spring17/lec17_rl.pdf · Deep Reinforcement Learning: Q-Learning Garima Lalwani Karan Ganju Unnat Jain

Hierarchical Deep Reinforcement Learning: Integrating ... · Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction ... expected future intrinsic (controller)

Exploration in Deep Reinforcement Learning · 2017-11-10 · Exploration in Deep Reinforcement Learning Exploration in Deep Reinforcement Learning Vorgelegte Bachelor-Thesis von Markus

From Reinforcement Learning to Deep Reinforcement …fagostin/assets/files/...Keywords: Machine learning · Reinforcement learning Deep learning · Deep reinforcement learning 1 Introduction

Deep Learning with Reinforcement Learning · about statistics, information theory, machine learning, reinforcement learning and ﬁnally deep learning. Finally, an honest and special

Tutorial: Deep Reinforcement Learning · Outline Introduction to Deep Learning Introduction to Reinforcement Learning Value-Based Deep RL Policy-Based Deep RL Model-Based Deep RL

Deep Reinforcement Learning in a Handful of Trials …papers.nips.cc/paper/7725-deep-reinforcement-learning-in...Deep Reinforcement Learning in a Handful of Trials using Probabilistic

10703 Deep Reinforcement Learning and Controlrsalakhu/10703/Lecture_Exploration.pdf · 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department

Deep Reinforcement Learning An Introduction

Deep Learning for Reinforcement Learning in Pacman · Deep Learning for Reinforcement Learning in Pacman Deep Learning für Reinforcement Learning in Pacman Vorgelegte Bachelor-Thesis

(Deep) Reinforcement Learning - Computer Vision Lab. …cvlab.postech.ac.kr/.../cse703r_2016s/Deep_Reinforcement_Learning… · (Deep) Reinforcement Learning Kee-Eung Kim School of

Tutorial: Deep Reinforcement Learning · 2018-09-10 · Tutorial: Deep Reinforcement Learning David Silver, Google DeepMind. Outline Introduction to Deep Learning Introduction to

DEEP FEATURE EXTRACTION FOR SAMPLE-EFFICIENT REINFORCEMENT … · Deep Reinforcement Learning Il deep reinforcement learning (DRL) `e una branca dell’apprendimento per rinforzo

Learning about (Deep) Reinforcement Learningsrome.github.io/files/dataphillytalk/data_philly_04022017.pdf · Learning about (Deep) Reinforcement Learning SSccootttt RRoommee romesco

Reinforcement Learning and Deep Reinforcement Learningcse.ucdenver.edu/.../Class-22-Reinforcement-learning-DL.pdf · 2018. 11. 28. · Outlines 1 Principles of Reinforcement Learning

Deep Reinforcement Learning and Control

Reinforcement and deep reinforcement learning for wireless

Tutorial: Deep Reinforcement Learning - Machine Learning ...hunch.net/~beygel/deep_rl_tutorial.pdfTutorial: Deep Reinforcement Learning - Machine Learning

Human-level control through deep reinforcement … control through deep reinforcement learning ... We demon- stratethatthedeepQ ... Human-level control through deep reinforcement learning

Reinforcement Learning - Deep Reinforcement Learningmmartin/URL/Lecture4.pdfReinforcement Learning Deep Reinforcement Learning Mario Martin CS-UPC May 17, 2018 ... Deep Q-Network algorithm

Deep Reinforcement Learning from Human Preferencespapers.nips.cc/...deep-reinforcement-learning-from-human-preferences.pdf · Deep Reinforcement Learning from Human Preferences Paul

Deep Reinforcement Learning with Double Q-Learning

Introduction to Deep Reinforcement Learningwnzhang.net/teaching/cs420/slides/13-deep-rl.pdf · 2020-06-08 · Deep Reinforcement Learning •Deep Reinforcement Learning •leverages

Tutorial: Deep Reinforcement Learning

10703 Deep Reinforcement Learning

Introduction to Deep Reinforcement Learning · PDF fileIntroduction to Deep Reinforcement Learning Shenglin Zhao ...

Hierarchical Deep Reinforcement Learning: Integrating ...papers.nips.cc/paper/6233-hierarchical-deep-reinforcement-learning... · work that integrates deep reinforcement learning

Hangzhou Deep Learning Meetup-Deep Reinforcement Learning

A Deep Reinforcement Learning Chatbot

Thinking While Moving: Deep Reinforcement Learning in ... While...Thinking While Moving: Deep Reinforcement Learning in Concurrent Environments. Thinking While Moving: Deep Reinforcement

Survey of Deep Reinforcement Learning for Motion Planning ... · Reinforcement Learning Autonomous vehicles Fig. 1: Web of Science topic search for ”Deep Reinforcement Learning”