neurogammon -...

Neurogammon CJ Bell Matthew Maas Brian Suchland Joe Cartano

Upload: others

Post on 29-Oct-2019

6 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

NeurogammonCJ Bell

Matthew MaasBrian Suchland

Joe Cartano

Page 2: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Backgammon• A zero-sum board game between two players• Players roll dice and choose which checkers to

move• Players can also choose to use the doubling-

cube

Page 3: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Backgammon (continued)• An excellent candidate for an AI program• BUT, the game involves a large element of

chance• Traditional search methods are inefficient• Expert human players rely on judgment, not

search.

Page 4: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Neurogammon• Developed by Gerald Tesauro of IBM• Relies on neural-networks instead of search

Page 5: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Implementation

• Six neural-networks for six different stages of the game. (289-?-?)

• One additional neural-network to determine whether to use the doubling-cube. (best setup: 243-24-9)

Page 6: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Training

• Input: Initial board position and transition to next position

• The first six networks trained on a set of expert’s games, where each move was rated from -100 (worst) to 100.

Page 7: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Training

• The seventh network trained on a separate set of expert games

• 3000 positions covering 64 games (225 set aside for testing)

• Each position was categorized from 1 to 9 by an expert, indicating whether it was a good time to use the doubling-cube.

• The 9 outputs were summed

Page 8: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

First Computer Olympiad

• Held in 1989• Pitted the six premier computer

backgammon programs of the time against each other in a round-robin tournament

• The first serious test of Neurogammon’sabilities

• All five other programs relied on traditional, human-defined board evaluations

Page 9: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Results of the First Computer Olympiad

COMPUTER OPPONENT RESULTS(FIRST TO 11 POINTS)

Saitek Backgammon 12-9, won by Neurogammon

Mephisto Backgammon 12-5, won by Neurogammon

Backbrain 11-4, won by Neurogammon

AI Backgammon 16-1, won by Neurogammon

Video Gammon 12-7, won by Neurogammon

Page 10: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-GammonVersion Training

GamesOpponents Results

TD-Gammon 0.0 300,000 Computer Programs

Tied for Best

TD-Gammon 1.0 300,000 Various Human Experts

-13 Points / 51 Games

TD-Gammon 2.0 800,000 Various Human Experts

-7 Points / 38 Games

TD-Gammon 2.1 1,500,000 Robertie(Grandmaster)

-1 Point / 40 Games

TD-Gammon 3.0 1,500,000 Kazaros(Grandmaster)

+6 Points / 20 Games

Page 11: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-Gammon is Used to Reevaluate Board Positions

White has just rolled two 4’s, giving it 4 moves of 4 spaces each

Page 12: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

The Traditional Move

The traditionally accepted move in this situation is 8-4, 8-4, 11-7, 11-7

Page 13: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-Gammon’s Move

TD-Gammon’s move in this situation is 8-4, 8-4, 21-17, 21-17

0(+* 1+2.)*1+3#4’* - courses.cs.washington.educourses.cs.washington.edu/courses/cse351/12sp/lecture-slides/04-floats.pdf · !"#$%&’#()*+,*-.’/#"0(+"* f%3&%’%"(

courses.cs.washington.educourses.cs.washington.edu/courses/cse454/10au/student... · Web viewBook verification and deselecting: This second page would be an intermediate page, containing

What is ICT for Development - courses.cs.washington.educourses.cs.washington.edu/courses/csep590/04au/lectures/slides/... · What is ICT for Development? ... curriculum with rural

Computer Animation - courses.cs.washington.educourses.cs.washington.edu/.../animation/Computer_Animation.pdfBasic Animation •What’s the simplest kind of animation you can think

Relation Extraction II - courses.cs.washington.educourses.cs.washington.edu/courses/.../13wi/slides/...slide adapted from Rion Snow . Hypernyms via distant supervision ... Google was

Local Search and Optimization - courses.cs.washington.educourses.cs.washington.edu/courses/csep573/11wi/... · •Analysis –Say each search has probability p of success •E.g.,

Digital Sound Generation - courses.cs.washington.educourses.cs.washington.edu/courses/cse490s/11au/Readings/Digital_Sound_Generation_1.pdfDigital Sound Generation Beat Frei, 10-07-19,

LASSO Regression - courses.cs.washington.educourses.cs.washington.edu/.../slides/LARS-fusedlasso-annotated.pdfAxel Gandy LASSO and related algorithms 34 LARS – Illustration for p=2

courses.cs.washington.educourses.cs.washington.edu/courses/cse458/08au/... · Animation 3 Table of Contents 1 Animation Basics . . . . . . . . . . . . . . . . . . . . . . . . .

Slides by Alex Mariakakis - courses.cs.washington.educourses.cs.washington.edu/.../14sp/sections/section10-designpatter… · Design Patterns 2 . List of Design Patterns We discussed

CSEP 590 Data Compression - University of …courses.cs.washington.edu/courses/csep590a/07au/lectures/...CSEP 590 - Lecture 1 - Autumn 2007 7 Logistics • I will be gone the week

RNA Secondary Structure - courses.cs.washington.educourses.cs.washington.edu/courses/cse417/06wi/slides/06dp-rna.pdf · •E.g. “riboswitches”: thousands in bacteria. DNA structure:

Pipelining vs. Parallel processingcourses.cs.washington.edu/courses/cse378/07au/lectures/L...2 Pipelining vs. Parallel processing In both cases, multiple “things” processed by

NURBS Modeling - courses.cs.washington.educourses.cs.washington.edu/courses/cse459/06wi/help/...NURBS Modeling 6 Table of Contents Control multi-knots and CV hardness . . . . . .

Introduction to Data Programming - courses.cs.washington.educourses.cs.washington.edu/courses/cse140/14wi/lectures/00-introduction.pdf · programming language –…and you will gain

courses.cs.washington.educourses.cs.washington.edu/courses/csep561/08au/slides/p561.3.ink.pdfInternetworks Set of interconnected networks, e.g., the Internet Scale and heterogeneity

CSE 473: Artificial Intelligence - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/14sp/slides/22-BN-inference.pdfCSE 473: Artificial Intelligence Bayesian Networks:

Reflection in Java - courses.cs.washington.educourses.cs.washington.edu/.../reflection.pdf · Programming Reflection To program with reflection, we must put on our meta-thinking caps

Image Stitching - courses.cs.washington.educourses.cs.washington.edu/courses/cse576/16sp/Slides/10_ImageStitching.pdfImage Stitching Ali Farhadi CSE 576 Several slides from Rick Szeliski,

Card 3000 - courses.cs.washington.educourses.cs.washington.edu/.../final-presentation.pdfStarbucks - 10/20 $1.23 $38.56 $700.89 $2.75 $4.00 RECENT TRANSACTIONS McDonald's rispy Kreme

cse403 testing part 2 - courses.cs.washington.educourses.cs.washington.edu/courses/cse403/12au/... · Announcements • Deliverables for 11/19 release (due 11:59PM) • Documents

Visual Schedule Finder - courses.cs.washington.educourses.cs.washington.edu/.../projects/srs/VisualScheduleFinder_SR… · The software we will produce is a Web application called

Building Java Programs - courses.cs.washington.educourses.cs.washington.edu/courses/cse142/08au/lectures/2008-10-13... · Building Java Programs Chapter 3 ... Java class libraries:

cse403 scale - courses.cs.washington.educourses.cs.washington.edu/courses/cse403/12au/lectures/cse403_s… · Scaling up through caching database server/ database web servers application

Players dribble around area, on coaches command players

Ray Tracing - courses.cs.washington.educourses.cs.washington.edu/courses/cse557/15au/lectures/ray-tracin… · Whitted ray-tracing algorithm In 1980, Turner Whitted introduced ray

Testing and Debugging - courses.cs.washington.educourses.cs.washington.edu/.../ppt/LogicAnalyzers.pdf · Logic analyzers 20 Semiconductor scaling confounds testing Testing is a key

Python Evaluation Rules - courses.cs.washington.educourses.cs.washington.edu/courses/cse140/13wi/eval_rules.pdf · 2016-08-02 · 3.2.1 Rules for Evaluation To evaluate a compound

The Google File System - courses.cs.washington.educourses.cs.washington.edu/courses/cse490h/08au/lectures/490h-gfs.pdf · Seagate Barracuda 7200.11 ... transfer: transferring data

Chapter 4: Greedy Algorithms - courses.cs.washington.educourses.cs.washington.edu/courses/cse417/14sp/slides/04greed.pdf · Interval Scheduling: Greedy Algorithms! Greedy template

Replication - courses.cs.washington.educourses.cs.washington.edu/courses/csep545/12wi/slides/10_Replication.pdfthe same effect as a serial execution on a one-copy database. • Readset

CSE 454 Final Presentation - courses.cs.washington.educourses.cs.washington.edu/courses/cse454/10au/.../presentation.pdf · CSE 454: Advanced Internet and Web Services Autumn 2010

courses.cs.washington.educourses.cs.washington.edu/.../04sp/pdfs/lectures/L4-MicroBlaze.pdf · Simple MicroBlaze System Block Diagram External to FPGA MicroBlaze LM B_BRAM IF_CN TLR

Computers - courses.cs.washington.educourses.cs.washington.edu/.../13/CSE120-L13-computers_17sp_ink.pdf · Computers museumin Seattle managed to get three. And one of them was Jobs’

A Survey on Virtualization Technologiescourses.cs.washington.edu/courses/cse451/07au/lectures/18-virt.pdf · Virtualization: What is it, really? Real vs. Virtual Similar essence,