norvig's paradigms of artificial intelligence programming: an instructor's perspective

Artificial Intelligence 64 (1993) 169-180 169 Elsevier

ARTINT 1099

Book Review

Norvig's Paradigms of Artificial Intelligence Programming:* an instructor's perspective James H. Martin Computer Science Department and Institute of Cognitive Science, University of Colorado, Boulder, CO 80309-0430, USA

Received March 1993 Revised May 1993

1. Introduction

The publication of a promising new textbook is always a source of ex- citement for educators. This is especially true when the text promises to break new ground both in content and style. Peter Norvig's Paradigms of AI Programming: Case Studies in Common Lisp is such a book. This book should be on the desk of everyone who needs to program in Common Lisp, build AI systems, or teach these topics. Be forewarned however; this is not a book for the faint of heart.

The primary goal of this text is to convey the programming knowledge and skills necessary to build realistic AI systems. This goal is addressed by focusing on three related areas: the programming language Common Lisp, AI programming techniques, and AI content material. These areas are explored through the use of a series of case studies. Each case study is a reconstruction of some well-known AI system in Common Lisp. Students learn by dealing with a working system, evaluating the system on test cases, and walking through a series of revisions designed to both improve the

Correspondence to: J.H. Martin, Computer Science Department and Institute of Cognitive Science, University of Colorado, Boulder, CO 80309-0430, USA.

* (Morgan Kaufmann, San Mateo, CA, 1992); xxviii + 946 pages, $ 44.95.

0004-3702/93/$ 06.00 (g) 1993 - - Elsevier Science Publishers B.V. All fights reserved

170 J.H. Martin

system and further investigate the problems. Features of Common Lisp and a variety of AI techniques are introduced and explained along the way.

The following summary section will give a largely uncritical overview of the contents of this text. The remaining sections will then present a series of perspectives on the book. These perspectives are based upon having taught a graduate AI Programming course using Paradigms as the primary text for the past three years.

2. Summary

Paradigm's 25 chapters are divided into 5 major parts: "Introduction to Common Lisp", "Early AI Programs", "Tools and Techniques", "Advanced AI Programs", and "The Rest of Lisp". There are no purely theoretical chapters in this book--when a chapter introduces an idea it is immediately followed by an implementation of the idea. Therefore, each chapter introduces a significant number of new Lisp features and, more significantly, builds upon code introduced earlier in the text.

2.1. Part 1: Introduction to Common Lisp

The first chapter of Paradigms is designed to give the reader the minimum Lisp background necessary to get into the main part of the book. In Chapter 2, Norvig introduces the reader to the case study technique used throughout the rest of the book. This chapter develops two solutions to the problem of writing a program to generate random English sentences. Norvig first runs through a solution that encodes a grammar representation procedurally. A critique of this code is then used to motivate the idea of a declarative representation. This then leads to a program that manipulates an explicitly represented grammar.

Chapter 3 follows with an extremely fast tour of a large chunk of Common Lisp. Here Norvig deviates from the overall style of the book to simply present the major features of the language one by one. As such, it is best used as a resource that can be referred to as needed, rather than as an integrated part of teaching from this book.

2.2. Part 2: Early AI programs

In Part 2, the case-study methodology begins in earnest. It presents reconstructions of four early AI programs: GPS, ELIZA, STUDENT, and MACSYMA. Part 2 also includes a chapter on software tools that are used repeatedly throughout the book.

Book Review 171

GPS: the general problem solver Chapter 4 serves both as an introduction to GPS or STRIPS-style linear

planning and as a lesson in AI-style experimental methodology. The chapter presents three versions of GPS. Each new version is built in response to unanticipated problems encountered in the testing of a previous version. The most interesting complications arise from a variety of interacting sub-goal problems.

ELIZA: dialog with a machine Chapter 5 uses the ELIZA program as a vehicle to introduce the notion

of pattern matching. The pattern matching code in this chapter is used in a variety of forms throughout the rest of the book.

Building software tools Chapter 6 investigates the notion of procedural abstraction by creating

a series of software tools. Among the tools developed are a rule-based interpreter, a pattern matcher, and finally a set of searching tools. The search section includes a fairly extensive discussion of blind, heuristic and optimal search strategies. The GPS system from Chapter 4 is then reconstructed using these search tools as a abstraction layer.

STUDENT: solving word algebra problems Chapter 7 continues the pattern-matching discussion begun in Chapter 5.

Here Norvig reconstructs Bobrow's thesis program, STUDENT, using the pattern matching tool from the previous chapter.

Symbolic mathematics: a simplification program A more in-depth exploration of pattern-matching is conducted in Chap-

ter 8. Here a subset of the MACSYMA system is implemented. Among the features that are implemented are algebraic simplification, differentiation, and integration.

2. 3. Part 3: Tools and techniques

Part 3 departs somewhat from the case-study approach to delve into a variety of efficiency techniques, programming styles and knowledge representation issues. Norvig maintains continuity with the earlier parts of the book by exploring these issues in the context of the previously implemented systems.

Efficiency issues Chapter 9 begins by exploring four fairly general efficiency techniques:

memoization, rule compilation, indexing, and delays. Norvig then presents a set of profiling tools to be used in instrumenting code to decide where to

172 J.H. Martin

focus one's efficiency efforts. Finally the simplification code from Chapter 8 is instrumented and altered using these techniques to produce a new more efficient version.

Low-level efficiency issues As the title implies, Chapter 10 covers a range of low-level efficiency

hacks specific to Common Lisp. Among the issues covered are declarations, the costs of consing, and the costs of complex argument lists.

Logic programming Chapter 11 introduces Prolog from a pragmatic programming language

point of view. Norvig focuses on what he considers three important ideas in the language: a uniform database of rules and assertions, unification of logic variables and automatic backtracking. As one might guess by this point in the book, Norvig immediately proceeds to implement what is essentially a complete Prolog system. Returning to the theme of efficiency, Chapter 12 presents a Prolog compiler based on the Warren abstract machine model.

Object-oriented programming The eclectic nature of Part 3 continues with a discussion of object-oriented

programming. Chapter 13 begins with a brief introduction to the object- oriented paradigm; this is followed by an implementation of an extremely simple object system using closures. This simple system is used to illustrate the notions of objects, generic functions, classes, delegation and inheritance. Norvig then turns his attention to presenting the language features of the Common Lisp Object System (CLOS). Here Norvig avoids the temptation to present his own full-blown implementation of CLOS. Chapter 13 concludes with a reconstruction of the search abstractions from Chapter 6 using CLOS. This reconstruction is followed by a useful discussion of the relative merits of the procedural and object-oriented -versions.

Knowledge representation and reasoning In Chapter 14 Norvig turns his attention to the issue of knowledge repre-

sentation. The Prolog implementation from Chapter 11 is used as a substrate upon which a robust knowledge representation language is built. Among the issues discussed are expressiveness, completeness, possible worlds, and the issue of notation. Although the substrate used here is Prolog, the issues discussed are clearly motivated by current directions being taken in the standardization of knowledge representation systems.

2. 4. Part 4: Advanced AI programs

Part 4 returns to the case-study style set forth in Part 2. The AI programs

Book Review 173

and theoretical issues discussed in this part are in general more in-depth and more interesting than those in Part 2. The programming itself is far more complex and is based upon the many abstractions introduced earlier.

Symbolic mathematics with canonical forms Chapter 15 revisits the realm of symbolic mathematics. In this version

Norvig abandons the notion of general purpose pattern matching in favor of special purpose functions centered around the idea of canonical simplification used in MACSYMA.

Expert systems Chapter 16 produces impressive reconstructions of EMYCIN and MYCIN.

Included in this chapter are discussions and implementations of backward chaining, contexts, certainty factors, an explanation facility and a user- interface.

Line-diagram labeling by constraint satisfaction Chapter 17 uses the topic of high-level vision as a means to introduce sym-

bolic constraint satisfaction. A nearly complete implementation of Waltz's line-labeling work is presented.

Search and the game of Othello Norvig addresses the topic of game-playing search in Chapter 18. The

chapter begins with discussion of the usual topics of representation, evalua- tion functions, minimax search, and alpha-beta pruning in the context of the game Othello. Norvig then moves on to an implementation of an Othello player that is an amalgam of Rosenbloom's IAGO and Lee and Mahajan's Bill programs.

Introduction to natural language Chapter 19 represents something of a stylistic departure for Norvig. Rather

than reimplement a particular system, Norvig simply presents the basics of natural language processing using phrase structure grammars. This chapter includes discussions of parsing, ambiguity, chart-parsing, semantic interpretation and parsing with preferences in the context of building a natural language interface to a CD player.

Unification grammars Chapter 20 continues the discussion of natural language but shifts its

focus from simple parsing to parsing using unification grammars. Using the framework of definite clause grammars, Norvig explores the topics of parsing as deduction, quantifier scope, long distance dependencies and semantic interpretation.

174 J.H. Martin

A grammar of English Chapter 21 uses the mechanisms developed in Chapter 20 to develop

an impressive grammar that covers all the major syntactic constructions of English.

2.5. Part 5: The rest of Lisp

As the title of Part 5 suggests it has something of a kitchen-sink feel to it. It includes chapters on Scheme, a Scheme compiler, miscellaneous features of ANSI Common Lisp not yet covered, and a troubleshooting chapter.

Scheme: an uncommon Lisp Having covered Common Lisp, CLOS, and Prolog, Norvig turns to Scheme

in Chapters 22 and 23. In Chapter 22, Norvig introduces the language itself by walking through the development of a complete Scheme interpreter.

Compiling Lisp Chapter 22 is simply a precursor to the far more interesting Chapter 23.

Here Norvig develops a complete Scheme compiler, a peephole optimizer and a machine simulator to run the resulting compiled code.

ANSI Common Lisp Chapter 24 is a roundup of various Common Lisp language features

that for some reason didn't show up in the previous 800 pages. Included among these are packages, condition handling, pretty printing, series, the loop macro, and advanced macro hacking.

Troubleshooting The final chapter deals with what to do when things aren't working the

way they should. Written in the style of a troubleshooting guide, the chapter covers such topics as variables that refuse to change their value, common syntax errors with various special forms, and problems with macros and packages. This chapter also includes a succinct style-guide. Like Chapter 3 this chapter is intended as a resource, and readers should not wait until they get to the end of the book before availing themselves of it.

3. General comments

A 1000 page textbook containing 10000 lines of Common Lisp code is bound to have a few problems. The following comments are a grab-bag of impressions gained from teaching using this book that roughly parallel

Book Review 175

Section 2. Section 4 will provide several more focused perspectives on the text.

Chapters 1 and 2 are a nice introduction to both Common Lisp and the overall style of the text. Students new to Lisp should find these chapters fairly easy going. Chapter 3 deviates from the overall pedagogical style of the book and is clearly an attempt to make the book more palatable to those readers who simply want to be told what the language features are and how they work. While it serves that purpose well, an instructor would be well-advised to skip it altogether while referring students back to it as necessary.

The lead-off chapter of Part 2 on GPS represents a paradigm of the case-study approach that Norvig is using. It gives students a good feel for the exploratory problem solving style that is such a major part of AI. The only weakness in this chapter is that it plays somewhat fast and loose with the details of GPS and STRIPS. In the interest of presenting a small neat implementation, Norvig largely ignores the issue of how these systems heuristically choose among appropriate operators.

The remainder of Part 2 is dominated by the idea of pattern matching. Now while this is an important and powerful idea, the ELIZA-STUDENT- MACSYMA sequence is a bit repetitious. The STUDENT chapter could have been safely omitted with little loss in either AI or programming content.

The chapters that make up Part 3 are widely varying in content and complexity. This part of the book has two distinct themes running through it. The first theme contains issues relating to programming language theory, efficiency, and implementation. This theme runs through all the chapters except the knowledge representation one. The second theme concerns knowledge representation. This second theme runs through the first Prolog chapter, the object-oriented programming chapter and the knowledge representation chapter. The knowledge representation chapter contains information critical to understanding AI. The programming language issues, while interesting, may be safely skipped if there isn't time for them.

Norvig really hits his stride with the material in Part 4. The applications are interesting, and the programming is quite challenging. The MYCIN, Waltz line-labeling, Othello and first natural language processing chapter should all be of interest to a wide audience. However, as with Part 3, instructors will find that it is unlikely that all students will be interested in, or prepared for, all the material in this part of the book. The MAC- SYMA chapter will likely only appeal to those with a strong interest in mathematics.

Similarly, the last two of the three natural language chapters will only appeal to those with some computational linguistics background. Unfortu- nately, these last two chapters suffer from another problem that limits their effectiveness. The unification grammar apparatus and the extensive English

176 J.H. Martin

grammar are not used to implement an application. The average student will likely find them to be interesting but largely unmotivated artifacts.

Part 5 really only contains two chapters that will make it into a lesson plan: the Scheme and Scheme compiler chapters. As with the Prolog compiler chapter, these chapters will appeal mostly to those interested in programming languages.

4. Perspectives on this book

As a teaching resource Paradigms can be viewed from a number of dif- ferent perspectives. While no single perspective does it justice, individually they provide a way to compare this book to others on the market; taken together they give some feel for the scope of the book.

4.1. Lisp

This section will consider Paradigms from the perspective of a modern Lisp programming text. Viewed from this perspective, I believe Paradigms goes far beyond any textbook I've seen on the topic. The remainder of this Section will discuss some of the aspects of the book that an instructor should be aware of before using it.

Common Lisp supports many programming styles. Norvig's style is heavily functional and applicative. It's clearly not for everyone. Moreover, at times the code to explanatory text ratio can be rather high. As a result an instructor may at times find that a five line function will be completely obvious to half the class and completely obscure to the rest.

Going hand in hand with Norvig's functional style is his extensive use of procedural abstractions introduced earlier in the book. While this is generally a good thing, the number of layers of abstraction by the end of the book is quite large. It can be quite difficult at times to manage the inherent complexity of this style. Specifically, the natural language systems at the end of the book are not for those short of human or machine memory.

A second potential trouble spot concerns Norvig's scant coverage of object oriented programming. Having introduced CLOS in Chapter 13 it disappears never to be seen again. Norvig's treatment of object oriented techniques in Chapter 13 conveys the idea that they should be considered as tools to be used where appropriate, not as a religious movement. Nevertheless, it would have been nice if one of the advanced programs in Part 4 had been implemented using CLOS.

Finally, despite its rather impressive implementations, this book is still mainly concerned with the problem of medium scale software development by a single programmer. There is little direct discussion of the problems of

Book Review 177

programming in the large. In particular, Norvig's coverage of packages is lacking.

Other Lisp texts A novice programmer needing to learn Common Lisp would be better

advised to first turn to Touretzky's Common Lisp: A Gentle Introduction to Symbolic Computation [8] before attempting Norvig's text.

Wilensky's Common LISPcraft [9] is obviously geared more towards the computer science student who is a competent programmer and simply needs to learn Lisp for some reason. Norvig's book is overkill for this purpose. However, for the well-prepared, confident, student you get nearly all of Wilensky's basic Lisp material, quite a bit more about Common Lisp and a considerable amount of AI for nearly the same price.

Winston and Horn's Lisp [11] text is quite similar to Paradigms in that both present an extensive amount advanced Lisp and AI material. The main difference lies in the breadth and depth of material presented in Norvig's text. For the most part, the programs in Paradigms move well beyond the toy stage toward realistic application programs. Winston and Horn's text does however have the added advantage of being coordinated with Winston's widely used text [ 10].

Any discussion of books covering Common LISP must include a refer- ence to Steele's Common Lisp: The Language [7]. Paradigms manages to introduce and use a significant fraction of the contents of Common Lisp. While it doesn't obviate the need to have Steele around someplace, you can get a quite lot done just using the features introduced and explained in Paradigms. The students in my classes have only rarely had to refer to Steele to solve their problems.

Finally for the instructor who wants to cover more object-oriented material, Sonya Keene's Object Oriented Programming in Common Lisp [4] would be an essential supplement.

4.2. Artificial intelligence

For a book that is primarily a programming textbook, Paradigms holds up remarkably well when viewed as an AI textbook. Many of the chapters have substantive AI discussions along with their implementations. Most notably, the material covered in the chapters on game playing, symbolic mathematics, knowledge representation, and natural language processing are arguably more comprehensive and reflective of modern practice than the equivalent treatments in either Winston's third edition Artificial Intelligence [10] or Rich and Knight's Artificial Intelligence [6].

The only missing element in Paradigms is unfortunately a rather major one: there is no treatment of learning. Its hard to criticize a nearly 1000 page

178 J.H. Martin

book for not having enough in it, but on the other hand what's another 30 or 40 pages? A chapter reconstructing some well known learning system would have been a straightforward addition given the machinery already developed in the book. Material on learning could, of course, be added with a set of extra readings. Implementation of ideas presented in a well written journal article would be quite reasonable given the machinery Norvig provides.

4.3. AI programming

The area of AI programming is a fairly specialized one and has produced few texts. Artificial Intelligence Programming by Charniak, Riesbeck, Mc- Dermott and Meehan (hereafter AlP) [2] has been an important text since its first edition was published in 1980. It provides a long introduction to both basic and advanced features of Lisp, followed by chapters presenting implementations of a number of AI techniques including discrimination nets, deductive retrieval, production systems, frame systems, chronological backtracking and reason maintenance. These chapters each begin with a brief description of a technique and the kinds of problems it is intended to address, they then present Common Lisp code that implements the idea and demonstrate it with several small examples.

There are two notable differences between Norvig's book and AlP. The first difference involves the way in which code is conveyed to the reader. The emphasis in AlP is on the presentation and explication of a single good implementation of a technique, rather than on the iterative development and refinement of a solution as in Paradigms. The second difference involves the use of the implemented technique to solve some interesting problem. Unlike Paradigms, AIP does not present implementations of realistic uses for the given techniques. Ironically, the first edition of AIP did include two final chapters which were more in the case-study style. These chapters introduced and outlined a specification of Meehan's Talespin story-generating program. This third part was omitted from the second edition of the book.

4.4. Pedagogical style

Year after year, programming books are published that equate learning to program with learning the features of a particular language. Readers are led through chapters on data types, iteration constructs, conditionals etc. Students can usually tell which language feature is appropriate to use in solving their problems based on where they are in the bookma crutch that is of little use when the course is over. This is particularly problematic when one is trying to teach techniques rather than language features. It is critical that students learn where techniques are appropriate as well as the technique itself.

Book Review 179

Fortunately, Norvig's book can be seen as part of a recent trend away from this style. Several newer texts and teaching styles emphasize the idea that programming is not just coding and that programming languages must be learned in the situated context of actually writing interesting complex programs. The recent textbook Designing Pascal Solutions: A Case Study Approach by Clancy and Linn [3] is an excellent example of this trend and provides a clear motivation for the case-study approach to teaching programming.

As with any trend this one has predecessors to thank. In the preface to Paradigms, Norvig acknowledges a book that is clearly one of the forerunners of this style: Kernighan and Plaugher's Software Tools [5]. In Software Tools readers are quickly immersed in the real task of building some interesting piece of code to solve some realistic problem. Readers are expected to learn to write well only after having done considerable reading. Norvig's book owes another obvious debt to Abelson and Sussman's Structure and Interpretation of Computer Programs [1]. Norvig's emphasis on data and procedural abstraction is clearly reminiscent of Abelson and Sussman's book.

5. Conclusions

As noted earlier, I have successfully taught a graduate class on AI programming three times using this book. Paradigms has been well-received by the students in each offering. The book does however pose some challenges to an instructor primarily having to do with gauging the abilities and inter- ests of the students. The ideal student for this class and textbook has had an introductory AI course which includes substantive Lisp programming, a course on modern programming language theory, and a compiler class. The choice of path through Norvig's text is primarily dictated by which of these areas are missing from the background of most of the students.

References

[ 1] H. Abelson, G. Sussman and J. Sussman, Structure and Interpretation of Computer Programs (MIT Press, Cambridge, MA, 1985).

[2] E. Charniak, C. Riesbeck, D. McDermott and J. Meehan, Artificial Intelligence Programming (Lawrence Erlbaum, Hillsdale, NJ, 2nd ed., 1987).

[3] M. Clancy and M. Linn, Designing Pascal Solutions: A Case Study Approach (Computer Science Press, Rockville, MD, 1992).

[4] S. Keene, Object-Oriented Programming in Common Lisp (Addison Wesley, Reading, MA, 1988).

[5] B. Kernighan and P.J. Plaugher, Software Tools (Addison Wesley, Reading, MA, 1976). [6] E. Rich and K. Knight, Artificial Intelligence (McGraw Hill, New York, 2nd ed., 1991 ). [7] G. Steele, Common Lisp: The Language (Digital Press, Belford, MA, 2nd ed., 1990).

180 J.H. Martin

[8] D.S. Touretzky, Common Lisp: A Gentle Introduction to Symbolic Computation (Benjamin Cummings, Menlo Park, CA, 1990).

[9] R. Wilensky, Common LISPcraft (Norton, New York, NY, 1986). [10] P. Winston, Artificial Intelligence (Addison Wesley, Reading, MA, 3rd ed., 1992). [11] P. Winston and B. Horn, Lisp (Addison Wesley, Reading, MA, 3rd ed., 1989).

norvig's paradigms of artificial intelligence programming: an instructor's perspective

Documents