rogers robert_ mathematical logic and formalized theories

Mathematical logic and formalized theories A Survey of Basic Concepts and Results

ROBERT ROGERS Professor of Philosophy, University of Colorado

lIIilHlH-HOLLAND PUBLISHING COMPANY - AMSTERDAM. LONDON _B~CAN ELSEVIER PUBLISHING COMPANY, INC. - NEW YORK

© NORTH-HOLLAND PUBLISHING COMPANY - 1971

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopy· ing, recording or otherwis<J, without the prior permission of the copyright owner.

Library of Congress Catalog Card Number: 78·146195 ISBN North·Holland: 07204 2051 2 ISBN American Elsevier: 0444 10083 0

Publishers:

NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM· LONDON

Sole distributors for the U.S.A. and Canada:

AMERICAN ELSEVIER PUBLISHING COMPANY, INC. 52 VANDERBILT AVENUE NEW YORK, N.Y. 1001 ~

1 st edition 1971 2nd printing 1974

Printed in the Netherlands

For Gus

PREFACE

This book is intended as a 'iurvey - primarily for people who are not professional logicians - of the basic concepts and results of mathematical logic and the study of formalized theories. It is not a textbook, complete with proofs and exercises. A considerable number of concepts are defined in an exact way, and numerous results and methods are carefully stated. Because it is basically a survey rather than a textbook in the usual sense, however, many important results are stated without proof, though for many results proofs are given. This makes the book noticeably easier to read (I hope) than an ordinary textbook on mathematical logic. My priorities in writing the book have been readability and precision. Holding to these priorities, I have attempted to give a representative and reasonably up-to-date picture of the fundamentals. Proofs which discourage all but the experts, however, are generally omitted.

The book (and the subject!) demands a certain. maturity in symbolic thinking. No logic or mathematics is presupposed, however. It is hoped that philosophers, as well as mathematicians, who have a genuine interest in logic without being professional iogicians will be able to read it without too great difficulty. The most difficult material has been put off until the last chapter. Chapters I and VI are easiest to read; Chapters II, IV, V and VII aR somewhat more· difficult; and Chapters III and VIII will probubly strike the reader as the most difficult chapters. The reader may find that he is unable to follow certain points on first leading. Quite often, he should be able to proceed, however, and can return to these points later.

Expressions which are being defined are put in italics. Also, variables are italicized, except for syntactical meta-variables, which ,ue put in boldface.

vii

viii PREFACE

I am especially indebted to the textbooks on mathematical logic by E. Mendelson and A. Church. I have also profited from helpful suggestions from North-Holland Publishing Company's reader; from discussions with my colleague Professor Donald Monk of the University of Colorado mathematics department, who read the next-to-final draft of my manuscript; and from detailed comments on an early draft of the manuscript, coming from my former classmate Professor Richard Montague of the UCLA philosophy department. To them, my thanks and gratitude. Many thanks are also due to Mrs. Kathi George, for her help with the proofreading; to my typist Mrs. Eloise Pearson, for her patience with the rather difficult manuscript; and to my wife Marilyn for her patience with me while I was writing it.

Robert Rogers Boulder, Colorado September, 1970

I

II

III

CONTENTS

The Sentential Logic 1.1. Introduction 1.2. Sentential Connectives 1.3. The Sentential Logic P. Symbols and For

mulas 1.4. Tautologies 1.5. Axiom Schemata of P. Rules of Inference

and Theorems 1.6. Metamathematical Properties of P

The First-Order Predicate Logic: I 2.1. The First-Order Predicate Logic Fl. Symbols,

Quantifiers and Formulas 2.2. Interpretations. Truth and Validity 2.3. Axiom Schemata of Fl. Rules of Inference

and Theorems. Consistency of Fl 2.4. The Deduction Theorem

The First-Order Predicate Logic: II 3.1. Elementary Theories 3.2. Completeness Theorems 3.3. Further corollaries. Decision Problem 3.4. The First-Order Predicate Logic with Iden

tity 3.5. The First-Order Predicate Logic with Iden

tity and Operation Symbols

ix

1 3

11 14

21 25

29 34

43 48

52 60 68

71

78

x

IV

V

CONTENTS

The Second-Order Predicate Logic. Theory of Definition 4.1. Introduction 4.2. The Second-Order Predicate Logic F2 4.3. Second-Order Theories 4.4. Theory of Definition

The Natural Numbers 5.1. Introduction 5.2. Elementary Arithmetic: The Theory N 5.3. The Metamathematics of N 5.4. Second-Order Arithmetic: The Theory N2

5.5. The Metamathematics of N2

83 85 96 98

107 109 116 120 124

VI The Real Numbers 6.1. The Theory R 129 6.2. The Metamathematics of R and of Element-

ary Algebra 133 6.3. Second-Order Real Number Theory: The

Theory R2 137 6.4. The Metamathematics of R2 140

VII Axiomatic Set Theory 7.1. Paradoxes 143 7.2. The Zermelo-Fraenkel Axioms 149 7.3. The Axiom of Choice 165 7.4. The Metamathematics of ZF 172 7.5. Strengthened Forms of ZF 182

VIII Incompleteness. Undecidability 8.1. Introduction 186 8.2. Recursive Functions and Relations. Repre-

sentability 188 8.3. Arithmetization 196

CONTENTS

8.4. Godel's First Incompleteness Theorem 8.5. Godel's Second Incompleteness Theorem. 8.6. Tarski's Theorem 8.7. Decision Problem. Church's Thesis. Recur

sively Enumerable Sets 8.8. Undecidability

Author index Subject index

xi

199 206 210

215 220

230 232

CHAPTER I

THE SENTENTIAL LOGIC

1.1. Introduction

Within the first four chapters of this book we shall be concerned with a formal presentation of various branches of mathematical logic. In this chapter we shall be concerned with the most elementary branch of mathematical logic; viz., the sentential logic, or the propositional calculus. This branch of logic has to do with the logical properties of the various forms of sentential composition, by means of which sentences can bejoined together so as to result in compound sentences. We shall be especially concerned here with the problem of distinguishing among sentences in general those sentences which are true solely by' virtue of the logical properties of the sentential connectives; viz., the so-called class of tautologies. These are the sentences which are true, as we say, 'solely by virtue of the meanings of the sentential connectives themselves'. These sentences form the most fundamental class of logical truths.

Our approach to the sentential logic - and to the various more advanced parts of logic taken up in Chapters II-IV - will be in part sYntactical, and in part seman tical. Within syntax, we attend only to various of the typographical, or structural, features of the expressions with which we are concerned. Here no meaning or interpretation is presupposed; symbols and expressions in general are regarded as uninterpreted. Within semantics, however, we attend not only to structural features of expressions, but also to interpretations. Thus, within semantics symbols and expressions are interpreted, and· cel'tain expressions are said to be true, and others false, once given certain interpretations of those expressions.

2 THE SENTENTIAL LOGIC

In our approach to each of the various areas of logic, and thus to the sentential logic in particular, we shall proceed by developing a certain formal system of logic (or a whole type of formal systems of logic). This will be done in each case in a certain order. FJr.st, we shall t(il<e up a certain part of the syntax of that system of logic. Here we characterize principally the symbols and formulas of that system of logic in an exact way. In particular, in characterizing, or distinguishing, certain of the expressions of that system as formulas, no reference is made to any interpretation of those expressions. The second step in setting forth a logical system will consist in providing the semantics of that system. Here we first specify in an ex(ct fashion just how the expressions of that system are to be interpreted; then define a number of important semLntic concepts, and establish a number of basic results concernmg those concepts. Most importantly, we here define the fundamental concept of a logically valid formula within that system. In the case of the sentential logic, the logically valid formulas are the tautologies." Finally, we return to the syntactical approach, and attempt to characterize syntactically this class of valid formulas, which we have just defined semantically. We attempt to do this by laying down certain formulas as axioms -that is, as formulas accepted wjthouL proof. We then specify certain rules of inference, and define as theorems those formulas within the system which can be derived from those axioms by means of those rules of inference. The attempt is to do all this in such a way that the theorems of the particular system will coincide with the valid formulas of that system. In the case of the sentential logic this turns .. out to be possible. Here the class of valid formulas can be successfully characterized by syntactical means. This also remains true for that branch of logic taken up in Chanters II and III; viz., the first-order predicate logic. It turns out, however, no longer to be possible with respect to the logic of Chapter IV; viz., the second-order predicate logic. Here the syntactical approach falls short of the seman tical approach; that is, here the class of logically valid formulas can be characterized only semantically.

The elements of the sentential logic were first studied by certain

SENTENTIAL CONNECTIVES 3

of the early Stoics of ancient times, and a number of minor contributions to the sentential logic come from the medieval period. Its study in a serious way, however, dates only from the second half of the nineteenth century. Most important in the whole history of this logic is Gottlob Frege (1848-1925), who has been called the greatest logician of modern times. The first formulation of the sentential logic as a formal system appeared in Frege's Begriffsschrift of 1879. Other important figures in the history of this logic include G. Boole (1815-1864), E. SchrOder 11841--1902), the American philosopher and logician C.S. Peirce d839--1914), and E. Post. 1

1.2. Sentential Connectives

Consider the sentence, 'Today is Monday, and tomorrow will be Tuesday.' It is as obvious as can be that this sentence implies the sentence, 'Today is Monday,' in the sense that it is impossible for the former of these two sentences to be true without the latter being true. We may say that this implication holds by virtue of the very nature of sentential conjunction. Sentential conjunction is one of the topics studied within the logic of sentences. It is there assigned a precise analysis, as follows: A compound sentence of the form

A and B

is called a conjunction, with A and B as its conjuncts. The conjunction of A and B is regarded as tru.e just in that case when the sentence A and the sentence B are both true. That case is one of a total of four possible cases: A and B both true; A true, B false; A false, B true; and A and B both false. Only in the first of these four cases is the conjunction

Aand B

1 For detailed notes on the history of the sentential logic, see A. Chl!rch 1956, section 29.


true. All of this can be said very simply by making use of so-called truth-tables, which are schematic diagrams of a sort. The truthtable for the sentential connective '~E1_<:!.: is as follows, where the letters 'T' and 'F' ::;tand for the two truth-values, truth and falsity:

A B

T T T F F T F F

AandB

T F F F

For example, consider the sentences 'Caesar was a Roman,' 'Shakespeare was.m Englishman,' and 'Beethoven was an Italian.' The first two of these sentences are true, and the third is false. Therefore, the conjunction 'Caesar was a Roman and Shakespeare was an Englishman' is a true sentence, while the conjunction 'Caesar was a Roman and Beethoven was an Italian' is a false sentence.

This use of the connective 'and' accords reasonably well with the way in which the word 'and' is used in ordinary informal discourse. It differs from that usage principally in the fact that in the logic of sentences any two sentences can be joined together by that connective. In particular, it is not required that the conjuncts A and B be related to one another in what they are about; that is, in their subject matter. Thus, for example, the two sentences 'Today is Monday' and '2 + 2 = 4' can be joined together to give the compound sentence 'Today is Monday and 2 + 2 = 4.' In ordinary discourse, this sentence would, perhaps, never be used, since there is no "connection" between the subject matters of its two conjuncts. But in sentential logic no such "connection.'.'. . .is required, either here-'~r in the case of any of the remaining sent~ntial connectives.! By not requiring any such "connection", the logic of sentences becomes much simpler by far than it would otherwise be.

A second connective - a singulary connective, rather than a binary one, as is the connective 'and' - is the connective for negation; viz., 'not.' The truth-table for this connective is as

follows:

SENTENTIAL CONNECTIVES

A not A

TTF FIT

5

Thus, the negation of a sentence counts as true when that sentence itself is false, and as false when that sentence itself is true.

A compound sentence of the form

A orB

is called a disjunction, with A and B as its disjuncts. The connective for disjunction is here understood in the so-called inclusive sense: a disjunction counts as true not only in. those cases where one disjunct is false and the other true, but also in the case where both disjuncts are true. A disjunction counts as false then

I ' , on y when neither of its disjuncts is true. The truth-table for 'or' therefore, is as follows: '

A B AorB

T T T T F T F T T F F F

For example, the disjunction 'Caesar was a Roman or Shakespeare was an Englishman' is a true sentence, as is the disjunction ·Caesar was a Roman or Beethoven was an Italian.' If we take the ~~tence 'Beethoven was an Italian' for both the left and the right dISJuncts, however, we obtain a false sentence; viz., 'Beethoven was an Italian or Beethoven was an Italian.'

We tum now to the connective for the conditional; viz., 'if ... then.'

The logician's definition of this connective is admittedly a bit peculiar, at variance with the ordinary usage, or usages, of the expression 'if ... then.' Here, as with the case for 'and' and 'or' it . . ' IS not requIred that the sentences joined by this comiective have anything in common in their subject matters. Any two sentences


can be joined by means of this connective, and the result will itself always count as a sentence. In a sentence of the form

if A then B,

the sentence A is called the antecedent, and the sentence B is called the consequent. A sentence of this form counts as false only when its antecedent is true and its consequent is false. Thus, e.g., we have the following sentences as true: 'If three is less than four, then five is less than six'; 'If five is less than four, then five is less than six'; 'If five is less than four, then Beethoven was an Italian.' The sentence 'If five is less than six, then Beethoven was an Italian,' however, is a false sentence.

The truth-table for 'if ... then,' therefore, is as follows:

A B if A then B

T T T T F F F T T F F T

One might protest that this truth-table hardly conveys what we ordinarily mean by the expression 'logically implies.' And of course it does not, nor are we here trying to pretend that it does. The analysis of formal (or logical) implication is, to be sure, a fundamental task of deductive logic. That analysis, however, which we shall consider later, can be given only after logic has been developed up to a certain point, and is certainly not given by the truth-table for 'if ... then.' For one reason, when we say that one sentence implies another, we thereby mention these two sentences, rather than use them. What we would use in saying this would not be these ~entences themselves, but expressions for referring to them; for example, names for them (such as the result of pu tting quotation marks around these sentences). The sentential connectives, however, such as 'if ... then,' stand between sentences themselves, rather than between names of sentences. For this reason alone, within the sentential logic we could not replace the expression 'if ... then' by the term 'implies.'

Still, rather than simply replacing the expression 'if ... then' by


the term 'implies,' one might propose saying that if a conditional

if A then B b

is true, then the sentence A implies the sentence B. For example, since the sentence 'If snow is white, then grass is green' is true, then the sentence 'Snow is white' implies the sentence 'Grass is green.' But to do this would be to use the term 'implies' in a far weaker sense than it is ordinarily used. We shall not use the term 'implies' in this book in this way at all, but only in a much stronger sense, which will be much closer to the ordinary sense of the term 'implies,' or 'logically implies.' Some logicians do use the expression 'materially implies' in this weak sense, however, thus saying that 'Snow is white' materially implies 'Grass is green,' though it does not logically imply 'Grass is green.' We shall not follow this practice here, since it tends to invite confusion to say, for example, that 'Snow is white' implies 'Grass is green' in any sense of the term 'implies.' Rather than follow this practice, we shall say simply that the conditional 'If snow is white, then grass is green' is true.

It has to be admitted that the logician's use of the expression 'if ... then' does represent a departure to some extent from the ordinary use, or uses, of this expression. Ordinarily, this expression is not used truth-functionally, as it is in logic, as examples readily reveal. The logician is not attempting, however, to stay as close as possible to ordinary usage, but is prepared to depart from the ordinary usage of a term - to a certain extent, at any rate - if he has to do this in order to devise some concept which suits his purposes better than any already existing concept or usage. And what suits the logician's purposes best is that practice in which (a) where the antecedent of a conditional is true, we identify the truth-value of the conditional with the truth-value of its conseq uen t and (b) where the an teceden t is false, we regard the conditional as true.

Finally, we have the connective for the biconditional; viz., 'if and only if.' The biconditional

A if and only if B


is considered true when the sentences A and B have the same truth-value; otherwise, false. Its truth-table, then, is as follows:

A B A if and only if B

T T T T F F F T F, F F T

Remarks precisely similar to those we made in connection with 'if ... then' apply to 'if and only if.' To say that the biconditional between two sentences is true ~no1 .. at.all to assert that these two sentences are logically equivalent. The concept of logical equivalence is not here under consideration, but will be defined only at a later point.

It is important to see that these sentential connectives are not all independent of one another. Once given certain of them, the remaining ones could be introduced in terms of these given ones. Thus, consider the connectives 'not' and 'if ... then.' If we wished, we could regard all sentences of the form

AorB

as merely abbreviations for sentences of the form

if not A then B;

for, as the truth-tables show for given sentences A and B, a sentence of the first form is true if and only if the corresponding sentence of the second form is true. Similarly, we could regard all sentences of the form

AandB

as merely abbreviations for sentences of the form

not (if A then not B);

and all sentences of the form

A if and only if B


as abbreviations for sentences of the form

if A then B, and if B then A.

Thus, we could in principle either dispense with the connectives 'and,' 'or' and 'if and only if altogether, or regard sentences containing them only as definitional abbreviations of other sentences. Similarly, the same thing can be done if we start with 'not' and 'and,' or with 'not' and 'or'; and the reader should show that this is the case before continuing. Indeed, consider the following two truth-tables:

A B neither A nor B not both A and B

-~--I~--~ -1- F T T T

It is known that all of our sentential connectives can be introduced in terms of either one of these two connectives (H.M, Sheffer, 1913). The reader might consider how this could be done, starting with the definitions of

not A as

neither A nor A, and

not both A and A.

Each of the connectives we have considered is a truth-functional connective, giving rise to truth-functional contex ts, That is, every application of these connectives to given sentences gives rise to a context, or compound sentence, whose truth-value is dependent solely upon the truth-values of those given sentences, In particular, the truth-value of the compound sentence is not otherwise dependent upon its meaning, or upon the meaning of its constitRf:nt sentences, We can, then, replace any sentence' that occurs wIthin any of these contexts by any other sentence having the same truth-value without changing the truth-value of that context. Stated exactly: For any formulas A, B, C and D, if D results from


C by replacing one or more occurrences of A in C by occurrences of B, then if A and B have the same truth-value, then C and D have the same truth-value. This is the(Replacement Principle for the sentential logic. For example, if within a compound sentence

Aand B

we replace B by any other sentence having the same truth-value as B, the truth-value of the resultant sentence will be the same as the truth-value of the original sentence. Ordinary discourse contains a number of types of sentential context which are not truthfunctional. For example, the expressions 'believes that .. .' and 'said that .. .' both give rise to sentences whose truth-values are not truth-functionally dependent upon the truth-values of the sentences occurring within them, after the word 'that.' Thus, e.g., though the sentence 'Aristotle believed that the world is round' is true, when we replace its true sub-sentence 'the world is round' by the true sentence 'the earth is not at the center of the universe' the result is a sentence which is false; i.e., of a different truth-valu~ from the truth-value of the sentence we started with. The belief context, then, is not a truth-functional context.

Philosophers sometimes speak of the truth-value of a sentence as its extension. We have just seen that all contexts within the sentential logic are truth-functional contexts. For this reason, these contexts are often called extensional contexts and the , Replacement Principle is referred to as a principle of extensionality for the sentential logic. Further, the sentential logic itself is said to be an ex tensional logic, in the sense that all of its contexts are extensional. A logic which contained belief contexts on the other hand, would be a non-extensional logic in this se~se. And modal logic, in which the concepts of necessity and possibility are, ~tudied, is another example of a non-extensional logic. For though 'Two plus two equals four' and 'Snow is white,' for example, are bo'.h true, when we replace the former by the latter in the true sentence 'Necessarily, two plus two equals four,' the result is a sentence which is false: namely, the sentence 'Necessarily, snow is White.' Though non-existensional logics are often philosophically important, for the purposes of orthodox mathematics it is not

THE SENTENTIAL LOGIC P. SYMBOLS AND FORMULAS 11

necessary to take up the study of such logics, and all of the logical systems which we shall consider in this book are extensional in some appropriate sense.

As one final introductory observation, we remark that in addition to the standard two-valued approach to the sentential logic, in which the only recognized sentential values are truth and falsity, logicians have also studied .many-valued approaches to the sentential logic, in which three or more sentential values are. recognized. And in addition there is the intuitionistic.approac.h to the sentential logic, which departs from the orthodox sentential logic in not accepting without restriction the law of excluded middle, according to which every sentence is either true or false. But we shall not here be further concerned with these alternatives to the orthodox sentential logic.

1.3. The Sentential Logic P. Symbols and Formulas

We turn now to the consideration of a particular system of sentential logic, which we shall call P. We shall first specify the symbols and formulas of P, and then define the notion of a tautology of P.

The symbols of P are the following symbols: (I) the sentential connectives for negation, conditional, disjunc

tion, conjunction and biconditional; viz.,

These symbols will be used in place of the familiar English words 'not,' 'if ." then,' 'or,' 'and' and 'if and only if,' which were used as sentential connectives in the preceding section. (~l left and right parentheses; viz.,

( )

(3) an infinite list of sentential letters; viz.,

p q r s P1 q1 rl s1 .. ,

The symbols under (l) and (2) are the so-called logical con-


stants of P. And an expression of P is any (finitely long) string of symbols of P.

A formula of P is any expression within P which is either (a) a sentential letter of P, or (b) the negation of a formula of P, or (c) the conditional between two formulas of P, or (d) the disjunction of two formulas cf P, or (e) the conjunction of two formulas of P, or (f) the biconditional between two formulas of P. That is:

(a) each sentential letter of P is a formula of P;

and if A and B are formulas of P, then:

(b) '" A is a formula of P; (c) (A::> B) is a formula of P; (d) (A v B) is a formula of P; (e) (A /\ B) is a formula of P; (f) (A == B) is a formula of P; (g) all formulas of P are provided for by conditions (a)-(f).

In particular, an atomic formula of P is any sentential letter of P; i.e., any formula provided by condition (a).

For example, the following expression is a formula of P:

'" «p::> q) v (q /\ r».

For, by (a), the letters 'p', 'q' and or' are formulas of P; thus, by (c), '(p ::>q)' is a formula ofP, and, by (e), '(q /\ r)' is a formula of P; thu s, by (d), '( (p ::> q) v (q /\ r»' is a formula of P; thus, by (b), the above expression itself is a formula.

In presenting examples of formulas of P, we shall subsequently usually omit the outermost pair of parentheses, since this will lead to no ambiguity. In "official" notation, however, no such omissions are allowed; they are permitted only in informal context. And, again only within informal notation, we shall also usually draw upon one further generally accepted convention for cutting down on the number of parentheses appearing in formulas. This is the following convention:

(a) '",' ties more closely than either 'v' or '/\'; (b) both 'v' and '/\' tie more closely than '::>'; and (c) '::>' ties more closely than '=='.

THE SENTENTIAL LOGIC P. SYMBOLS AND FORMULAS 13

Thus, for example, using these two conventions, the formulas (in official notation)

(a) «p /\ q) ::> r), (b) «p::> q) == r), (c) «(p::> q) v - q) == (- p /\ q»,

can be written as

(a') p /\ q ::> r, (b') p ::>q == r, (c') (p ::> q) v - q == - P /\ q.

Notice that, in contrast with (a), however, the formula

(d) (p/\ (q::>r»

is correctly abbreviated only as

(d') p /\ (q::> r),

and not as in (a'). It is very important to keep distinct the different roles played

by, first, the sentential letters of P ('p', 'q', or', etc.) and, second, the variables 'A', 'B', 'C', 'Ai', etc. Sentential letters occur within the system P. The bold roman capitals 'A', 'B', 'C', etc., on the other hand, do not themselves occur within P, but (throughout our discussion of P) only in the language in which we talk about P. They are, that is, meta-variables, and they playa very different role from that of the sentential letters. Within the language in which we discuss P, these variables range over the expressions of P. The difference between these meta-variables and the letters of P can be easily seen by considering an example. The expression . (p ::> q)' is a formula within P. The expression '(A::> B)', on the other hand, does not itself occur within P, but only within the meta-language of P, in which we discuss P, and which is here the English language supplemented with a number of technical symbols. Meta-variables are used in the meta-language of P in order to discuss the expressions of P in a general way. We have. just used them, for example,in specifying the formulas of P, and we shall presently use them in discussing the tautologies of P. In addition


to containing the syntactical meta-variables, the meta-language of P contains the symbols '''-'', ':::>', 'v', '1\' and '='. These symbols, of course, appear within P itself as sentential connectives. Within the meta-language of P, however, these symbols are used not in conjunction with sentential letters, but in conjunction with the syntactical meta-variable~. Thus, within the meta-language of P these symbols have an entirely different function from the one they hc:ve within P itself. Within P, they are used for the purpose of constructing compound formulas; within the meta-language of P, they are used for the purpose of constructing meta-linguistic expressions by means of which we refer to formulas, and the expressions in general, of P.

As one further observation on the meta-language of P, consider the meta-linguistic expression '(A:::> B)'. This expression is an example of a schema. 'fhis schema and various other schemata were used in defining the class of formulas of P. The notion of a schema can be defined in a general way, paralleling the way in which we defined the notion of a formula. Thus, each of the syntactic meta-variables 'A', 'B', 'C', etc., is a schema; putting the symbol ',,-,' in front of a schema results in a schema; putting ':::>' between any two schemata and enclosing the result in parentheses results in a schema; and similarly for the remaining symbols 'v', '1\' and '='. In writing Ollt schemata subsequently, we shall often draw upon the conventions for omitting parentheses which apply to formulas.

1.4. Tautologies

Now that we have defined the class of formulas of P in an exact way, we are able to tum to the semantics of P. By drawing upon the truth-tables from the preceding section, we are able to state in a general way the conditions under which a formula of P is true. Let A be any formula of P. Consider any possible assignment of truth-v:llues to the sentential letters that appear within A. Then the truth-value of A itself for that assignment is uniquely determined by the truth-tables for whatever sentential connectives appear within A. Let us illustrate this. Let A be the formula

TAUTOLOGIES 15

'p :::> P v q'. We construct a truth-table for this formula, in terms of the elementary truth-tables for ':::>' and 'v', as follows:

p q p:::> P v q

T T T T T T T T F T T T T F F T F T F T T F F F T F F F

In constructing this truth-table, since the formula 'p:::> P v q' is itself a conditional, we first determine the value of its antecedent and consequent for the case where 'p' and 'q' are both true. Then in terms of these values, we determine the value of the conditional itself for this case. Having done this, we then repeat the procedure for each of the remaining three cases. It should be clear from this particular case how the truth-table for any formula A whatsoever is to be constructed. In general, if A contains n distinct sentential letters, there will be 2n different possible assignments of truthvalues to these letters. Thus, the complete truth-table for A will contain 2n rows.

We are, to be sure, here assuming without argument that each formula of P can be taken in only one way, and thus gives rise to only one truth-table. That is, for example, if A is a conditional B :::> C, then it is not also a conditional Bl :::> C1, where Bl and C1 are distinct from Band C respectively; nor is it a negation, disjunction, etc. This assumption can be proved; but we here omit the proof.

As the truth-table for 'p :::> p v q' shows, this formula is true for all possible assignments of truth-values to the letters appearing within it. Now any formula of P which is true under all possible assignments of truth-values to the sentential letters appearing within it is a tautology, and is said to be tautologically valid. The formula 'p:::> p v q', then, is a tautology; and the schema A:::>AvB. is a tautological schema, in the sense that for all formulas A and B, the formula A:::> A v B is a tautology. And all such formulas can be shown to be tautologies by appeal to their truth-tables.

The tautologies of P are those formulas of P which are true


regardless of the truth or falsity of the sentential letters that appear within them. They are, in this sense, true in 'all possible cases.' Intuitively, a tautology is any formula which is true solely by virtue of the meanings of the sentential connectives that appear within it. The definition we have given above of a tautology can be regarded as an exact an alysis, with respect to P, of this in tuitive concept. And this will be characteristic of many of the concepts which are exactly defined in this book: in a great many cases, these concepts will represent exact analyses, with respect to some formal system or theory, of some concept which exists at the informal and intuitive level. To that extent, at any rate, our concern here will be with a kind of concept clarification, or explication. 2

As further examples of tautological schemata, we have the following schemata:

Law of Excluded Middle Law of Non-Contradiction Law of Double Negation Commutativity of Disjunction Commutativity of Conjunction Associativity of Disjunction Associativity of Conjunction De Morgan's Laws

Law of Contraposition Distribution Laws

Falsity of Conditional Law of Detachment

(Modus Ponens) Modus Tollens Hypothetical Syllogism Disjunctive Syllogism Law of Absurdity

Av'VA 'V(A A 'VA) A=='V'VA AvB==BvA AAB==BAA A v (B v C) == (A v B) v C A A (B A C) == (A A B) A C 'V(A A B) == 'VA v 'VB 'V(A v B) == 'VA A 'VB AJB=='VBJ'VA A A (B v C) == (A A B) v (A A C) A v (B A C) == (A v B) A (A v C) 'V(AJB)==AA'VB AA(AJB)JB

'VB A (A J B) J'VA (A J B) A (B J C) J (A J C) (A v B) A 'V A J B (AJBA'VB)J'VA

2 For a discussion of this concept of clarification, or explication, see R. Carnap 1956, pp.7 ff. See also W.V. Quine 1960, section 53; and W.V. Quine 1953, section 5.

TAUTOLOGIES 17

Three further tautological schemata are worth special note; viz.:

A J (B J A)

'VA J (A J B)

(A J B) v (B J A)

The first two of these three schemata are sometimes referred to as 'paradoxes of material implication.' And, indeed, if we were to read the symbol 'J' as 'implies' (or 'materially implies'), then all three of these schemata would seem to be paradoxical; not in the sense that they were contradictory, but in the sense that they were highly counter-intuitive. For then the first of these schemata would apparently say that any true sentence is implied. by any sentence whatsoever; the second, that a false sentence implies any sentence whatsoever; and the third, that for any two sentences, at least one implies the other. The paradoxical appearance largely disappears, however, if we read the symbol 'J', not as 'implies', but only as 'if ... then'. The first of these schemata then simply expresses the fact that a conditional is true if its consequent is true; the second, that a conditional is true if its antecedent is false; and the third, that for any two sentences at least one of the conditionals between them is true.

There are many other types of tautology in addition to those mentioned above, of course; indeed, infinitely many others. A reasonably complete list of the different types of tautology most often used in reasoning, however, would contain at most a few dozen.

It is clear that the construction of truth-tables provides us with a perfectly general test for determining whether a formula of P is a tautology. For any formula A, if A receives the value truth in each of the rows in its truth-table, then A is a tautology; if A receives the value falsity in at least one of these rows, then A is not a tautology. We shall speak of the truth-table test as a mechanical test, or an effe.ctive test. The concept of an effective test or procedure is an intu,itive concept which can be given an exact analysis within mathematical contexts, and we shall consider such an analysis (in terms of recursive functions) in Chapter VIII. Until


we reach that chapter, we shall use only the intuitive concept of the mechanical, or the effective. By way of explanation of that concept, perhaps a number of illustrations and informal remarks will suffice. Familiar mathematics provides us with a large number of effective procedures. For example, the procedures for determining the sum and product of any two numbers, the procedure for extracting square ro)ts, and the procedure for solving quadratic equations, are effective procedures. Such effective procedures are oft~n called algorithms. These procedures are effective or algorithmic in the sense that they provide us with instructions for ascertaining something or other in a systematic, step-by-step manner. Any concept which is defined in such a way that there is an effective procedure for determining whether that concept applies in any particular case is called an effectively defined concept. Thus, the concepts of a formula of P, or of a tautology of P, are effectively defined concepts. Where there exists no general procedure for determining whether a concept applies, on the other hand, ingenuity is required. We shall from this point forward repeatedly draw upon these informal concepts of an effective procedure and an effectively defined concept.

In addition to the concept of tautology, a number of further semantical concepts can readily be defined with respect to P.

A formula A is tautologically inconsistent if and only if A is false on every assigrment of truth-values to the sentential letters appearing within it. As a special case of tautologically inconsistent formulas we have those formulas that are contradictions; that is, formulas of the form ~ /\ "vA. A formula A tautologically implies a formula B if and only if, for every assignment of truth-values to the sentential letters in A and B, if A is true on that assignment then B is true on that assignment also. In this case we say that B is a tautological consequence of A. More generally, let r be any class of formulas of P, and A any formula of P. Then r tautologically implies A (and A is a tautological consequence of r) if and only if, for every assignment of truth-values to the sentential letters appearing either in A or in any of the formulas of r, if all of the formulas of r are true on that assignment, then A is true on that assignment also. And two formulas A and B are tautologically

TAUTOLOGIES 19

equivalent if and only if, for every assignment of truth values to the sentential letters in A and B, either A and B are both true on that assignment or both false on that assignment.

The reader should have no difficulty in seeing that the following results readily follow from these definitions (where A and Bare any formulas of P):

(1) A tautologically implies B if and only if the conditional A J B is a tautology.

(2) A and B are tautologically equivalent if and only if the biconditional A == B is a tautology.

(3) If r is a class of tautologies, and A is a tautological consequence of r then A is a tautology.

(4) If A tautologically implies B, and B tau tologica11y i!llplies C, then A tautologically implies C. Thus, tautological implication is transitive.

(5) Any two tautologies are tautologically equivalent. (6) A tautology is tautologically implied by any formula (or

class of formulas) whatsoever; and a tautologically inconsistent formula tautologically implies any formula whatsoeve~.

We now show (7) A formula A is a tautology if and only if it is tautologically

implied by the empty class of formulas (that is, the class which has no formulas in it at all). We have one half of this by (6), for if A is a tautology, then it is tautologically implied by the empty class. To establish the other half, suppose now that A is tautologically implied by the empty class. Then every assignment of truth-values to the sentential letters appearing in A and in the formulas of the empty class which makes all of the formulas in the empty cla~s true makes A true. But since there are no formulas in the empty class, every such assignment vacuously makes those. formulas all true. Thus, A is true on every such assignment of truth-values, and is therefore a tautology. Thus, (7) is proved.

Tautological implication is a special form of logical implication; viz., logical implication by virtue of the meanings of the sentential connectives. We f\hall define a general concept of logical implication in Chapter II. T~utological implication will there be formally subsumed under logical implication. And similarly for tautological equivalence and logical equivalence.


In our discussion of the symbol for the conditional, we noted that the logician's usage of 'if ... then' is at variance with the ordinary usage, or usages, of 'if ... then.' In part this variance consists in the fact that the logician's conditional is true if either its antecedent is false or its consequent is true. These features of the conditional, recall, have often been referred to as 'the paradoxes of material implication.' Now, similarly, it has often been maintained that the logician's concept of logical implication is at variance with the ordinary concept (or concepts) of logical implication. Since tautological implication is a species of logical implication, it follows from result (6) above that on the logician's concept of logical implication, a tautologically inconsistent formula implies any formula whatsoever, and a tautology is implied by any formula whatsoever. This is the analogue to (and results from) the fact that a conditional is true if either its antecedent is false or its consequent is true. Thus, for example, 'Snow is white and snow is not white' logically implies 'Grass is green' (as is often said, Anything follows from a contradiction); and 'Logic is difficult or logic is not difficult' is logically implied by 'God is dead.' At first glance,",t any rate, these results seem to be opposed to our intuitions concerning logical implication. Whether they are in fact so opposed, and are 'paradoxes' in this sense, is a question that has been much debated in the literature. 3 The position (a popular one) which seems most plausible to the present writer is that there is admittedly some variance between the logician's concept of logical implication and the ordinary concept, or concepts of logical implication; but that this variance is relatively harmless, and that it could be eliminated (and even then only partially) only at the cost of considerable complexity in theory. We have here simply one of those many points at which precise formulation leads to departure from informal usage.

3 For a recent discussion of these matters, and further references, see G.E. Hughes and M.J. Cresswell 1968, Appendix 2.

AXIOM SCHEMATA OF P. RULES OF INFERENCE AND THEOREMS 21

1.5. Axiom Schemata of P. Rules of Inference and Theorems

We now return to the syntax of P, and define first the class of axioms of P. What we want to do is to distinguish a class of formulas of P from which all of the tautologies of P (and no other formulas) can be derived, by means of applying certain rules of inference which we shall subsequently specify. Once we succeed in doing this we will have two ways of showing that a gi~en formula A is a tautology; viz., by means of the truth-tabl~ test, and by means of deriving A from the axioms of P. Though the truth-table test in principle suffices, in practice it becomes lengthy and cumbersome when a large number of sentential letters appear in A' in this case, the procedure of deriving A from the axioms of P is'to be preferred.

The procedure which we shall follow is well known. It consists in laying down several schemata, with the understanding that each of the infinitely many formulas of P which are of the form of any one of these schemata is to count as an axiom of P. It is easy to see, by consulting the truth-tables for ''V' and '::J', that each of these axioms is a tautology. Our axiom schemata are as follows, 4

where A, Band C are formulas of P:

(a) A::J (B ::J A) (b) (A::J (B ::J C)) ::J «A::J B) ::J (A::J C)) (c) ('VB::J 'VA)::J «'VB::J A) ::JB)

These axiom schemata themselves occur, of course, not within P itself, but within the meta-language of P. According to the first of these schemata, the following formulas (in informal notation) are examples of axioms of P:

P ::J (q ::J p) P ::J (p ::J p) (p ::J q) ::J (q ::J (p ::J q)) 'Vp ::J «q ::J p) ::J "'p)

As an example of an axiom provided by axiom schema (b), we have the formula '

(p ::J «p ::J q) ::J r)) ::J «p ::J (p ::J q)) ::J (p::J r));

4 These axiom schemata appear in E. Mendelson 1964, Chapter 1.


and as an example provided by axiom schema (c), we have the formula

It will be noticed that the only sentential connectives which occur within the above axiom schemata are the connectives for negation and for'he conditional. As for the remaining connectives, we must either add further axiom schemata in which they appear, or in some way correlate them with the connectives for negation and the conditional. We shall here choose the latter course. We shall say that a formula A is definitionally equivalent to another formula B if and only if there are formulas AI' BI , A2, and B2 such that A and B are alike except that A contains an occurrence of Al at some place where B contains an occurrence of BI , and either

(a) Al is A2 V B2 and BI is 'VA2 ::::> B2, or (b) Al is A2 f\ B2 and BI is 'V(A2 ::::> 'VB2), or (c) A I is A2 == B2 and B I is (A2 ::::> B2) f\ (B2 ::::> A2).

Thus, for example, by (a) the following two formulas of Pare definitionallyequivalent:

p v q 'Vp::::> q;

as are the two formulas

(p == q) f\ «q v r) v p) (p == q) f\ «'Vq::::> r) v p).

In the first of these examples, Al is the whole formula 'p v q' itself, and BI is the formula ''VP::::> q'; in the second example, Al is the formula '(q v r)', and BI is the formula '('Vq ::::> r)'.

It should be noticed that the above definition of , definition ally equivalent' proceeds in accordance with the interpretations of the sentential connectives which are provided by the truth-tables for these connectives. Any two formulas of P which are definitionally tquivalent in the above sense are equivalent by the truth-table test; that is, for every assignment of truth-values to the sentential letters appearing within these formulas, these two formulas take on the same truth-value.

AXIOM SCHEMATA OF P. RULES OF INFERENCE AND THEOREMS 23

In order now to derive theorems from the axioms of P, we need some rules of inference, which permit us to infer formulas from other formulas. In particular, in order to derive all of the tautologies of P from the axioms of P, we need only two rules of inference. First, we shall use a rule of Definitional Interchange. This rule is as follows:

(a) If A and Bare definitionally equivalent to each other, then from A one may infer B, and vice versa. Here the expression 'definitionally equivalent' is meant in the sense of the preceding section.

Second, we shall use the well-known rule of inference, Modus Ponens:

(b) From A and A::::> B, one may infer B. We now define a theorem of P as any formula which is derivable

from the axioms of P by means of finitely many applications of these two rules of inference, (a) and (b). More exactly, a theorem of P is any formula A of P which is the last formula in some finite sequence of formulas of P, where each formula in this sequence is either an axiom of P, or obtainable from earlier formulas in this sequence by one application of either rule (a) or rule (b). Such a sequence will be called a proof of A.

Let us now consider an example of a proof of a theorem of P; viz., of the theorem '(p::::> p)'.

1. (p::::> (p::::> p)) Axiom schema (a) 2. (p::::> «p::::> p) ::::> p)) Axiom schema (a) 3. «p::::> «p ::::> p)::::> p))::::> «p::::> (p::::> p)) ::::> (p::::> p)))

4. «p ::::> (p ::::> p)) ::::> (p ::::> p» 5. (p::::> p)

Axiom schema (b) 2,3, rule (b) 1,4, rule (b)

Each of the formulas in this sequence of five formulas is either an axiom of P, or obtainable from earlier formulas in this sequence by one of the rules of inference of P. This sequence is, then, clearly a proof of the last formula in this sequence; viz., '(p ::::> p)'.

A sound rule of inference is defined as any rule of inference which, when applied to true formulas, permits us to infer only true formulas. By consulting the truth-tables for the sentential


connectives, the reader will see that rules (a) and (b) are both sound rules of inference. There are, of course, a great number of sound rules of inference; strictly speaking, an infinite number. Certain of these rules of inference depend for their soundness solely upon the logical properties of the truth-functional connectives. Consider, for example, the tautological schema

(A:J B) :J ('VB :J'VA).

Corresponding to this schema, there is the sound rule of inference

From A:J B and 'VB, one may infer 'VA.

The soundness of this rule of inference consists in the fact that for any two formulas A and B, whenever A:JB and 'VB are true, then 'VA is true also. And this fact itself depends solely upon the logical properties of the sentential connectives involved; viz., ''V' and ':J'. Similarly, to eve!;' tautological schema of conditional form there corresponds a sound rule of inference.

Though there are a great number of rules of inference that are sound solely on truth-functional grounds alone, then, it is to be noticed that within our system P, we use only the rule of Definitional Interchange and the rule Modus Ponens as our primitive rules of inference. In the derivation of theorems from axioms, we are able to dispense with any further primitive rules of inference.

In practice, however, it is very convenient to be able to draw upon additional sound rules of inference, such as the rule which permits us to infer 'V A from A :J B and 'V B. This rule can be given an effective proof, in the sense that we can show effectively how to replace any inference within P in which it is used by an inference in which only the primitive rules of inference of Pare used. Consider an inference in which this rule is used:

I.A:JB 2. 'VB 3. 'VA

This inference of 'V A from A:J B and 'V B can be replaced by an inference in which only primitive rules are used. For any formulas

METAMATHEMATICAL PROPERTIES OF P 25

A and B, it is known that the formula (A:J B) :J.(~B:J 'VA) will be derivable from the axioms of P. Thus, in our inference we first derive the formula (A:JB):J ('VB:J'VA). We then add the step A :J B, and by Modus Ponens conclude 'V B :J 'V A. Finally, we add the step 'VB, and conclude 'V A by Modus Ponens.

Any rule of inference which we have proved in the above sense can be used as a derived rule of inference within P. Derived rules of inference serve as shortcuts. They permit us to derive formulas from other formulas in fewer steps than would be needed if we were to use only primitive rules. Now to every tautological schema of conditional form there corresponds a sound rule of inference. Further, it is known that there is an effective procedure for deriving any tautology of P from the axioms of P. It follows that to every tautological schema of conditional form there corresponds a rule of inference which can be proved as a derived rule of P.

1.6. Metamathematical Properties of P

Logicians have studied the sentential logic very extensively. We are not here primarily interested in the sentential logic for its own sake, however, but rather as a step toward a more comprehensive logic; viz., the first-order predicate logic. For that reason, we shall now bring our consideration of the sentential logic to a close by introducing three very important syntactical concepts, in terms of which we can draw attention to some basic features of the system P. Let r be any (non-empty) set of formulas of P. Then, we say that r is consistent if and only if there is no formula A of P such that both A and 'V A are derivable from r. We say that r is complete if and only if every tautology of P is derivable from r. And r is a decidable set of formulas if and only if there is an effective procedure for determining whether or not any formula A of P is included in r.

It is very easy to show that the set of axioms for P which we have presented above is consistent., First, as we have already pointed out, each of these axioms is a tautology, as can readily be


seen by considering the truth-tables for the sentential connectives which appear within the various axiom schemata of P. Second, the two primitive rules of inference within P when applied to tautologies permit us to infer only tautologies. This, too, can readily be seen by considering the truth-tables for the sentential connectives which appear within those rules of inference. Thus, for Modus Ponens, suppose that both A and A::l B are tautologies. Then B must be a tautologYllso. For if it were not a tautology, there would be some assignment of truth-values to the sentential letters appearing within it which would make B false. But then, since A is a tautology, on that assignment of truth-values the antecedent of A ::l B would be true and its consequent false. By the truth-table for the conditional, the formula A::l B would then be false, and thus not a tautology, contrary to our assumption. By similar reasoning, one can show that the rule of Definitional Interchange when applied to tautologies permits us to infer only tautologies. Thus, we can conclude that all of the theorems of Pare tautologies. But no tautology can be the negation of any other tautology; for if it were, it would be false for every assignment of truth-values to its sentential letters, and thus not a tautology. There is, then, no formula A of P such that both A and IV A are theorems ofP. Thus, our set of axioms for P is consistent.

As we have already remarked, our set of axioms is known to be complete. (JIe here omit the proof of this.5) From this fact, together with the fact that all theorems of P are tautologies, it follows that the class of formulas which are theorems of P is identical with the class of tautologies of P. And from this it follows that the class of theorems of P is decidable; that is, that there is a mechanical procedure for determining whether an arbitrary formula of P is a theorem of P. 6 All that we need to do in order to determine whether any given formula of P is a theorem of P is to construct the truth-table for that formula. If this truth-table shows that formula to be a tautology, then it is a

5 See E. Mendelson 1964, pp. 36-37. 6 The decidability of the class of theorems of sententia110gic was first established by

Post, 1921.

META MATHEMATICAL PROPERTIES OF P 27

theorem of P; if not, then it is not a theorem of P. More generally, because a formula of P is a theorem of P if and only if it is a tautology, it follows that whatever general results hOld true for tautologies also hold for theorems. Thus, for example, in the list of results about tautologies which appears on page 19, if we everywhere replace the word 'tautology' by the phrase 'theorem of P,' the results will all be true statements. In particular, we have the important result that if r is a class of theorems of P and A is a tautological consequence of r, then A is a theorem of P.

CHAPTER II

THE FIRST-ORDER PREDICATE LOGIC: I

The logic we are now about to consider is the so-called first-order predicate logic, or the functional calculus of first order. This logic contains the sentential logic within it as a proper part, in the sense that all reasoning that can be c<lrried out within the sentential logic can also be carried out within the first-order predicate logic, ..bl!111~lLylc.e.yers.a. Within the first-order predicate logic, the logical structure of formulas and of arguments can be presented in considerably greater detail than can be done within the sentential logic itself. Thus, for example, within the sentential logic the difference in logical structure between the sentences 'Seven is greater than SlX' and 'All men are mortal' cannot be exhibited in any way. Clearly, any system of logic which is satisfactory, however, must enable us to exhibit this difference. As we shall see, the first-order predicate logic permits us to do this. As for the logical structure of arguments, consider the following argument:

All men are mortal, All Greeks are men, Therefore, all Creeks are mortal.

This argument is certainly of valid form. Within the limitations of the sentential logic, however, we cannot express its form other than as follows:

p, q, Therefore, r.

28

THE FIRST-ORDER PREDICATE LOGIC FI 29

Clearly, we need to be able to express the form of this argument in greater detail than this if we are to account for its validity. Once we have the first-order predicate logic at our disposal, we shall be able to do this in a way which is thoroughly satisfactory for the purpose of establishing the validity of this argument. Indeed, we shall be able to express in satisfactory fashion the logical forms of a great variety of sentences and arguments.

The first-order predicate logic as a formal system made its first appearance (in effect, though not perfectly explicitly) in Frege's Begriffssc h rift (1879). In addition to Frege, other important figures in the early history of the first-order predicate logic include G. Peano (1858-1932), C.S. Peirce, Bertrand Russell and A.N. Whitehead (in Principia Mathematica), T. Skolem, and D. Hilbert and W. Ackermann (in their Grundzuge der theoretischen Logik, 1928). I

2.1. The First-Order Predicate Logic Fl. Symbols, Quantifiers and Formulas

Just as there is more than one way to set up the sentential logic, so too there is more than one way to set up the first-order predicate logic. We proceed now to one formulation of that logic, which we shall call Fl. Rather than being the name of only one system of first-order predicate logic, however, 'F!> will stand ambiguously for a number of first-order predicate logics. These various logical systems will all closely resemble one another, and for all of our purposes we can treat them together. They will differ only in which symbols they contain. We shall, however, refer to them collectively as Fl. 1 .

The symbols of each of the systems F include all of the symbols under (I) and (2) from the following list. In addition, any particular one of the systems Fl mayor may not include various

I For detailed notes on the history of the first-order predicate logic, see A. Church

1956, section 49.

30 THE FIRST-ORDER PREDICATE LOGIC: I

of the symbols under (3); and each of the systems pI includes one or more of the symbols under (4).

(1) The connectives of P together with the parentheses, and one new symbol; viz., '3'. That is, here we have the following symbols:

'V::)vl\=()3

These symbols are the logical constants of pl. All other constants of pI are non-logical constants.

(2) An infinite list of individual variables; viz.,

x y z Xl YI zl x2 Y2 z2 ...

Whenever we need to speak of the n-th individual variable of pI, the ab JVe is to be taken as the ordering of the individual variables.

(3) An infinite list of individual constants, which we need not specify here.

(4) Por each positive integer n, an infinite list of n-ary predicate constants; viz., an infinite list of singulary predicate constants

pI QI R 1 pI QI RI . 1 I I'" ,

an infinite list of binary predicate constants

p2 Q2 R 2 p2 Q2 R 2 . I 1 1'" ,

and so on. (We shall use these particular predicate constants only in order to illustrate formulas in this chapter and the following one. In later chapters we shall introduce new predicate constants.)

We shall subsequently specify in an exact way how the symbols of pI are to be interpreted. Let it suffice for the moment simply to say that (a) the individual variables are to be thought of as ranging over some arbitrary (non-empty) domain of entities; (b) that the individual constants are to be thought of as standing for certain particular entities in that domain; and (c) that the predicate constants are to be thought of as standing for particular properties of and relations among the entities in that domain. In particular, the n-ary predicate constants are to stand for n-ary relations; that is, relations with n terms.

A great variety of mathematical theories can be developed within the first-order predicate logic. (Often symbols for identity

THE FIRST-ORDER PREDICATE LOGIC Fl 31

and operations must be added. We shall consider these additions subsequently.) Any formulation of the first-order predicate logic which does not include predicate variables, such as the formulation pI, is often referred to as a simple applied first-order predicate logic. And any mathematical theory stated within a simple applied first-order predicate logic (possibly with identity and operation symbols added) is often referred to as an elementary theory, or a theory with standard formalization. The general properties of elementary theories have been much studied, and a number of very important results concerning all such theories have been discovered. We shall consider a number of elementary mathematical theories in later chapters.

As before, we shall use the bold face roman capitals A, B, C, etc., as meta-variables, now understood as ranging over the expressions of pl. And within the meta-language of pI we now also include a second infinite list of variables; viz., the bold face roman small letters:

abc al b l ci a2 b2 ... ,

which shall range over the individual variables and constants of pl. And we also include the bold face letters 'f', 'f 1" 'f2' ... , which shall range over the predicate constants of pl.

The concept of a quantifier is one of the central concepts within the syntax of the predicate logic. There are within pl two kinds of quantifiers: universal quantifiers and existential quantifiers. A universal quantifier of pI is any expression of the form (a), where a is any individual variable of pI; and an existential quantifier of pI is any expression of the form (3a), where a is any individual variable of pl. Thus, there are infinitely many universal and existential quantifiers within pl. Por example, '(x)' and '(y)' are universal quantifiers, while '(3x)' and '(3y)' are existential quantifiers. The universal quantifier has the following meaning: a formula (a)A is true if and only if everything within the range of the variable a satisfies the formula A. Por example, the formula '(x )plx' is true if and only if everything within the range of the variable 'x' satisfies the formula 'pI x'. And the existential quantifier has the following meaning: a formula (3a)A is true if and only


if something within the range of the variable a satisfies the formula A. But these are only informal remarks. We shall subsequently give a precise interpreta tiol! of the quantifiers.

Let us now define the class of formulas of FI in an exact way. As Ll the case of P, we shall here proceed recursively, by listing all posslble cases.

A formula of FI is any expression within Fl (i.e., any finitely long string of symbols of FI) which is either (a) an n-ary predicate constant of FI, followed by n occurrences of individual variables and/or individual constants of FI; or (b) the negation of a formula of FI; or (c) the conditional, or disjunction, or conjunction, or biconditional between any two formulas of FI; or (d) the result of putting a quantifier before a formula of Fl. Thus:

(al if f is an n-ary predicate constant of FI, aM aI' a2 ... an are individual variables and/ or individual constants, then f al a2 ... an is a formula of FI;

(b) if A is a formula of FI, then 'V A is a formula of FI; (c) if A and B are formulas of FI, then (A::) B), (A v B), (A 1\ B)

and (A == B) are formulas of FI; (d) if A is a formul2 of FI, and a is an individual variable of FI,

then (a)A and (3a)A are formulas of Fl. In particular, an atomic formula of FI is any formula of FI of

type (a). Any formula of the form (a)A is called a universal generalization, and any formula of the form (3a)A is called an existential generalization.

By way of illustration of various of the special cases within this definition, consider the following expression:

(1) (XI)«(X)plx 1\ (3y)(z)Q2y z)::) QIXI).

By (a), 'pIx', 'QIXI' and 'Q2y z' are formulas. Thus, by (d), '(x)p1x' and '(z)02y z' are formulas; and by (d) again, '(3y)(z)02y z' is a fonnula. By (c), then,

«x)plx f\ (3y)(z)Q2y z)

is a formula; and by (c) again,

«(x)p1x 1\ (3y)(z)Q2y z) ::) QIXI)

THE FIRST-ORDER PREDICATE LOGIC Fl 33

is a formula. Finally, then, by (d) the expression (1-) itself is a formula.

We shall from this point on usually draw upon the conventions for omitting parentheses from formulas and schemata which were introduced in our discussion of the sentential logic. Further, we shall omit superscripts on predicate constants; for the required superscript in each case is evident if individual variables or constants appear in the argument places.

Consider now the algebraic formula 'x < 3'. This formula is true for certain values of 'x' and false for others. We shall say that the occurrence of 'x' in this formula is a free occurrence. Consider however the formula 'for some x, x < 3'. Here there is a qualification on the occurrence of 'x' in 'x < 3', with the result that this formula has one determinate truth-value; viz., truth. We shall say that the occurrence of 'x' in 'x < 3' in this formula is a bound occurrence (as is the 'occurrence of 'x' in 'for some x'). Let us now distinguish free and bound occurrences of variables in an exact way. A particular occurrence of a variable a within a formula A is a bound occurrence of a in A if and only if it occurs within some part of A which is a formula of the form (a)B or of the form (3a)B. Otherwise, that occurrence is a free occurrence of a in A. For example, within the formula '(x)P x y f\ Q x' the first two occurrences of 'x' are bound occurrences, while the tnird is a free occurrence; and the only occurrence of 'y' within that formula is a free occurrence.

If a formula (a)B, or (3a)B, occurs within a formula A (or is the formula A itself), then the scope in A of that particular occurrence of the quantifier (a), or (3a), is the formula (a)B, or (3a)B, itself. Thus, within the formula '(x)P x y f\ Q x', the scope of the quantifier '(x)' is the formula '(x)P x y'. The final occurrence of 'x' in this formula lies outside the scope of that quantifier.

Finally, a sentence, or a closed formula, of Fl is any formula of FI which contains no free occurrences of variables. All other for-mulas of FI are called open formulas. .


2.2. Interpretations. Truth and Validity

The concept of ;l formula of F1 IS a syntactical concept, because this concept was defined in such a way as to make reference only to the forms of expressions, quite apart from their meanings. Having defined the concept of formula, we are now in a position to turn to the semantics of Fl. What distinguishes semantics from syntax is that within semantics we consider expressions together with their meanings, or interpretations, as well as their forms, and not merely expressions and their forms. Whatever concepts concerning some particular language are defined so as to make reference to expressions of that language together with the meanings of those expressions are semantical con cepts. One of the principal tasks of this chapter is to define in an exact way the concepts of logical validity and logical implication with respect to the first-order predicate logic Fl. These concepts will be defined as seman tical concepts.

Within the semantics of F1 singulary (i.e., one-place) predicate constants will be assigned classes of individuals taken from some (non-empty) domain of individuals, over which the individual variables are said to range. For every n greater than I, the n-ary predicate constants will be assigned n-ary relations among the individuals in this domain of individuals. Thus, we shall assign binary relations to binary (i.e., two-place) predicate constants; ternary relations to ternary (i.e., three-place) predicate constants; and so one. Binary relations are familiar from algebra, as well as from everyday experience; e.g., the relations of being greater than, being less than, being to the left of, being father of. Examples of ternary and quarternary (i.e., four-place) relations from plane geometry are, respectively, the relation of point x lying between points y and z, and the relation of poin t x being the same distance from point y as point z is from point w. If now we presuppose the notion of an ordered n-tuple, we can define the concept of an n-ary relation in a very general way; viz., an n-ary relation is any class of ordered n-tuples whatsoever. (We shall, in Chapter VII, show how the notion of an ordered n-tuple in turn can be defined in terms of the notion of set.) Thus, the less than relation among

INTERPRETATIONS. TRUTH AND VALIDITY 35

the integers can be defined as the class of all ordered pairs (x, y)

such that x and yare integers, and x is less than y. And the geometric relation of point x lying between points y and z can be defined as the class of all ordered triples (x, y, z) which are such that x, y and z are points on a line, and x lies between y and z. Both of these relations so understood will be classes with infinitely many ordered tuples as elements.

For purposes of generality, we shall subsequently think of classes of individuals as singulary relations among individuals.

Let us now proceed to definitions, considering first the definition of the seman tical concept of an interpretation. 2 Roughly, an interpretation of F1 consists of a domain of entities over which the individual variables of F1 range, together with assignments of appropriate entities defined with respect to this domain for each of the non-logical constants of Fl. More exactly, an interpretation I of F1 consists of:

(a) a non-empty domain D, over which the individual variables of F1 range:

(b) for each individual constant (if any) of FI, an a~signmen t to that constant of some individual from the domain D; and

(c) for each n-ary predicate constant of Fl, an assignment to that constant of some n-ary relation among the individuals of D.

It should be noticed that we are not here defining the concept of an interpretation merely of a formula of F1, but rather the concept of an interpretation of F1 itself - in which, of course, every formula of F1 receives an interpretation.

We need next the concept of an arbitrary infinite sequence of individuals within D. The variables of F1 are arranged in an infinite sequence. Thus, each infinite sequence of individuals within D correlates with each of the variables of Fl some individual within D. Thus, in particular, given any formula A and infinite sequence

2 Throughout the rest of this chapter we follow rather closely E. Mendelson 1964, Chilpter 2, with some minor changes. For an alternative seman tical and syntactical l:±?fWach to the ftrst-order predicate logic, see R.C. Lyndon 1966, pp. 13-19, 43-48. f.,r l definitive treatment of the theory of formal inference for the first-order predicate :"''''Pc, 5e<~ R. ~ontague and L. Henkin 1956.


of individuals S = (bI> b2, ... ), each of the free variables within A has some individual within D correlated with it by S.

Let us use the symbol

siCa)

to stand for the individual in D which is assigned to a by I if a is an individual constant, and the individual in D which is correlated with a by S if a is an individual variable ..

We now define what it means to say that a given sequence S satisfies a formula A with respect to an interpretation I. We do this recursively, in a manner paralleling the general definition of formula. In stating !his definition, we proceed, of course, in accordance with the intended interpretations of the sentential connectives and quantifiers.

Let A be any formula of pI, I an interpretation of pI, and S any infinite sequence of individuals from the domain D of I

(a) If A is an atomic formula f al ... an' then S satisfies A (with respect to 1) if and only if the n individuals Si(al) .,. Si(an) are related. by the relation which I assigns to f. That is, if and only if the ordered n-tuple (Si(al) .. , Si(an» is an element of the class of ordered n-tuples that J assigns to f.

(b) If A is 'VB, for some formula B, then S satisfies A if and only if S does not satisfy B.

(c) If A is B :::J C, for some formulas Band C, then S satisfies A if and only if either S does not satisfy B or S satisfies C.

(d) If A is B v C, for some formulas Band C, then S satisfies A if and only if S satisfies either B or C (or both).

(e) If A is B /\ C, then S satisfies A if and only if S satisfies both Band C.

(f) If A is B == C, fvr some formulas Band C, then S satisfies A if and only if either S satisfies' both Band C, or S satisfies neither B nor C.

Before we add two final clauses to this definition, which cover quantification on formulas, we introduce the notion of an a-variant of an infinite sequence S. An infinite sequence of individuals in D is an a-variant of some given infinite sequence of individuals in D if and only if a is an individual variable, and either


(a) these two sequences are identical with one another, or (b) they differ only in the individual that they correlate with the variable a (being alike in every other respect). Thus, as a special case, every sequence is an a-variant of itself. We now add two final clauses, stated in terms of this concept of an a-variant.

(g) If A is (a)B, for some individual variable a and formula B, then S satisfies A if and only if every a-variant of S satisfies B.

(h) If A is (3a)B, for some individual variable a and formula B, then S satisfies A if and only if some a-variant of S satisfies B.

The reader will probably appreciate a few illustrations of this definition. Consider the formula 'P x.' Suppose that some interpretation I assigns the domain of positive integers as the range of the variables of pI , and assigns the class of prime numbers to the predicate 'P'. Then, since 'x' is the first variable of pI, by clause (a) any sequence S which had a prime number as its first term would satisfy the formula 'P x'. Consider now th~ formula '(3 x)P x'. By clause (h), an arbitrary sequence S satisfies this formula if and only if some 'x'-variant of S - thus, some sequence differing from S at most in its first term - satisfies 'P x'. Now any sequence which has a prime number as its first term and is otherwise identical with S will satisfy 'P x'. Thus, since there is such a sequence, S itself satisfies '(3 x)P x'. Because S was an arbitrary sequence, all sequences satisfy '(3 x)P x' (with respect to the above interpretation 1). Indeed, it is easy to see that, in general, if any sequence satisfies a formula with no free variables, with respectto some interpretation I; then every sequence satisfies that formula with respect to I. With respect to a given ~nterpretation, then, a sentence is either satisfied by all sequences or by none. This is not true for formulas in general, of course.

As one further example, consider the formula '(x)P x'. A sequence S satisfies '(x)P x' if and only if every 'x'-variant of S satisfies 'P x'. Thus, if there is any 'x'-variant of S at all which fails to satisfy 'P x', then S will fail to satisfy '(x)P x'. But any 'x'-variant which has some number which is not prime as its first term will fail to satisfy 'P x'. Thus, S itself does not satisfy '(x)P x'. That is, no sequence satisfies '(x)P x' (with respect to the above interpretation 1).


In terms of the concepts which we now have at hand, we are able to define a variety of further seman tical concepts, many of which correspond in some measure to some important intuitive concept (indeed, possibly in some cases to more than one such concept). Naturally, each of the concepts that we define will be relativized to the formulas of pl.

First, the concept of truth. Rather than define a concept simply of truth, we shall define a concept of truth under an in terpretation. And we shall define this concept, and all further seman tical concepts, not just for sentences - that is, formulas with no free variables - but for formulas in general.

Let the (universal) closure of a formula A be the result of prefixing A with universal quantifiers on each of its free individual variables. (These variables are to be taken in increasing order. And if A contains no free variables, then A is to count as its own closure.) We shall define truth under an interpretation in such a way as to make a formula with free variables true if and only if its universal closure is true. Thus, for example, 'P x' will be true under a given interpretation if and only if '(x)P x' is true under that interpretation.

Given these remarks, the definition is almost obvious. A formula A of pI is true under an interpretation I if and only if A is satisfied by every infinite sequence S of individuals in I. Purther, we shall say that a formula A is false under an interpretation I if anJ only if A is satisfied by no sequence of individuals in I.

The reader should have no difficulty in seeing that the following results are immediate consequences of these definitions.

(I) A formula A is true under I if and only if its universal closure is true under I.

(2) (a)A is true under I if and only if 'V(3a)'VA is true under I; and similarly for 'V (a)'V A and (3a)A.

(3) If A and A:) II are both true under I, then B is true under I. (4) A formula A is false under an interpretation I if and only if

'VA is true under I. And A is true under I if and only if 'VA is false under I.

(5) No formula is both true and false under the same interpreta-tion I.


(6) If A is a sentence, then either A or 'VA is true under I; thus, A is either true or false under I.

Notice, however, that it is not the case that if A is any formula whatsoever, then either A or 'V A is true under I. Consider again the open formula 'P x'. Under any interpretation which assigns the positive integers as domain, and assigns the class of prime numbers to 'P', this open formula is neither true nor false. This is because we are regarding an open formula as true or false if and only if its universal closure is true or false. The universal closure of 'P x', however, is neither true nor false under" such an interpretation, since it is neither the case that all positive integers are primes, nor the case that none of them are primes.

We introduce now the central concept of a model. A rt)odel of a class of formulas r (or of a formula A) is simply any intefpretation under which all of the formulas of r (or A) are true. And if I is a model of r (or of A), then we say that r (or A) holds in I. Purther, we shall say that r (or A) is semantically consistent if and only if r (or A) has a model.

A formula A is valid within pI - or logically valid, or a logical truth, within FI - if and only if A is true under every interpretation. Here we have a precise analysis (restricted to formulas of pI ) of Leibniz' famous informal concept of 'being true in all possible worlds.' Intuitively, a valid formula is one that is true by virtue of logical considerations alone; or one that is true under all logically possible conditions.

We shall present illustrations of the logically valid formulas of pI later, when we come to a consideration of the axioms and theorems of pl.

A formula A is (logically) inconsistent if and only if A is false under every interpretation. This will be the case, obviously, if and only if 'V A is logically valid. And a formula A is satisfiable if and only if, for at least one interpretation, A is satisfied by at least one seq uence. Clearly, any formula A will be valid if and only if 'V A is not satisfiable.

A formula A is a (logical) consequence of a class of formulas r if and only if, for every interpretation I, any sequence which satisfies all of the formulas in r also satisfies A. We say that a


formula B (logically) implies A if and only if A is a logical consequence of the class which contains B as its only element. And two formulas are said to be (logically) equivalent if and only if they logically imply each other.

The semantical concepts introduced in our discussion of the sentential logic can easily be shown to be special cases of the above concepts, and are readily transferred to the predicate logic. Thus, for example, the tautologies of pI are a special case of the logically valid formulas of pI, and tautological implication is a special case of logical implication. As we shall see later, however, the semantical concepts for the predicate logic differ in an important respect from the semantical concepts of the sentential logic. These latter concepts are effectively defined concepts (for all formulas A and finite classes of formulas n. The former concepts, however, are not effectively defined concepts. That is, there are no effective procedures for determining in every case whether the semantical concepts for the predicate logic apply to that case. Thus, for example, the concept of valid formula of pI is not an effectively defined concept, though the concept of a tautology of pI is an effectively defined concept. Truth-tables provide us with an effective test for determining whether or not any given formula A is a tautology; but there is no corresponding effective test for determining whether or not A is valid. In a great many cases we can determine, of course, whether or not A is valid; but it is known that there is no effective procedure for determining this in every case.

Corresponding to the list of results following from the semantical concepts which we defined in discussing the sentential logic (page 19), we now have the following list of more comprehensive results for formulas of pI which we trust the reader will readily see to follow from the above definitions of seman tical concepts for pI:

(I) A logically implies B if and only if the conditional A :J B is valid.

(2) A and B are logically equivalent if and only if the biconditio nal A == B is valid.

(3) If r is a class of valid formulas, and A is a logical consequence of r, then A is valid.


(4) If A implies B, and B implies C, then A implies C. That is, logical implication is transitive.

(5) Any two valid formulas are logically equivalent. (6) A valid fJrmula is implied by any formula (or class of for

mulas) whatsoever; and an inconsistent formula implies any formula whatsoever.

(7) A is valid if and only if A is a logical consequence of the empty class of formulas. .

Purther: (8) A is valid if and only if its closure is valid; and A has a model

if and only if its closure has a model. (9) Por any formula A and any class of formulas r, if A is a con

sequence of r, then A holds in every model of r. In particular, for every formula B, if B implies A, then A holds in every model of B.

These semantical concepts which we have defined for the predicate logic pI are quite straightforward when applied to sentences, but we have to be careful in applying them to open formulas, which have free variables. We have already noted that it does not hold in general that an open formula is either true or false under a given interpretation - since an open formula A is regarded as true only if its closure is true, and false only if the closure of "v A is true. Several further points of this sort need to be noted explicitly. Let A be a sentence. Then, a formula B holds in every model of A if and only if A implies B. However, if A is an open formula, this result does not hold in general. Thus, if A is 'P x', and B is '(x)P x', then B holds in every model of A - that is, if A is true under any interpretation I, then B is true under I - but B is not implied by A. Por let I assign the positive integers as domain, and assign the class of prime numbers to 'P'. Then any sequence S which has the integer 3 as its first term 'will satisfy 'P x', but will not satisfy '(x)P x'. Thus, by our definition of implication, 'P x' does not imply '(x)P x'.

There is, then, in the case of open formulas at times a divergence between being logically implied by a formula A and holding in every model of A. However, it is clear that the following does hold in general: Let r be any set of formulas, and r' be the set of the closures of those formulas. Then, a formula A holds in all


n;lOdels of r if and only if A is a logical consequence of r'. And similarly, r will have a model if and only if r' is satisfiable. Where all of the formulas in r are already closed, of course, r' is identical with r, and in that case A is a consequence of r if and only if A holds in every model of r; and r has a model if and only if r is sa tisfiable.

We now introduce a symbol which is widely used in logical writings, viz., the symbol 'F', defining it as follows: for any class of formulas r and any formula A,

rFA

if and only if A holds in every model of r. As we have just remarked, where r is any class of closed formulas, this is equivalent to A's being a consequence of r.

We shall also use

BFA

to mean that A holds in every model of B;

AFr

where A is a class of formulas, to mean that each formula in r holds in every model of A; and

FA

to mean that A holds in every model of the empty set of formulas. The following results are relatively obvious, and are easily stated

with the help of our new notation. ( I) I f A 1= rand r 1= A, then A 1= A. (2) If r 1= A, and every formula in r is in A, then A 1= A. (3) If r 1= A, and r 1= A :) B, then r 1= B. (4) If r 1= A, then r 1= A', where A' is the closure of A. (5) If A is valid, and r is any class of fonnulas, then r 1= A. (6) A is valid if and only if 1= A. We now conclude this section by restating, with the help of our

new notation, two results earlier mentioned: (7) Por any formula A and any class of formulas r, if A is a

consequence of r, then r 1= A.

AXIOM SCHEMATA OF Fl 43

(8) Por any formula A and any class of closed formulas r, A is a consequence of r if and only if r F A.

As we shall see subsequently, once we have defined the notion of being derivable from, r FA if and only if A is derivable from r, for every class of formulas r and every formula A.

2.3. Axiom Schemata of pl. Rules of Inference and Theorems. Consistency of pI

We now return to the syntax of pI, and consider a set of axiom schemata and rules of inference for pl. What we w~uld like to do is to lay down axioms and rules from which one could derive all of the valid fonnulas of pI as theorems of pI, and no other formulas. It turns out that this is possible. We shall here use the following set of axiom schemata, which includes the axiom schemata of P. Let A, Band C be formulas of pl.

(a) A :)(B :)A) (b) (A:) (B :) C)) :) ((A:) B):) (A:) C)) (c) ('VB :)'VA) :)(('VB :)A):)B) (d) (a)(A:) B) :) (A:) (a)B), where a is an individual variable that

has no free occurrences in A. (e) (a)A:) B, where a is an individual variable, and B differs

from A at most in having free occurrences of some individual variable (or occurrences of some individual constant) b where A has free occurrences of a.

As examples of axioms of pI we have the following formulas: by schema (a),

(x) P x :)(P y :)(x)P x);

by schema (d),

(x )(P y :) Q x y) :) (P y :) (x)Q x y);

and by schema (e),

(x)Px:)Py, and

(y)(Q x y :) (z)Q y z):) (Q xx:) (z)Q x z):

44 THE FIRST-ORDERPREDICATE LOGIC: I

The reader should be able to show that these axioms are all true under every interpretation and thus valid. The restrictions on axiom schemata (d) and (e) are necessary, however, for without them we would have axioms which were not true under certain interpretations. Thus, if we were to drop the restriction on axiom schema (d), we would have

(x)(P x::> P X)::> (P x::> (x)P x)

as an axiom. But consider any interpretation I which assigns to 'P' the class of everything in its domain except for some entity a. Then the antecedent of this formula will be true under I, but its consequent will not be true under I; thus the formula itself will not be true under I. As for axiom schema (e), without its restriction the formula

(x) 'V (y)P X )' ::> 'V (y)P y y

would be an axiom. But this formula is false under any interpretation whose domain contains at least two individuals, and where the identity relation is assigned to 'P'.

In order to state the primitive rules of inference of F1, we shall draw again upon the notion of definitional equivalence, as we did in the case of P. Now, however, we need to add an extra clause to the definition of 'definitionally equivalent,' to cover the universal and existential quantifiers. A formula A is now definitionally equivalent to another formula B if and only if there are formulas AI' B [, A2, and B2 such that A and B are alike except that A contains an occurrence of Al at some place where B contains an occurrence of B1,

(a) Al is A2 V B2 and B1 is 'VA2 ::> B2, or (b) Al is A2 /\ B2 and B1 is 'V(A2 ::> 'VB2), or (c) Al is A2 == B2 and B1 is (A2 ::> B2) /\ (B2 ::> A2), or (d) Al is (3a)A2 and B1 is 'V (a) 'VA2.

The added clause (d), it should be noted, is in accord with the intended meanings of the universal and existential quantifiers.

We now state the primitive rules of inference of FI: (a) If A and Bare definitionally equivalent to each other, then

from A one may infer B, and vice versa.


(b) From A and A ::> B, one may infer B. (c) From A, if a is an individual variable, one may infer (a)A. Rule (a) is the rule of Definitional Interchange; rule (b) is the

rule Modus Ponens; and rule (c) is the rule of Generalization. These rules are all sound rules of inference, in the sense that for

every interpretation I, when these rules are applied to formulas which are true under I, the formulas they permit us to infer are true under I. It is to be noticed in particular that this is true of rule (c), the rule of Generalization.

From the fact that our rules are sound, it follows that they are validity-preserving; that is, when applied to valid formulas, they lead only to valid formulas.

A derivation of A from a class of hypotheses r within F1 is any finite sequence of formulas of F1, where A is the last formula in the sequence, and where each formula in this sequence is either (a) an axiom of F1, (b) an hypothesis within the class r, or (c) obtainable from earlier formulas in this sequence by a single application of one of the three primitive rules of inference of F1 to those formulas. And a formula A is derivable from r if and only if there is a derivation of A from r. We introduce the familiar symbol 'r-' in the following sense:

rf-A

if and oply if A is derivable from r. Similarly,

B1> ... , Bn f- A

if and only if A is derivable from B1 ... Bn as hypotheses; and

f-A

if and only if A is derivable from the empty class of formulas. And we define a theorem of F1 as any formula of F1 which is derivable from the empty set of formulas. Thus, A is a theorem of F1 - i.e., f- A - if and only if there is a finite sequence of formulas such that A is the last formula in this sequence, and every formula in this sequence is either an axiom or obtainable from earlier formulas by applying one of the rules of inference to those formulas. Such a derivation is often called a proo/. Thus, a


formula A is a theorem of Fl if and only if there is a proof of A within Fl.

We shall not here derive any theorems from the axioms of Fl. Rather, we merely list a number of theorem schemata as important examples. All formulas of Fl which are covered by these schemata are theorems of Fl.

(a)A =: 'V (3a)'VA (a)(b)A == (b)(a)A

(3a)(3b)A == (3b)(3a)A (a)(A!\ B) == (a)A !\ (a)B

(3a)(A v B) == (3a)A v (3a)B (a)(A :JB) == 'V(3a)(A!\ 'VB) (a)(A:J B) == (3a)A :J B, if a is not free in B (a)(A v B) == (a)...l\ v B, if a is not free in B

(3a)(A :J B) 2 (a)A :J B, if a is not free in B (3~)(A v B) == (3a)A v B, if a is not free in B (a)(A:J B) :J «a)A :J (a)B) (a)(A :J B) :J «3a)A :J (3a)B) «a)'V A v (a) B) :J (a)(A :J B) «a)A v (a)B) :J (a) (A v B) (3a)(A !\ B) :J (3a)A !\ (3a)B (3 a)(b)A :J (b )(3 a)A

It is easy to see that

(l) If r f- A, then r 1= A.

For suppose that rf-A, and that some interpretation I is a model of r. Then all of the formulas in r are true under I. All of the axioms of Fl, being valid, are true under I also. Now since r f-A, A is derivable from various of the axioms of Fl and the formulas of r. Since our rules of inference are sound, A will therefore be true under I. Thus, r 1= A.

It follows as a corollary from (I) that

(2) If f- A, then 1= A.

That is, every theorem of Fl is valid. We say that a set of formulas r is syntactically consistent - or


consistent -- if and only if no formula A !\ 'V A is derivable from r. As a further corollary from (1) we have the very important result .

(3) If a set of formulas r has a model, then r is consistent. That is, semantic consistency implies syntactic consistency. For suppose that I is a model of r, and that r f- A!\ 'VA. Then, by (1), I is a model of A !\ 'V A. But this is impossible. Thus, if r has a model, r is consistent.

The above result (1) is one of the most important results concerning the system Fl, and is often referred to as the soundness theorem for the first-order predicate logic. As we shall see in Chapter III, its converse holds also, as do the c'onverses of (2) and (3). The converse of (3), indeed, is the principal metatheorem conceming the first-order predicate logic.

Since the theorems of Fl are all valid, it follows that Fl is consistent in the sense that there is no formula A such that both A and 'VA are theorems of Fl. For no formula can be such that both it and its negation are valid. It is possible to give a syntactical proof of the consistency of Fl, however, in which we make no reference to semantical concepts. Such a proof is to be welcomed, since it has fewer presuppositions than a semantical proof -making no reference to interpretations, domains of entities, infinite sequences, and so on. One such proof is as follows.

For each formula A of Fl, let TCA) - the transform of A - be the expression which results from deleting all quantifiers from A, and then replacing each of the atomic formulas in the resulting expression by the sentential letter 'p'. TCA) will then be a formula in the sentential logic P. Now for every axiom A of Fl provided by axiom schemata (a)-(c), T(A) will clearly be a tautology of P. If A is an axiom provided by axiom schema Cd), then T(A) will be a formula of the form (A:J B) :J (A:J B), and thus will be a tautology. Similarly, if A is an axiom by axiom schema (e), then T(A) will be a formula of the form A :J A, and thus will be a tautology. Thus, for every axiom A of Fl, T(A) will be a tautology. Now if T(A) and T(A:J B) are tautologies, then T(B) will be a tautology; and if T(A) is a tautology, then T«a)A)will be a tautology - indeed, it will be identical with the tautology T(A). Similarly, if B


result" from A by definitional interchange, and T(A) is a tautology, then T(B) will be a tautology. Thus, if we apply our primitive rules of inference to formulas whose transforms are tautologies, we infer only formulas whose transforms are tautologies. Thus, the transforms of all theorems of FI are tautologies. It follows immediately that there is nO formula A such that both A and 'V A are theorems of Fl. That is, FI is consistent.

2.4. The Deduction Theorem

In mathematical as well as in ordinary discourse One often establishes a conclusion of conditional form by first assuming its antecedent, and then deriving its consequent from this assumption. Reasoning of this sort can also be carried out within FI, provided we adhere to certain restrictions. Let

r, A f- B

mean that B is derivable from the class of formulas which contains the formulas in r, together with the formula A. Then, where A is a closed formula, it can be shown that if A f- B, then f- A ::) B. More generally, if r, A f- B, then r f- A ::) B. Where A is an open formula, however, this result does not hold in general. Thus, though '(x) P x' is derivable from 'P x', we do not have f-P x ::)(x)P x. For as we have noticed earlier, 'P x ::) (x)P x' is not valid, and thus is not a theorem of Fl.

We now need to show how to provide for inferences of the above sort for formulas in general.

Let Bv ... , Bn be the formulas appearing in some n-line derivation from a class of formulas r. And let A be any formula in r. Then we say that Bi depends upon A in this derivation if and only if,

(a) Bi is A itself; or (b) Bi is obtainable L·om earlier formulas in the sequence by

application of one of the primitive rules of inference, where at least one of these earlier formulas depends upon A.

Let us illustrate this. Consider the following derivation of (3a)B

THE DEDUCTION THEOREM

from the hypothesis A and (a)A ::) (3a)B:

A (a)A (a)A ::) (3a)B (3a)B

Hypothesis BI, Generalization Hypothesis B2, B3, Modus Ponens

49

Here BI depends upon A; B2 depends upon A; B3 depends upon (a)A ::) (3a)B; and B4 depends upon both A and (a)A ::) (3a)B.

We now state the Deduction Theorem for Fl: Assume that there is a derivation of B from the formulas in r together with A, in which no application of the rule of Generalization to a formula which depends upon A has as its quantified variable a variable free in A. Then, r f- A ::) B.

As illustration, consider the following derivation (rather, the following schema of an abbreviated derivation):

(I)

(2) (3)

(4)

(5)

(6)

(a)(A ::) B) (a) A A ::)B

A

B (a)B

Thus, (a)(A ::) B), (a)A f- (a)B.

Hypothesis Hypothesis (I), axiom schema (e),

Modus Ponens (2), axiom schema (e),

Modus Ponens (3), (4), Modus Ponens (5), Generalization

Line (5) depends upon lines (I) and (2), in which a does not appear freely. Thus, by the Deduction Theorem, (a)(A ::) B) f- (a)A ::) (a)B. Consider now, however, the closely related derivation:

(I) (a)(A ::)B) Hypothesis (2) A Hypothesis (3) A ::)B (1), axiom schema (e),

Modus Ponens (4) B (2), (3), Modus Ponens (5) (a)B (4), Generalization

Thus, (a)(A ::) B), A f- (a)B. Now line (4) depends upon lines (I) and (2). The Deduction Theorem will let us conclude (a)(A ::) B) fA::) (a)B, therefore, only if a does not appear freely in A.


The Deduction Theorem is not a rule of inference, to be applied withi1 a derivation just as primitive and derived rules of inference are applied. Rather than being used in the actual construction of a derivation, it is used to show that certain derivations exist. To illustrate its use, let us suppose that we want to show that all formulas of the form

(a) (A :J B) :J ((a)(B :J C) :J (a)(A:J C)

are theorems of Fl. We may do this by arguing as follows:

By axiom schema (e), (a) (A :J B) f- A :J B. By axiom schema (e), (a)(B :JC) f- B :JC. By sentential logic, (a)(A :J B), (a)(B :J C) f- A :J C. By the rule of Generalization,

(a)(A :J B), (a)(B :J C) f- (a)(A :J C). Hence, by the Deduction Theorem,

(a)(A :J B) f- (a)(B :J C) :J (a)(A :J C). Finally, by the Deduction Theorem again,

f-(a)(A :J B) :J ((a)(B :J C) :J (a)(A :J C».

We have not here, of course, actually derived any theorems from the axioms of Fl. What we have done, rather, is to show that all formulas of a certain form are in fact derivable from those axioms; that is, provable. Now the Deduction Theorem can be proved in an effective fashion: that is, in such a way as to show how, once given the derivation mentioned in its hypothesis, actually to construct the derivation lilentioned in its conclusion. Thus if one wished actually to construct a proof of some formula of the above form, the above argument, together with the proof of the Deduction Theorem, would effectively show one how to do this. Once one is able to show as above, however, that certain formulas are in fact provable, or derivable from certain hypotheses, it is often unnecessary - and indeed rather tedious - actually to construct actual proofs or derivations. It is in establishing that certain formulas are in fact provable, or derivable from certain hypotheses, that the Deduction Theorem is invaluable.

The following two useful corollaries readily follow from the Deduction Theorem. They are easier to formulate than the Deduc-

THE DEDUCTION THEOREM 51

tion Theorem, and they cover most cases in which one would use the Deduction Theorem. In particular, the first one covers each of the above illustrations.

(I) I f there is a derivation of B from the formulas of r together with A, in which no application of the rule of Generalization has as its quantified variable a variable free in A, then r f- A :J B.

(2) If A is a closed formula, then if r, A f- B, then r f- A :J B. In connection with (2), we remark that it can be shown that the

Deduction Theorem holds without exception within the sentential logic P. There, for any formulas A and B, and any class of formulas r, if r, A f- B, then r f- A :J B.

We now close this chapter by establishing two very elementary but very important results. Each of these results states that given a derivation of a certain sort, there exists a derivation of another sort; and the proofs of these results show us how to construct these latter derivations once given the former derivations. First, if a contradiction is derivable from a given sentence A, then 'V A is a theorem. This is proved as follows:

(l), corollary 2, By sentential logic (2), (3), sentential logic

(l)Af-BI\'VB (2) f-A:JB 1\ 'VB (3) f-'V(B 1\ 'VB) (4) f-'VA

More generally, if r, A f- B 1\ 'VB, then r f- 'VA. This is the familiar Principle of Reductio Ad Absurdum: to refute a given hypothesis it suffices to derive a contradiction from that hypothesis. , Using the Deduction Theorem, we see that this principle holds for FI - as well as for P also.

Second, if a contradiction is derivable from a set of formulas r, then any formula B whatsoever is derivable from r. The proof:

By sentential logic, (l), (2) Modus Ponens,

(I) r f- A I\'VA (2) r f- A 1\ 'VA :J B (3) r f- B

Thus if there is some formula B that is not derivable from r, then , ,

r is consistent.

CHAPTER III

THE FIRST-ORDER PREDICATE LOGIC: II

3.1. Elementary Theories

We have now defined the basic semantical and syntactical concepts pertaining to the first-order predicate logic Fl, and have considered a number of elementary results concerning these concepts. Before going on to further results of a deeper nature, however, it will be convenient to turn to the notion of an elementary theory, and to define a number of concepts pertaining to elementary theories. 1

An elementary theory, recall, is any theory developed within a first-Jrder predicate logic in which there are no predicate variables, such as the predicate logic Fl. (There mayor may not be operation symbols, and a symbol for identity. We shall consider these possible additions to Fl later.) Elementary theories are to be contrasted with (a) first-order theories in which predicate variables appear, and (b) second-order t"heories - or n-th-order theories, for n greater than 2 - in which the underlying logic is some second-order, or higher-order, logic.

In this book we shall not consider theories of type (a). That is, all first-order logics and theories here considered will be elementary logics and theories. We shall define the notion of a secondorder theory in Chapter IV, and shall later consider examples of both elemen tary and second-order theories.

1 For a consideration of various of the general properties of elementary theories, see the first essay in the monograph A. Tarski, A. Mostowski and R.M. Robinson 1953.

52

ELEMENTARY THEORIES 53

We say that any theory stated within some precisely formulated system of logic is a formalized theory. It is only formalized theories that we are concerned with in this book. In actual practice, to be sure, mathematicians do not usually present their theories as formalized theories, but present them in ways which only approximate the explicitness and exactness of formalized theories. In particular, it is not customary to give a precise statement of the underlying logic and rules of inference: Rather, in ordinary mathematical practice one usually does little more than (a) list the undefined terms of the theory being presented, and then (b) state the axioms of that theory, using for this purpose these undefined terms, together with a certain body of informal terminology that will presumably be readily understood by the reader. One then proceeds to derive theorems from these axioms, using whatever forms of reasoning seem sound. This method of presenting a logical or mathematical theory has been traditionally known as the axiomatic method, in contrast with the method of formalization, which is often referred to as the logistic method. Historically, the axiomatic method (in most of its' essentials, anyway) dates from Euclid; it was brought to near perfection at the end of the nineteenth century, especially by Hilbert. The logistic method, on the other hand, is a product of modern times. Its first appearance is in Frege's Begriffsschrift of 1879, in which Frege presents a complete formalization of the sentential logic. Prominent names in the subsequent history of the logistic method include those of Peano, Russell, Hilbert and Godel.

We have taken the position that many of the concepts exactly defined with respect to the sentential and predicate logics correspond to concepts which exist at the intuitive and inf<?rmallevel, and that these exactly defined concepts can be thought of as explications of these intuitive concepts. That is, these exactly defined concepts in some measure resemble these intuitive concepts, while differing from them in ways which make them more useful for the logician's purposes. The same can be said of formalized theories. In many cases - though by no means necessarily and in all cases - formalized theories correspond to theories which exist at the more or less informal or semi-formal

54 THE FIRST-ORDER PREDICATE LOGIC: II

level. This is true, for example, of formali.zations of the foundations of number theory, analysis and set theory, as well as of geometry. With respect to formalized theories one can ask a number of very pointed metamathematical questions, concerning completeness and decidability, for example, which seem impossible to ask of the informal counterparts of these theories. There is no need to pretend, of course, that mathematicians should work only with formalized theories. For most mathematical purposes the degree of exactne"s and explicitness that formalization requires is not only not necessary, but would prove cumbersome and distra;:;ting. Still, for certain purposes - certain metamathematical purposes, most obviously - an exactly defined theory is a prerequisite. When one turns from a study of the particular subject matter of a theory - real numbers, functions, groups, etc. - to a study of that theory itself, one simply finds that formalization is a necessity if that study is to be carried beyond a certain point.

Throughout the rest of this chapter we shall use the term 'theory' in the sense 01 'elementary theory developed within Fl.'

The symbol 'Flo, recall, stands for not just one system of first-order logic, but a whole class of such logics, which differ just in which non-logical constants they contain. Each of these logics mayor may not contain individual constants; but each must contain at least one predicate constant. Now every theory is developed within some one of the systems Fl, which we shall speak of as its underlying logic. The non-logical constants of a theory T are those which appear in its underlying logic; similarly for its formulas. Thus, noW that the various systems Fl have been exactly defined, once we specify the non-logical constants of a particular theory T, its non-logical constants and formulas are precisely fixed.

The axioms of a theory T include (a) the axioms of its' underlying logic, called its logical axioms, together with (b) its particular subject-matter or proper axioms, called its non-logical axioms. The theorems of a theory, of course, are those formulas which are derivable from its axioms. These include, as special cases, all of the axioms, and all of the logical theorems.

A theory can now be defined as a certain ordered pair,


consisting of two things: (a) a class of non-logical constants, and (b) a class of non-logical axioms, which contain no non-logical constants other than those in the class mentioned in (a). Thus, two theories T and T' are identical if and only if they are' identical in their non-logical constants and non-logical axioms.

A model of theory T is any interpretation f under which all of the axioms of T are true. Because of the soundness theorem for first-order logic it follows that all of the theorems of T will be true under f.

If there is an effective procedure for determining whether any given formula of T is a non-logical axiom of T, then T is called an axiomatic theory. Thus, we have an axiomatic theory if we simply list the non-logical axioms, or present them through axiom schemata. Though this is the familiar, classical manner.of presenting the axioms of a theory, it is not the only possible way. One might characterize the non-logical axioms of a theory T semantically; for example, as all sentences of T which are true under a certain interpretation. And there are still further possible characterizations of the axioms of a theory (for example, as in the proof of Lindenbaum's lemma, which appears later in this chapter). Whenever we characterize the axioms of a theory other than by listing them, or presenting them through schemata, two questions arise. First, is T an axiomatic theory; that is, are its non-logical axioms effectively characterizable? As an example of a theory whose non-logical axioms were initially characterized se.mantically and were later shown to be effectively characterizable, we have the theory of elementary algebra of Chapter VI. Second, one can ask whether T is (effectively) axiomatizable, in the sense that there is some axiomatic theory T' whose formulas and theorems are identical with those of T. A theory whose axioms are initially characterized semantically mayor may not be axiomatizable. As the most famous example of such a theory which is not axiomatizable, we have the theory whose axioms are all those sentences of arithmetic which are true under the usual interpretation of their symbols. We shall consider this theol;y in Chapter V.

If a theory is axiomatic it is, of course, axiomatizable. However, the converse does not hold in general: there are axiomatizable


theories T which are not axiomatic. Thus, let the axioms of T be the theorems of some axiomatic theory T' which is such that there is no effective procedure for determining in generai whether a formula is a theorem of T'.

We can ask not only whether a theory Tis axiomatizable, but in particular whether it is finitely axiomatizable, in the sense that there is some axiomatic theory T' whose formulas and theorems are identical with those of T, where the number of non-logical axioms of T' is finite. It is known that there are axiomatizable theories which are not finitely axiomatizable; as examples, the axiomatic theories N, Rand ZF of Chapters V, VI and VII, respectively.

A theory T' is an extension of a theory T - and T is a subtheory of T' - if and only if the non-logical constants and theorems of T are included among the non-logical constants and theorems of T'. (Notice that on this definition each theory is considered an extension of itself.)

If T' is an extensiol! of T, and T and T' have precisely the same symbols, then T' is a simple extension of T. If every formula of T which is a theorem of T' is also a theorem of T, then T' is a conservative extension of T. That is, within T' no new theorems containing only the non-logical constants of T are provable. Finally, two theories are equivalent if and only if each is an extension of the other. Thus, if two theories T and T' are equivalent, they have the same non-logical constants and the same theorems, and are simple conservative extensions of each other. If they differ at all, then, they differ only in their axioms - though their different axioms lead to the same theorems.

A theory T is consistent if and only if there is no formula A of T such that both A and "v A are theorems of T, and complete if and only if, for every sentence A of T, either A or "v A is a theorem of T And a theory T is decidable if and only if there is an effective procedure for determining, for every formula A of T, whether A is a theorem of T.

We say that a set ot' axioms for a theory T is consistent if and only if T itself is consistent; and similarly for completeness.

In addition to asking whether a given set of axioms is consistent


or complete, one can ask whether it is an independent set of axioms. In general, a set of formulas r is an indepe.ndent set of formulas if and only if no formula in r is derivable from the remaining formulas in r; and ~ particular formula A is independent of a class of formulas r if and only if it is not derivable from those formulas.

A familiar way of showing that a particular theory is consistent, of course, is to show that its axioms have a model. The reasoning which is presupposed by this procedure is set out in results (1) and (3) on pages 46 and 47. For suppose that some interpretation I is a model of the axioms of T. Thus, by (1) - i.e., the soundness theorem - I is a model of the set of theorems of T. Thus, by (3), it follows that the set of theorems of T is consisten.t. By similar reasoning, one can show that a formula A is independent of a given set of formulas r if one can show that r has a model in whi~h A is not true.

The above concepts of consistency, completeness and independence were all defined as syntactical concepts, in terms of the concept of derivability. We turn now to a number of important seman tical concepts pertaining to theories, defined in terms of the concept of a model of a theory.

Two models of a theory are said to be isomorphic when they have the same structure. The notion of being alike in structure can be defined in an exact way as follows. Interpretations I and I' are isomorphic models of a theory T if and only if:

(a) I and I' are models of T; (b) there is a one-to-one correspondence G between the domains

of I and 1', associating with each element x of the domain of I exactly one element G(x) of the domain of 1';

(c) for each individual constant a in T, G(I(a» = I'(a); i.e., the individual that I assigns to a is correlated by G with the individual that I' assigns to a;

(d) for each n-ary predicate constant f in T, and for all individuals Xl' ... , xn in the domain of I, the n-tuple of individuals <x l' ... , xn ) is an element of 1(0 if and only if the n-tuple of individuals <G(xI)' ... , G(xn» is an element of 1'(0; that is, the relation assigned to f by I is preserved under this one-to-one correspondence G.


If two models of a given theory are isomorphic, then, by condition (b) it follows that they have the same number of elements in their domains. And what conditions (c) and (d) say is that whenever any relation in either one of these models relates certain individuals, then the corresponding relation in the other model relates the corresponding individuals in that other model. Speaking intuitively, we may say that these two models are copies of one another. It is this intuitive notion of a copy which receives precise analysis in the formal concept of isomorphic models.

It is easy to show t11at if I and l' are isomorphic models of some theory T, then a formula A of T is true under I if and only if it is true under 1'.

Now if a consistent theory T is such that all of its models are isomorphic, then T is said to be categorical. The theory T would then characterize its models up to isomorphism. Though it would admit of more than one model, all of its models would be alike in structure; it could then be said to capture that one structure.

Let it be said here and now, however: as we shall see from results of the next sel~tion of this chapter, no theory stated within Fl is categorical. Once we add a theory of identity to the first-order predicate logic, it becomes possible to state categorical elementary theories all of whose models have n elements, for some finite number n. However, no elementary theory - with or without identity - admitting a model with an infinite domain is categorical. As we shall see, there are many interesting and important categorical second-order theories developed within a second-order logic. Once we have the second-order logic at our disposal, we shall be able to state theories which characterize interpretations with an infinite domain up to isomorphism. So long as we are limited to elementary theories, however, this is impossible.

We shall return to the concept of categoricity, and a certain relativization of this concept, when we come to a consideration of the first-order predicate logic with equality. There we shall be able to mention a number of important results of a positive nature.

For a particular th~ory, one may single out one model of that theory as its intended interpretation. This will be the interpreta-


tion that one normally has in mind in considering that theory. Thus, for example, arithmetic theories have intended interpretations in terms of the natural numbers and the familiar operations of addition and multiplication of natural numbers. When we have singled out some one inte~pretation of an elementary theory as its intended interpretation, or intended model, we refer to that model of that theory, and all models isomorphic to it, as standard models of that theory. All other models of that theory are non-standard models of that theory.

A theory for which an intended interpretation has been specified is said to be sound if and only if all of its theorems are true under that interpretation. If a theory is sound, it follows, of course, that it is consistent. .

In order to discuss subsequently the modeling of theories -elementary theories, or higher order theories - in any general way, we shall need the notion of a cardinal number, together with a few basic facts concerning cardinal numbers. The notion of cardinal number can be defined within set theory, but for our purposes we shall simply take it as a primitive concept. Intuitively, the cardinal number of a set is the number of elements in that set. We postulate that every set has a unique cardinal number, and that any two sets between which there is a one-to-one correspondence have the same cardinal number. Every set is either finite or infinite. It is finite if it can be put into one-to-one correspondence with the set of all natural numbers (that is, the numbers 0, I, 2 ... ) less than or equal to n, for some natural number n. Otherwise, it is infinite. The finite cardinals can be taken to be the natural numbers themselves. The cardinal number of the infinite set of all natural numbers is represented by the symbol aleph null: ~ o. This is the smallest of the transfinite cardinals. Every set which is of this cardinality is said to be denumerably infinite (or countably infinite, or of denumerable power). Thus the set of all natural numbers is denumerable. All infinite sets which cannot be put into one-to-one correspondence with the set. of all natural numbers are said to be non-denumerably infinite, or of nondenumerable cardinality; e.g., the set of all real numbers. Every cardinal number has a successor. FUrther, for every cardinal


number n, finite or infinite, n is less than the cardinal number 2 n.

There is, then, no gre<ltest cardinal number. There is a whole infinite increasing sequence of transfinite cardinal numbers. The first of these is, as we have remarked, the cardinal number of all denumerably infinite sets; viz., Ho. Just which one of these is the cardinal number of the continuum, that is, of the set of all real numbers, is a famous unanswered problem in set theory.

The theories which we are considering in this book all have a finite or denumerable infinity of non-logical constants. In addition to theories of this sort, logicians also study so-called generalized theories, in which non-denumerably many non-logical constants appear. The principal results that hold for theories of denumerable non-Iosical vocabulary <llso hold for the most part for generalized theories when suitably modified. We shall not, however, here concern ourselves with such theories.

3.2. Completeness Theorems

The most important result of a general nature pertaining to the first-order predicate logic is the result that

(I) Every consistent se t of formulas of the first-order predicate logic has a denumerable model; that is, a model with a denumerably infinite domain of individuals.

This result was first proved by Kurt Godel in 1930. 2 It is generally referred to as the completeness theorem for the first-order predicate logic. It has a number of important corollaries, which we shall consider later. We here merely mention two of these corol-

2 K. Godel 1930. An English translation of this important paper appears in J. van Heijenoort 1967. GodeI's proof appears in A. Church 1956, section 44; section 45

contains an independent proof of the same result by L. Henkin in 1949. For a simpler version of Henkin's proof, see B. Mates 1965, pp. 136-141. In addition to the proofs by Godel and Henkin, there are a number of other proofs of this theorem, including algebraic and topological proofs.

For completeness proofs which effectively show how to find a proof of a formula A if A is valid, see Kalish and Montague 1964, Chapter V; Jeffrey 1967, Chapter 8.

COMPLETENESS THEOREMS 61

laries, each of which is often referred to as a completeness theorem for the predicate logic of first order:

(2) If r F A, then r I- A. That is, if A holds in every model of r, then A is derivable from r.

(3) If F A, then I- A. That is, if A is a valid formula, then A is a theorem.

Godel's result, together with its various corollaries, is one of the most important results in the whole field of mathematical logic. It is important both for mathematical logic itself and for philosophy. Thus, taken as a result in logic, it is one of the bases for the model theoretic approach to the study of formalized theories. Within this approach one studies in a general way questions of completeness, incompleteness, categoricity and related properties of mathematical theories. We shall be considering some of the results in the area of model theory in later chapters. Further, the completeness theorem is of great importance with respect to the application of logic to certain other areas of mathematics, and vice versa. Thus, for example, in his discussion of the completeness result the contemporary logician A. Mostowski writes, 'One often speaks of the relevance of mathematical logic to algebra; it is chiefly the completeness theorem that allows one to connect these two disciplines in such a way that they can deeply influence one another.' 3

Philosophically, the completeness theorem, together with its above two corollaries, is important concerning the question of the relations between syntax and semantics. We have already encountered the converses of the above three results in Chapter II (viz., (3), (I,) and (2) respectively, pages 46, 47). When we take these three results together with their converses, we see that certain very important equivalences hold between certain semantical concepts and their syntactical counterparts. In particular, by (I) and its converse, a set of formulas in the first-order logic is semantically consistent if and only if it is syntactically consistent. By·(2) and its converse, a formula A holds in all models of a set of formulas r if

3 A. Mostowski 1966, p. 6l.


and only if it is derivable from r. And by (3) and its converse, a formula is valid if and only if it is a theorem. In the case of the predicate logic of first order, then, these semantical and syntactical concepts coincide in their extension: wherever a concept of the first sort applies, the corresponding concept of the second sort applies, and vice versa. Because of the second half of this equivalence, we say that the first-order predicate logic is sound; because of its first half, we say that it is complete.

It is worth pointing out here and now that these equivalences do not hold for systems of higher-order predicate logic. Such systems are sound, but On an important sense) not complete. We shall consider this state of affairs for the second-order predicate logic ill particular in more detail in Chapter IV.

We tum now to a proof of GOdel's principal result, stated in the following form:

Every consistent first-order theory has a denumerable model. The proof that we shall consider is not Godel's original proof, but a simpler proof due to L. Henkin.4 We shall present all of the principal points in this proof, without however going into all of its details. It will not be necessary for any subsequent purposes to master this proof, and if the reader prefers he may skip lightly over whatever details do not intrigue him. S

We introduce the expression

r-T A

to mean that A is a theorem of T. We divide the proof into four parts.

Part I. First, we establish a very simple lemma; viz., Lemma 1. If a closed formula 'V A of a theory T is not a theorem

of T, then the theory T' which results from adding A as an axiom to T is consistent.

Suppose that T' is inconsistent. By the final result of Chapter II, it follows that r-T' 'VA. Now, by our second corollary to the

4 L. Henkin 1949. We here follow a simplified version of Henkin's proof. S For all the details, see for e'{ample E. Mendelson 1964, pp. 62-68.


Deduction Theorem (page 51), it follows that r-TA::> 'V A. From this, by the sentential logic, it follows that r-T'VA. But this contra- 1

dicts our assumption that 'V A is not a theorem of T. Thus, T' is consistent. Similarly, if A is not a theorem of T, then the result of adding 'V A is consistent.

Next, we introduce a second lemma, which was first-proved by A. Lindenbaum, and is known as Lindenbaum's lemma.

Lemma 2. If T is a consistent elementary theory, then T has a complete, consistent, simple extension.

Let AI' A2, ... be an enumeration of all sentences of T. We then define an infinite sequence of theories as follows. Let To be T itself. For any theory Tn' if 'V An+l is not a theorem of Tn' then let Tn+l be the theory which results from adding An+l to Tn as a new axiom. Otherwise, let Tn+l be Tn itself. Thus, for example, if not r-TQ 'VAl' then Tl is To with Al added as a new axiom; otherwise, Tl IS To· Now let T' be the theory which has as its axioms all of the axioms of these theories Tn- It is obvious that T' is a simple extension of T; that is, that T' is an extension of T in which no new non-logical constants have been added to T. A simple argument by induction, together with lemma I, shows that T' is consistent, since T is consistent. And since every sentence of T appears in the above enumeration, it is easy to see that T' is complete. Thus T' is a complete, consistent, simple extension of T.

The reader must be careful not to misunderstand this lemma. We have not shown that every consistent theory has an effectively defined complete, consistent, simple ex tension. Indeed, as we shall see in later chapters, this result does not hold in general. The theory T' defined above was not defined in an effective fashion, but in terms of the notion of provable within a given theory, which is not in general an effective notion. In some cases, a given consistent theory will have an effectively defined complete, consistent, simple extension; but in other cases not.

Part II. We now show that every consistent theory T* has a complete,

consistent extension with a certain important additional property, which we shall call 'the w-property.'

Let To be the theory which results from adding denumerably


many new individual constants to T*. The axioms of To are those of T*, together with further logical axioms containing the new constants. Since T* i& consistent, To will be consistent also. For any derivation of a contradiction within To could be converted into a derivation of a contradiction within T*, simply by replacing all occurrences of distinct new individual constants by occurrences of distinct new variables not occurring within this derivation.

Introduction of some new notation is helpful at this point. Let A be any formula, a be any individual variable, and b be any individual constant. Then the expression

Aa/b

shall stand for the result of replacing all free occurrences of a in A by occurrences of b. For example, let A be 'P x:::> (3x)P x', a be 'x' and b be 'a'. Then Aa/b is 'P a:::> (3x)P x'.

Co~sider now an enumeration of all sentences of To which are existential generalizations: (3al)AI, (3a2)A2, .... Choose a denumerable sequence b l, b2, ... of the new individual constants which were added to T*, such that for all m < n, bn is distinct from bm ,

and does not appear in Am' or in An. We now define a new sequence of sentences of To:

CI (3al)AI :::> Al adb i C2 (3a2)A2 :::> A2 a2/b2

In terms of this new sequence of sentences, we define an infinite seqmnce of theories. For each n;;;' 1, let Tn be the theory which results from adding CI, ... , Cn to To. Then let Tw be the theory which results from adding all of these sentences Cn to To. Using the Deduction Theorem, it can readily be shown (we here omit the proof) that each of these theories Tn is consistent, and thus that Tw is consistent.

Now Tw is a consistent, simple extension of To. By lemma 2,

COMPLETENESS THEOREMS 6S

Tw has a complete, consistent, simple extension. Le~ us call this extension Tw'. Tw' is, then, of course, a complete, consistent, simple extension of To. In addition, it contains each of the sentences Cn as axioms. This implies that, for any sentence (3an}An, if (3a)A is a theorem of Tw', then An an/bn is also a theorem of

n n . , h Tw'. Thus, in addition to being complete and consIstent, Twas the property that if any sentence which is an existential generalization is a theorem of Tw', then some sentence which expresses some special case of that generalization will also be a theorem of Tw'. Let us refer to this property of theories as the w-property. The presence of these sentences Cn as axioms of Tw' will be crucial in the construction of the denumerable model. for T*. The reader will notice that in defining these sentences Cn we have already made use of the fact that new constants have been added to T*, resulting in To. And we shall again make use of this fac_t in defining the denumerable model of T*, for in that model we shall take the domain of individuals to be the individual constants of T . By virtue of the fact that we have added these new constants t~ To, we are assured that To contains denumerably many individual constants, regardless of whether or not T* itself does.

Part III We now list a number of elementary results which hold true of

Tw', which we have just seen to be complete, consistent, and ~o possess the w-property. The reader should have no difficulty In

establishing these results for himself. Let T be a complete and consistent theory, and let A and B be any sentences of T

(1) Either A or 'V A is a theorem of T, but not both. That is, r-TA if and only ifnot r-T'VA.

(2) r-TA v B if and only if either r-TA or r-TB. (3) r-TA 1\ B if and only if both r-TA and r-TB. (4) r-TA:::> B if and only if either not r-TA or r-TB. (5) r-TA == B if and only if both r-TA and r-TB, or neither

r-TA nor r-TB. . Suppose now that in addition to being complete and conSIstent,

T has the w-propertYr Let (a)A and (3a)A be any sentences of T which are universal or existential generalizations. Then the following two additional results hold for T.

66 THE FIRST-ORDER PREDICATE LOGIC; II

(6) t- T(a)A if and only if, for every individual constant b of T, t-TA a/b.

(7) t- T(3a)A if and only if, for some individual constant b of T, t-TAa/b.

The proof of (6) is as follows. The implication from left to right is obvious, ~iven axiom schema (e) of Fl. Suppose now that for every constant b of T, t-TA alb, but that it is not the case that t-T(a)A. Then, by the fact that T is complete, it follows that t-T'V(a)A; and thus, by the logic of quantification, that t- T(3a)'VA. Now because T has the w-property, it follows that for some individual constant b, t- T 'V A a/b. But since for every constant b, t-TA alb, it follows that T is inconsistent. Since this contradicts our assumption that T is consistent, we conclude that t- T(a)A. The proof of (7) is similar.

Now since Tw' has the w-property in addition to being complete and consistent, results (1) through (7) hold true of Tw'. These results are all drawn upon in the next and final part of the proof, in which we define a denumerable model for Tw'.

Part IV We now show that Tw' has a denumerable model. Let I be an interpretation of the underlying logic of the theory

of Tw'. The non-logical constants of Tw', recall, are those of the theory To, which resulted from adding denumerably many new individual constants to our original theory T*. Let the domain of individuals of I be the denumerably many individual constants of Tw'. Let I assign to each individual constant a of Tw' that constant itself. To each n-ary predicate constant f of Tw', let I assign the class of all those n-tuples of individual constants (aI' ... , an) of Tw' which are such thJt the atomic fonnula f a1 ... an is a theorem of Tw'. (Notice that we are not in general here defining the assignments to the predicate constants of Tw' in an effective fashion; for the concept of being a theorem of Tw' will not in general be an effectively defined concept. Thus the interpretation I is not in general an effectively defined interpretation.)

We now show that, for every sentence A of Tw', A is true under I if and only if A is a theorem of Tw'. The proof proceeds by induction in the number of sentential connectives and quantifiers


in A. Thus, we show first (I) that this result holds for all sentences containing no connectives or quantifiers; and then (II) that it holds for all sentences containing n + I connectives and quantifiers, on the hypothesis that it holds for all sentences containing n or fewer connectives and quantifiers.

(I) Where A is an atomic sentence, this result follows immediately from the definition of I.

(II) Let A be a sentence containing n + I cOllllectives and quantifiers. Then A falls under one of the following cases.

(a) A is 'VB, for some sentence B. Then B obviously contains n connectives and quantifiers.

'VB is true under I if and only if B is not true under I; if and only if not t-Tw' B (by the hypothesis of the induction); if and only if t- Tw' 'V B (by result (I), Part III).

(b) A is either B v C, B f\ C, B :) C or B == C, where Band Care sentences. Here the argument is similar to the argument in (a), except that results (2)-(5), Part III, apply. .

(c) A is (a)B, for some variable a and formula B. Since every individual in the domain is assigned to some individual constant (in particular, to itself),

(a)B is true under I if and only if, for every individual constant b, B alb is true under I; if and only if, for every individual constant b, t-Tw' B alb (by the hypothesis of the induction); if and only if t-Tw,(a)B (by result (6), Part III).

(d) A is (3a)B, for some variable a and formula B. Here the proof is similar to that in (c), except that result (7), Part III, applies. .

From (I) and (II) it follows that a sentence A of Tw' is true under I if and only if t- Tw' A. Now since a fonnula is true under I if and only if its universal closure (which is a sentence) is true under I, and is a theorem of Tw' if and only if its universal closure is a theorem of Tw', it follows that every fonnula which is a theorem of Tv.;' is true under I, and thus that I is a model of Tw'. But since Tw' is an extension of T*, it follows that I is a model of T*. And this completes the proof of the completeness theorem.


It follows immediately, of course, that every consistent set of formulas r of the first-order predicate logic has a denumerable model. For let r' be the theory which has as its non-logical axioms those formulas of r which are not axioms of logic. Suppose now that r is consistent. Then r' is consistent, and thus has a denumerable model, which will clearly be a model of r.

The two corollaries mentioned at the beginning of this section (page 61) can now be easily established. Suppose, first, that some formula A holds in every model of some set of formulas r. Let r' be the theory which has r as its axioms, and let A' be the universal closure of A. If A' is not a theorem of r', then by lemma I the result of adding 'VA' tc the axioms ofr' is a consistent theory. By the completeness theorem, this theory has a model. Now 'VA' is clearly true in this model. But A' is also true in this model, which leads to a contradiction. Thus, A' is a theorem of r'; and consequently A is derivable from r, which proves the first of our two corollaries. The second of the two corollaries follows immediately from the first, since it is merely the special case in which r is the empty set of formulas.

3.3. Further Corollaries. Decision Problem

The completeness theorem leads to a variety of corollaries in addition to the two we have just considered. One of the most important of these is the famous Lowenheim-Skolem theorem (1920):

(3) If a set of formulas of the first-order predicate logic has any model at all, then it has a denumerable model; that is, a model with a denumerably infi.nite domain.

This result follows immediately from the completeness theorem, together with the result from Chapter II that if a set of formulas has a model then it is consistent. In view of the LowenheimSkolem theorem, it follows that any set of first-order formulas which admits a non-denumerable model also admits a denumerable model. Thus, no such set of formulas can be categorical.

Now suppose that some interpretation I is a denumerable model

FURTHER COROLLARIES. DECISION PROBLEM 69

for some set of formulas r. It is easy to show that r will then have models of every non-denumerably infinite cardinality. For any non-denumerably infinite cardinal 0:, let I' be an interpretation of cardinality 0:, with the domain of I included in that of I'. Let a be any individual in the domain of I, and then in I' interpret the predicate constants of r so that those individuals in the domain of I' which are not in the domain of I are treated as though they were the individual a. That is, let D be the domain of I, and D' be the domain of I'. Let f be a predicate constant of r, and let I(f) and I'(f) be its interpretations in I and I'. Then for all xl ... xn in D'. I' (f) holds for (X I ... X n > if and only if I(f) holds for (y I ... Y n >, where Yi = xi if xi is in D, and otherwise Yi = a. Further, in I' interpret the individual constants of r as in I. Then it can be shown that I' is a model of r. Thus, as a further corollary of the completeness theorem we have

(4) Any consistent set of formulas of the first-order predicate logic has models of every infinite cardinality.

It follows from (4), of course, that no consistent set of first-order formulas is categorical. There are, then, no categorical elementary theories (unless, as we shall see, the theory of identity is added to the first-order logic, in which case there are categorical elementary theories with finite domains, but none with infinite domains).

If a theory is consistent, then, it admits models of denumerably infinite cardinality, and models of all higher cardinalities. Will it necessarily admit models of lower cardinality; that is, finite models? No. There are consistent theories all of whose models are infinite; for example, the usual elementary theories of arithmetic and algebra. There are many familiar formulas which serve as axioms of infinity, in the sense that any theory which contains any of those formulas as theorems admits only models of infinite cardinality. The reader should be able to see that the following formula is an axiom of infinity in this sense:

(x) 'VP x X A «x)(y)(z)(P x YAP Y z:J P x Z) A (x)(3y)P x y).

Finally, as one further corollary to the completeness theorem, we mention the so-called compactness theorem:


(5) A formula A holds in every model of a set of formulas r if and only if A holds in every model of some finite subset of r. Similarly, a set of formulas r has a model if and only if every finite subset of r has a model.

This corollary follows from the first of the above corollaries and its converse (viz., the soundness theorem for the first-order predicate logic), together with the fact that a contradiction is derivable from a set of formulas r if and only if that contradiction is derivable from some finite subset of r, since every derivation has only finitely many steps.

From the fact that every valid formula of the first-order predicate logic is a theorem of the first-order predicate logic, it does not follow that there is an effective procedure for determining whether any arbitrary formula of that logic is valid. In fact, it is known that there is no effective procedure for determining whether an arbitrary first-order formula is valid, or (as we now know, equivalently) a theorem of the first-order logic; that is, the first-order predicate logic (unlike the sentential logic) is undecidable (Alonzo Church, 1936).6 In at least this sense, then, we can say that, contrary to many popular conceptions of logic, logic is not all 'mere calculation,' or 'mere computation.' No machine or computer of any sort could determine, for each formula of the first-order predicate logic, whether that formula is valid.

Since a formula is valid if and only if its universal closure is valid, it follows that there is no effective procedure for determining the validity of arbitrary closed formulas of the first-order logic. And from this it follows that there is no such procedure for determining whether an arbitrary closed formula of that logic has a model (or, equivalently, is consistent); for a closed formula has a model if and only if its negation is not valid. And, finally, since a formula has a model if and only if its universal closure does, this result holds for formulas in general.

6 A. Church 1936a. To be sure, Church's result that the set of valid formulas of the first-order predicate logic is not effectively decidable has been established only if one grants Church's thesis, which states that a set is effectively decidable only if it is general recursive. This topic will be considered further in Chapter VIII.

THE FIRST-ORDER PREDICATE LOGIC WITH IDENTITY 71

Certain classes of valid formulas of the first-order logic are decidable, however. Clearly the class of all tau tologies is decidable, by means of appeal to truth-tables. Further, it is known that a formula of first-order logic in which there are no quantifiers is valid if and only if it is a tautology. Thus, the class of all quantifier-free valid formulas is decidable (being a sub-class of the class of tautologies). More interestingly, the class of all valid firstorder formulas in which all predicate constants have only one argument is decidable (Behmann, 1922). (It is known, however, that the class of all valid formulas in which all predicate constants have at most two arguments is not decidable; Kalmar, 1936). Further, let us say that a formula A is in prenex normal form if and only if either it has no quantifiers within it, or it ~onsists of a string of quantifiers followed by a formula with no quantifiers within it. It is known that every formula A is equivalent to some formula B which is in prenex normal form (in the sense that the biconditional A == B is a theorem). Thus, in searching for decision procedures we may restrict our attention to formulas in prenex normal form. Now, there are effective procedures for determining the validity of all first-order formulas in prenex normal form whose quantifiers are either (a) all universal quantifiers, or (b) all existential quantifiers, or (c) such that no existential quantifier precedes any universal quantifier. And there are other decidable classes of valid formulas within first-order logic. 7 In spite of these many partial results, however, first-order logic as a whole is not decidable.

3.4. The First-Order Predicate Logic With Identity

We return now to the particular system of first-order predicate logic Fl, and to a consideration of ways in which that system can be extended.

Though a number of important branches of mathematics can be

~ For a number of solutions to the decision problem in special cases, see A. Church r956. $<!c!ion 46. For a deci~ion procedure for the class of all valid first-order formulas t"ll 'IIIii'llkh all predicate constants have only one argument, see D. Kalish and R. Montague ~%4, PI'. 124-126.


developed within Fl, for developing certain other first-order branches of mathematics it is convenient and sometimes necessary to have a symbol for identity included among the primitive constants of our predicJte logic, together with axioms or rules of inference governing that symbol. And at times it is very convenient to have operation symbols included among the symbols of our logic. Let us now indicate one way (among several ways) in which these additions could be made to Fl, considering first the addition of a symbol standing for identity, together with axioms governing this symbol.

Let us take the familiar symbol '=' as our binary predicate constant for identity. (We here require retroactively that this symbol be not already present within Fl.) Let us call the result of adding this symbol to Fl, together with adding the axioms governing this symbol which we shall list below, FI - a system of first-order predicate logic with identity.

As in the case of Fl, the symbol 'FI> stands not just for one system of logic, but a whole variety of systems, which differ just in which non-logical constants they contain. The number of individual constants in any particular system FI may be· either none at all, or a finite or countably infinite number. Since the sign for identity is a predicate constant, no further predicate constants are required. Thus, the number of additional predicate constants in any particular system FI may be either none at all, or any finite or countably infinite number.

I t is not customary to place the argument expressions appearing with the binary predicate symbol '=' after the occurrence of this symbol itself; rather, it is customary to place one of them immediately to the left of the occurrence of this symbol, and to place the other immedirtely to the right, as in 'x = y', for example. This practice will be followed here, but this will imply (unless we make appropriate changes in our definition of 'formula of FI,) that we recognize again a distinction between official notation and informal notation. Within official notation expressions of the form a = b do not appear; rather, in their place we have only expressions of the form = a b. The definitions of 'formula' for Fl and FI define the formulas of those languages as these formulas appear


within official notation. In order to keep things simple, some uniform procedure for constructing formulas is adhered to within this detinition. When actually writing out examples of formulas of these languages, however, it is most convenient to permit oneself to draw upon whatever customary practices may prevail.

As a further departure from official notation, expressions of the form 'V(a = b) will often be written in the form a =f b.

The definition of formula of FI is obtained from the definition of 'fonnula of Fb simply by replacing all occurrences of 'Fb therein by occurrences of 'FI '. .

If FI is to be a system of logic, rather than a formalization of a non-logical theory of identity, all of the axioms of FI must be true under all interpretations of Fl. In order to secure this result, let us require that every interpretation of FI assign the identity relation to the symbol '='. This calls for only a minor revision in the definition of 'interpretation of Fb (page 35). In particular, an interpretation I of FI consists of:

(a) a non-empty domain D, over which the variables of FI range; (b) for each individual constant (if any) of FI, an assignment to

that constant of some element from the domain D; (c) for each n-ary predicate constant of FI, except the binary

predicate constant '=', an assignment to that constant of some n-ary relation among the elements of D; and

(d) an assignment to the predicate constant '=' of the identity relation among the elements of D.

The identity sign is now a special kind of predicate constant. Whereas among the various interpretations of FI there are many different relations assigned to the remaining predicate constants of FI, no relation other than the identity relation (with respect to the domain of the interpretation in question) is ever assigned to the identity sign. For this reason, the identity sign is now to be included among the logical constants of Fl.

To specify the axioms of FI, we first take the axiom schemata of Fl, now understanding the metavariables appearing within these schemata to range over expressions of FI of the appropriate type; add the axiom

x=x; (Reflexive law of identity)


and then add the schema

(0 a = b :J (A :J B) , (Substitutivity of identity)

where a and b are individual constants or variables of FI, A and B are formulas of FI, and B is obtained from A by replacing one particular occurrence of a by an occurrence of b, where this occurrence of a does not lie within the scope of any quantifier containing either a or b.

The rules of inference of FI are those of Fl. The above axiom is a Law of Extensionality for the predicate

logic with identity. It assures us that all contexts within this logic are ex tensional with respect to identity, in the sense that if two individual variables or constants a and b stand for the same individual, these symbols may replace each other in any context without changing the truth-value of that context (provided, of course, that the qualification on a and b appearing in the formulation of the schema is met).

As examples of axioms provided by this schema, we have the following formulas:

x=y :J(Px :JPy),

x = y :J «z)P.) z :J (z)P y z).

The familiar laws concerning identity can easily be derived from these axioms for Fl. These include principally

x=y:Jy=x (Commutative law of identity) and

x = Y :J (y = z :J x = z) (Transitive law of identity)

The definition of all remaining syntactical and seman tical terms earlier stated with respect to FI can be carried over to FI without change, except for repla<;ing all occurrences therein of 'Fl> by occurrences of 'FI'. Under these definitions it is easy to show that all of the axioms and theorems of FI are valid within Fl. Furthermore, the completeness theorem is known to hold also (Godel, 1930), in the following form: 1\

1\ This can be proved by a modification in the proof of completeness for the predicate logic without identity. See E. Mendelson 1964, p. 80.


Every consistent set of formulas of the first-order predicate logic with identity has a model whose domain is either finite or denumerable.

We tum now to the topic of categoricity and elementary theories with identity.

In our discussion of elementary theories we remarked that though '10 consistent elementary theory without identity is categorical, there are finite elementary theories with identity which are categorical. Thus, consider the theory of simple ordering of order two, which has FI as its underlying logic, the binary predicate constant 'R' as its sole non-logical constant, and the following five formulas as its non-logical axioms:

(1) (3x)(3y)(x f y 1\ (z)(z = x v z = y» (2) (x)R x x (3) (x)(y)(Rx y 1\ Ry x:J x = y) (4) (x)(y)(z)(R x y 1\ Ry z :J R x z) (5) (x)(y)(x f y:J Rx y v Ry x)

Axiom (1) states that there are exactly two elements in the domain; and axioms (2)-(5) state respectively that the relation R is reflexive, antisymmetric, transitive and connected. It is clear that all of the models of this theory have exactly two elements; and in fact all of those models are isomorphic. Thus this theory is categorical, as is the theory of simple ordering of order n, for every positive integer n. Thus, for all positive integers n, there are categorical elementary theories with identity of cardinality n.

Suppose now that T is a categorical elementary theory with identity. Let A be a sentence of T. Then, since all of the models of T are isomorphic, A has the same truth-value in every model of T. Thus, either A holds in all models of T, or 'VA holds in all models of T. Thus, by the completeness theorem for elementary logic with identity, either A or 'VA will be a theorem of T. Thus, T will be complete. Every categorical elementary theory with identity, that is, is complete. In particular, then, the above theories of simple ordering of order n are complete. Unfortunately, of course, this way of proving completeness of elementary theories with identity is restricted to finite theories, since only such theories are cat ego rical.


Since no elementary theory (with or without identity) which admits models of infinite cardinality is categorical, the unrestricted concept of categority is not very useful for elementary theories. More useful is a restricted form of this concept, defined with respect to elementary theories with identity; viz., the concept of being categorical in power. We say (Los, 1954; Vaught, 1954), that an elementary theory T with identity is categorical in power m if and only if (1) T has at least one model of cardinality m, and (2) all models of T which are of cardinality m are isomorphic.

As we remarked in connection with the theory of simple ordering of order n, for every positive integer n there are theories which are categorical in power n. One of the deepest results in model theory concerns categoricity in infinite powers. According to that result, if an elementary theory with identity is categorical in any non-denumerably infinite cardinal, then it is categorical in every non-denumerably infinite cardinal (Morley, 1962). This is the principal result which makes possible the following four-fold classification of elementary theories with identity which admit models with infinite domains.

(1) theories which are categorical only in the denumerable cardinal;

(2) theories which are categorical in all non-denumerable car-dinals, but not in the denumerable cardinal;

(3) theories which are categorical in all infinite cardinals; and (4) theories which are categorical in no infinite cardinal. The following are simple examples of theories of these types. 9

Type (I). Let T be a theory whose only non-logical constant is the singulary predicate constant 'P'. As non-logical axioms, for each positive integer n take an axiom stating that there are at least n individuals in the set assigned to 'P', and at least n individuals not in that set. Type (2). Let T be a theory whose non-logical constants are an infinite sequence of individual constants aI' a2' .... As non-logical axioms, take the axioms 3m =I an' for all m =! n.

9 For additional examples, see Mendelson 1964, pp. 91-92; A. Mostowski 1966, pp. 122-124. And further examples "{ill appear in later chapters of this book.


Type (3). Let T be the theory which has no non-logical constants and no non-logical axioms. That is, let T be that particular pI itself which contains no non-logical constants. Type (4). Let T be the theory with the singulary predicate constant 'P' as its sole non-logical constant, and let T have no nonlogical axioms.

As a more interesting, well-known example of type (1) - that is, of theories which are categorical only in the denumerable power - we have the elementary theory of densely ordered sets with neither first nor last elements. This elementary theory with identity has the binary predicate constant '<' as its sole nonlogical constant. Its non-logical axioms are the following axioms.

(1) (x)( 3y )( 3z)(y < X II X < z) (2) (x)(y)(z)(x < y II Y < z :;) x < z) (3) (x)(y )(x = y :;) "vx < y) ( 4) (x)(y)( (x < Y v Y < x) V x = y) ( 5) (x)(y )(x < y :;) (3z )(x < Z II Z < y ) )

Any denumerable model of this theory will be a linearly ordered set which is dense - that is, between any two distinct elements x and y there will be a third element Z - and has no first or last elements. It was proved by Cantor that all such ordered sets are isomorphic. Thus, this theory is categorical in the denumerable power. All of its denumerable models are isomorphic with the model which consists of the rational numbers under their natural ordering. It is known, however, that it is not categorical in any non-denumerable power.

In terms of the concept of categoricity in power we are able to state a very useful criterion for completeness; viz., Vaught's criterion (1954). Let T be a first-order theory with identity which is categorical in some infinite power m, and which admits no finite models. Then T is complete.

The proof of this criterion is very simple. Suppose that T is incomplete. Then there is some sentence A of T such that neither A nor "v A is a theorem of T. Thus it follows (by lemma 1, p. 62) that both T U {A} and T U {"v A} - that is, the two theories which result from adding A and "v A to the axioms of T - are consistent.


Therefore, these two theories will both have a model (by the completeness theorem). But since T itself has no finite models, these models will be infinite. Now, since any first-order theory with identity which admits infinite models at all admits models of every infinite cardinality, these two theories will both have a model of the infinite cardinality m. Let these models be M and M', respectively. These two models M and M' will be models of T. Since A holds in M and "vA holds in M', M and M' cannot be isomorphic. But this contradicts the hypothesis that T is categorical in power m.

As an illustration of the use of Vaught's criterion, notice that we are now able to conclude that the elementary theory of densely ordered sets is complete. This follows from Vaught's criterion, together with the fact that this theory is categorical in the denumerable power and admits no finite models.

3.5. The First-Order Predicate Logic With Identity and Operation Symbols-

We pass now to a consideration of how operation symbols could be added to Fl. We could, of course, also add operation symbols directly to Fl. These symbols will be interpreted as standing for operations, where an n + l-ary relation R is an n-ary operation with respect to a domain D if and only if, for each n-tuple <x l' x2'" x n ) of individuals in D there is exactly one individual y in D such that the n + I-tuple <xl>'" x n' y) is an element of R.

Theoretically, it is not necessary eyer to use operation symbols. N-place operation symbols can always be dispensed with in favor of n + l-ary predicate constants. Thus, for example, instead of using the familiar binary operation symbol '+' for addition, and then writing 'x + y = z', we might introduce the ternary predicate constant '~' by the following definition:

~xyz=x+y=z,

and then write '~x y z'. Still, when operation symbols are dispensed with in favor of predicate constants, sentences that are


comparatively brief in length become replaced by much more lengthy sentences, which are often difficult to recognize as translations of perfectly familiar sentences. For this reason, we shall now add operations symbols to the list of symbols of Fl. Let us call the resulting system (rather, systems), FO. FO, then, is a first-order predicate logic with identity and operation symbols.

The symbols of FO are the symbols of FI, together with any number (possibly none) of n-ary operation symbols (or functors), for each positive integer n (except, of course, that there must be at least one operation symbol of some sort).

We now need a word to stand for the various expressions of FO which come to designate individuals when FO is interpreted. The most familiar word for this kind of expression is the word 'term.' We now define 'term' recursively. A term of FO is any expression which is either (a) an individual variable of FO, or (b) an individual constant of FO, or (c) an n-ary operation symbol of F?, followed by n occurrences of terms of FO. (Of course, we might have introduced the word 'term' in connection with Fl and FI also, to stand for individual variables and constants.)

As an example of a term within any system FO in which the familiar operation sign for addition appears, we have (within informal notation) the expression '(x + y) + z': By (a), 'x', 'y', and 'z' are terms of FO ; thus by (c), the expression '(x + y)' is a term of FO; and by (c) again, '(x + y) + z' is a term of FO.

Now that we have a new kind of expression that comes to designate individuals when FO is interpreted, we extend the range of the bold face small roman letters 'a', 'b', 'c', etc., to all terms of FO (rather than just to the individual variables and constants of FO).

We next need to define 'formula', 'interpretation', 'satisfies', and 'axiom', all with respect to FO.

We obtain the definition of 'formula of FO' by an obvious modification of the definition of 'formula of Fl, (page 32), so as to provide for the fact that we now have a new kind of expression designating individuals.

To obtain the definition of an interpretation of FO, we simply add to the definition of an interpretation of FI (page 73)


one further clause covering operation symbols (fIrst replacing in that definition all occurrences of 'Flo by occurrences of 'FO'):

(e) for each n-ary operation symbol of FO, an assignment to that operation symbol of some n-ary operation over the elements of D.

Next, we need to define the notion of an infinite sequence of individuals' satisfying a ;'ormula of FO. In order to do this, we first define the notion of a value of a term a with respect to a given interpr;!tation I and a given infinite sequence S of individuals in the domain of I.

(a) If a is an individual constant, then the value of a is the individual assigned to a by the interpretation I;

(b) If a is an individual variable, then the value of a is the individual correlated with a by the infinite sequence S;

(c) If a is an n-ary operation symbol 0 followed by n terms bl, b2, ... , bn , then the vdue of a is the result of applying the operation assigned to 0 by I to the ordered n-tuple of the values of b1> b2, .. , bn (taken in that order).

Let us use again the symbol

Sica)

from Chapter 2 to stand now for the value of the term a with respect to the interpretation I and the infinite sequence S.

To obtain now the definition of the expression 'the infinite sequence S satisfies the lOrmula A of FO under the interpretation r, we simply replace all occurrences of 'FI> by occurrences of 'Fo' in the definition of the notion of satisfaction defined with respect to FI (pages 36-37). In terms of this definition, the definition of all remaining semantical terms proceeds as in the case of Fl.

Finally, we tum to the characterization of the axioms of FO. Here all that is needed is slight changes in the characterizations of the axioms of FI and Fl. First, we now understand the metavariables appearing in these characterizations to range over expressions of FO of appr9priate type. Then, because of the presence in P of terms other than individual variables and constants, we need to extend the restriction on axiom schema (e) (page 43). For FO, this schema is stated as follows:

(el) (a)A::) B, where a is an individual variable, B differs from A


at most in having occurrences of some term b where A has free occurrences of a, and none of these occurrences of b in B lie within the scope of any quantifier containing any individual variable occurring in b.

Further, the obvious corresponding change needs to' be made in the formulation of the axiom schema (f) for identity (page 74): i.e., in place of axiom schema (f), we take the schema:

(f1) a = b::) (A::) B), where a and b are terms of FO, A and Bare formulas of FU, and B is obtained from A by replacing one occurrence of a by an occurrence of b, where this particular occurrence of a does not lie within the scope of any quantifier containing any variable occurring within either a or b.

The rules of inference of FO are those of Fl. And the definition of all remaining syntactical terms now proceeds as in the case of Fl. Further, the completeness theorem holds for FO in the form in which it holds for Fl. '

We shall now close our treatment of the first-order predicate logic with identity and operation symbols by giving a simple but very important example of a theory developed within this logic; viz., elementary group theory. Group theory has a number of equivalent axiomatizations; the following axiomatization is wellknown. The non-logical constants of this theory are (I) a binary operation symbol '0'; (2) a singulary operation symbol ,-1>, used as a superscript; and (3) an individual constant 'e'. The non-logical axioms of this theory are the following axioms:

(1) (x)(y )(z )(x 0 (y 0 z) = (x 0 y) 0 z) (2) (x)(x 0 e = x) (3) (x)(x 0 x-I = e)

The first of these axioms states that the binary operation ° is associative; the second, that the element e is a right-hand identity element with respect to the operation 0; and the third, that the singulary operation -1 is an inverse operation with respect to the operation o. As a model of this theory, we have the interpretation which takes the integers as its domain; which assigns the addition operation to the symbol '0', and assigns the operation of taking the negative of a number to the symbol '-I>; and assigns, the integer


zero to the symbol 'e'. The reader should have no difficulty in thinking of further algebraic models.

Group theory admits both finite and infinite models. Indeed, for every positive integer n, there are groups of cardinality n. As for categoricity in power, elementary group theory is not categorical in any infinite power. It is, then, another example of theories of type (4) (page 76).

Finally, if we add a fourth axiom to the effect that the operation 0 is commutative, viz.,

(4) (x)(y)(x 0 y = y 0 x),

we obtain commutative (or Abelian) group theory. We shall consider this theory, as well as group theory itself, later in Chapter VIII, in connection with the decision problem.

CHAPTER IV

THE SECOND-ORDER PREDICATE LOGIC. THEORY OF DEFINITION

4.1. Introduction

Within the first-order predicate logic-understood as elementary logic _. the only variables that appear are individual variables. (Within non-elementary forms of first-order logic predicate variables also appear, but only as free variables. As before, we shall continue to consider first-order logic only in the sense of elementary logic.) The distinctive feature of the second-order predicate logic, or the functional calculus of second order, is that within that logic there are not only individual variables but predicate variables as well, with both individual and predicate variables being quantifiable. 1 Due to the presence of quantified predicate variables, the second-order predicate logic has, in a certain sense, more expressive power than the first-order predicate logic. By way of illustration of this increase in expressive power, suppose that we let our individual variables range over the integers. Then, within the limits of the first-order logic there is no way to say that for every two properties of integers, there is a property of integers which applies just to those integers to which the first of these two properties, but not the second, applies. Within the second-order logic, however, we can say this with the help of quantified predicate variables as follows:

1 For more extensive treatments of the second-order predicate logic, see D. Hilbert and W. Ackermann 1950, Chapter IV; A. Church 1956, Chapter V; J.W. Robbin 1969, Chapter 6. See also R. Montagu<e 1965.

83

84 THE SECOND-ORDER PREDICATE LOGIC

(F)(G)(3H)(x)(Hx =- Fx 1\ rvGX) .

As another example, consider the expression

(F)(3G)(x)(y)(Fxy =- GyX) .

As this expression is interpreted within the second-order logic, it states that every bina'y relation among individuals has a converse. So long as we restrict the range of individual variables to indi'/iduals, it is not possible to express this thought within the first-order logic.

As we shall see illustrated in Chapters V and VI, with the help of the logic of second order we are able to state a number of very interesting second-order theories; in particular, the second-order arithmetic of natural numbers, and the second-order algebra of real numbers. These second-order formulations are on a number of counts superior to th~ corresponding first-order theories of arithmetic and algebra. For one thing, these second-order theories con:ain only finitely many axioms of a specifically mathematical nature, while the corresponding first-order theories require an infinite number of such axioms. Further, as we shall see, these second-order theories possess the very important property of being categorical, in the sense that all of their principal models are alike in structure; the corresponding first-order theories, however, are not categorical. And these important differences between first and second-order theories hold not just for arithmetic and algebra, but various other mathematical theories as well, including various theories of geometry.

One further respect in which second-order theories are superior to first-order theories is that, by virtue of the increased expressive power of second-order logic, many concepts which are taken as primitive, or undefined, in first-order theories can be introduced by definition in the corresponding second-order theories. Thus, for example, in first-order arithmetic one customarily takes sy~bols for the operations of successor, addition and multiplicatIOn as primitive. If one wished, in second-order arithmetic one could take only the first of these symbols as primitive, and then introduce the remaining two by definition.

THE SECOND-ORDER PREDICATE LOGIC F2 85

As against these various advantages of second-order theories over first-order theories, there is one big disadvantage, however: second-order logic is in an important sense incomplete. That is, there is no consistent and effectively defined set of axioms from which all of the valid formulas of the second-order logic are derivable. Due in part to this reason, second-order theories have not in the past received as much attention from logicians as first-order theories. At present, however, there is an' increasing interest in such theories, and in the second-order logic itself.

Most mathematical theories employ not only primitive symbols, but in addition symbols which are defined in terms of other symbols. In order to avoid circularity, contradiction and other unwanted results, definitions introducing defined symbols must satisfy certain conditions. We shall conclude this chapter with a consideration of these conditions, and of ways in which they can be met.

4.2. The Second-Order Predicate Logic F2

We now present a system of second-order predicate logic, F2. Rather, as before, we shall present a whole class of such systems, which shall differ only in which non-logical constants they contain. And when we do not define a particular syntactical or semantical term, its definition should be obvious from our presentation of the first-order predicate logic.

The symbols of F2 are taken from the following list of symbols. (l) the logical constants of the first-order predicate logic with

identity Fl. That is, the symbols

v 1\ ( 3 =

(2) an infinite list of individual variables; viz.,

x y Z Xl Yl ZI x2 Y2 Z2 ...

(3) for each positive integer n, an infinite list of n-ary predicate variables; viz., an infi,nite list of singulary predicate variables:

FI G1 HI F~ Gl Hi F~ .. , ;

86 THE SECOND-0RDER PREDICATE LOGIC

an in~inite list of binary predicate variables:

and so on.

F 2 G2 H2 F2 G2 H2 F2 . 1 1 1 2""

(4) an infinite list of individual constants, which we need not specify here.

(5) for every positive integer n, an infinite list of n-ary predicate constants.

(6' for every positive integer n, an infinite list of n-ary operation constants.

Each of the systems F2 contains all of the symbols under (I), (2), and (3). In addition, each of these systems mayor may not include various of the non-logical constants under (4), (5) and (6).

We are here including the usual identity symbol among the logical constants for F2. This symbol will function within F2 as a symbol for identity. Thus, we are here considering only secondorder predicate logiC with identity. To obtain a treatment of second-order logic without identity, one has only to make a few obvious deletions in the syntax and semantics of F2.

Within F2 we permit both individual and predicate variables to appear within quantifiers. A quantifier of F2, then, is any expression of F2 of the form (v) or (3v), where v is any variable ofF2.

The terms of F2 are characterized as expected; viz., as the individual variables anJ constants of F2, together with the results of putting some n-ary operation symbol in front of a string of n terms.

A formula of F2 is any expression of F2 which is either: (a) an n-ary predicate variable or predicate constant of F2,

followed by n occurrences of terms of F2 ; or (b) the negation of a formula of F2 ; or (c) the conditional, or disjunction, or conjunction, or bicondi

tional between any tW(' formulas of F2; or (d) the result of putting a quantifier before a formula of F2. The definitions of 'free occurrence of a variable,' 'bound

occurrence of a variable,' and 'scope' can be carried over directly from Fl , except that now they apply to all variables of F2 . And a


sentence, or closed formula, of F2 is any formula of F2 in which all occurrences of variables are bound occurrences. All other formulas are open formulas.

As in the preceding chapter, we shall usually resort to informal notation when presenting formulas and formula schemata. As examples of formulas of F2 , we have the following expressions:

(x)(x=x) (x)Fx (3F)(x)Fx

(G)((3F)F x y :::J (z)(G z :::J F x z))

The first and third of these examples are closed formulas, while the second and fourth are open formulas.

Rather than following the procedure of earlier chapters and turning at this point to the semantics of F2 we shall now continue

. 2 ' WIth the syntax of F . As we shall later see, the semantics of second-order logic, unlike that of the sentential and first-order logic, presupposes a characterization of its axioms and rules of inference at a certain point.

Let A, Band C be fonnulas of F2; let v be any variable of F2; and let a and b be tenns of F2. Then the axioms of F2 are those provided by the follbwing schemata.

(a) A:::J (B:::J A) (b) (A:::J (B :::J C)) :::J ((A:::J B) :::J (A:::J C)) (c) ('vB:::J 'VA):::J (('VB:::J A):::J B) (d) (v)(A:::J B) :::J (A :::J (v)B), where v has no free occurrences

in A (e) (v)A:::J B, where if v is an individual variable, then B differs

from A at most in having occurrences of some tenn b where A has free occurrences of v, and none of these occurrences of b in B lie within the scope of any quantifier containing any 'individual variable occurring in b; and if v is an n-ary predicate variable, then B differs from A at most in having free occurrences of some n-ary predicate variable, or occurrences of some n-ary predicate constant, where A has free occurrences of v.

(f) (3f)(al)(a2) ... (an)(f al a2 ... an == A), where n is any positive integer, f is any n-ary predicate variable, al, a2, ... an are individual variables, and A is any formula of F2 containing no free occurrences of the variable f.

88 THE SECCND-ORDER PREDICATE LOGIC

(g) a = b:;: (f)(fa:;: fb), where a and b are tenus, and f is any singulary predicate variable.

Axiom schemata (a) - (e) are obvious modifications of axiom schemata for the first-order predicate logic FO. Axiom schema (f) - a comprehension axiom schema - says in effect that every formula A defines a relation among individuals (where n = I, a class of individuals). With this schema, to establish the existence of a relation (or class), one has only to define that relation (or class). As illustrations of how this axiom schema is used, consider the follo\\ing axioms provided by this schema:

(a) (3F)(x)(F x:;: x =i= x) (b) (3F)(x)(F x:;: x = x) (c) (3F)(x)(F x:;: rvG x) (d) (3F)(x)(F x:;: G x v H x) (e) (3F)(x)(F x:;: G x 1\ H x) (f) (3F)(x)(y j(F x y :;: G y x)

Axioms (a) and (b) assert the existence of a null (empty) class and a universal class (i.e., domain of individuals), respectively. And the universal closures of axioms (c) - (f) assert, respectively, that every class has a complement; that for any two classes, a union and an intersection exist; and that every relation has a converse.

Axiom schema (g), the schema for the identity symbol, differs from the schema for identity in the case of FO in that it is a biconditional rather than a conditional. This schema embodies Leibniz's well-known Principle of the Identity of Indiscernibles, according to which any two individuals that have all their properties in common are identical with one another. Because this schema is a biconditional and not merely a conditional, it permits us (once given that F2 is an extensional system of logic) to replace any simple context in which the identity symbol appears by an equivalent expression in which this symbol does not appear. It has the force, then, of a definition. We are able to state a definitiontype axiom schema for :dentity in F2 only because of the presence in F2 of quantifiable predicate variables. Thus no such axiom schema could be stated within the first-order predicate logic.

It should come as no surprise that given this axiom schema for


iden tity, we can derive as a theorem and as a theorem ·schema of F2 the axiom and axiom schema for identity of FO; viz.,

x=x, and

a = b :J (A :J B),

where a and b are terms of F2 , A and B are formulas of F2 , and B is obtained from A by replacing one (or more) occurrences of a by occurrences of b, where these occurrences of a do not lie within the scope of any quantifier containing any variable occurring within either a or b.

As the primitive rules of inference of F2, we take those of Fl (pages 44-45), with two modifications. First, the rule of generalization is now stated as follows:

(c) From A, if v is any variable, one may infer (v)A. Second, clause (d) of the definition of 'definitionally equivalent' (page 44) is now to be understood as applying to variables in general.

A theorem of F2 is any formula that is derivable from the axioms by means of the rules of inference. A class of formulas r is (syntactically) inconsistent if and only if there is some formula B of F2 such that both Band rv B are derivable within F2 from r; otherwise, r is (syntactically) consistent. By virtue of our choice of axioms and rules, every theorem of FO is a theorem of F2 (for all systems FO and F2 such that F2 contains all of the symbols of FO). And further, it is not difficult to prove syntactically that F2 is consistent, in the sense that there is no formula A such that both A and rv A are theorems of F2 .

The deduction theorem holds for F2 . However, it is known that the second-order logic is incomplete, in the sense that there is no effectively defined set of axioms and rules from which can be derived all formulas that are true in all those interpretations which we shall define as the principal interpretations of second-order logic. 2 The system F2, then, is incomplete with respect to principal interpretations. As we shall see, however, it is· complete with respect to a certain wider class of interpretations.

2 This result follows from the main results of Gode1193l.

90 THE SECONO-QROER PREDICATE LOGIC

Because F2 is incomplete in the above sense, one might consider the addition of further axioms (though, as we have remarked, this could never result in a system of second-order axioms that was consistent and complete with respect to all principal interpretations, provided that these axioms are effectively characterized). We shall not here add any further axioms, but merely refer the interested reader to the literature. 3

We turn now to the semantics of F2 . An interpretation I of F2 consists of (a) a non-empty domain D, over which the individual variables

of F2 range; (b) for each positive integer n, a non-empty domain of n-ary

relations among the members of D, over which the n-ary predicate variables of F2 range;

(c) for each individual constant (if any) of F2 , an assignment to that constant of some element from the domain D;

(d) for each n-ary predicate constant (if any) of F2, an assignment to that constant of some relation from the above domain of n-ary relations;

(e) for each n-ary operation symbol (if any) of F2, an assignment to that constant of some n-ary operation among the individuals of D (included in the above domain of n + l-ary relations).

An interpretation of the first-order predicate logic contains only one domain, over which the individual variables range. Interpretations of second-order predicate logic, however, contain infinitely many domains, one for each of the infinitely many types of variable appearing in the second-order logic. Suppose now that a given interpretation of F2 is such that, for every n, its domain of n-ary relations contains all of the n-ary relations among the members of its domain D. Further, suppose that that interpretation assigns to the predicate constant '=' the identity relation among the individuals of D. Then that interpretation is called a principal interpretation of F2. In addition to these principal

3 See O. Hilbert and W. Ackermann 1950, Chapter IV, section 1; A. Church 1956, sectiom 56, 57; J.W. Robbin 1969, p. 145.

THE SECONO-QRDER PREDICATE LOGIC F2 91

interpretations, there are other non-principal interpretations, in which either not all of the domains of n-ary relations are complete in the above sense, or some relation other than the identity relation is assigned to the symbol '='.

We need now the concept of an evaluation of the variables of F2 with respect to a given interpretation I of F2. Let this be a function E which assigns to each individuaJ variable a of F2 a member of the domain D of I; and to each n-ary predicate variable f of F2 some n-ary relation from the domain of n-ary relations of I. With respect to a given interpretation I and evaluation E, each term a takes on a certain value. We shall use the symbol

Ei(a)

to stand for the value of a under E (and I); and we shall use the symbol

Ei(f)

to stand for the n-ary relation assigned to f by I if f is an n-ary predicate constant, and to stand for the n-ary relation assigned to f by E if f is an n-ary predicate variable.

Now let A be a formula of F2, I an interpretation of F2, and E an evaluation of F2 with respect to I.

(a) If A is an atomic formula f a1 ... an' then E satisfies A (with respect to l) if and only if the n individuals Ei(al)' ... Ei(an ) are related by the relation E'(f).

(b) If A is a negation, conditional, disjunction, conjunction or biconditional, then the conditions that E satisfy A are the usual truth-functional conditions.

(c) If A is a universal generalization (v)B, then E satisfies A if and only if every v-variant of E satisfies B.

(d) If A is an existential generalization (3v)B, then E satisfies A if and only if some v-variant of E satisfies B.

We are now in a position to distinguish, among the interpretations of F2, those that are sound. (It is at this point that we presuppose a formulation of the axioms and rules of F2 .)

Let I be an interpretation of F2 such that for every evaluation E with respect to I,

92 THE SECOND-QRDER PREDICATE LOGIC

(a) each axiom of F2 is satisfied by E, and (b) whenever a formula A is obtainable from given formulas by one application of a primitive inference rule of F2, then A is satisfied by E if those given formulas are satisfied by E. Then we shall say that that interpretation I is a sound interpretation of F2 . Not all of the interpretations of F2 are sound. From this point on we shall be concerned only with sound interpretations. All principal interpretations are sound, as is easy to show. We shall call those non-principal intnpretations which are sound, secondary interpretations. There are interpretations of F2 which are secondary in this sense. In fact, it is known that there are secondary interpretations in which all domains are denumerably infinite. In a principal interpretation whose domain of individuals is denumerably infinite, the domains over which the predicate variables range are non-denumerably infinite. This is not always the case with respect to secondary interpretations, however.

A formula is true under a given interpretation if and only if it is satisfied by every evaluation with respect to that interpretation.

We have distinguished among the class of sound interpretations a special kind of interpretation; viz., the principal interpretations. In terms of this distinction, a number of important semantic concepts can now be defined and interrelated. A formula of F2 is valid (or logically valid) if and only if it is true under all principal interpretations of F2 ; and a formula of F2 is satisfiable if and only if it is satisfied by at least one evaluation in some principal interpretation. From these definitions it readily follows that a formula is valid if and only if its negation is not satisfiable. Distinguishing now from these concepts, a formula is secondarily valid if and only if it is true under all sound interpretations; that is, true under all principal and under all secondary interpretations. And a formula is secondarily satisfiable if and only if it is satisfied by at least one evaluation in some sound interpretation; that is, by at least one evaluation in an interpretation which is either principal or secondary. It follows that a formula is secondarily valid if and only if its negation is not secondarily satisfiable.

How now are validity and satisfiability related to secondary validity and secondary satisfiability? It is immediately clear that


on these definitions all satisfiable formulas are secondarily satisfiable, and that all secondarily valid formulas are valid. It is known, however, that the converses of these results do not hold. Not every formula that is secondarily satisfiable is satisfiable; that is, a formula can be satisfiable in some sound interpretation without being satisfiable in any principal interpretati.on. In this case, that formula is satisfiable only under secondary interpretations. Similarly, not every formula that is valid is secondarily valid; that is, a formula can be true under every principal interpretation, though not true under every sound interpretation. In this case, that formula fails to be true under some secondary interpretation.

A moaeL ot a set of formulas r of Fl is any sound Interpretation of F2 under which all of the formulas of r are true. If the interpretation is a principal interpretation, we call the model a principal model; otherwise, a secondary model.

Let us now use the expression

rl=A

to mean that A holds in all models of r; and thus the expression

I=A

to mean that A is secondarily valid. By virtue of the definition of 'sound interpretation,' it is

trivially true that all theorems of F2 are secondarily valid, and thus also valid. That is, if I- A, then 1= A. Further, if r I- A, then r 1= A. In this sense, the axiomatic basis for F2 is sound.

The situation with respect to completeness now is as follows. The principal completeness theorem for the first-order predicate logic Fl is that every consistent set of formulas is true under some interpretation. The direct analogue to this for F2 would be that every consistent set of formulas of F2 has a principal model. This analogue, however, does not hold (as follows from Godel's results of 1931). In this sense, the second-order logic is incomplete. The following weak completeness theorem, however, has been proved by L. Henkin: 4

4 L. Henkin 1950. Henkin's completeness proof appears also in A. Church 1956, section 54; and is sketched in J.W. Robbin 1969, section 47.


(l) Every (syntactically) consistent set of formulas r of the second-order predicate logic has a model; that is, the formulas of r are all true under some sound interpretation of the second-order logic.

Indeed, Henkin's proof, which resembles his completeness proof for the first-order predicate logic considered in Chapter III, shows that if r has any models at all, it will have models in which all domains are denumerably infinite. As Henkin remarks, this is a second-order generalization of the Lowenheim-Skolem theorem for first-order logic. For systems of second-order logic with identity - such as F2 - if r has any models at all it will have models whose domains are all either finite or denumerably infinite.

In Chapter III we drew attention to a number of corollaries that followed from the completeness theorem for the first-order predicate logic. Their (weak) analogues for the second-order logic follow from the above (weak) completeness theorem. Thus, for example,

(2) If r F A, then r r A. That is, if A holds in every model (principal or secondaty) of r, then A is derivable from r.

(3) If F A, then r A. That is, if A holds in all sound interpretations (principal or secondary) of F2 , then A is a theorem of F2 .

These results do not hold, however, if we understand the symbol 'F' only in terms of principal interpretations and principal models. That is, within F2 there are formulas A and sets of formulas r such that though A holds in all principal models of r , A is not derivable from r. Because of the weak completeness of F2, we see that this must be because A fails to hold in some secondary model of r. We shall see this state of affairs illustrated in Chapter V, which concerns the arithmetic of the natural numbers. Similarly, there are formulas of F2 which are valid, but are not theorems of F2. Again, because of the weak completeness of F2 , we see that this must be because these formulas fail to hold in all secondary interpretations of F2 .

Syntactical procedures, then, do not have the adequacy within second-order logic that they have within first-order logic. If the concept of second-order logical truth is understood in the sense of

THE SECOND-ORDER PREDICATE LOGIC p2 95

truth in all principal second-order interpretations, then secondorder logical truth cannot be characterized through the syntactical devices of effectively defined axioms and rules of inference. Nor can the concept of a formula A's logically following from a class of formulas r be syntactically characterized in the case of second-order logic, if this concept is understood in the sense of A's holding in all principal models of r. All of this seems somewhat out of step with various rather widely accepted notions of what the essence of logic is supposed to be. What it implies is that semantic procedures have a priority and indispensability in the case of second-order logic which they do not have in the case of first-order and sentential logic.

Because the first-order predicate logic is undecidable, it follows a fortiori that the second-order logic is undecidable, in the sense that the class of valid formulas of second-order logic is not an effectively defined class. (Nor is the class of theorems of F2 a decidable class. See Chapter VIII, p. 222.) It has been shown, however, that the class of second-order valid formulas in which all predicate variables and constants are singulary is a decidable class, as in the case of the first-order predicate logic.

Before going on to a consideration of second-order theories, it should be pointed out that the first- and second-order predicate logics can be regarded as only the first two steps in a possible infinite hierarchy of logics. Thus, logicians have defined a thirdorder logic; a fourth-order logic; indeed, for every positive integer n, an nth-order logic. The third-order logic results from introducing predicate variables and constants which take as argument expressions the variables and non-logical constants of the secondorder logic. These new variables can appear within quantifiers. In general, the n + 1st-order logic results from the nth-order logic by adding predicate variables and constants which take as argument expressions the variables and non-logical constants appearing within the nth-order logic, where these new variables can appear within quantifiers. The union of the whole infinite hierarchy of such logics is the so-called (simple) theory of types. The expressive power of any logic of order n is, of course, greater than that of any lower-order logic. For example, if a suitable axiom set of


either first- or second-or jer for arithmetic is added to a suitable fourth-order logic, we can develop mathematical analysis within the restJting system, by virtue of being able to define the rational, real and complex numbers in terms of the natural numbers, together with the higher-order classes and relations provided by the fourth-order logic itself.

4.3. Second-Order Theories

A second-order theory, obviously, is any theory whose underlying logic is a second-order logic. Some of the terminology earlier defined for elementary theories can be carried over to secondorder theories without change. Principally because within secondorder. logic we have distinguished two kinds of interpretations and models (principal and secondary), however, some new distinctions and new terminology are called for. The reader is hereby advised, however, that the termi ~1010gy used to mark these distinctions varies from writer to writer.

We have been using the word 'theorem' in the syntactic sense of 'formula derivable from axioms by rules of inference.' We shall continue to use this term in this sense. Because of the soundness and weak completeness theorems for second-order logic, the theorems of a second-order theory coincide with those formulas of that theory which hold in all of its models. We need, however, a term to stand for a formula of a theory if and only if that formula holds in all principal models of the theory. Now the term 'theorem' is, indeed, itself sometimes used for this purpose. We are not here following that usage; let us, however, for this purpose speak of a theorem in the semantic sense, of a semantic theorem. We shall be using the unmodified term 'theorem,' then, in the sense of 'theorem in the syntactic sense,' or 'syntactic theorem.' The semantic theorems of a second-order theory will then be determined solely by its non-logical axioms. The (syntactic) theorems of a second-order theory, however, are determined by both its logical and non-logical axioms. Since the choice of logical axioms for any particular formulation of second-order logic is

SECOND-ORDER THEORIES

somewhat arbitrary, the class ot theorems of a second-order theory is a somewhat 'accidentally' defined class, unlike the in general more comprehensive class of semantic theorems of a second-order theory.

The only sense of completeness for second-order theories which is of importance is a certain semantic sense of completeness. Let us say that a second-order theory T is complete in the semantic sense, or semantically complete, if and only if every sentence A of T is such that either A or "v A is a semantic theorem of T; that is, either A holds in all principal models of T, or "v A holds in all principal models of T. As we shall see, there are a number of important second-order theories which are complete in this sense, including second-order arithmetic and the second-order algebra of real numbers.

Turning now to the concepts of isomorphic models and categoricity, these will here be defined for second-order theories only with respect to principal models.

The definition of 'isomorphic models' parallels that for FI , with an added clause guaranteeing that operations are preserved under isomorphism. (This added clause applies also to elementary theories with operation symbols.) Two principal interpretations I and J' of F2 are isomorphic models of a theory T if and only if:

(a) I and l' are principal models of T; (b) there is a one-to-one correspondence G between the individ

ual domains of I and 1', associating with each element x of the domain of I an element G(x) of the domain of 1';

(c) for each individual constant a in T, G(l(a)) = 1'(a);· (d) for each n-ary predicate constant fin T, and for all individ

uals x l' ... X n in the domain of I, the n-tuple of individuals (x l' ... X n) is an element of I(f) if and only if the n-tuple of individuals (G(xI)' ... G(xn ) is an element of ref); and

(e) for each n-ary operation symbol 0 in T, and for all individuals Xl' ... xn in the domain of I, G(l(O)(xI' ... xn)) = !,(o)(G(xI) ... G(xn )). That is, to the result of applying the operation assigned to 0 by the first model to the individuals xl.·· X n' the correspondence G correlates the result of applying the operation assigned to 0

by the second model to the n individuals G(xI)' ... G(xn ).


A second-order theory T now is categorical if and only if it is consistent and all of its principal models are isomorphic. As important examples of such theories, second-order arithmetic and algebra are categorical in this sense.

If a second-order theory is categorical, then all of its principal models are alike in structure. A sentence of a categorical secondorder theory T, then, either holds in all of the principal models of T, or in none. Thus, a categorical second-order theory is complete in the semantical sense. Because second-order arithmetic and algebra are categorical, tlen, they are semantically complete.

For a given second-order theory T one may single out some one interpretation of T as the intended interpretation, or intended model, of T. As in the case of first-order theories, one then often speaks of all models of T which are isomorphic with that model as standard models of T; and of all remaining models as non-standard models. .

4.4. Theory of Definition

Within the development of a formalized theory, it is often very convenient to introduce new symbols by means of definition. In order to be formally satisfactory, definitions must meet a number of requirements. We shall now conclude this chapter by presenting the usual requirements, and ways of meeting these requirements, that logicians impose upon definitions. 5

One requirement that we want to impose upon definitions is that they be stated in such a way as to permit us always to elimin(lte the expressions being defined from any simple context in which they appear. This requirement includes within it the popular intuitive requirement that definitions not be circular, and is known as the requirement of eliminability. It is because of this requirement that we are able to say that definitions, though very

5 For an advanced treatment of the problem of definability, see Tarski's 'Some Methodological Investigations on the Definability of Concepts' (1935), which is paper X in A. Tarski 1956. For additicnal illustrations of the concepts and results of the following discussion, see P. Suppes 1957, Chapter 8.

THEORY OF DEFINITION 99

helpful and indeed often indispensable in practice - in order to keep our notation down to manageable size - are never necessary in principle. Whatever can be said with the help of definitions that meet the requirement of eliminability can be said without the help of those definitions. Whatever can be said by drawing upon both primitive and defined terminology, that is, can also be said by drawing upon only primitive terminology. If this requirement were not met by a particular definition, we would have to say that that definition was really introducing not a defined term but a new primitive term; and thus that it was not really a definition at all.

A second requirement that we want to impose upon definitions is that they not permit us to prove any formulas not containing the terms being defined by these definitions, unless these formulas can be proved without the help of these definitions. A definition must not increase the number of assertions within a theory which are expressed within notation not containing the term being defined, that is. Were a definition to do this, it would really be an axiom with creative power. This second requirement is the so-called requirement of non-creativity. It follows from this requirement that if a given theory is consistent, then the result of adding definitions to that theory is always itself a ·consistent theory. The addition of definitions meeting this requirement to a theory cannot result in contradictions within the resulting theory unless those contradictions are already present in the original theory. For if the resulting theory is contradictory, then every formula within that theory is provable as a theorem. Unless the original theory is already contradictory, then, this will represent an obvious increase in the number of theorems in primitive notation.

We shall here consider definitions as non-creative axioms. Thus, the non-logical axioms of a theory are to be thought of as being of two sorts: (I) the creative axioms, and (2) the non-creative axioms (or definitional axioms). Further, we shall say that when a definition, or a series of definitions, is added to a theory T, the result is a second theory, T J • T J , we shall say, is a definitional extension of T. The non-logical constants of TJ are those of T, together with the constant, or constants, being introduced by


definition; and the axioms of Tl are those of T, together with the definition, or definitions, being added as non-creative axioms. In general, whenever we add definitional axioms to a theory Tn' the result is a new theory Tn+ 1 , which is a definitional extension of Tn' And since these axioms will have to meet the requirement of non-creativity, this extension Tn+ I will be a conservative extension of Tn' in the sense that every theorem of Tn+I which involves only constan ts of Tn is already a theorem of Tn'

We are now in a position to state the above two requirements in exact form. 6 Let T be any arbitrary theory; let k be any non-logical constant not belonging to T; let A be any formula containing k, and no other non-logical constants other than those appearing in T; and let Tl be the theory which results by adding A to the axioms of T. Then, our first requirement is stated as follows:

A, considered as a definition of k relative to T, meets the requirement of eliminability if and only if for every formula B of TI ,

there is a formula C of TI such that (a) C does not contain k, and (b) all closures of B == C are theorems of TI .

If A meets the requirement of eliminability, then, every formula B in TI which contains k is provably equivalent in TI to some formula C of Tl which does not contain k. By means of C, then, we can eliminate k from B.

Our second requirement is stated as follows: A, considered as a definition of k relative to T, meets the

requirement of non-creativity if and only if every theorem of Tl which does not contain k is a theorem of T.

We need now to lay llown rules for constructing definitions, such that all definitions constructed in accordance with these rules will meet the above two requirements. The rules we shall present are standard. We shall need a rule for each type of symbol for which we intend to permit definitions; viz., a rule for predicate constants, a rule for operation symbols, and a rule for individual constants.

Consider a definition which is to be added to a theory T. If that definition is introducing an n-ary predicate constant, it is to be of

6 These two requirements were first exactly stated by the Polish logician S. Lesniewski.


the following form:

(al)(a2) ... (an)(f al ... an == A),

where (a) fis the n-ary predicate constant, (b) al> ... , an are distinct individual variables, and (c) A is a formula appearing within the theory T, in which no variables other than aI' ... , an appear freely. By hypothesis, the predicate constant f is a new symbol, and thus does not appear within T; therefore, it will not appear within the formula A.

If a definition introduces an n-ary operation symbol, it is to have one of the following two forms:

(1) (al) '" (ll,z)(b)(o al ... all = b == A),

where (a) 0 is the n-ary operation symbol being defined, (b) aI' ... , all' b are distinct individual variables, and (c) A is a formula within T, in which nO variables other than aI' ... , an' b appear freely, and (d) the uniqueness formula

(al)(a2) ... (an)(3c)(b)(b = c == A),

where c is an individual variable distinct from b, aI' ... , an' is provable within T;

(2) (al) '" (~)(o al ... ~ = b),

where (a) 0 is the n-ary operation symbol, (b) al> ... , an are distinct individual variables, and (c) b is a term within 7', in which no variables other than aI' ... , an appear freely.

Notice that in case (I), where we define an operation symbol by means of a biconditional, there appears a certain uniqueness formula. The importance of this formula needs to be stressed. According to this formula, for any given values of the variables a 1 , ••• , an' the defining condition A is satisfied by exactly one value of b, neither more nor less. If this formula is not provable in T, then the requirement of non-creativity will be violated. Consider a 'definition' for which the uniqueness formula is not provable in T; e.g., the following 'definition' of the algebraic binary operation symbol '0':

(x)(y )(z )(x 0 y = z == x < z 1\ y < z).


It follows from this 'definition,' for example, that

20 3 = 4,

and that

2 0 3 = 5,

and thus, by the logic of identity, that

4 = 5,

which contradicts the fact that

4 f 5.

What has happened here is that for given values of 'x' and 'y', there exists more than one value of 'z' which satisfies the defining condition A; viz., 'x < z 1\ y < z'. As a result, adding this definition to any consistent algebraic theory T would result in a contradictory theory T 1 • Because TI would be contradictory, every formula of TI would be a theorem of TI , and the requirement of non-creativity would obviously be violated. For a definition of type (I) to be satisfactory, then, we must be able to prove within T tha~ for any given values of the variables a1 , ... , an' there exists at most one value of the variable b which satisfies A.

We must also be able to prove within T that, for any given values of aI, ... , an' there exists at least one value of b which satisfies A. If this is not provable within T, once again the requirement of non-creativity will be violated. For once we add our definition to T, within TI we will be able to prove as follows that there exists at least one such value of b satisfying A. By the logic of quantification, from our definition the formula

o a1 ... an = 0 a1 ... an == B,

readily follows, where B results from A by replacing all free occurrences of b by occurrences of the term 0 a1 ... an· By the logic of identity and sentential logic, B follows. But, again by the

logic of quantification,

(a1) ... (an )(3b)A


then follows. Now this is a formula within T, according to which for any given values of aI, ... , an there exists at least one value of b which satisfies A. Thus, unless this formula is already provable in T, the addition of our definition to T will increase the number of theorems of T, and thereby violate the requirement of noncreativity. But this formula will be provable in T if the above uniqueness formula is provable in T.

In case (2), where we use an identity formula rather than a biconditional, the uniqueness condition would be stated by the formula

(a1) ... (an )(3c)(d)(d = c == d = b),

where c and d are distinct from each other and from 81, ... an. This formula will always be a theorem within the logic of identity, however, and therefore it would be superfluous for us to require explicitly that it be provable within T. In this second case, uniqueness is automatically provided for.

Finally, if our defining formula is introducing an individual constant, it is to have one of the following two forms:

(1) (b)(c = b == A),

where (a) c is the individual constant being defined, (b) b is an individual variable, (c) A is a formula appearing within T, in which no variable other than b appears freely, and (d) the uniqueness formula

(3d)(b)(b = d == A),

where d is an individual variable distinct from b, is provable within T;

(2) c= b,

where (a) c is the individual constant being defined, and (b) b is a term within T in which no variables appear freely.

Again, in case (I) there appears a uniqueness formula which . , guarantees that the defining condition A is satisfied by exactly one individual. This formula must be provable in T if the definition introducing c is to be added to T. And, again, in case (2) the


uniqueness fonnula which applies will be a theorem of logic; it is therefore unnecessary for us to require explicitly that this formula be provable within T.

A few ;l1ustrations of these definitions will prove helpful. As an example of a definition introducing a binary predicate constant, we have the following fonnula, stated within a language for the arithmetic of the non-negative integers (and using infonnal notation):

(x)(y)(x > y == (3z)(z f 0 1\ X = Y + z».

As an example of a bicondhional introducing a singulary operation symbol, we have the formula

(x)(y)(y2 = y == y = x . x),

which introduces the symbol for the operation of squaring. This same symbol can be defined through an identity fonnula; viz., the fonnula

(x)(x 2 = x . x).

As a biconditional introducing an individual constant, we have the formula

(x)(2 = x == x = I + I). And this constant can also be defined through the identity fonnula

2=1+1.

In some circumstances, one is not able to define a particular operation symbol, or individual constant, by means of an identity, but has to use a biconditional-type definition. This is necessary whenever T contains no operation symbols, or individual constants. Thus, for example, suppose that T is an algebraic theory containing no individual constants. Then clearly one cannot introduce symbols for the numbers zero and one by identity-type definitions. By means of the following biconditional-type definitions. however, such symbols could be introduced:

THEORY OF DEFINITION

(x)(x = 0 == (y)(y + x = y)), (x)(x = 1 == (y)(y . x = y)).

105

Further examples of definitions will appear in subsequent chapters.

Given the axioms of a theory T, one can ask whether any particular one of these axioms is independent of the remaining axioms, in the sense that it cannot be derived as a theorem within the theory whose non-logical axioms are these remaining axioms. As we have earlier pointed out, one way of showing that a particular axiom A is independent of the remaining axioms of Tis to give two interpretations, one which is a model of T, and one which is a model of the remaining axioms of T, but in which the axiom A does not hold. Now a similar situation holds with respect to definability. Given the non-logical constants of a theory T, one can ask whether any particular one of those constants can be defined in tenns of the remaining constants. Let T be a theory, and let k be one of the non-logical constants of T. Let T -k be the theory whose non-logical constants are those of T, except for k; and whose theorems are the theorems of T, except for those theorems in which k appears. Then we shall say that k is definable within T if and only if within T there is a theorem which is a formula in accordance with the above rules for being a definition of k with respect to the theory T -k.

One can show, then, that k is definable within T by producing such a theorem of T as just described. How could we show that k is not definable within T? We have an answer to this problem in Padoa's Principle (A. Padoa, 1900). According to this principle, to show that k is not definable in T it suffices to give two models of T, which differ only in what they assign to the constant' k. The intuitive argument in support of Padoa's principle is as follows. If k is definable within T, then the above described theorem exists within T. Such a theorem will be (the closure of) an identity or a biconditional, in which the constant k does not appear to the right of the identity or biconditional symbol. Thus any two models of T which are alike in every respect except possibly what they assign to k, will have to be alike in what they assign to k also; otherwise


this theorem would not hold in both of these two models. It ~ollows, then, that if we define two models of T which differ only In how they interpret k, we have thereby shown that k is not definable within T. CHAPTER V

THE NATURAL NUMBERS

5.1. Introduction

In this chapter and the following chapter we shall present and discuss various approaches to the two most important kinds of numbers in all of mathematics; viz., the natural numb.ers and the real numbers. By the term of 'natural numbers' we here mean the numbers:

° 2 3 4

Thus, we are here including the number zero among the natural numbers, as the first of these numbers. This is in accordance with the usage of most logicians, though many mathematicians do not include the number zero among the natural numbers, but take the number one to be the first of these numbers.

The natural numbers are the numbers we use for counting things, where the things being counted are finitely many in number. Familiar facts about those numbers include the fact that there is a first natural number; the fact that each natural number has a successor, and except for the case of 0, each natural number is itself the successor of some other natural number; and the fact that each natural number can be reached by starting with the number 0, then adding 1 to it, then adding 1 to the number thereby obtained, then adding 1 to the number obtained by this last addition, and so on, for some finite number of additions of 1. Not quite so familiar, to the average man at any rate, is the fact that there is no last natural number - 'infinity' - which we reach after we have added 1 in this way many, many times. To startle

107

108 THE NATURAL NUMBERS

the mind uninitiated to mathematics it is not necessary to go any further in mathematics than to a consideration of certain elementary facts about the natural numbers, such as the fact that there is no last one of them. There are, of course, infinitely many natural numbers, though each of them is finite. This is itself a surprising fact, for one might suppose at first that there could be only finitely many finite numbers. As a further surprise for the beginner, there is the 'peculiar' fact that there are just as many even (or odd) natural numbers as there are even and odd numbers taken together. One can readily prove that this is so as follows: pair off each natural number with its double, so that there is a one-to-one correspondence between the natural numbers nand their doubles 2n. Given any natural number, we find its mate by taking the double of that number; and given any double of a natural number, its mate is that natural number itself. Because there is a one-to-one correspondence between the whole set of natural numbers and the set of even natural numbers, we may say that there are just as many numbers in the latter set as there are in the former. When we are dealing with an infinite class of things, as we are in the case of the natural numbers, the principle that the whole is greater than any of its parts does not always hold true.

We can think of the natural numbers as located on a half-line. The nt.mber zero is located at the origin of the half-line; the number one is located at one unit of length to the right of the Origin; and, in general, the number n is located at that point on the half-line that is n units of length to the right of the origin. Each particular natural number, then, will be located on the half-line at a point which is 'at some finite distance from the origin. Though there are infinitely many such numbers and points, none of them is at an 'infinite' distance from the origin; rather, each is at some finite distance from the origin. And, of course, for every natural number, there are infinitely many other natural numbers that are located on the half-line at some greater distance from the origin than it is.

The set of natural numbers, taken as ordered by its less than relation, is an example of an infinite discrete series. Each natural number has an immediate successor; and, except for zero, each

. I

ELEMENTARY ARITHMETIC: THE THEORY N 109

natural number has an immediate predecessor. The set of all integers - the positive integers, negative integers, and zero - taken as ordered by its less than relation, is also a discrete series, in which every element has both an immediate predecessor and an immediate successor. In contrast with these series, the set of all rational numbers, and the set of all real numbers, taken as ordered by their less than relations, are not discrete series. Rather, they are examples of dense series, where a dense series is one such that between any two of its elements there is another element. No rational number, then, nor any real number, has an immediate predecessor or an immediate successor.

As a further introductory observation, we remind the reader again that not all infinite sets are of the same size. There are infinite sets of different sizes, and they can be compared in magnitude. In particular, the set of all natural numbers is of the smallest size among infinite sets; no infinite set is of smaller size than the set of natural numbers. This set is said to be denumerably infinite. The set of all integers, and the set of all rational numbers, are also denumerably infinite in size, in the sense that the members of each of these sets can be put into one-to-one correspondence with the natural numbers. The set of all real numbers, however, is larger than these sets; there are more real numbers, that is, than there are natural numbers. The set of all real numbers, therefore, is said to be non-denumerably infinite in size.

5.2. Elementary Arithmetic: The Theory N

We now present a certain well-known elementary axiomatic theory for the arithmetic of the natural numbers. Let us call the particular theory we are about to consider the theory N, or first-order Peano arithmetic.1 Its underlying logic is an elementary predicate logic with identity and operation symbols FO, in which

1 For a development of this theory and proofs of its principal metamathematical properties, see E. Mendelson- i964, Chapter 3; and J.R. Shoenfield 1967, Chapter 8.


the non-logical constants are the following four constants:

o S +

The first of these constants is an individual constant; the second is a singulary operation symbol; and the third and fourth are binary opera:ion symbols. The non-logical axioms of N are the following axioms:

1. (x)(y)(Sx = Sy ~ x = y) 2. (x)(O f Sx)

In place of a third axiom, we have the following axiom schema: 3. If A is any formula of N, a is an individual variable of N, and

A results from A by replacing all free occurrences of a in A by oc1currences of '0', and A2 results from A by replacing all free occurrences of a in A by occurrences of Sa, then the closure of

is an axiom of N. 4. (x)(x + 0 = x) 5. (x)(y)(x + Sy = Sex + y)) 6. (x)(x . 0 = 0) 7. (x)(y)(x . Sy = (x . y) + x)

Now that we have specified its non-logical constants and axioms, and its underlying logic, the syntax of theory N is completely determined. In particular, the class of formulas of N and the class of theorems of N are fixed.

On the intended interpretation of N, (a) the individual domain is the domain of naturat numbers, (b) the individual constant '0' designates the natural number zero, (c) the singulary operation symbol'S' designates the operation the (immediate) successor of, and (d) the binary operation symbols '+' and '.' designate the binary operations of addition and multiplication respectively.

On the intended interpretation of N, axiom I, a kind of law of subtraction, states that if the successors of x and yare identical, then x and yare themselves identical. The converse is, of course,


also true; but we do not need to add it as an extra axi0m, since it is a theorem of the logic of identity (following almost immediately from our axiom schema for identity, in Chapter III, taking A to be 'Sx = Sx', B to be 'Sx = Sy', a to be 'x', and b to be 'y'). Axiom 2 simply states that zero is not the successor of any natural number. Our third axiom, of course, is really an infinite number of axioms, one for each formula of N. This is the axiom schema for mathematical induction (or weak induction). What this schema says is that for each condition statable within N, if (a) 0 satisfies that condition, and (b) whenever any natural number satisfies that condition then its immediate successor does, then all natural numbers satisfy that condition. We shall illustrate the'use of this induction schema later.

Axioms 4 and 5 are recursive axioms for addition. Axiom 4 gives us the result of adding the first natural number to some number x; and axiom 5 gives us the result of adding the successor of some number y to x, in terms of the result of adding y itself to x. Similarly, axioms 6 and 7 are recursive axioms for multiplication: axiom 6 gives us the result of multiplying some number x by the first natural number, while axiom 7 gives us the result of multiplying x by the successor of some natural number y, in terms of the result of multiplying x by that number y itself. Because of the recursive nature of these axioms, they should remind' the reader both of the recursive syntactical definitions of 'formula' from Chapters I and II, and of the schema for mathematical induction. In each of these instances, we are concerned (a) with a 'first' case, and (b) with the transition from any arbitrary case to the next case.

As an illustration of the recursive nature of these four axioms, consider how the two axioms on addition permit us to determine the value of the sum of the numbers three and two. For the sake of this illustration, let us use the following definitions:

2 = SSO 3 = SSSO 5 = SSSSSO


We now have the following line of reasoning:

(1) 3 + 2 = 3 + 2 (2) = 3 + SSO (3) = S(3 + SO) (4.1 = SS(3 + 0)

(5) = SS3 (6) = ssssso (7) = 5

In the transitions from steps (2) and (3) to steps (3) and (4), respectively, we use axiom 5, which permits us to compute the value of adding the successor of some number to a given number, in terms of the value of adding that number itself to that given number. And in the transition from step (4) to step (5), we use axiom 4, which gives us the value of adding zero to any number, in terms of that number itself. As a result of applying these two axioms, and using the definitions of the numerals involved, we are able to compute the value of the sum 3 + 2, and to express that value in such a way as not to draw upon the symbol for addition (or the symbol for successor).

These two familiar operations of addition and multiplication for natural numbers are elementary examples of so-called 'primitive recursive functions,' as is the successor operation itself. These operations lie at the foundations of the modem theory of recursive functions.

In terms of the constants '0' and'S', we can readily define the familiar arabic numerals. Thus, we have the following infinite list of definitions that can be added to N:

1 = SO, 2 = SSO, 3 = SSSO,

and so on. And definitions of the relations less than, less than or equal to, greater than and greater than or equal to are obvious. Further, the property of being prime and the relation of one number being divisible by another are readily definable. Not at all


obvious, on the other hand, is the fact - first proved by Godel in 1931 - that definitions of power (x = yn) and factorial (x = y! ) can also be stated within N. Indeed, Godel has shown that N is sufficiently comprehensive so that recursive functions in general can be defined within it.

Symbols for the operations of subtraction and .division, of course, do not appear within N, for the simple reason that these operations (unlike addition and multiplication) cannot always be performed within N.

It is known that all of the standard theorems within elementary arithmetic are derivable within N. In all but comparatively trivial cases, some application of mathematical induction is drawn upon. Because mathematical induction is of such great importance, an example or two of its use is here in order.

Consider the familiar associative law of addition:

(1) x + (y + z) = (x + y) + z

Using an appropnate case of the induction schema, we can easily show within N that this law holds in general. To do this, we prove this law for the first case (where z = 0); and then show that if this law holds for any number z, that it holds for the successor of z. The case where z = 0 is the case:

x""+ (y + 0) = (x + y) + O.

The (informal) proof for this case is as follows:

(2) x + (y + 0) = x + y

=(x+y)+O Axiom 4 Axiom 4

Assume now that this law holds for some number z, and then show that it follows from that hypothesis that it holds for the successor of z:

(3)

x + (y + Sz) = x + S(y + z) = Sex + (y + z)) = S«x + y) + z) = (x + y) + Sz

Axiom 5 Axiom 5 By hypothesis Axiom 5.


We now tum to the induction schema

Al !\ (a) (A :J A2) :J (a)A

to conclude that the a' socia tive law of addition holds in general. To do this, we take as a the variable 'z', and as the formulas A, Al and A2 the following formulas:

(A) x+(y+z)=(x+y)+z, (AI) x + (y + 0) = (x + y) + 0, and (A2) x+(y+Sz)=(x+y)+Sz.

The formula Al is proved above in (2), and the formula (a)(A:J A2) is easily proved. For to prove A:J A2 , by the Deduction l'heorem it suffices to derive A2 from A as hypothesis, which we have done in (3). The variable a may then be generalized upon, by the G~neralization rule of inference of the underlying logic. The induction schema then permits us (first dropping off the closure quantifiers on 'x' and 'y' by quantificationallogic) by using Modus Ponens to conclude the formula (a)A; that is,

(z)(x + (y + z) = (x + y) + z).

By two further applications of the Generalization rule, we obtain the generalized form of the associative law of addition:

(x)(y)(z)(x + (y + z) = (x + y) + z).

(Throughout this proof, notice, we have drawn repeatedly upon the logic of identity.)

It would be a serious mistake to suppose that the principle of mathematical induction is confined in its application to arguments that appear within arithmetic, or even within mathematics generally. This principle applies wherever we have any progression whatsoever, where a progression is any sequence of the form .

XO,XI,X2··· xn·"

in which there is a first term, a successor to each term (and thus there is no last term), no repetitions, and every term can be reached from the first term in a finite number of steps. The sequence of ascending natural numbers is itself a progression, and


all other progressions have the same logical structure as it does. Wherever we have a progression, and are able to show both (a) that its first term has some property F, and (b) that whenever any given term of that progression has the property F then so does the successor of that term, we may conclude - by the principle of mathematical induction - that all of the terms of that progression have the property F. In order for the induction principle to apply to some totality of things, then, what is required is not that these things be mathematical in nature, but only that that totality have a certain structure; viz., that of a progression. The induction principle may thus be said to be the supreme prinGiple of all reasoning in which progressions make an appearance, whatever the subject matter of that reasoning. Indeed, because of its wide applicability and fundamental position in certain types of reasoning, the induction principle might fairly be said to be a principle of general logic, rather than merely a principle of mathematics.

Further mathematical examples of progressions are easy to find. Thus, we have the sequence of increasing even numbers, and the sequence of increasing odd numbers as progressions. For every positive integer n, the sequence of increasing natural numbers divisible by n is a progression; as is the sequence of increasing prime numbers. Further, the infinite sequence of rational numbers

1/2 1/4 1/8 1/16

is a progression. For an example of reasoning by mathematical induction drawn

from outside of mathematics, we may tum to the semantics of Fl. In Chapter II we maintained that it could easily be shown by mathematical induction that all of the theorems of Fl are valid. In order to show this, it is convenient to draw not upon the principle of induction as expressed by the axiom schema of N, but upon an equivalent form of the principle of induction; viz., the so-called principle of strong induction. On this principle, in order to conclude that all natural numbers satisfy some condition A, we need to show that whenever all numbers less than n satisfy that condition then n too satisfies that condition. Now, to each of the axioms of Fl associate the number 0. To each theorem of FI,


associate the number of applications of primitive rules of inference which appear within the shortest possible derivation of that theorem from the axioms of Fl. Every theorem of FI (the axioms of Fl included) then has a unique natural number assigned to it. We will be able to conclude that all of the theorems of Fl are valid, then, if we are able to show that for every n, all theorems associated with the number n are valid. This we can do by forming a progression of classes of theorems, in which the nth member of that progression is the class of all theorems associated with the number n. We then perform an induction on this progression. Assume, therefore, that all theorems associated with numbers less than n are valid, and then show that all theorems associated with n are valid. If n = 0, then the theorems associated with n are just the axioms of Fl. These can readily be shown to be valid. (Here our hypothesis becomes vacuous, and is not used.) If n is greater than 0, will any theorem A associated with n fail to be valid? No, for as can be seen from their derivations, all such theorems are obtainable by making one application of a primitive rule of inference of FI upon one or two formulas occurring earlier in these derivations. These earlier formulas will themselves be theorems, and will of course be associated with numbers less than n. By hypothesis, then, they will be valid. Since A is obtained from them by one application of a primitive rule of FI , A will itself be valid, for it can easily be shown that each of the primitive rules of Fl, when applied to valid formulas, leads' only to other valid formulas. Therefore, by our second principle of induction, for every n, all theorems associated with n are valid; that is, all theorems of FI are valid.

Further examples of the use of mathematical induction appear at various points throughout the book; in particular, in the proof of Godel's Completeness Theorem in Chapter III.

5.3. The Metamathematics of N

Let us now run over briefly the principal metamathematical properties of the theory N. Some of these we have already

THE METAMATHEMATICS OF N 117

mentioned in an earlier chapter as illustrations of concepts there being defined.

It is intuitively clear that all of the axioms of N are true under its intended interpretation. From the soundness of the underlying logic, it follows that all of the theorems of N are true .under that interpretation, and thus that in this sense N is sound, and therefore consistent. This proof that N is consistent, however, is semantical in nature, and certainly is not what is called a 'constructive consistency proof.' We shall in the following discussion simply assume that N is consistent, and then list that assumption wherever it is needed.

The theory N is not finitely axiomatizable (Ryll-Nardzewski, 1953). That is, this is no effectively defined finite set of formulas of N from which can be derived precisely the theorems of N (supposing that N is consistent). As for completeness, Godel's famous incompleteness theorem of 1931 rules this out once and for all.2 If N is consistent, then there are sentences ofN which are true under the intended interpretation of N, but are not provable in N. More generally, let T be any consistent and effectively defined theory (in the sense that its axioms are effectively characterizable) in which the addition and multiplication of the natural numbers can be developed. Then T will be incomplete in the sense that not every sentence of T which is true (under the normal interpretation of arithmetic symbols) will be a theorem of T. Thus, any consistent and effectively defined extension of N will be incomplete in this sense, where an extension of a theory T, recall, is any theory which contains among its theorems all of the theorems of T. 3 N is in this sense essentially incomplete. This result is one of the most fundamental results in all of logic, or metamathematics, and is very far-reaching for one's whole conception of the nature of mathematics, implying as it does that the

2 K. GOdel1931. 3 More strictly speaking, we can say that Godel proved this result for all effectively

defined extensions of N once we assume the thesis (Church's thesis) that an effectively defined set is a recursive set. A similar remark holds for other results concerning effectivity. Church's thesis will be considered in Chapter VIII.


concept of truth in mathematics is not identifiable with the concept of being provable within some one consistent and allencompassing system of mathematics. We shall consider this incompleteness result of Godel at greater length (and with more precision) in Chapter VIII.

In order to show the further result that all consistent and effectively defined theories containing elementary arithmetic are incomplete in the sense that there will be sentences A of those theories such that neither A nor "v A is provable within those theories, Godel needed a further hypothesis to the effect that those theories are also 'w-consistent.' Shortly thereafter, however, J.B. Rosser showed that this result could be proved without this further hypothesis.

Though N, then, is incomplete, the following weak completeness theorem is known to hold for N: Every true sentence of N which contains no quantifiers is a theorem of N. Such sentences, of course, contain no variables at all.

In 1936 Alonzo Church showed that the theory N is undecidable, in the sense that the set of theorems of N is an undecidable set. That is, there is no effective procedure for determining whether any arbitrary formula of N is a theorem of N. And in the same year Rosser shmved that every consistent extension of N is undecidable, and thus that in this sense N is essentially undecidable.

Finally, since no elementary theory is categorical, it follows that N is not categorical. It admits models of every infinite cardinality. More specifically, however, it is easy to show, assuming that N is consistent, that it is not categorical in the denumerable cardinal (Skolem, 1934). Consider the theory N' which results from N by adding the individual constant 'a', together with the following infinite sequence of axioms, one for each natural number.

afO a f SO a f SSO

THE METAMATHEMATICS OF N 119

Now suppose that the axioms of N are consistent. Then, any finite set of axioms of N' will be consistent, since it will contain only finitely many of the axioms of N, together with only finitely many of our added axioms, and these latter axioms can all be made true by assigning to 'a' some number not mentioned in any of them. Now, if all finite subsets of an infinite set of formulas are consistent, the whole infinite set is consistent, for clearly any contradiction derivable from an infinite set of formulas will be derivable from some finite subset of that infinite set, since every derivation has only finitely many steps. The infinite s~t of axioms of N', therefore, will be consistent if N is consistent. Now, by Godel's Completeness Theorem, any consistent set of first-order formulas has a model. But the intended interpretation of N in the natural numbers cannot be a model of N'. For any model J of N' must contain the entities 0, SO, SSO, etc., and an entity a which is distinct from each of these entities. In addition to a, J will have to contain infinitely many further entities distinct from 0, SO, SSO, etc. Thus, J will have to contain Sa, SSa, etc. And a, not being the zero element in J, will itself be a successor. Further there will be the entity 2a, and its infinite strings of predecessors and successors; and similarly for n a, for every positive integer n. /, then, will be a model of N', but will not be isomorphic with the intended model of N in the natural numbers. Because the axioms of N are included among those of N', however, J will also be a model of N. Now J can be assumed to be of denumerable cardinality. The theory N, therefore, is not categorical in power ~o. Nor is it categorical in any infinite cardinal. This, of course, follows from the fact that N is incomplete; for, as we showed in Chapter III, any first-order theory with identity which is categorical in any infinite cardinality m and admits no finite models, is complete. More specifically, however, it is known (Ehrenfeucht, 1958) that for every infinite cardinal, there are at least 2~o (i.e., continuum many) non-isomorphic models of N of that cardinality. Those models of N that are isomorphic with the intended model of N are its standard models. All other models are non-standard (or unintended) models of N', and the study of those models constitutes non-standard arithmetic.


Before turning to a second-order formulation of arithmetic, let us consider briefly several further elementary theories of an arithmetic nature. Consider the theory of elementary addition of natural numbers. This theory has as its non-logical constants those of N, except for the binary operation symbol for multiplication. These constants are interpreted as in N. The axioms of this theory are those of its sentences which are true under this interpretation. This theory is known to be axiomatizable and decidable (Presburger, 1929). That is, there exists an effectively defined set of sentences of this theory from which all of its true sentences are derivable; and there exists an effectively defined procedure for determining whether a given sentence of this theory is true. And these same results hold for the theory of elementary multiplication of natural numbers (Skolem). Consider, however, that theory whose axioms are all true sentences from the addition and multiplication of natural numbers; that is, all true sentences of N. This theory is sometimes called Skolem's arithmetic. From the very definition of truth it follows immediately that this theory is complete and consistent. From our discussion of N - which is an axiomatic subtheory of Skolem's arithmetic - however, it is clear that Skolem's arithmetic is neither axiomatizable nor decidable. Here, then, is an example of a theory whose axioms are not effectively characterizable; together with two interesting fragments of that theory whose axioms are effectively characterizable - though in the description of these fragments we neither explicitly list these axioms nor present them through axiom schemata.

5.4. Second-Order Arithmetic: The Theory N2

We now introduce one of the most famous axiom sets in all of mathematics; viz., PeanJ's axioms for arithmetic.4 These axioms

4 The axioms of the theory N are also often referred to as Peano's axioms. More accurately, however, they are the first-order form of Peano's axioms, together with the recursive equations for addition and multiplication.

SECOND-ORDER ARITHMETIC: THE THEORY N2 121

will be added to a second-order predicate logic F2 . Let us call the resulting axiomatic theory the theory N2 , or second-Qrder Pea no arithmetic. The non-logical constants of N2 are (a) the individual constant '0', together with (b) the singulary operation symbol'S'. The non-logical axioms of N2 are the following:

1. (x)(y)(Sx = Sy :::l x = y), 2. (x)(O 1= Sx), 3. (F)(FO /\ (x)(F x :::l F S x) :::l (x)F x).

Peano's first statement of these axioms appears in his Arithmetices Principia of 1889, and they appear again in Formulaire de Mathematiques, vol. II, in 1898. Peano himself points out that these axioms already appear in Dedekind's Was sind und was sol/en die Zahlen? of 1888. And they were partially anticipated in 1881 in a paper by C.S. Peirce (in which the recursive equations for addition and multiplication first appear). The principle of induction itself is much older; it was apparently first explicitly stated and used by Pascal in 1654, and by Fermat in 1659.

On the intended interpretation of A 2 , (a) the domain of individuals is the domain of natural numbers, and for each n the domain over which the n-ary predicate variables range is the domain of all n-ary relations among the natural numbers; (b) the individual constant '0' designates the natural number zero; and (c) the singulary operation symbol'S' designates the operation immediate successor of.

Within the theory N there were no predicate variables. For that reason, the principle of mathematical induction there had to appear as an axiom schema, giving rise to infinitely many particular axioms. Within N2, on the other hand, n-ary predicate variables appear for every n, and we are therefore here able to present the principle of induction as a single axiom, in which a bound singulary predicate variable occurs in initial position. Thus this axiom permits us to refer to all sets of natural numbers, whereas the axiom. schema of N enables us to refer only to those sets of natural numbers that are definable within the system N. There are,


indeed, denumerably many such definable sets; however, there are non-denumerably many sets of natural numbers altogether.s

On the intended interpretation, the principle of induction as it appears in our third axiom has the following meaning: Every set of natural numbers which contains zero and the successor of each of its elements contains all natural numbers. There are no natural numbers other than thc1se that can be obtained by starting with zero and then repeatedly applying the successor operation.

While discussing the theory N, we remarked that the principle of induction can be stated in more than one form. The form we have presented here as the third of Peano's axioms is known as the principle of weak induction. The alternative form that we earlier mentioned and illustrated is known as the principle of strong induction. Within second-order arithmetic it would be stated as follows (once we have introduced the usual symbol for the less than relation):

(F)((x)((y)(y < x :J F y) :J F x):J (x)F x).

That is, any set which contains any given natural number whenever it contains all natural numbers less than that number, contains all natural numbers. Very closely related is the so-called minimum principle (or the principle of the least integer, or the well-ordering principle for the natural numbers). This principle states that every non-empty set of natural numbers contains a smallest number. This principle holds true for the natural numbers, though not for the integers, rational or real numbers. Within second-order arithmetic it would be stated as follows (once having introduced the usual symbol for the less than or equal to relation):

(F)((3x)F x:J (3x)(F x 1\ (y)(F y :J x ~y))).

These three principles are _ equivalent to one another in the following sense: from anyone of them, taken together with axioms I and 2 of N2 , the remaining two can be derived. Thus, if we wished, we might use either the principle of strong induction

5 For a development of a certain fragment of second-order arithmetic within a first-order theory, see J .R. Shoenfield 1967, pp. 227 - 233.

SECOND-ORDER ARITHMETIC: THE THEORY N2 123

or the minimum principle as our third axiom, in place of the principle of weak induction.

The recursive axioms of N for addition and multiplication do not appear as axioms within N2 . Within N2 we are able to define and prove the existence of a unique addition operation and a unique multiplication operation, and to derive as theorems of N2

the recursive axioms of N. Indeed, within N2 we are able to define, and prove the existence of, recursive functions in general. As illustrations of how this can be done, we present the following two possible definitions of the familiar symbols for addition and multiplication.6

x + y = z = (F)(F 0 x 1\ (x1)(Y1)(F Xl Y1 :J F Sx 1 SY1):J F y z)

x . y = z = (F)(F 001\ (x1)(Y1)(Fx1Y1 :J (zl )(z 1 = x -+ y :J

F SX1z1)) :J F y z)

These definitions illustrate the fact that N2 is a stronger system of arithmetic than N. Within N variables range over only the natural numbers themselves, while in N2 we have in addition variables ranging over all sets of natural numbers, variables ranging over all binary relations among natural numbers (and thus over all singulary operations upon natural numbers); and, in general, variables ranging over all n-ary relations among natural numbers, for each positive integer n. And all of these variables can appear within quantifiers. Thus, here we are able to state and prove theorems of the form 'For every n-ary relation ... ,' and of the form 'There exists an n-ary relation such that .... ' Formulas of this sort cannot even be stated within the limited vocabulary of elementary arithmetic.

If we were to add the three axioms of N2 to some predicate logic of higher order than second-order logic, or to some system of set theory, then we would be able to develop further areas of mathematics within the resultant theories. In particular, if we were to add these axioms to a fourth-order applied logic, within the resultant theory we would be able to introduce by definition the rational, real and complex numbers, and would thereby obtain a

6 A. Church 1956, p. 322.


formalization of mathematical analysis, in which our only nonlogical axioms (aside from definitions) would be Peano's axioms for arithmetic.

5.5. The Metamathematics of N2

By means of interprttations it is an easy matter to show that each of the axioms of N2 is independent of the remaining axioms. We invite the reader to try to show this, with the hint that in showing the independence of axioms I and 2 he work with interpretations with finite domains.

Like N, the theory N2 is incomplete, and the set of its theorems is an undecidable set (assuming that N2 is consistent). 7 The principal difference between N2 and N is that N2 is categorical, in the sense that all of its principal models are isomorphic. All of these morlels are progressions, and are thus isomorphic with the intended interpretation in the natural numbers. For this reason we may say that Peano's axioms characterize the system of natural numbers up to an isomorphism. To be sure, they do not single out anyone system; they admit infinitely many principal models (assuming their consistency). All of these models, however, are isomorphic to one another.

From the fact that N2 is categorical it follows that N2 is complete in the semantic sense. Every sentence of N2 is either true under all principal models of N2, or false under all principal model:; of N2 . Thus, for every sentence A of N2 , either A or "v A is a semantic theorem of the theory N2. N2 is in this sense, then, a complete system of arithmetic, though incomplete in another sense; viz., in the syntactic sense.

Notice now that with these facts about N2 at our disposal we are able to establish the incompleteness of second-order logiC with

. respect to principal interpretations; and thus, given Henkin's weak completeness proof, esLblish the existence of secondary interpretations of second-order logic. The argument is as follows. Because

7 For proofs of the results of this section, see J.W. Robbin 1969, pp. 145-164.

THE METAMATHEMATICS OF N2 125

N2 is incomplete in the syntactic sense, there will be a sentence A of N2 such that neither A nor "vA is derivable from the axioms of N2. However, because N2 is categorical, either A or "vA holds in all principal models of N2. Let it be A. Consider, then, the conditional sentence whose antecedent is the conjunction of the finitely many axioms of N2, and whose consequent is A. This conditional will be a valid sentence of F2 , in the sense that it holds in all principal interpretations of F2. It will not be a theorem of F2, however; for if it were, A would be a theorem of N2 , which by hypothesis it is not. Thus, 'F2 is incomplete. And similarly for any effectively defined formalization of second-order logic.

Consider further, now, this sentence A. Because it is not derivable from the axioms of N2, it follows from Henkin's weak completeness proof for second-order logic that A fails to hold in some model of N2. By hypothesis, A holds in all principal models of N2. It follows that it fails in at least one secondary model of N2, and thus that secondary models of N2 exist.

Alternatively, we can show the existence of secondary models of N2 by arguing as we did in showing the existence of nonstandard models of the first-order theory N. Thus, to the nonlogical constants of N2 add the individual constant 'a', and to the axioms of N2 add the infinitely many further axioms

a =1= n,

one such axiom for each natural number n. Assuming that the axioms of N2 are consistent, this resulting infinite set of axioms is consistent. It follows from Henkin's weak completeness proof for second-order logic that this set of axioms has a model, which is clearly also a model of N2. But this model cannot be isomorphic with the intended model of N2. Thus, again, since ail principal models of N2 are isomorphic with its intended model, N2 admits a secondary model. In this argument, notice, we do not use the fact that N2 is incomplete in the syntactic sense .

As we have remarked, every model of N2 contains a zero element (0), the successor of that element (SO), the successor of this successor (SSO), etc.; and these elements are all distinct from one another. Furthermore, it is known that every secondary model


of N2 contains further elements in its domain of individuals. It follows from this, then, that no secondary model can contain, within the domain over which the singulary predicate variables of N2 range, the set which contains 0, SO, SSO, and no further elements. For, by the principle of induction (axiom 3 of N2) every set which contains 0 and is closed under the successor operation contains all elements in the domain of individuals. Thus, we have the interesting fact that no secondary model of N2 can contain among its sets of individuals precisely the set of natural numbers. 8

In c.onclusion, we may now draw up the following four-fold classification of the sentences of N2. Every sentence of N2 is of exactly one of these four types (assuming the consistency of N2).

(I) Those sentences of N2 that are true in all models (principal and secondary) of Peano's axioms. These sentences are all theorems ofN2. ,

(2) Those sentences of N2 that are false in all models (principal and secondary) of N2. These sentences are all refutable sentences ofN2. ,

(3) '.::'hose sentences of N2 that are true in all principal models of N2 , but false in some secondary model of N2 . Included among these sentences are Godel's famous 'true but unprovable (within N2)' sentences;

(4) Those sentences of N2 that are false in all principal models of N2, but true in some secondary model of N2. These sentences include Godel's 'false but not refutable (within N2)' sentences.

S For :'urther discussion of unintended models of arithmetic, see L. Henkin 1950; and J.G. Kemeny 1958.

CHAPTER VI

THE REAL NUMBERS

The theory of real numbers occupies a position of the very greatest importance within contemporary mathematics. This is so principally because of two reasons. First, all of mathematical analysis rests upon the theory of real numbers as its foundation. And second, most mathematical theories can be shown to have models within the theory of real numbers. This fact implies in particular that most of modern mathematics can be shown to be consistent if real number theory is consistent, by means of 'relative-consistency' proofs.

A real number can be defined as a number denoted by an infinite non-terminating decimal; that is, by an expression of the form ±Kl K2 ... Kn . a

1 a2 ... an .•. , where each K and each a is a digit

and except in the case of the number 0, there is no n such that all digits to the right of the digit an are the digit '0'. The real numbers include the rational numbers min (where m and n are integers, and n is not 0) and the irrational numbers. Rational numbers are those real numbers that are denoted by repeating infinite decimals (e.g., '0.3333 .. .'), and irrational numbers are those real numbers denoted by non-repeating infinite decimals. Every educated person is familiar with the rational numbers. Not everyone, however, is very familiar with the concept of an irrational number. And yet, the recognition within the Western world that some numbers are irrational goes back to the time of the Pythagoreans, probably to some time during the fifth century B.C. The Pythagoreans discovered the fact that the square root of the number two is not a rational number, together with its geometrical counterpart, the fact that the hypotenuse of a right-angled triangle with two sides

127

128 THE REAL NUMBERS

of unit length is not commensurate in length with those other two sides. That is, there is no line segment s, no matter how small, such that the length of the hypotenuse of such a triangle is exactly m times the length of the segment s, while the length of the other two sides of that triangle is exactly n times the length of the segment s, where m and n are positive integers. The discovery of this fact led to a kind of crisis within the Pythagorean school, and it has been conjecturec1 that it was this crisis that led to the method of derivation of theorems from axioms.

Mathematicians, then, have been familiar since ancient times with the fact that some numbers are irrational. And since the time of the 'discovery' of the calculus by Newton and Leibniz in the seventeenth century, they have employed, in a more or less geometrically intuitive way, many concepts which were later shown to presuppose some precise account of the nature of real numbers. Yet it was not until the nineteenth century that mathematicians took up in earnest the task of constructing such an account, and it was not until the end of that century that that task, together with the task of defining those many mathematical concepts which presuppose the concept of real number, was completed in a way which subsequent mathematicians were able to accept.

In preceding chapters we have remarked that the set of all real numbers is non-denumerable; that is, that there are more real numbers than there are natural numbers (or, equivalently, than there are positive integers). This is an appropriate point at which to establish this fact. The non-denumerability of the real numbers was first proved by the great nineteenth-century mathematician Georg Cantor, by means of a very striking so-called diagonal proof.

Cantor's proof is as follows. Suppose that there were an enumeration of all real numbers which are greater than 0 and less than or equal to 1. Let us represent these real numbers rn by infinite non-terminating decimal fractions. (Thus, for example, rather than use '.50000 .. .' (= Y2), we use '.49999 .... ' Without this practice, we would have two different decimals denoting the same real number.) We may illustrate this enumeration by the following diagram:

THE THEORY R 129

rO: · aDO aD} a02 a03 ...

\j

r1 : · aID all a12 a}3 ...

\j

r2 : · a20 a2} a22 a23 ...

\j

r3 : · a30 a3} a32 a33' .. \j

where each one the a's is one of the digits '0' through '9'. We now reduce the supposition that there is such an enumeration to absurdity by defining a real number C which is greater than 0 and less than or equal to 1, though it is not in our enumeration. This is the real number represented by the decimal

where ck is the 'cyclic sequent' of the digit akk ; that is, the successor of the digit a k k' if a k k is any digit from '0' through '8', and '0' if akk is '9'. Thus if akk is '0', then ck is '1'; if au is '1', then ck is '2'; ... ; and if akk is '9', then ck is '0'. Thus, C is represented by a simple variation on the diagonal decimal .aOOa}} /122 a 3 3 ... , which is shown by the arrows.

Now it is easy to see that for every n~O, c will differ from the real number rn in at least the nth place in the decimal representations of c and rn' In other words, the supposition that for some number k, c is rk contradicts the definition of c. Thus, the supposition that there could be an enumeration of all real numbers greater than 0 and less than or equal to 1 leads to a contradiction, and is therefore false. Obviously, then there can be no enumeration of all real numbers altogether. In other words, there are non-denumerably many real numbers.

6.1. The Theory R

Let us consider now an elementary axiomatic theory of the


addition and multiplication of the real numbers. 1 Let us refer to this theory as the theory R. The underlying logic of the theory R is a first-order predicate logic FO with identity and operation symbols. The non-logical constants of R are the following seven symbols:

o + -1 ~

The first and second of these symbols are individual constants, the third and fourth are bbary operation symbols, the fifth and sixth are singulary operation symbols, and the seventh is a binary predicate symbol.

The non-logical axioms of R are the following sixteen axioms, together with one axiom schema:

1. (x)(y)(z)(x + (y + z) = (x + y) + z) 2. (x)(y)(x + Y = Y + x) 3.(x)(x+0=x) 4. (x)(x + -x = 0) 5. (x)(y)(z)(x· (y. z) = (x· y). z) 6. (x)(y)(x· y = y . x) 7. (x)(x' I =x) 8. (x)(x t= 0 :) x . x-I = 1) 9. (x)(y)(z)(x . (y + z) = (x . y) + (x . z))

10. 0 t= 1 11. 0-1 = 0 12.(x)(0~x vO~-x) 13. (x)(x t= 0:) '\..(0 ~ x) V '\..(0 ~ -x)) 14. (x)(y)«O ~ x A 0 ~ y) :) (0 ~ x + y)) 15. (x)(y)«O ~x A 0 ~y):) (0 ~ X· y)) 16. (x)(y)(x ~ y == 0 ~ y + -x)

1 This particular formulation of the elementary theory of real numbers (with axiom 11 not included) appear' in D. Kalish and R. Montague 1964, pp. 280, 286. It is there extended and developed up through the basic concepts of the differential and integral calculus. The underlying logic used, however, unlike FO, contains a theory of descriptlOns within it.

THE THEORY R 131

The axiom schema within R is the following schema:

17. «3a)A A (3b)(a)(A :) a ~ b)) :) (3c)«a)(A:) a ~ c) A (b)«a)(A:) a ~ b) :) c ~ b)),

where A is any formula of R, and a, band c are any individual variables of R. For each of the infinitely many instances of this schema, the closure of that instance is an axiom of R.

On the intended interpretation of R, (a) the domain of individuals is the domain of real numbers (i.e., the positive and negative real numbers, together with zero); (b) the individual constant '0' designates the real number zero, while the individual constant 'I' designates the real number one; (c) the binary operation symbol '+' designates the addition operation, and the binary operation symbol '.' designates the multiplication operation; (d) the singulary operation symbol '-' designates the operation of forming the negative of a number (Le., '-x' means 'the negative of x' or 'minus x'), and the singulary operation symbol ,·1, (which is used as a superscript) designates the reciprocal operation (thus yl' designates the number I/x; the term '0. 1 , is here regarded as meaningful, and is taken to stand for the number 0); and (e) the binary predicate symbol '~' designates the relation less than or equal to.

The familiar symbols for subtraction and division are readily definable in terms of the symbols under (c). We shall not add them at this point, however, but at a later point.

Axioms I· through II taken together define the algebraic concept of a (commutative) field. By virtue of these axioms, the familiar associative, commutative and distributive laws concerning addition and multiplication hold true within any model of R, and (by axioms 4 and 8) for every element within the domain of any model of R, other than zero in the case of multiplication, the additive and multiplicative inverses of that element (which make subtraction and division possible) are in the domain of that model. Axioms I through 16 taken together define the concept of a (commutative) ordered field, in which the less than or equal to relation is an ordering relation. In addition to the system of real numbers, the system of rational numbers, too, is a (commutative) ordered field. Axioms (1) through (16), then, hold in both of


these number systems, and are thus by themselves incapable of distinguishing between them.

It is the axiom schema of R that distinguishes the system of real numbers from the system of rational numbers, for though this schema holds within the former of these two systems it fails to hold within the latter of them. When we add this axiom schema to the other sixteen axioms of R, we obtain a set of axioms for a complete ordered field (or a continuously ordered field, or a real closed field). The syst~m of real numbers is a complete ordered field, then, while the system of rational numbers is not. Under the intended interpretation of R, what this axiom schema in effect says is this: Consider any non-empty set of real numbers that is definable by some formula A of R. If there is a real number y such that every element x of this set is less than or equal to y, then there is a real number z such that every element x of this set is less than or equal to z, and z is the smallest such number with this property. In other words, for every non-empty. se: of r~al n~mbe~s for which a defining condition can be stated wIthm R, If thIS set IS bounded above, then there is a least upper bound of this set. This schema is called the Continuity schema. It is this schema which expresses the fact that the field of real numbers is continuously ordered. As an illustration of its force, consider the set ,of all rational numbers whose squares are less than or equal to two. Though this set is clearly bounded from above, it has no least upper bound within the rationals. Such a bound would have ~o be the positive square root of two, and we know that there IS no rational number that is the square root of two. Thus, the Continuity schema does not in general hold within the system of rational numbers. However, it does hold within the system of real numbers and the particular least upper bound it provides for in this cas~ is the positive real square root of two (which is, of course, a least upper bound both for the set of all rationals whose squares are less -than or equal to two, and the set of all real numbers whose squares are less than or equal to two). The following formula is an axiom of R provided by the Continuity schema:

THE METAMATHEMATICS OF R 133

«3x)(x' x ~ 1 + 1) /\ (3y)(x)«x· x ~ 1 + 1):Jx ~ y)):J (3z)«x)«x· x~ 1 + 1):Jx~z)/\ (y )( (x )«x ' x ~ 1 + 1) :J x ~ y) :J z ~ Y ».

This formula in effect states that on the hypothesis (which in this case is clearly true) that the class of real numbers whose squares are less than or equal to two is non-empty and bounded from above, there is a real number z which is a least upper bound for this class. This real number z is, of course, the positive square root of two.

It is, to be sure, true that R contains no constants which designate particular classes of real numbers; for example, no constant that designates the class of all real numbers whose squares are less than or equal to two. Nevertheless, within R there appear formulas that determine particular classes of real. numbers. Every formula A of R which contains one free variable determines a class of real numbers; viz., the class of all those real numbers that satisfy that formula. By using formulas of this sort, we are in a sense able within R to refer to each of denumerably many classes of real numbers. In particular, by using the formula 'x, x ~ 1 + I' we are able in effect to speak of the class of all real numbers whose squares are less than or equal to two, and to state that there is a real number that is a least upper bound for this class.

6.2. The Metamathematics of R and of Elementary Algebra

Let us use the expression 'elementary algebra' to refer to any elementary theory of the addition and multiplication of the real numbers, in which the axioms of that theory are defined as all sentences of that theory which are true under the usual interpretation of the symbols of that theory. Let us now consider the principal metamathematical features of any such theory, together with the metamathematical features of the axiomatic theory R as well.

The elementary algebra of real numbers differs from the elementary arithmetic of the natural numbers in several important


respects. First of all, elementary algebra is decidable (Tarski, 1930).2 That is, there is an effective procedure for determining whether any arbitrary sentence from the elementary algebra of real numbers is true or not (under the intended interpretation of that sentence). Second (as indeed follows from this first result), elementary algebra is axiomatizable; it is known, however, to be not finitely axiomatizable. There are, then, consistent and effectively defined axiom sets for elementary algebra from which all true sentences of ele'nentary algebra are derivable. One such complete axiom set (several, indeed) appears in Tarski's work cited abovt} and the axioms of the theory R are another complete set of axioms. Thus here we have one of those infrequent cases in mathematics where the concept of truth is characterizable in syntactical terms, and coincides with 'theoremhood'. Moreoever, one is able to prove the consistency of these axiom sets for elementary algebra in a 'constructive' way, which, speaking loosely, 'leaves no room for doubt' as to whether these axiom sets are consistent. This results from the fact that elementary algebra is decidable; for Tarski's decision method provides us with an effective procedure for demonstrating, with respect to each of these axiom sets, that a certain sentence (e.g., '0 * 0') is not derivable from that axiom set, and thus that that axiom set is consistent. Because of Gbdel's results of 1931 concerning incompleteness (page 117), and the Church-Rosser result of 1936 concerning undecidability (page 118), Tarski's results for elementary algebra are stronger th an any possible corresponding results for Skolem's arithmetic; that is, the elementary theory whose axioms are all true sentences of arithmetic.

The reader might wonder how it is possible that these positive results can hold for elementary algebra but not for elementary arithmetic. After all, one might protest, the natural numbers are included among the real numbers. Indeed, within elementary algebra one can construct names for each of the natural numbers.

2 This result is due to Tarsli, and appears in A. Tarski 1951. The main results of this work were found in 1930.

3 A.Tarski 1951,p.49.

THE METAMATHEMATICS OF R 135

This is of course true. Nevertheless, within elementary algebra we are not able to refer in any way to the set of all natural numbers. There is no formula within R which has one free variable and is satisfied by a real number x if and only if x is a real number which is a natural number. For this reason, we are not able to make all of the assertions here that we are able to make within elementary arithmetic. For example, within elementary algebra we are not able to say that a certain condition is not satisfiable within the natural numbers, because of our inability to refer to' just those numbers. It is this fact which blocks the immediate transfer of Godel's result and the Church-Rosser result to elementary algebra.

As for categoricity, since every elementary theory admitting any infinite models at all admits models of every infinite cardinality, the theory R is not categorical. In particular, by the Lowenheim-Skolem theorem the axioms of R have a model within the denumerable domain of positive integers, though the intended domain of real numbers itself is of non-denumerable cardinality. As for categoricity in power, it is known that R is not categorical in any infinite power.

As Tarski points out, his above positive results hold not only for the elementary algebra of real numbers, but can be extended to various other elementary theories as well. The algebras of complex numbers, quaternions and n-dimensional vectors rest upon the real numbers, and the elementary theories of these algebraic systems are decidable and axiomatizable. And because of the possibility of mapping geometry in to real number theory (through analytic geometry), Tarski's results can be extended to various elementary geometrical theories; for example, to elementary n-dimensional Euclidean geome try, and to various theories of elementary non-Euclidean geometry and projective geometry. 4 .

4 !". Tarski ~951, pp. 55-56, contains a complete axiom set for elementary two-dimensIOnal Euchde~n geometry, which can readily be adapted to elementary geometry of any .number of dimensIOns. This axiom set uses two non-logical constants: a ternary pr:dlca.te and a quaternary predicate. The individual variables range Over the domain of ~Ol~ts In the EuclIdean plane; the ternary predicate designates the betweenness relation y h~s ~etwe~n x and z,' and the quaternary predicate designates the equidistance relatIOn the dlstan~e between x and y is equal to the distance between x I and y 1.'

For a later and Improved version of this axiom set, see A. Tarski 1959.


Let us now consider several extensions of R, which we shall here call R1, R2 and R3.5 The non-logical constants of R1 are those of R, together with the singulary predicate '1'. The nonlogical axioms of R1 are those of R, together with the following three axioms:

18. I 0 19. (x)( I x :::l I x + 1 /\ I x + - 1 ) 20. (x)(y)(((1,.,( /\ I y) /\ (x ~ Y /\ Y ~ X + 1)) :::l

(y = x v y = x + 1)).

On the intended interpretation, the symbol 'I' denotes the property of being an integer. On this interpretation, axiom 18 states that zero is an integer; axiom 19, that the successor and predecessor of an integer are themselves integers; and axiom 20, that for any integer x and its successor y there is no integer that lies between x andy.

Let us now pass to a definitional extension of R1 , by introducing definitions for the familiar symbols for the less than relation, and for the binary operations of subtraction and division. The non-logical constants of our new theory R2 are those of R1 together with these three constants, and the non-logical axioms of R2 are those of Rl together with the following three definitional axioms:

21. (x)(y)(x < y == x ~ Y /\ X =t= y) 22. (x)(y)(x - y = x + -y) 23. (x)(y)(x/y = x . y-l)

Finally, we pass to the definitional extension R3 of R2 by adding the following two definitions of the symbols 'N' and 'R':

24. (x)(Nx==Ix /\O~x) 25. (x)(y)(z)(R x == (3y)(3z)((I y /\ I z) /\ (0 < z /\ X = y/z)):

5 D. "Calish and R. Montague 1964, pp. 285 ff. The particular extension R J is there called the theory of real numbers, and is due to Montague.

SECOND-oRDER REAL NUMBER THEORY 137

Under the intended interpretation, the symbols 'N' and 'R' turn out to denote the property of being a natural number and the property of being a rational number, respectively.

The theory Rl is an elementary theory of real numbers in which we are able to distinguish the integers (and consequently the natural numbers and rational numbers) among the real numbers, as we are not able to within the theory R. Because the elementary theory of natural numbers can therefore be developed within R1, it follows that R1, and R2 and R3, are neither decidable nor complete.

6.3. Second-Order Real Number Theory: The Theory R2

Let us now turn to a second-order axiomatic approach to the theory of real numbers. Here, in addition to variables ranging over real numbers, we have variables ranging over classes of real numbers and relations among real numbers. Second-order real number theory is, consequently, a considerably more powerful theory than any elementary theory of real numbers, just as second-order arithmetic is considerably more powerful than any elementary theory of arithmetic.

The particular axiom set that we are about to consider is due to Tarski, and appears in his Introduction to Logic (German edition, 1937; first English edition, 1941).6 Let us call the particular second-order theory to which it gives rise the theory R 2. The underlying logic of this theory is a second-order predfcate logic F2, which contains all n-ary predicate variables. The non-logical constants of R 2 are the following five constants:

< + o

6 A. Ta.rski 1965, pp. 217-218. Included in this work is another second-order axiom set for real number theory, which is considerably smaller than the axiom set of R 2

though more difficult to work with. As a further axiom set for second-order real numbe; theory, we might simply take tl;1e axioms of R, and replace the axiom schema therein by its formulation as a second-order axiom. And there are categorical second-order axiom sets for the integers, rationals, and complex numbers as well.


The first of these constants is a binary predicate constant, the second and third are binary operation symbols, and the fourth and fifth are individual constants. The non-logical axioms of R 2 are the following sixteen axioms:

l. (X)(y)(X f y :::> X < y V y < X) 2. (x)(y)(x <y:::> "v(y <x)) 3. (x)(y)(z)(x <y!\y <z:::>x <z) 4. (F)(G)«x'l(y)(F x !\ G y :::> x < y) :::>

(3z)(x)(y)«F x !\ G y !\ X f z !\ Y f z):::> (x < Z !\ Z < y)))

5. (x)(y)(x + y = Y +x) 6. (x)(y)(z)(x + (y + z) = (x + y) + z) 7. (x )(y )( 3z )(x = y + z ) 8. (x)(y)(z)(y < z :::> x + y < x + z) 9. (x)(x + 0 = x)

10. (x)(y)(x . y = y . x) 11. (x)(y)(z)«(x . (y. z) = (x . y). z) 12. (x)(y)(y f 0:::> (3z)(x = y. z)) 13. (x)(y)(z)«O < x !\ Y < z):::> (x . y < x· z)) 14. (x)(y)(z)(x . (y + z) = (x· y) + (x . z)) 15. (x)(x . 1 = x)

16.0 f 1

(In his presentation of this axiom set Tarski includes the primitive symbol 'N' ('real numl,er') and four additional axioms, to the effect that 0 and 1 are real numbers, and that the sum and product of an) two real numbers are themselves real numbers. Because we here intend to restrict the domain of the intended interpretation of R 2 to the domain of real numbers, there is no need for us to include these axioms and the symbol 'N' in the formulation of R2.)

On the intended interpretation of R2, (a) the domain of individuals is the domain of real numbers; (b) the binary predicate constant '<' designate~ the less than relation among the real numbers; (c) the binary operation symbols '+' and '.' designate the operations of addition and multiplication of real numbers, respec-

SECOND·ORDER REAL NUMBER THEORY 139

tively; and (d) the individual constants '0' and '1' designate the real numbers zero and one, respectively.

With the exception of axiom 4, the axioms of'R 2 are all satisfied within the system of rational numbers. It is this axiom that excludes this system of numbers, and thereby makes the axiom set of R 2, a suitable axiom set for the theory of real numbers. On the intended interpretation of R 2, axiom 4 (the only axiom in which predicate variables appear) takes on the following meaning: Let F and G be any two sets of real numbers such that, for every x and y, if x is an element of F and y is an element of G, then x is less than y. Then there will be a real number z such that: for any x and y, if x is an element of F and y is an element of G and z is distinct from both x and y, then z will be greater than x and less than y; that is, z will lie between x and y. .

Axiom 4 is a form of the Axiom of Continuity, whi'ch is due to the great German mathematician R. Dedekind (1831-1916). Dedekind's formulation of this axiom (which is often referred to as "Dedekind's axiom") first appeared in his classic Stetigkeit und irrationale Zahlen of 1872, in the following form (in effect):

Let F and G be any two non-empty sets of real numbers such that every real number is an element of either F and G, and for every x and y, if x is an element of F and y is an element of G, then x is less than y. Then there is a real number z such that all real numbers less than z are in F, and all real numbers greater than z are in G. .

This axiom assures us that the set of real numbers is continuously ordered by the less than relation, in the sense that if we break this set into two parts F and G with the above properties, there will be a real number at the point of that "break," or "cut." The set of rational numbers, it is important to see, is not continuously ordered in this sense. For example, let F be the class of all negative rational numbers together with zero and all positive rational numbers whose squares are less than two, and let G be the class of all remaining rational numbers. Then F and G satisfy the hypothesis of the Axiom of Continuity. However, there is no rational number z such as is described by this axiom, for such. a number would have to be the positive square root of two, which we know


is not a rational number, but a real number. At the break in the set of rational numbers into these two sets F and G, then, there is a "gap." The Axiom of Continuity (in either of the above two forms) assures us that there are no such gaps in the set of real numbers.

Dedekind's axiom was in fact first stated by Dedekind in the following geometrical form: If all the points of a line are separated into two classes such that every point of the first class is to the left of every point of the second class, there exists one and only one point which produces this division of all the points into two Glasses and divides the line into two parts in this way. This axiom serves as a Continuity axiom in geometry. It guarantees a one-toone correspondence between the points on a line and the set of all real numbers, and thus makes analytic geometry possible. Dedekind's axiom is known to be equivalent (within the context of Hilbert's axioms for Euclidean geometry) to Hilbert's final two axioms of continuity; viz., the Axiom of Archimedes, and Hilbert's Axiom of Completeness. It could therefore be used in place of these two axioms.

6.4. The Metamathematics of R2

Let us now briefly run through the principal metamathematical properties of R 2 •

The theory R 2 is categorical, in the sense that all of its principal models are isomorphic. They are, indeed, all complete ordered fields, The axioms of R 2 , then, completely determine the structure 01 their principal models, just as the axioms of N2 completely determine the structure of their principal models. Such is the power of many second-order theories. Of course, R 2 (if consistent) admits secondary models, and these models are different in structure from the principal models of R 2 • In particular, there are secondary models of R 2 which have the denumerable domain of positive integers as their domain of individuals, though the principal models of R 2 all have non-denumerable domains.

Because R2 is categorical, it is complete in the semantic sense.

THE MET AMATHEMA TICS OF R 2 141

Every sentence of R 2 is such that either it or its negation is a semantic theorem of R2. Because the elementary arithmetic of natural numbers can be developed within R2, however, R2 is incomplete in the syntactic sense (assuming that R 2 is consistent). Not all of the true sentences of R2 , then, are (syntactic) theorems of R2. And, again because elementary arithmetic can be developed within R 2 , R 2 is an undecidable theory.

As Tarski points out in his presentation of this theory, the axiom set of R2 is not a set of independent axioms. In particular, axioms 5, 9 and 11 are derivable from the remaining axioms. Furthermore, the non-logical constants '<', '0' and '1' are definable in terms of the remaining non-logical constants. This fact, in turn, makes for additional possible reductions in the number of axioms. However, if these reductions in axioms and non-logical constants were to be made, the resulting theory would be less convenient to work with than is R 2 itself.

Axiom 16 is independent of the remaining axioms. This fact comes out very dramatically when we see that the axioms of R2 without this axiom have a model whose domain consists ·of a single element, as the reader should be able to show for himself very easily.

CHAPTER VII

AXIOMATIC SET THEORY

A knowledge of the fundamental features of set theory is of the greatest importance for an understanding of the foundations of logic and of mathematics, and indeed of the nature of mathematics as a whole. We have, of course, in earlier chapters drawn repeatedly upon a number of set-theoretical concepts. Thus, for example, in Chapters II and III the definitions of interpretations and models drew upon the concepts of particular sets of individuals, and, in general, particular sets of ordered n-tuples of individuals. Further, in Chapter IV we considered a formalization of the second-order logic, which includes bound variables that range over domains of sets of individuals and domains of sets of ordered n-tuples. Turning to mathematical theories, we have considered the formalization of two second-order theories of arithmetic and real number theory. In each of these theories we treat not only of the basic entities (natural numbers and real numbers), but also of all sets of these entities, and in general, of all sets of ordered n-tuples of these entities. 1

It is known that all of classical mathematics - indeed, that practically all of com[emporary mathematics itself - can be developed within set theory. All of the basic concepts within classical mathematics can be defined in terms of set theoretical concepts; and all of the basic principles concerning these concepts can then be established within set theory with the help of these defmitions. Many of the traditional concepts of mathematics,

1 In this chapter, as throughout the text, we use the words 'set' and 'class' in terchangeably.

142

PARADOXES 143

which prior to the rise of set theory were left at a rather vague and inexact level - e.g., the concepts of natural number, relation, function, finite and infinite - come to receive exact analyse~ within set theory, as well as considerable generalization. Quite clearly, the development of set theory has brought a good deal of conceptual clarity and precision to mathematical thinking. Set theory is especially important for an understanding of twentiethcentur¥ mathematics. The generality and abstractness, and indeed the vety rapid growth, of twentieth-century mathematics have to be understood largely in terms of the extent to which this mathematics is permeated by the concepts and methods of set theory. Modern algebra and topology are especially important and obvious illustrations of this pervasive characteristic of modern mathematics.

In this chapter we shall first present a number of well-known paradoxes, which indicate the need for taking precautionary measures in the construction of an axiomatic set theory, so as to exclude paradoxes from that theory. We shall then present and discuss the axioms of the Zermelo-Fraenkel set theory, one of the best known of contemporary set theories. And in our discussion, we shall present certain of the basic metamathematical features of this particular set theory.

7.1. Paradoxes

The development of set theory as a distinct branch of mathematics begins with Georg Cantor (1945-1918), in the early 1870's. Cantor took up an intuitive approach to set theory, rather than an axiomatic approach. That is, unlike Euclid, say, he did not first lay down a number of axioms, and then go on to derive theorems from those axioms. Rather, he proceeded in accordance with considerations which seemed intuitively sound, without explicitly setting forth those considerations. In this, his procedure resembles that of the ordinary mathematician, who argues in accordance with the laws of logic, but rarely presents an explicit statement of

144 AXIOMATIC SET THEORY

those laws. The ordinary mathematician presupposes logical principles without explicitly stating them; Cantor presupposed both logical and set theoretical principles, without explicitly stating either kind. Indeed, the average 'working' mathematician today (non-logician that he is!) still prefers to leave set theory at a more or less informal and intuitive level. For him, set theory is a tool, which he is far more interested in using than in studying in an exact way for its own sake. 2

By the end of the nineteenth century Cantor had developed set theory to a point wl ___ ere it constituted a special branch of mathematics, and during the 1890's his set theory, which had at first rret with little favor, came to enjoy considerable popularity. But at the same time there were signs that not all was well. In 1895 Cantor himself discovered an antinomy, or contradiction, within. the theory of well-ordered sets. In 1897 C. Burali-Forti rediscovered the same contradiction, which later came to bear his name. Neither Cantor nor Burali-Forti proposed a way of avoiding the contradiction, however. Things became much more serious when Bertrand Russell in 1902 discovered his famous paradox, which lay not at some fairly remote corner of set theory, as did the Burali-Forti paradox, but at the very foundations of set theory. Set theory was once again suspect, and even an object of ridicule. As Poincare, who had always been very sceptical of set theory, delightfully put it, set theory was no longer barren, for it now had given birth: to a contradiction!

In 1926 F .P. Ramsey (1903-1930) proposed a distinction of the paradoxes known at that time into two types: logical, or mathematical paradoxes, and epistomological, or seman tical paradoxes. Ramsey argued that paradoxes of the latter sort, by virtue of ma,<ing reference to language (meaning, truth, definability) cannot be stated within mathematics, in which there is no reference to such matters, and thus that there is no need to consider them at all in attempting to devise ways of avoiding paradox within mathematics. As we shall see, this reasoning is not as conclusive as Ramsey apparently thought, but his classification

2 For an intuitive approach to set theory, see P. Suppes 1957, Part II.

PARADOXES 145

is helpful and has been widely used. Let us turn now to two examples of each of Ramsey's two types of paradox. 3

The most famous of the logical, or mathematical, paradoxes is Russell's paradox. This paradox proceeds as follows: First we define a class K, say, as the class of all those classes that are not elements of themselves. The class of dogs, for example, is not itself a dog, and thus is not an element of itself; by the' definition of K, then, the class of dogs is an element of K. And so are most, if not all, of the classes that first come to mind. Now we ask whether K itself is an element of K. We see immediately that K is an element of K if and only if it is not an element of K. It follows by the sentential logic - since all formulas (A == 'VA):) (A /\ 'V A) are tautologies - that the class K both is and is not an element of itself. But this is a contradiction, or a paradox.

Russell insisted that there was nothing within either traditional logic or intuitive set theory which would exclude the very simple reasoning which appears within the argument leading up to this paradoxical conclusion. Intuitively, it seems perfectly' sound to suppose that any condition involving just one free variable determines a set; viz., the set of all those entities satisfying that condition. But this supposition leads directly to Russell's paradox, in which non-selfmembership appears as our condition. Russell concluded that our intuitions are not a completely reliable guarantee of correctness in reasoning, and that in order to avoid his paradox we would have to impose restrictions upon settheoretical reasoning which went further than any restrictions known at that time. Moreover, he showed that by making a slight change in the formulation of his paradox it became, not !l paradox in set theory, but a paradox within logic itself. Thus, if we let K be the property of being a property which does not apply to itself, and then ask whether K applies to itself or not, we see that K does apply to K if and only if K does not apply to K. But this leads immediately to a contradiction. Not only the relatively new Cantor set theory was contradictory, then; traditional logic itself,

3 For an extended consideration of the principal paradoxes of logic and set theory, see E. Beth 1959, Part VI.


as old as Aristotle, contained a contradiction inherent within it. Logic, too, called for restrictions beyond any known at that time.

A second logical, or mathematical, paradox is Cantor's paradox, which 'vas discovered by Cantor in 1899. Cantor had proved ('Cantor's theorem') that the set of all subsets of any set X is of greater cardinality than X itself; that is, for every set X there are more subsets of X than there are elements of X. He then considered the universal set U and concluded that the set of all its subsets must have more elements than U itself has. But since U has e}'ery set (and thus every subset of U) as an element, this is a contradiction.

As a well-known example of the seman tical paradoxes, we have the paradox of the liar. This paradox, in one form or another, goes back to ancient times. In one of its forms it proceeds as follows: Consider a man who says, 'I am lying,' and then says nothing further. If this man is telling the truth, then (as he says) he is lying; if, however, he is lying, then he is telling the truth in saying so. It follows by the sentential logic that he is both lying and telling the truth, which is a contradiction. This paradox is a semantical paradox, because it makes reference to certain uttered words, which are said to express a lie. For another form of the paradox, take the sentence, 'This sentence is not a true sentence.' Evidently this sentence is true if and only if it is not true. True, this paradox does indeed at first seem frivolous, and not worthy of serious consideration. Yet it is as genuine a paradox as any other, and must be taken seriously if any paradoxes are taken seriously. Indeed, Alfred Tarski devotes the first section of his famous statement of the semanti::: conception of truth (Wahrheitsbegriff, 1936) to an examination of one form of this paradox, and then proposes the semantic conception of truth as a way of avoiding it within formalized languages.

As a second example of a semantic paradox, we have the Richard paradox (J. Richard, 1905), which strongly resembles Cantor's diagonal proof (pages 128-129) of the non-denumerability of all real numbers greater than 0 and less than or equal to 1. Consider all those English phrases which denote precisely one real number greater than 0 anJ less than or equal to 1 (e.g., the phrases

PARADOXES 147

'point five,' .and 'the square of point seven five'). There are only denumerably many such phrases and thus they can be enU1:nerated. Consider then some enumeration E of these English phrases. We now characterize a certain real number r by the following phrase:

(A) that real number greater than 0 and less than or equal to 1 such that in its decimal representation the nth digit after the decimal point is the cyclic sequent of the nth digit in the decimal representation of the number denoted by the nth phrase in the enumeration E. This number r will differ from each of the numbers denoted by the phrases in the enumeration E, and thus is not a real number greater than 0 and less than or equal to I which is uniquely characterizable by an English phrase. But this contradicts. the fact that we have just uniquely characterized r by an English phrase -viz., by the phrase (A)!

So much for examples. Ill. each of these four paradoxes we have a formula with one free variable. These formulas are as follows:

(1) x is an element of K if and only if x is not an element of x; (2) the cardinal of the power set of x is greater than the cardinal

of x; (3) x is not a true sentence; ( 4) x is a real number greater than 0 and less than or equal to 1

which is not uniquely characterizable by an English phrase. We now substitute respectively the following expressions for 'x'

in these formulas: (1) K; (2) the universal set; (3) this sentence; (4) that real number greater than 0 and less than or equal to I

such that in its decimal representation the nth digit after the decimal point is the cyclic sequent of the nth digit in the decimal representation of the number denoted by the nth phrase in the enumeration E.

In each case, the substitution is specially chosen so as to result in a sentence which leads to contradiction - that is, to a sentence of the form A and not A. In each of these paradoxes there is an element of self-reference, as there is in Cantor's characterization of


the real number c so long as we suppose that Cantor's enumeration is an enumeration of all real numbers greater than 0 and less than or equal to 1. Definitions which define entities in terms of totalities that presuppose those entities are known as impredicative definitions. In each of our four paradoxes there is some impredicative definition or condition. Not all use of impredicative definitions and conditions leads to con tradictions, however. As we shall see, for example, the axioms for Zermelo-Fraenkel set theory contain two axiom schemas; in these schemas we can take for the condition A conditions which are impredicative. These particular uses of impredicativity (presumably) do not lead to contradiction.

Both in Cantor's proof and in the above paradoxes, then, a contradiction makes its appearance. In the case of Cantor's proof, however, we easily avoid the contradiction - viz., that the diagonally defined real number c is and is not included in Cantor's enumeration - by dropping the initial assumption that that number does appear in Cantor's enumeration. In the case of the above paradoxes, however, it is not immediately apparent just what to do in order to avoid contradiction. In particular, notice that we cannot escape the Richard paradox simply by dropping the assumption that the denoting phrases in question can be enumerated, since if it makes any sense to speak of the totality of denoting phrases it is clear that those phrases can be enumerated, because there are only denumerably many of them. If and when satisfactory solutions to these paradoxes are worked out and generally accepted, they may no longer seem paradoxical to us~ any more than Cantor's conclusion that there are non-denumerably many real numbers seems paradoxical - once we become accustomed to it!

We shall later see how the Zermelo-Fraenkel set theory avoids the two logical paradoxes we have just presented. As for the seman tical paradoxes, we have remarked that it is not completely convincing to argue that since these paradoxes make reference to seman tical considerations, while no such reference appears within

4 See W.V.O. Quine 196G.

THE ZERMELO-FRAENKEL AXIOMS 149

mathematics, it follows that these paradoxes cannot appear within mathematics, in particular, within set theory. All that follows is that they can make no direct appearance within mathematics (here excluding metamathematics). This fact, however, does not itself exclude the possibility that th9' make some indirect appearance there. For there may be some isomorphism, not at first evident, between semantic concepts and mathematical concepts, which would permit a translation of these paradoxes into paradoxical statements within mathematics. Until such a possibility is ruled out, we cannot conclude, then, that the semantical paradoxes are of no concern to the mathematician in general, or to the set theoretician in particular.

7.2. The Zermelo-Fraenkel Axioms

One of the very earliest axiomatic approaches to set theory (preceded only by the work of Frege and Russell) is the Zermelo set theory of 1908, by Ernst Zermelo (I 871-1956). This theory was subsequently improved and extended, principally by A.A. Fraenkel (1922), T. Skolem (1922) and Zermelo himself (1930). The resultant theory is generally referred to as the Zermelo-Fraenkel set theory. 5

Zermelo was, of course, in 1908 well aware of the numerous paradoxes which had been brought to light by that time, and his axiom system was especially designed so as to prevent these paradoxes from appearing within his theory. In particular, the so-called Aussonderungsaxiom is so stated as to block the derivation within this theory of the known paradoxes in their familiar forms.

The Zermelo-Fraenkel set theory - which we shall refer to as ZF - can be presented in a number of ways. In the particular

5 For a very helpful presentation and consideration of the Zermelo-Fraenkel axioms see A.A.Fraenkel and Y. Bar-Willel 1958, Chapter II. For a compact yet lively develop~ ment of the outlines of set theory from the Zermelo-Fraenkel axioms, see P. HaImos 1960. For a more detailed and extended treatment, see P. Suppes 1960.


approach that we shall use here, the underlying logic is a first-order predicate logic FO with identity and operation symbols. The only non-logical constant which appears at the start is the binary predicate constant 'E'. Further constants are then introduced by definition. As for the domain, we shall not here attempt to specify in an exact way some one intended domain; such a specification would call for the notion of ordinal number, which we have not introduced. Let us merely say that the intended domain is some non-empty domain of sets.6 All of the entities within this domain are to be sets; there are no entities here which are not sets. (Zermelo-~raenkel set theory can also be developed so as to admit entities that are not sets; Le., so-called individuals. See, for example, P. Suppes, Axiomatic Set Theory.) And the intended interpretation of the non-logical constant 'E' is, of course, is an element of; thus, the formula 'x E y' is to be read: x is an element of y.

In presenting ZF, we shall from the start resort to such informal departures from official notation as have already been used in earlier chapters; for example, omitting certain parentheses, and placing operation symbJls within terms in accordance with customary practice, rather than necessarily immediately in front of their ,lrgument expressions. And, as a further departure from official procedure, we shall not explicitly label the various theories which are implicit within the theory ZF itself, which is strictly

6 For a discussion of models of Zermelo-Fraenkel set theory and related theories, see A.A. Fraenkel and Y. Bar-Hillel 1958, pp. 329-332. See also R. Montague 1965.

To those readers who have the concept of an ordinal number, and of an inaccessible ordinal, we offer the following wen-known characterization of the intended, or standard, models of ZF. Let a be an ordinal number. Then by transfinite induction we define a function T from ordinal numbers to sets as follows:

T(O) = the empty set T(a + 1) = the set of all subsets of T(a). If 7--. is a limit number, then T(7--.) is the union of an sets T(1l), for all Il < 7--..

Now let a be any inaccessible ordinal (greater than w). For a model of ZF, let the domain be T(a), and to 'E' assign the membership relation restricted to T(a). Any model isomorphic to this model will count as an intended (or standard) model of ZF. Of course, as we choose different inaccessible ordinals a, we get models of different cardinality.


speaking the final theory in a sequence of theories, each of which results from the theory preceding it by extending that theory through adding to it various new constants and creative or non-creative axioms. It is this resultant theory that is here taken as the theory ZF. What are usually referred to as the ZermeloFraenkel axioms, however, is not the totality of axioms of ZF, but rather the totality of creative axioms of ZF; that is, those axioms of ZF that are not definitions. We shall, then, number the creative axioms of ZF separately from the definitions of ZF.

Some of the axioms of ZF will be stated with the help of notation introduced by definition. If we wished, we could eliminate all defined notation within these axioms by means of appeal to definitions introduced prior to the introduction· of those axioms themselves. We could then introduce all of the creative axioms first, and introduce the definitional axioms only later.

I. The Axiom of Extension As the first axiom of ZF we have the axiom of ex tension:

(x)(y)((Z)(Z E x == Z E y) :J x = y).

This axiom states that if the elements of x are identical with the elements of y, then x and yare the same set. The converse of this axiom is, of course, a theorem of the logic of identity ~ which is here part of our underlying logic. Thus, we have the result that any two sets x and y are identical if and only if they have the same elements.

II. The Axiom Schema of Separation (Aussonderungsaxiom) As the second axiom of ZF, we have the axiom (schema) which

is most characteristic of the Zermelo-Fraenkel axioms; viz., the axiom schema of separation (called by Zermelo the Aussonderungsaxiom). This axiom is not an axiom, but an axiom schema, formulated not within ZF itself but within the metalanguage of ZF. Let A be any formula of ZF in which the variable 'y' .does not occur freely. Then the closures of all formulas of ZF provided by the following schema are axioms of ZF:

(x)(3y)(z){z E y == Z E x 1\ A).


The axiom schema of separation as first stated by Zermelo in 1908 did not appear in just this form, which draws upon a precise notion of formula. Rather, it was formulated in terms of the concept of a 'definite' sentence: For any sentence that is definite for all members of some set x, there exists the set y which contains as its elements just those members of x for which this sentence is true. The present form of the axiom schema of separation is due to Skolem (1922), who proposed replacing the vague concept of a 'definite' sentence by an exactly defined concept of formula.

This axiom schema cllls for a number of comments. First, it clearly provides for an infinite number of axioms for ZF. What it says, in effect, is that for any set x and for any condition A statable within ZF, there will exist a subset y of x which contains as its elements just those elements of x that satisfy the condition A. For example, consider the formula '(3xl)(xl E z).' For this choice of A, our schema provides the following as an axiom of ZF:

(x)(3y)(z)(z E y = z E x II (3xl)(x1 E z)).

What this axiom says is that for every set x, there exists that subset y of x which contains as its elements just the non-empty elements of x.

As a second observation on this axiom schema, let us consider how it serves to exclude the known logical paradoxes. Consider what would happen if we were to omit from the statement of this schema the expression 'z E x' (together with the quantifier on 'x'). Our schema would then be the following:

(3y)(z)(z E y == A).

This schema leads directly to a contradiction, once we choose the formula '''v (ZEZ)' as our formula A. For then we have the formula:

(3y)(z)(z E y = "v(z E z)),

which implies the formula:

(3y)(y E Y = "v(y E y)),


which implies:

(3y) "v(y E y = Y E y).

But the following is a theorem of logic:

"v(3y) "v(y E Y = Y E y);

Thus the axiom schema of separation without the cla~se "z E x' leads to a contradiction; in particular, to essentially Russell's paradox. .

The axiom schema without the clause 'z E x' asserts the existence of a set of all sets which satisfy any condition A statable within ZF. As we have seen, this leads to contradiction. In order to block this contradiction, what Zermelo did was to assert the existence not of the set of all sets which satisfy the condition A, but only of the set of all sets which are elements of some given set and satisfy the condition A. If we assert the existence of the set of all sets which satisfy A, for every choice of A, then we are postulating the existence of sets which are 'too big,' so to speak. The reasoning lying behind the axiom schema of separation, in which the phrase 'z E x' occurs, is that we must in some way restrict the size of the set we are to postulate in connection with the condition A. The restriction which this axiom schema embodies is that the set postulated be a subset of some set x already known to exist. Thus the condition that z be an element of x is added. Once this condition is added, the derivation of Russell's paradox along the familiar lines becomes impossible.

Within ZF, then, when given a certain condition A we cannot always immediately conclude that there exists a set which· contains as its elements just those sets that satisfy A. Such a set indeed exists for certain conditions A, but not for others. If we are to prove the existence of such a set by means of the axiom schema of separation, we must first show that that set is a subset of some set which we have already proved to exist.

We have just seen that the schema '(3y)(z)(z E y = A)' leads to contradiction. This scheQ1a, however, is a form of the comprehension axiom schema which appears within the second-order logic F2. The reader will find it instructive to state for himself just why


essentially the above contradiction cannot be derived along the same lines within second-order logic from the comprehension axiom schema.

In contrast with the axiom schema of separation, the next three axioms - axioms III, IV and V - are alike in that they represent special cases in which, for each of certain conditions A, we can directly conclude that there exists a set which contains as its elements just those sets that satisfy A.

III. The Axiom of Pairing As the third axiom of ZF we have the axiom of pairing:

(X)(y)(3Z)(X l )(X I E Z == Xl = X V Xl = y).

This axiom states that given any two sets X and y, there exists a set Z whkh contains as its elements just X and y, and no other sets. There is, then, at least one such set. The axiom of extension, on the other hand, assures us that there is at most one such set, for any two sets that contained as their elements just X and y would be identical with each other. These two axioms taken together, then, assure us that for any two sets X and y there exists a unique set containing X and y and no other sets. We are, therefore - in accordance with the requirements of the theory of definition as presented in Chapter IV - now entitled to introduce by definition a term designating the unordered pair of x and y. We shall do this by defining the customary notation

{,} as a binary operation symbol, as follows (following the customary practice of omitting initial universal quantifiers).

Dl. {x,y} = Z == (X 1)(X 1 E Z == Xl :: X V Xl = y)).

By specifying '{x,y}' for 'z', and using the identity '{x,y} = {x,y}', we obtain immediately the theorem:

Xl E {x,y} == Xl :: X V Xl :: y.

The usual notation for the unit set of x can now be introduced by the following definition, here regarded as a definition of a


singulary operation symbol:

D2. {x}={x,x}.

The unit set of x - i.e., {x} - must not be confused with x itself. To see the difference, let x be the set of natural numbers. Then x has infinitely many elements; {x}, however, has only one element.

We have in earlier chapters repeatedly taken an n-ary relation to be any set of ordered n-tuples. We are now in a position to introduce a set-theoretical definition of ordered pairs, and in general of ordered n-tuples, and thereby bring the notion of a relation completely within set theory. For all mathem.atical purposes, the only requirement that we need to impose on any definition of ordered pairs is the requirement that two ordered pairs <x,y) and < u, v) are identical if and only if x = u and y = v. The following definition meets this requirement. 7

D3. <x,y) = {{x},{x,y}}.

To see that this definition meets our requirement, suppose that <x,y) = <u, v). Then, by the definition, {{x},{x,y}} = {{u},{u, v}}. Thus either (a) {x} = {u} and {x,y} = {u, v}, or (b) {x} = {u, v} and {x,y} = {u}. In case (a), clearly x = u and y = v. In case (b), x:: u and x:: v, and x = u and y :: u, and hence x:: u and y:: V (here x = y = U :: v). Thus, in either case, x :: U and y :: v. Finally, if x = u and y:: v, then <x,y):: <u, v) simply by virtue of the logic ofidentity.

On this definition, the ordered pair of x and y is a certain unordered pair; viz., the unordered pair whose elements are the unit set of x, and the unordered pair of x and y. It is immediately apparent that this definition serves to distinguish the order of the elements of an ordered pair. The unordered pair {x,y} is identical with the unordered pair {y ,x}, for every x and y. But the ordered pair <x,y) - i.e., {{x},{x,y}} - is identical with the ordered pair <y,x) - i.e., {{y},{x,y}} - clearly, only if x is identical with y.

7 This definition of ordered pairs in terms of sets is due to Kuratowski (1921). The earliest such definition is due to Wiener (1914).


Where x and yare not identical, (x,y) is not identical with (y,x). An ordered triple may now be introduced as a certain ordered

pair; viz., (x,y,z) = «x,y),z). And so on for ordered n-tuples in general. For every n, tr.en, an n-ary relation is a certain binary relation.

Onc\.! we have the general concept of relation within set theory, we are able to introduce further important concepts in terms of this concept. Thus, a function can be defined as a relation which is a many-one relation; Le., a function is any relation R such that

(x)(y)(z)(x R y A X R Z :J Y = z).

A one-to-one correspondence, in turn, is a function which always correlates distinct elements with distinct elements.

As convenient definitions of further familiar notation, we also introduce at this point the following definitions:

D4. x f/,y == 'V(XEy). D5. x k y == (z)(z E x :J Z E y).

Definition 5 defines the notion of subset: x is a subset of y if and only if whatever is an element of x is an element of y. In particular, then, every set is a subset of itself. If x is a subset of y without being identical with y, then we say that x is a proper subset of y. Thus we have the definition:

D6. x C y == x k y A X 1= y.

The subset relation must not be confused with the membership relation. Thus, as an illustration of the difference, notice that every set is a subset of itself; within ZF, however, we can show (as we shall see) that no set is a member of itself.

IV. The Axiom of Unions (Sum Set Axiom) The fourth axiom of ZF is the axiom of unions (or the sum set

axiom):

(x)(3y)(z)(Z E y == (3xl)(Z E xl A Xl Ex)).

This axiom tells us that for every set x there exists a set y which contains as its elements just the elements of the members of x.


Again, the axiom of extension guarantees that there is at most one such set. These two axioms together, then, guarantee the existence, for any set x, of the union of x. We may thus introduce a term designating this union, which we now do by means of the following definition of the singulary operation symbol 'U':

D7. Ux = y == (z)(z E y == (3xl)(z E xl A Xl EX)).

From this definition there follows the theorem:

zE Ux==(3xl)(ZEXlAXlEX).

Whereas the symbol 'u' has been introduced in defining a term designating the union of a set of sets, it is convenient to introduce a special term designating the union of two particular sets x and y. The customary term for this purpose is 'xu y', which we now introduce by definition as follows:

D8. xU y = Z == (xl)(xl E Z == xl E x V xl E y).

Using the pairing axiom, the axiom of unions, and the axiom of extension we are able to prove the theorem that for any two sets x and y there exists exactly one set which contains as its elements just those sets that are either elements of x or elements of y. It is this theorem that permits us to speak of the union of x and y, and serves to justify D8.

Now that we have a symbol for the union of two sets, terms designating unordered triples of sets, quadruples, etc., are easily introduced:

D9. {x,y,z}= {x,y} u {z} {x,y,z,x l } = {x,y,z} U {xl}

Closely related to the operation of union there is the operation of intersection. The intersection of a (non-empty) set of sets is that set which contains as its elements just those sets that are elements of each of the members of this set of sets; and the intersection of two particular sets x and y is that set which contains as its elements just those sets that are elements of both x and y. Terms designating these intersections can easily be defmed


and justified by uniqueness theorems. We shall here define only the usual term for the intersection of a pair of sets, as follows:

DI0. xny=z==(xl)(XlEZ==XlEXl\xlEy).

To justify introducing this definition, we tum to the axiom schema of separation. By this schema (together with the logic of quantEication) the following is a theorem of ZF:

(x)(Y)(3z)(xl)(xl E Z == Xl Ex 1\ Xl E y).

This theorem, together with the axiom of extension, guarantees the existence of the intersection of X and y, and thus justifies our introducing by definition a term designating that intersection. By similar reasoning, for any two sets X and y there exists their difference, or the relative complement of y in X; that is, the set which contains the elements of X minus the elements of y. This justifies the following definition:

Dll. x-y=z==(xl)(xlEz==xlEXI\Xl~Y).

V. The Power Set Axiom (Axiom of Powers) The fifth axiom of ZF is the power set axiom (or the axiom of

powers):

(x)(3y)(z)(Z E y == Z ~ x).

That is, for every set x there is a set v which contains as its elements just the subsets of x. Again, the axiom of extension assures us that there is at most one such set; thus there is exactly one such set. This axiom is called the power set axiom because of the fact that if a set x has n elements, then its power set y - that is, the set of all subsets of x - has 2n elements; that is, the number of elements of y is 2 rais(~d to the power n.

The definition introducing the notation for the power set operation is the following:

D 12. 'J> x = y == (z )(z E Y == z ~ x).

There follows the theorem:


VI. The Axiom of Infini ty None of the axioms introduced so far postulates the existence

of any sets in an outright fashion. The axiom of extension merely states a condition under which two sets are identical with one another; and the remaining axioms are all of the hypothetical form: if some set x (or sets x and y) exists, then there exists such and such a set z. We cOIljle now to an axiom of a different nature, which is not of this hypothetical form. This is the axiom of infinity, which postulates the existence of at least one set satisfying a certain condition. Any set satisfying that condition will have at least a denumerable infinity of elements. Thus, the axiom of infinity assures us that within the domain of ZF there exists at least one infinite set. The axiom of infinity is not needed, notice, to prove that there exists an infinite number of sets within the domain of ZF. This can easily be proved using only a few axioms of ZF, not including the axiom of infinity (as can be seen from the following paragraphs). What the axiom of infinity is needed for is to show that within the domain of ZF there exists at least one set which contains an infinite number of elements.

There are a number of different formulas that could serve as axioms of infinity within ZF. We shall here mention two examples of such formulas. First, however, let us introduce a symbol designating the null set; that is, that set which contains no elements. To prove that there is such a set, consider the following formula:

(x)(3y)(z)(z E y == z Ex 1\ Z f z).

This formula is an axiom of ZF, by virtue of the axiom schema of separation, taking 'z f z' for A. Now the formula

(3x)(3y)(z)(z E y == z E x 1\ Z f z)

follows from this axiom by the logic of quantification, and is thus a theorem of ZF. But this formula implies that there is a set y

which contains no elements z whatsoever, because of the impossibility of satisfying the condition 'z f z'. By the axiom of extension there is at most one such set; thus there is exactly one such set. We may therefore introduce a symbol designating that set by the


following definition:

D13. ¢=y=(z)(zEy=ziz).

We may now state the axiom of infinity, as the sixth axiom of ZF, in the following form:

(3 x)( ¢ E x 1\ (y)(y Ex:) Y u {y } EX)).

That is, there is at least one set x which contains ¢ as an element, and for each of its elements y contains also the union of that element and its own unit set. Any set which satisfies this condition will have infinitely many elements. For it will contain at least the following sets as elements:

¢, ¢u{¢}, ¢u{¢}u{¢u{¢}},

Now these sets are the following sets:

¢, {¢},{¢,{¢}}, ...

The sets in this infinite progression are all distinct from one another, as can easily be shown by induction. In particular, notice that {¢} is distinct from ¢, since {¢} has an element, while ¢ does not. Any set x which contains each of these sets as elements (possibly together with other sets as well) will, then, be an infinite set.

Alternatively, we might have chosen the following formula as our axiom of infinity:

(3x)(¢ Ex 1\ (y)(y Ex:) {y} EX)).

Any set x satisfying this axiom will also contain an infinity of elements; viz., ¢, {¢}, {{¢}} , etc. Clearly these sets are all distinct. from one another.

We are now in a posi1ion to develop the arithmetic of natural numbers within ZF. Let us indicate in part how this can be done. First, consider the above infinite progression:

¢, {¢}, {¢,{¢}}, ...

Following J. von Neumann (1923), we can define the natural number 0 to be the first term of this progression; and, in general,


the natural number n to be the n+ 1 st term of this progression. (Notice that in these definitions each natural number turns out to be the set all natural numbers less than it.) From axioms already introduced, it can be shown that there exists exactly one set which contains as elements each of the terms of this progression, and no further elements; viz., the smallest set which contains each of these terms. Let us call this set w. This set w can then be taken as the set of all natural numbers. Next, we can define the successor operation S for sets in general as follows:

Sx=xu{x}.

From these definitions and our axioms we can then derive the following formulas as theorems of ZF:

(1) OE w (2) (x)~x E w:) S x E w), (3) (x)(x E w:) S xi 0), (4) (x)(y)«x E w I\y E W 1\ Sx = Sy):)x =y), (5) (y)«(y S; W 1\ 0 E Y 1\ (x)(x E Y :) S x E y)) :) Y = w).

These formulas will doubtless remind the reader of Peano's axioms for the natural numbers, from Chapter V. For the development of arithmetic within ZF, what principally remains is to define the operations of addition and multiplication within ZF in such a way as to permit us to show that these operations satisfy the usual properties on them. This can be done. Once it is done, all of familiar arithmetic can then be developed within ZF. Furthermore, on' this basis one can then introduce the usual systems of integers, rational numbers, real numbers and complex numbers, by defining these numbers in terms of the natural numbers and sets in ways which have been known since the end of the nineteenth cen tury. This brings classical analysis within Zermelo-Fnienkel set theory.

VII. The Axiom Schema of Replacement Three axioms remail).. One of these, the famous Axiom of

Choice, we shall reserve for special consideration in the next section of this chapter. The remaining two axioms were not


included among Zermelo's original axioms (often referred to as the axioms of Zermelo set theory), but are later additions. The first, the axiom schema of replacement, is usually credited to Fraenkel (1922), though it was also formulated independently by Skolem at the same time. The second, the axiom of regularity, is due in the form used here to Zermelo himself (1930), though an equivalent but more complicated form of this axiom had been earlier stated by von Neumann (1929).

The axiom schema of replacement (or the axiom schema of substitution), our seventh axiom, is stated in the meta-language of ZF as follows. Let A be any formula of ZF containing 'x' and 'y' (but neither 'x l' nor 'y 1') as free variables, and let A' differ from A just in containing free occurrences of 'z' wherever A contains free occurrences of 'y' (anl; at no other places). Then the closures of all instances of the following schema are axioms of ZF:

(Xt)«x)(y)(z)«x E xl /\ A /\ A' ::) y = z) ::) (3Yl)(Y)(Y E Yl == (3x)(x E xi/\ A))).

What this axiom schema says is this: Let A be any formula of ZF which is functional for all elements of some set x 1> in that for every element of x I there is at most one set y which satisfies A. Then' there will exist a set y 1 whose elements are just those values which A associates with the elements of xl' Thus, this axiom schema postulates the existence of a set Yl> which it defines in terms of some given set xl and a condition A. This set YI is a certain transformation of the set xl' If we start with xl> and replace each of the elements of xl by something else (determined by A), the result will be itself a set; viz., YI'

This axiom schema is used within the general theory of transfinite induction and within the general theory of ordinal numbers. When added to the other axioms of ZF, it serves to establish the existence of sets of very large cardinality indeed, and for this reason is sometimes regarded as a kind of axiom of infinity. Further, when this schema is added to the other axioms of ZF the axiom schema of separation may then be omitted, since it is an almost immediate consequence of the axiom schema of replacement. Simply take A to be 'x = Y .f<, B', where B is any


formula of ZF in which 'YI' does not occur freely. Th~ antecedent of the axiom schema of replacement is then obviously true, and the axiom schema of separation readily follows from the consequent. Furthermore, the axiom of pairing may also be omitted, since it turns out to be derivable from the power set axiom and the axiom schema of replacemen t.

The axiom schema of replacement is in fact considerably stronger than the axiom schema of separation. Indeed, for the purposes of 'conventional' mathematics this added strength is not needed; here the axiom schema of separation is sufficient. It is for this reason that this latter axiom schema, though not independent of the remaining axioms of ZF, is often listed (as above) as a separate axiom.

One very inttjresting metamathematical result concerning the axiom schema of replacement is the following result (Montague, 1956): The theory ZF is not a finite extension of ZF without the axiom schema of replacement. That is, starting with all of the axioms of ZF except those provided by the axiom schema of replacement, no addition of only finitely many axioms will result in a theory equivalent to ZF. No finite number of axioms, then, can do the work of the axiom schema of replacement within ZF.

VIII. The Axiom of Regularity (Axiom der Fundierung) The eighth axiom of ZF is the axiom of regularity (Zermelo's

Axiom der Fundierung):

(x)«3y)(y E x)::) (3y)(y E x /\ 'V(3z)(z E y /\ Z Ex)))

What this axiom says is that every non-empty set x has an element y which has no elements in common with x. That is, there exists no non-empty set x such that every element of x has an element which is itself an element of x. In addition to being otherwise useful within set theory, this axiom is important because with its help we can establish the result that there is no infinitely descending chain of sets in the domain of ZF; that is, .no infinite sequence of sets xn (n ~ 1) such that x2 is an element of x l' x3 is an element of x2' ... xn+I is an element of xn , ... etc. Every descending chain of sets in ZF is finite and ends with the null set.


Further, using this axiom we can easily derive the result that within the domain of ZF there is no set x such that x is an element of x. For consider any set x. By the axiom of regularity, since {x} is not empty it must contain an element which has no elements in common with itself. Since its only element is x, it follows that x has no elements in common with {x}; and thus that x is not an element of x. Notice now that the fact that no set is an element of itself leads directly to an alternative proof of the non-existence within ZF of two troublesome old friends; viz., Cantor's universal set and Russell's set of all sets which are not elements of themselves. The most obvious proofs of the nonexistence of these two sets are the arguments which lead to Cantor's paradox and Russell's paradox, respectively; for those arguments show that supposing that these sets exist leads to contradiction. These arguments can be reproduced within ZF (and within Zennelo's set theory) without using the axiom of regularity. By using the axiom of regularity, however, we can show the non-existence of these two sets by a different route. If Russell's set existed, it would be the universal set; but there is no universal set, since if there were it would be an element of itself.

By an argument analogous to the argument showing that no set is an element of itself, we can show that there are no two sets x and y such that x is an element of y and y is an element of x; nor any three sets x, y and z such that x is an element of y, y is an element of z, and z is an element of x; and so on. Infinitely descending chains and finitely long membership cycles (including self-membership) seem counterintuitive to many logicians. That is, it seems contrary to their intuitive notion of set that there be any such chains and cycles of sets. Still, for the purposes of developing mathematics within set theory it is not absolutely necessary that we exclude the possibility of such chains and cycles; and there are set theories within which they appear (for example, W.V. Quine's Mathematical Logic).

More generally, the axiom of regularity implies the important result that every set in the domain of ZF can be obtained by starting with the null set and then applying the power set and union operations some finite or transfinite number of times. The

THE AXIOM OF CHOICE 165

ordinal number of times that these operations must be applied in order to reach any particular set x is called the rank of x. The axiom of regularity, then, implies that every set within the domain of ZF has a rank, or is well-founded. This result considerably clarifies our conception of the nature of the domain of ZF.

7.3. The Axiom of Choice

The Axiom of Choice is one of the philosophically most interesting axioms within all of mathematics. 8 Ever since it was first explicitly stated as an axiom, by Zermelo in 1904, mathematicians have been in disagreement as to whether this axiom should be accepted within mathematics, though it has doubtless been much more widely accepted than rejected. Acceptance or rejection of the Axiom of Choice reflects underlying philosophic conceptions about the nature of mathematics and mathematical existence.

The Axiom of Choice, our ninth axiom, in Zennelo's fonnulation of 1904 sta tes the following:

If x is a set of non-empty, pairwise disjoin t sets, then there is at least one set y which has exactly one element in common with each of the elements of x. Symbolically:

(x)((y)(z)((y E x 1\ Z E X 1\ Y =1= z) -:J (y =1= cp 1\ Y n z = cp»-:J (3y)(z)(z E x -:J (3x1)(Y n z = {xd»).

Two useful alternative fonnulations of the Axiom of Choice equiValent to the above fonnulation, may be stated info~ally a~ follows (recalling that a relation is any set of ordered pairs, and that a function is a relation which assigns exactly one value to each of its argument values):

(l) For every set x there exists a function f such that the domain of f is the set of non-empty subsets of x, and such that the value of f for each of these non-empty subsets y is itself an element of y.

II For a detailed and informal discussion of the Axiom of <'-'boice, see A.A. Fraenkel and Y. Bar-Hillel 1958, pp. 44-80.

166 AXlOMA TIC SET THEORY

Such a function is called a choice function on x. The Axiom of Choice in this form, then, asserts that for every set x there exists a choice function.

(2) For every relation R there exists a function f which is contained in that relation (that is, f is a subset of R), and has the same domain as that relation. To each of the entities x to which R assigns one or more entities y, then, f assigns exactly one of those entities y.

The Axiom of Choice is another axiom of hypothetical, or conditional, form. In this respect it differs from the axiom of infinity and resembles the axiom of pairing, the axiom of union, and the power set axiom. All four of these axioms are of the form: given such and such a set x (or a set x and a set y), there exists a set z, defined in terms of x (or x and y). In the case of the axioms of pairing, union and power set, however, this set z is uniquely determined: there is exactly one such set z. The Axiom of Choice, on the other hand, when applied to some appropriate set x assures us of the existence of some set y without, however, uniquely defining y by means of some condition on its members. That is, the Axiom of Choice guarantees only that there is at least one set y which has certain properties, once given an appropriate set x; there may (and ordinarily will), however, exist a number of such sets y.

The Axiom of Choice is known to be equivalent to a number of other important principles, or theorems, within mathematics; in the sense that once granted the remaining axioms of set theory (e.g., those of ZF, or of the von Neumann-Bemays-Godel set theory), we can derive each of these principles and theorems from the Axiom of Choice, and the Axiom of Choice from each of these principles and theorems. 9 We shall here mention two such equivalences. First, the Axiom of Choice is equivalent to the principle of the comparability of se,s (the trichotomy law), which states that

9 Sec P. Suppes 1960, Chapter 8. H. Rubin and J. Rubin 1963 contains over 100 equivalents to the Axiom of Choice!

Comparatively few of these, of course, are as mathematically important as the Axiom of Choice itself.


for any two sets, either one of these sets is larger than the other, or they are of the same size. One might think that this prinCiple would follow immediately from the very definitions of 'larger than' and 'of same size' as regards sets, but this is not so. The notion of size in set theory is defined in terms of the notion of a one-to-one correspondence, where a one-to-one correspondence between two sets x and y is a function which associates with each element of x exactly one element of y, and conversely. One set x is said to be larger than another set y if and only if there is a one-to-one correspondence between y and some subset of x, but no one-to-one correspondence between x and some subset of y; and two sets x and yare said to be of equal size (or of equal cardinality) if and only if there is a one-to-one correspondence between x and y. Surprising as it may seem, it does not follow from these definitions alone (which are due to Cantor), however, that for any two sets x and y either x is larger than y, or y is larger than x, or x and yare of equal size. This result - the comparability of sets principle - can be derived from the definitions of 'larger than' and 'of equal size' if we use also the Axiom of Choice, but it cannot be derived from those definitions alone. And, further, the Axiom of Choice can itself be derived from the comparability principle. The Axiom of Choice and the comparability principle are, therefore, equivalent.

The Axiom of Choice is equivalent also to the famous wellordering theorem, which was first proved by Zermelo. Indeed, Zermelo's proof in 1904 of the well-ordering theorem contains the first (important) appearance of the Axiom of Choice, though this axiom had been implicitly presupposed numerous times by earlier mathematicians. The well-ordering theorem states that every set can be well-ordered. In order to define the concept of well-ordering, let us first define the concepts of asymmetric, transitive and connected. A relation R is asymmetric in a set Xl if and on'ly if for every x and y in X 1> if x R y then not y R x. R is transitive in Xl if and only if for every x, y and z in Xl' if x R y and y R z then x R z. And R is connected in Xl if and only if for every x and y in Xl' if x f y then x R y or y R x. Now, to say that a set X is well-ordered by a relation R is to say (a) that R is asymmetric,


transitive and connected in X, and (b) that every non-empty subset of X has a smallest element in terms of R (that is, every such subset will have an element y such that for every element z of that subset, if y =1= z then y R z). As a simple example, the set of all positive integers is well-ordered by the ordinary less than relation. As a more interesting ex lmple, the set of positive integers is also well-ordered by the following relation R: if x is odd and y is even, then x R y; if x and yare both odd or both even, then x R y if and only if x is less than y. This ordering well-orders the positive integers as follows, with the odd integers all preceding the even integers:

1, 2, 3, ... ; 2,4,6, 8 ....

The well-ordering theorem, now, states that for every set X there is a relation R such that X is well-ordered by R. The the9rem does not tell us how to define an R, once given a set X, such that R well-orders X; rather, it merely tells us·-that for every X there is a relation R which well-orders X. This leads to an interesting state of affairs when we take the set of all real numbers as our set X; for though the well-ordering theorem states that there is a relation which well-orders this particular set, no one has ever defined any such relation, and at least some mathematicians find it hard to believe that there is any such relation. This is to be contrasted with the state of affairs concerning the set of rational numbers. It is very easy to well-order the set of rational numbers (Cantor), and there is no need to use the Axiom of Choice in order to do this. To well-order the set of non-negative rational numbers, one can proceed as follows: first, take all non-negative rationals the sum of whose numerator and denominator is 1 (viz., the single rational 0/1); then, take all non-negative rationals the sum of whose numerator and denominator is 2, taking them in order of increasing numerators (viz., the rational 1/1, omitting the rational 0/2 as being identical with a rational already introduced); then, take all non-negative rationals the sum of whose numerator and denominator is 3, in order of increasing numerators (viz., the rationals 1/2, 2/1); and so on. This procedure gives rise to an infinite progression rl' r2' r3' .. , within which each non-negative rational number


appears. The infinite progression rl' -r2' r2' -r3' r3' ... ' then, is clearly a well-ordering of all rational numbers; indeed, a wellordering of the simplest possible type. 10

In addition to the well-ordering theorem itself, certain other results implied by the Axiom of Choice have a somewhat counterintuitive appearance, and anyone who was interested in building a case against the Axiom of Choice would be very apt to draw attention to these peculiar results as part of his case. We here refer to only one of these peculiar results; viz., the Banach-Tarski theorem (1924). It follows as a special case from that theorem that a sphere of fixed radius can be broken up into finitely many parts which can be reassembled so as to form two spheres of the same radius as the original sphere! One might think that such a state of affairs is a logical impossibility, and that a' fonnal contradiction could be derived from the Banach-Tarski theorem. But it is known that no formal contradiction is derviable from this theorem, for the reason that no contradiction is derivable from the Axiom of Choice itself (in the sense that that axiom is consistent relative to the remaining axioms of set theory).

The Axiom of Choice has important applications within almost all areas of mathematics; in particular, within set theory itself, and within algebra, analysis and topology. Within set theory, for example, the Axiom of Choice is used in the study of ordinal and cardinal numbers, as in proving that every set is equivalent in size to some unique cardinal number. And we have already mentioned the fact that the Axiom of Choice is equivalent to two important principles within set theory; viz., the principle of the comparability of sets, and the well-ordering theorem. As one further example of the use of the Axiom of Choice within set theory, consider how one might define the concepts of the finite and the infinite. One natural way would be to define a finite set as any set which has exactly n elements, for some natural number n; and then to define an infinite set as any set which is not finite. This definition is perfectly satisfactory, once given the concept of a natural number.

10 Notice that this particular well-ordering shows that the set of all rationals is denumerable.


It is possible, however, to define the finite and the infinite in such a way as not to presuppose the concept of number. Thus, we can define a finite set as any set which cannot be put into one-to-one correspondence with any of its proper subsets. We then define an infinite set as any set which is not finite; that is, any set which can be put into one-to-one correspondence with some proper subset of itself (Dedekind, 1888,. Thus, for example, the set of all natural numbers is infinite in this sense, because it can be put into one-te-one correspondence with the set of all even natural numbers, simply by pairing each natural number with its double. Now this second set of definitions is just as satisfactory as the first set of definitions. However, somewhat surprisingly, it requires the Axiom of Choice to show that these two sets of definitions are equivalent; in particular, to show that if a set is finite in the latter sense, then it is finite ir the former sense; or, equivalently, that if a set is infinite in the former sense, then it is infinite in the latter sense.

In addition, the Axiom of Choice (or some equivalent) is often drawn upon within the seman tical parts of metamathematics, as in the study of languages with non-denumerably many symbols. It is within analysis and the theory of real functions, and within topology, however, that the Axiom of Choice has its greatest number of applications.

The chief significanc, ~ of the Axiom of Choice for the philosophy of mathematics lies in the question as to whether that axiom should be accepted within mathematics, and the question as to how we are to decide whether it should be accepted or not. Of course, if either the axiom itself or its negation were derivable from the remaining axioms of set theory, these questions would disappear - so long, that is, as we continued to accept these remaining axioms. But it is known (as we shall see) that neither the Axiom of Choice nor its negation is provable within set theory (within Zermelo-Fraenbl set theory, or von Neumann-BernaysGodel set theory, or Principia Mathematica, for example). The question as to whether or not to accept the axiom must, therefore, be answered by some other means.

As is well-known, there is a more-or-less definite group of


mathematicians who refuse to accept the Axiom of Choice; viz., the intuitionists. The intuitionists reject this axiom because of their conception of the nature of mathematical existence. What the intuitionists refuse to accept in the Axiom of Choice is its so-called 'non-constructive' nature. The axiom states that for any set x of non-empty and pairwise disjoint sets there exists some set y which contains as its elements one element from each of these elements of x. It does not, however, tell us how to define some such set yonce given any such set x. On the in tuitionistic conception of ma thematical existence, one is not permitted to say that there exists some mathematical entity satisfying such and such a condition unless one either gives an example of such an entity, or shows how to find such an example by following instructions which are essentially mechanical in nature. The Axiom of Choice clearly provides neither examples nor instructions. Within intuitionistic mathematics, therefore, one must reject the axiom, together with all of its equivalents. Only a very few mathematicians would be willing to accept the in tuitionistic restrictions upon mathematics, however. For most mathematicians, the decision either for or against accepting the Axiom of Choice will have to be made on grounds other than the non-constructive character of the axiom.

The mathematical "behavior" of the Axiom of Choice, so to speak, is sufficient to convince some mathematicians that the axiom should not be accepted. As we have already observed, some mathematicians (in addition to the intuitionists, that is) are very doubtful that every set can be well-ordered; in particular, that the set of real numbers can be well-ordered. These mathematicians must reject (or in some way restrict) the Axiom of Choice, because of the equivalence of that axiom to the well-ordering theorem. Further, as has also been pointed out, some of the results which follow from this axiom are curious, to say the least; for example, the Banach-Tarski theorem. But arguments against the Axiom of Choice of this sort, which proceed in terms of the (real or alleged) intuitive implausibility of certain equivalents to, and consequences of, that axiom are at present far less than conclusive. Curiosity-wise, there is something to be said both for accepting

172 AX10MA TIC SET THEORY

and fOl rejecting the axiom. It is doubtless true that a greater number of mathematicians today accept the axiom than reject it -principally because of its usefulness, and often indispensability, for obtaining many desired results. Still, many mathematicians prefer to avoid the axiom wherever possible; and much work has been done in the search for proofs not using this axiom for theorems which were originally proved only with its help.

Finally, it should be pointed out that rather than assert the Axiom of Choice in full generality, one might assert it only for all denumerable sets (for every denumerable set x of non-empty, non-overlapping sets, ... ). This countable Axiom of Choice would probably be sufficient for all of the purposes of classical analysis.

7.4. The Metamathematics of ZF

We have now presented the axioms of ZF, together with a few basic definitions, as a formalized theory. The formulas and theorems of ZF as developed up to this point, then, have been specified in an exact way. We now tum to a survey of the principal metamathematical results concerning the system ZF. 11

ZF is not finitely axiomatizable. That is, no finite set of formulas of ZF leads to precisely the theorems of ZF (supposing that ZF is consistent). Because the elementary arithmetic of the natural numbers can be developed within ZF, it follows from Godel's principal result of 1931 (together with a certain improvement of this result by Rosser in 1936) that if ZF is consistent then ZF is incomplete. Again, because arithmetic can be developed within ZF, it follows from the Church-Rosser result of 1936 that if ZF is consistent then the set of theorems of ZF is an undecidable set. Further, since ZF is an elementary theory which has nO finite models, if ZF is consistent it is not categorical. If consistent, it admits models of every transfinite cardinality. In particular, by the Lowenheim-Skolem theorem, if ZF is satisfiable

11 See nere especially P. Cohen 1966; and J.R. Shoenfield 1967, Ch. 9. For a more informal discussion, see A. Mostowski 1966, Lectures IX, XV.

THE METAMATHEMATICS OF ZF 173

at all it has a denumerable model in the positive integers. Of course, under such a model the formulas of ZF receive an interpretation very different from their intuitive interpretation in terms of sets. Under such a model, formulas of the form (a)A are interpreted as 'for every positive integer x .. .'; and formulas of the form (3a)A are interpreted as 'there exists a positive integer x such that ... .' Formulas which on the intuitive interpretation define particular sets now are interpreted so as to define particular positive integers. And the primitive symbol 'E' is no longer interpreted as 'is an element of'; rather it is assigned some number-theoretic relation as its interpretation.

In fact, not only does ZF have a denumerable model in the integers, it has a denumerable set-theoretical model! It follows from the so-called downward Lowenheim-Skolem theorem (which is a variant of the Lowenheim-Skolem theorem as we have presented it) that if a first-order theory has a model M at all, then it has a denumerable model M 1 such that (a) the domain of M 1 is a subset of the domain of M, and (b) the relations in the model Ml are obtained by restricting those in the model M to the domain of MI. Thus, if ZF has any set-theoretical model M at all, it has a denumerable set-theoretical model M l' in which the atomic formulas x E y take on the meaning: x is an element of y, where· x and y are sets in the domain of MI. Now since in ZF we can prove the existence of sets of non-denumerable cardinality (e.g., the set of all real numbers), it is most surprising that ZF should have a denumerable model. Yet though this situation is paradoxical in the loose sense of being highly unexpected, it is not paradoxical in the strict sense of leading to contradiction. Indeed, the explanation of how such a situation can obtain is relatively simple. A set is non-denumerable just in case there is no one-to-one correspondence between it and the set of positive integers. Now each of the sets in the model M 1 is in fact denumerable. Yet some of these sets are of non-denumerable cardinality within M l' in the sense that no one-to-one correspondence between anyone of them and the set of positive integers exists within the model M I! Thus we see that a set which is absolutely denumerable may, within certain restricted contexts, be in fact non-denumerable. Within those contexts, and


with those limited means, its elements cannot be enumerated, that

is.

By means of defining appropriate models it is possible to show that various of the axioms of ZF are independent of the remaining axioms, in the sense that they cannot be derived from the remaining axioms - always supposing those remaining axioms to be consistent, of course. (1) Thus, it is a relatively easy matter to define a model for all of the axioms of ZF except the axiom of infinity, in which all the sets that appear are of finite cardinality. Since the axiom of infinity is false in any such model, it follows that it is not derivable from these remaining axioms. It is thus indispensable within ZF if we are to be assured that the domain of ZF contains sets of infinite cardinality. (2) Further, it is not difficult to define a model for all of the axioms of ZF except the power set axiom, in wh;ch all of the infinite sets that appear are of denumerable cardinality. Since the application of the power set axion- to a set of denumerable cardinality leads to a set of non-denumerable cardinality, the power set axiom must be false in this model, and is thus not derivable from the remaining axioms of ZF. Within ZF this axiom thus plays an indispensable role in the proof that there exist sets of non-denumerable cardinality. (3) It is possible to define a model for all of the axioms of ZF except the axiom of regularity, in which certain sets turn out to be elements of themselves. Since th lS state of affairs is excluded by the axiom of regularity, this axiom is false in this model, and thus not derivable from the remaining axioms of ZF.

A more interesting (and much more difficult) case concerns the Axiom of Choice. By means of defining a certain highly intricate model, Paul Cohen has recently shown that this axiom is not derivable from the remaining axioms of ZF. 12 Cohen's model is 'standard,' in the special sense that within it 'x E y' is interpreted as 'x is an element of y.' Cohen first shows this model to be a model of all of the axioms of ZF, without the Axiom of Choice.

12 3ee P. Cohen 1966, Ch. IV, section 9. Cohen's results on the independence of the Axiom of Choice and the Continuum Hypothesis fIrst appeared in P. Cohen 1963, 1964.


He then shows that in this model the set of all real numbers is not well-ordered. It follows that the Axiom of Choice is false in this model, and thus, that it is not derivable from the remaining axioms ofZF.

At the same time Cohen showed that the Continuum Hypothesis is not derivable from the axioms of ZF (the Axiom of Choice included). This hypothesis was first advanced by Cantor, in 1878. It concerns the cardinality of the continuum; that is, it concerns the question, 'How many real numbers are there?' or 'How many points are there on a line?' It is known that the cardinality of the set of all real numbers is equal to the cardinality of the set of all subsets of positive integers; viz., 2~o (where ~o is the cardinality of the set of all positive integers). Now, by a famous theorem known as Cantor's theorem we know that for every set x, the cardinality of the set of all subsets of x (i.e., of the power set of x) is greater than the cardinality of x. Thus, there are more real numbers than there are positive integers: 2~o > ~o. Cantor's conjecture - the so-called Continuum Hypothesis - is to the effect that 2~o = ~I' where ~ 1 is the next largest cardinal after ~o. Equivalently, Cantor's hypothesis is that there is no set whose cardinal is greater than that of the set of all positive integers but less than that of the set of all real numbers; or, that every infinite subset of the continuum has the cardinality either of the set of positive integers or of the continuum itself. To this day no one has been able to prove or refute this hypothesis. As we shall see, Gi::idel has shown that within ZF the Continuum Hypothesis cannot be disproved. And Cohen has now shown that within ZF the Continuum Hypothesis cannot be proved. He did this by constructing a model (,standard' in the above sense) in which the axioms of ZF are all true, while the Continuum Hypothesis itself is false in that model. 13 This result is, indeed, the most outstanding result of recent years within the foundations of mathematics. And the methods used by Cohen to obtain these two results - centered in his very important new concept of 'forcing' - are not limited to

13' See P. Cohen 1966, Ch. IV, section 8. The hypothesis that 2~o = ~'T turns out to be consistent with the axioms of ZF for a very wide variety of values of T.

176 A {IOMATIC SET THEORY

these particular results, but are of general significance for proving independence results of a very wide range.

Cantor's theorem, which we have just mentioned, can easily be proved. Because of its importance for the matters at hand, we now prove it. The proof resembles the reasoning leading to Russell's paradox. Let x be any set, finite or infinite. Suppose that there is a one-to-one correspondence f between x and its power set 'J>(x). We now define a certaiIl subset of x which we shall call C. Let C contain just those elements y of x which are not members of fey);

that i~, not members of those subsets of x with which they are correlated by this correspondence f. Thus,

(A) y E C == Y $. f(y)·

Now C is a subset of x, and thus by our supposition is f(a), for some a in x. Thus, from (A) there follows

(B) y E f(a) == y $ fey)·

The reader can hardly fail to see what is about to happen! Taking y as a, we obtain

(C) a E f(a) == a$. f(a).

But (C) obviously leads to a contradiction. Thus our assumption that there is a one-to-one correspondence between x and 'J>(x) must be false. Thus x and 7>(x) are not of equal cardinality. Now the cardinality of'J>(x) i~ clearly no smaller than that of x, since we can get a one-to-one correspondence between x and a subset of ']>(x) cy correlating each element of x with its own unit set, which will appear in 'J>(x). Thus (by the comparability of sets principle), the cardinality of 7>(x) is larger than that of x, and Cantor's theorem is proved.

Closely related to these results on independence are certain results on relative consistency. A relative consistency proof for a particular axiom of ZF is a proof that if the system of axioms of ZF without that axiom is consistent, then the result of adding that axiom (i.e., ZF itself) is also consistent. Thus, the axiom of regularity has been shown by von Neumann to be consistent

THE METAMATHEMATICS OF ZF

relative to the remaining axioms of ZF: if ZF without the axiom of regUlarity is consistent, then it remains consistent when axiom is added. von Neumann proved this result by defining what is known as an inner model. That is. given any model J1 for the axioms of ZF without the axiom of regularity, he showed how to define in ternlS of M a model M' for these axioms taken together with the axiom of regularity; that is, for ZF itself. The model M' is a certain portion of the model M; the sets of M' are various of the sets of M. In particular, they are the well-founded sets of M; that is, those sets of M that can be built up by starting with the null set and then applying the power set and union operations some finite or transfinite number of times. Thus, we may conclude that if no contradiction is derivable from ZF without the axiom of regularity, then no contradiction is derivable from ZF with the axiom of regularity. Actually, von Neumann proved this result not for ZF but for a closely related set theory; his proof can be modified so as to apply to ZF itself, however.

The method of proving relative consistency through inner models is essentially the method used in the classic proofs of the consistency of various non-Euclidean geometries relative to Euclidean geometry. In the case of these latter proofs, one shows that if Euclidean geometry has a model, then a certain portion of that model is itself a model of some non-Euclidean geometry; e.g., the geometry of the Euclidean sphere presents us with a model for Riemannian plane geometry. In the case of proving the relative consistency of various set-theoretical axioms through inner models, one shows that if a certain group of set-theoretical axioms has a model, then a certain portion of this model is itself a. model for the result of adding one or more axioms to this particular group of axioms. In the geometric case, we show that if a certain theory is consistent, then an alternative (given the familiar interpretation, that is) theory is consistent; in the case of set theory, we show that if a certain theory is consistent, then the result of making certain additions to that theory is consistent.

The most outstanding result among relative consistency proofs in set theory is due to Gbdel (1938, 1940), and concerns the Axiom of Choice and Cantor's Generalized Continuum Hypo-

178 AX]:)MATIC SET THEORY

thesis. This is the hypothesis that for every infinite set x there is no set which is of greater power than X, but of less power than the set of all subsets of x; i.e., the cardinal number of the set of all subsets of x is the next largest cardinal after that of x itself. The Continuum Hypothesis itself is thus the special case where x is a denumerably infinite set. Now Godel has shown, by means of defining a certain very interesting inner model, that the Axiom of Choice (AC) and the Generalized Continuum Hypothesis (GCH) are consistent relative to the remaining axiOIps of set theory. 14 We shall now give a brief and informal account of the basic steps in Godel's argument as it applies to ZF (1938). For simplicity of presentation, let us agree to use the symbol 'ZF' throughout to stand not for the whole system of Zermelo-Fraenkel axioms, but only for the system of all of these axioms except for the Axiom of Choice.

First, Godel defines a certain domain of sets, which he calls constructible sets. Unfortunately, the exact characterization of these sets draws upon the notion of ordinal number (as does most conside:-ation of the modeling of set theory), and we prefer not to introduce that notion here. Roughly speaking, however, what Godel does is to define a transfinite sequence of sets, starting with the null set; and then, at each subsequent point in this sequence, defining the set which appears at that point either as (a) the set of all set-theoretically definable subsets of the set which appears at the immediately precedi Ig point in this sequence of sets, or as (b) the union of all sets which appear at earlier points in this sequence. Thus, we start with the null set, and from then on at each point introduce a new set whose elements are 'constructed' out of sets already obtained. 15 Any set which appears as an element of anyone of the sets in this transfinite sequence of sets is

14 K. Giidel 1938, 1940. The argument in the 1938 paper applies to various systems of set theory, including Zermelo-Fraenkel set theory; the 1940 booklet proceeds in terms of the von Neumann-Bernpys-Giidel set theory. For an informal exposition and discussion of Giidel's results, see A. Mostowski 1966, Lecture IX.

15 Agnin for those readers who have the concept of ordinal number, the exact definition is as follows: Let A be any formula of ZF containing k + I free variables, and let K be any set. Then for any choice of k values in K for any of these k + 1 free


called a constructible set, and Godel refers to the domain of all constructible sets as the domain L.

Godel's definition of constructible sets is an example of .1

so-called 'predicative definition,' in that at each point in the sequence the set that is introduced is defined in terms of sets that appear earlier in the sequence, and never in terms of any totality that presupposes that set.

From Godel's definition it follows immediately that all constructible sets are well-founded; that is, that they are among those sets obtained by starting with the null set and then applying the power set and union operations some finite or transfinite' number of times.

Godel's second step is to consider the relativization of each formula of ZF to the domain of constructible sets L. For any formula, the relativization of that formula to L is simply the formula that results from replacing in that formula each subformula of the form (a)A by a formula of the form (a)(a E L :=t A I; and replacing each sub-formula of the form (3a)A by a formula the form (3a)(aE L AA).

Third, Godel now shows that the relativization to L of each the axioms of ZF (thus not including AC) is itself a theorem ZF. It follows that the relativization to L of each of the theorems of ZF is itself a theorem of ZF. What this means intuitively is that given any model of ZF, the constructible sets within that mode~ themselves give rise to a model of ZF. To obtain this inner model. simply take these constructible sets as the domain, and to the primitive symbol 'E' assign the membership relation confined to those sets.

variables, there is a set B of elements of K which satisfy A. Let D(K) be the set of all those subsets B of K which are definable in this way; i.e., by way of all such formulas A. D(K), that is, is the set of all set-theoretically definable subsets of K. We now define by transfinite induction a function T from ordinal numbers to sets as follows: .

T(O) = the empty set T(a+ 1) = D(T(Q)) If A is a limit number, then T(A) is the union of all T(13), for all i3 < A.

A set is now said to be constructible' if and only if it belongs to T(a), for some ordinal Q.


Godel's final move is to show that the relativizations of AC and GCH to L are each theorems of ZF. He shows this in two steps. First, he shows how to formulate in ZF a sentence to the effect that all sets are constructible (the so-called Axiom of Constructibility), and then shows that the relativization of this sentence to L is itself a theorem of ZF. Thus, the Axiom of Constructibility is true in the above inner model (this, by the way, is not trivially true). Second, Godel shows that the Axiom of Constructibility itself implies both AC and GCH; that is, tipt the conditional from this a::iom to the conjunction of AC and GCH is a theorem of ZF. That AC follows from the Axiom of Constructibility is easily seen, since this axiom guarantees a well-ordering of all sets. The proof that GCH follows from this axiom is the most difficult part of Godel's whole argument. It follows that both AC and GCH are true in our inner model of ZF. Thus, if ZF admits any model at all then ZF + AC + GCH has a model in the constructible sets , contained within that model. If ZF is consistent, therefore, ZF + AC + GCH is consistent. 16

When we add Cohen's above results to Godel's results, we see that the Axiom of Choice is both consistent with, and independent of, the remaining axioms of ZF (now using 'ZF' to stand for the totality of the Zermelo-Fraenkel axioms, AC included), in the sense that it can neither be proved nor refuted from those remaining axioms; and, further, that the same is true of both the Continuum Hypothesis and the Generalized Continuum Hypothesis. In addition, it is known that the Axiom of Constructibility is itself both consistent with and independent of ZF + GCH: after having added GCH to ZF we are still not committed either way on the assumption that all sets are constructible in the above sense. Finally, it should be added that it is known that GCH implies AC. Thus, the Axiom of Constructibility implies GCH, which in turn implies AC; neither of these implications holds in the reverse direction, however.

16 Godel's consistency proof is in fact a constructive proof, in that it provides an effective procedure for deriving a contradiction within ZF, once given the derivation of a contradiction in ZF + AC + GCH.


The above results are valid not only for the Zermelo-Fraenkel set theory, but for certain other set theories as well; in particular, they hold true for the von Neumann-Bernays-Godel set theory.

Now that we know that the Continuum Hypothesis (in both its simple and generalized forms) is independent of the remaining axioms of set theory, we are faced with the question as to whether or not it should be accepted as an independent axiom. The situation here is the same as that we encountered in the case of the Axiom of Choice; however, the case for accepting the Continuum Hypothesis seems to be considerably weaker than is the case for accepting the Axiom of Choice. The Axiom of Choice has a high intuitive plausibility, which the Continuum Hypothesis does not: presumably no one regards the Continuum Hypothesis as obviously true! The Axiom of Choice has very great importance in applications within both classical and modern mathematics, while the Continuum Hypothesis is of comparatively only minor importance in applications. As we have already remarked, certain of the consequences of the Axiom of Choice seem peculiar (at any rate, they. are highly unexpected). And the same is true for the Continuum Hypothesis; indeed, perhaps certain of the results here are more truly implausible than are any that follow from the Axiom of Choice. In his very important paper on the Continuum Hypothesis (1947), Godel speaks of these results as 'highly implausible,' and writes that 'it is very suspicious that, a~ against the numerous plausible propositions which imply the negation of the continuum hypothesis, not one plausible proposition is known which would imply the continuum hypothesis.' 17 Indeed, G6del himself puts a Platonistic (realistic) interpretation on the Continuum Hypothesis, and it is his opinion that this hypothesis is false, and that someday it will be shown to be false through the discovery of new set-theoretical axioms, whose discovery will come about as a result of the study of the Continuum Hypothesis itself. Of course, since the Axiom of Constructibility implies the Continuum Hypothesis, Godel holds that this axiom is false also. That is, he rejects the view that all sets are constructible. .

Godel's presupposition is that there exists some 'well-determined reality' - viz., the domain of all sets - in terms of which

17K. Gi:ldel 1947. section 4.


the Continuum Hypothesis (and every other sentence in set theory) has a detenninate truth-value, quite apart from whether that truth-value can be ascertained by means of the axioms of contemporary set theory. One might, however, reject this presupposition, and maintain that there is no one determinate totality of all sets. Rather, there is the possibility of positing a number of different totalities of 'all sets'; in certain of these the Continuum Hypothesis is true (as in Gddel's model of contructible sets), while in the remaining ones it is false. Whether the Continuum Hypothesis is itself true or false is, then, perhaps only a matter as to which of these different totalities best serves the overall purposes which mathematics as a whole at any onetime imposes upon set theory.

7.5. Strengthened Form~ of ZF

We shall now close this discussion of Zermelo-Fraenkel set theory by mentioning several ways in which this theory can be extended so as to result in more powerful theories. First, we can turn to Godel's incompleteness theorems (which we shall consider in Chapter VIII), according to which every consistent and effectively defined theory containing elementary arithmetic must be incomplete. In particular, Godel nas shown that no such theory can contain as a theorem any sentence from some infinite class of sentences each of whicfI expresses the consistency of that theory. These sentences that Gddel considered express the consistency of a particular theory by stating that there is some formula of that theory which is not provable within that theory. Since elementary arithmetic can readily be developed within ZF, it follows that if ZF is consistent then no sentence of this particular sort is provable within ZF. Clearly, however, if we are to take ZF seriously at all we must suppose that it is consistent. We may, therefore, in good conscience add to ZF, as a new axiom, some sentence of the above sort to the effect that ZF is consistent. (In his incompleteness proofs, Godel has shown that such a sentence of consistency can always be expressed as a sentence within

STRENGTHENED FORMS OF ZF 183

elementary arithmetic, and thus within ZF.) Since this axiom is by Godel's results independent of ZF, the resulting theory will be a stronger theory than ZF itself. Nevertheless, it will be consistent if ZF is consistent; and, of course, it will be incomplete. We may therefore add to this theory an axiom of the above sort to the effect that it is consistent; by the above reasoning, the result will be a yet stronger consistent theory. This process can b.e repeated an infinite number of times, resulting in a theory that is still consistent (and incomplete) if ZF is itself consistent.

A second, more specifically set-theoretical, possibility is to add to ZF an axiom to the effect that there exists a very large cardinal number. Such axioms are called strong axioms of infinity. As one type of such axioms we have the so-called axioms of inaccessible cardinals (Tarski, 1938, 1939). It can be shown that for any set of cardinals X there is a smallest cardinal which is greater than each of the cardinals in X. This cardinal is denoted by the expression 'sup X'. Now a cardinal m is inaccessible if and only if:

(1) ~o < m; (2) for every set of cardinals X, if there are fewer than m cardin

als in X, and n < m for all cardinals n in X, then sup X < m; and (3) if n < m and p < m, then n P < m.

Tarski's axiom of inaccessible cardinals postulates the existence of a cardinal which is inaccessible in this sense. It is known that one can consistently suppose that there are no such cardinals; that is, that their existence cannot be proved within ZF. It is not known, however, whether the supposition that such cardinals exist is itself consistent relative to the remaining axioms of set theory. If it is, then the result of adding the above axiom to ZF is a consistent theory which is stronger than ZF, in which one is able to prove the existence of this new cardinal, together with all of the new sets that result from adding this inaccessible cardinal to the domain.

When we add an axiom of inaccessible cardinals to ZF, within the resulting theory we are able to establish the existence of sets which are sufficiently large as to serve as domains of ZF itself. Thus, by adding such an axiom we are able to show that ZF has a model, and thus that it is consistent. By going to a stronger theory than ZF itself, then, we are able to prove the consistency of ZF.


This proof, of course, is highly 'non-constructive'; and if anyone had real doubts about the consistency of ZF, this proof would hardly lay those doubts to rest.

After having added to ZF an axiom for inaccessible cardinals, one can take the resultant system and add a second such axiom to it, positing the existence of cardinals inaccessible within it. And this process can be kept up without end; indeed, one can assume that the cardinal number of/inaccessible cardinals is itself inaccessible! None of these axioms of inaccessible cardinals, however, serve to establish or refute either the Continuum Hypothesis or the Axiom of Constructibility - supposing that the result of adding these axioms to ZF are still consistent systems. Of course, the question as to whether our resultant systems are still consistent becomes ever more pressing, as we get closer and closer to 'Cantor's paradise', or 'Cantor's absolute'; that is, to Cantor's intuitive set theory, which proved to be so comprehensive as to be contradictory .

Finally, there is the possibility of considering a second-order formulation of ZF. Here the underlying logic is a second-order logic. The singulary predicate variables are understood as ranging over the domain of all subsets of the domain of ZF; and, in general, for everY n, the n-ary predicate variables are understood as ranging over the domain of all n-ary relations over the domain of ZF. As axioms, we take the axioms of ZF except for the axiom schema of separation and the axiom schema of replacement. We replace these axiom schemata by si!lgle axioms, in which bound predicate variables appear. In particular, the axiom schema of separation

(x)(3y)(z)(z E y == z E x II A)

is replaced by the single axiom

(F)(x)(3y)(z)(z E y == z E x II F z);

and the axiom schema of replacement

(xl)«x)(y)(z)(x E Xl II A II A') :J y = z) :J

(3YI)(Y)(Y E YI == (3x)(x E Xl II A)))

STRENGTHENED FORMS OF ZF

is replaced by the single axiom

(F)(xl)«x)(y)(z)«x E xl II F x Y II F x z) :J y = Z f = (3Yl)(Y)(Y E YI == (3x)(x E xl II F x .v)).

This second-order formulation of ZF, then, unlike ZF, has finitely many axioms. And if we add to it a further axiom to the effect that there are no inaccessible cardinals then the result is a set of second-order axioms which is categorical and thus complete in the semantical sense.

CHAPTER VIII

INCOMPLETENESS. UNDECIDABILITY

8.1. Introduction

We turn now, in this final chapter, to a consideration of two topics which are of the greatest importance in any inquiry into the general nature of formalized theories; viz., the topics of incompleteness and undecidability. A theory T is incomplete, recall, if and only if there is a sentence A in the language of T such that neither A nor 'VA is a theorem of T; and T is undecidable if and only if there is no effective procedure for determining in each case whether. a formula A stated in the language of T is a theorem of T. We have in earlier chapters mentioned a number of examples of incomplete theories and of undecidable theories - as well as of complete theories and decidable theories. We need now to consider these rna tters in a more general way.

The incompleteness and undecidability results of Godel and Church, together with related results by Rosser and other logicians during the 1930's, showed that the principal objectives of the famous Hilbert program could not be fully accomplished. This program, which was started about 1917, was concerned with what was called metamathematics, or proof theory, or syntax. Its object was the study of mathematics as uninterpreted theory. Hilbert here set himself and his associates the task of first presenting formaliZations of logic and the various branches of classical mathematics (including the Cantor set theory short of the paradoxes); and then establishing the consistency, completeness and decidability of these various formalizations. In the first part of this program the Hilbert school was able to draw extensively upon the

186

INTRODUCTION

already existing work of Frege, Peano, Zermelo, and Russell and Whitehead. The second part of the program was the most distinctive and original part. Here Hilbert proposed to establish the consistency, completeness and decidability of his formalizations classical mathematics by drawing upon only very restriCted methods of proofs, which he referred to as 'finitistic.' Finitistic methods of proof are meant to be of such an elementary character as to be absolutely compelling; that is, they are meant to establish their conclusions beyond all possible doubt. Finitistic proof procedures, of course, constitute only a small portion of the totality of proof procedures accepted within classical mathematics itself. Unfortunately, Hilbert never gave an exact account of what he meant by 'finitistic.' Nevertheless, it is reasonably clear that finitistic methods of proof, as Hilbert conceived of them, meet all of the requirements that the intuitionists impose upon proof procedures. In particular, within finitistic proof procedures one does not permit (a) the unrestricted use of the Law of Excluded Middle; nor (b) the use of impredicative definitions; nor (c) the use of the Axiom of Choice; nor (d) the assertion of statements to the effect that there exists something which satisfies a given condition, unless it is shown in an effective way how to construct an example of something satisfying that condition. Thus, within finitistic metamathematics Hilbert rejected indirect proofs of existence, which show only that the assumption that everything fails to satisfy a given condition leads to a contradiction, without showing how to find an example of something that does satisfy that condition. And Hilbert went even further than the intuitionists, in refusing to permit any reference to infinitely many entities, properties or operations. Part of his program was directed to a clarification of the nature of the infinite, and he did not wish to prejudice the success of that program by presupposing a concept which he himself was ready to grant needed clarification.

The Hilbert school worked at this program for many years, and obtained a considerable number of important partial results, which were presented in the classic two-volumed Grundlagen der Mathematik, by D. Hilbert and P. Bernays (Vol. 1,1934; Vol. 11,1939). The impossibility of anything like complete success, h,?wever, was

188 INCOMPLETENESS. UNDECIDABILITY

ruled out by the very important discoveries of Godel in 1931, together with results of Church and Rosser in 1936. Metamathematics ,It about this time took a new tum, and was soon to go far beyond the limitations of Hilbert's finitistic metamathematics (though finitistic metamathematics, to be sure, still remains of considerable interest and importance). The greatest single contribution to this extension of metamathematics to something like its present state is due to A. Tatski, in his semantic definition of truth (Wahrheitsbegrijj, 1936). Whereas Hilbert's procedures had all treated formalized theo11es as syntactical systems, Tarski's discoveries made it possible to treat such theories as seman tical systems. That ;s, whereas Hilbert had regarded the formalized theories whose consistency he was attempting to prove as completely uninterpreted, Tarski showed how it was possible to assign interpretations to such theories in a perfectly rigorous way, and indeed to define a whole group of very important seman tical concepts (e.g., truth, model, consequence) with respect to those theories. Metamathematics, then, since Tarski's work, takes two forms; viz., syntax and semantics. In this sense of the term 'metamathematics', much of our concern in this book is with metamathematics.

8.2. Recursive Functions and Relations. Representability

Before we take up the topics of incompleteness and undecidability, we shall need some new concepts as tools to work with. The first group of new concepts we shall use here are the concepts of (a) number-theoretic functions and relations; (b) recursive functions and relations; and (c) representable functions and relations. The functions and relations under (b) and (c) are certain of the functions and relations under (a); indeed, it will tum out that those under (b) are identical with those under (c).

An n-ary number-theoretic junction is simply a function whose arguments are n-tuples of natural numbers, and whose values are natural numbers; and u number-theoretic relation is simply a relation whose arguments are natural numbers. Every n-ary num-

RECURSIVE FUNCTIONS AND RELATIONS 189

ber-theoretic function assigns to each n-tuple of natural numbers exactly one natural number as value. The only functions and relations of a mathematical nature that we shall be concerned with in this chapter will be number-theoretic functions and relations. As examples, we have the operations of addition and multiplication, which are two-place functions; the successor operation, which is a one-place function; and the two-place relation of less than.

Since the only numbers that we shall be concerned with in this chapter are natural numbers, we shall from this point on use the term 'number' in the sense of 'natural number.'

We now tum to recursive functions and relations, taking first the case of recursive functions. We define these inductively, by listing certain functions as initial functions, and then presenting a number of rules for generating functions from other functions. Any function which is either an initial function, or can be obtained from initial functions by finitely many applications of these rules, is called a recursive function. Thus, the definition of 'recursive function' recalls our earlier definitions of 'formula' and 'theorem.' In each of these cases, we define a certain class of entities as those that can be generated by starting with certain given entities and then applying certain procedures finitely many times. And just as different sets of axioms and rules of inference can lead to the same theorems, there is a number of different possible choices of initial functions and rules for generating functions, all leading to the class of recursive functions.

As initial junctions we take the following functions: (a) The zero function Z(x); that is, that function which, when

applied to any number x, yields the value O. (b) The successor function N(x); this function, when applied to

any number x, yields the value x + 1. (c) The projection functions. For each n and each i~ n, there is

a projection function 17 (x l' ... , x n ) which, when applied to any n-tuple of numbers (Xl' ... ,xn ), yields the value xi; i.e., 1~(Xl' ... ,Xn ) = xi·

I As rules for generating new functions from given functions, we have the following rules:


(a) Substitution. Let 71 be an m-place function, and g l' ... ,gm be n-place functions. Then the n-place function f defined by

f(x 1, ... ,xn) = h(g1(xl' ... ,xn), ... ,gm(x}, ... ,xn))

is said to be obtained from h, g}, ... ,gm by substitution (or composition).

(b) Recursion. Let g be an n-place function and h be an n+2-place function. Then there is a unique n+l-place function f which satisfies the equations

(1) f(xl,···,xn,O)=g(xl'···'xn) (2) f(xl' ... ,xn' y+ 1) = h(x1' ... ,xn' y,!(xl' ... ,xn' y))

This function f is said to be obtained from g and h by recursion. (c) J.l-operation. Let g be an n+l-ary function such that for all

numbers xl" .. ,xn there is at least one number y such that g(xl, ... ,xn' y) = O. Let the expression

J.ly(g(xl, ... ,xn' y) = 0)

denote the least number y such that g(x}, ... ,xn' y) = O. Then the n-ary function f defined by the equa tion

f(x}, ... ,xn)=J.ly(g(x1,··· ,xn,y))

is said to be obtained from the n + l-ary function g by means of the u-operator.

A number-theoretic function, then, is a (general) recursive function if and only if it can be obtained from initial functions by finitely many applications of these three rules. And it is a primitive recursive function if and only if it can be obtained from initial functions by means of finitely many applications of just the first two of three rules; viz., by substitution and recursion. The primitive recursive functions, then, form a subclass of the class of recursive functions; indeed, a proper subclass, for there are recursive functions which are not primitive recursive.

Let us now illustrate how these rules are used. First, the function N(x + y) - i.e., the function (x + y) + 1 - is obtained from the successor function N(x) (which is h) and the addition function x + y (which is g1) by substitution. Second, the familiar equations


x+O=x (i.e., x + 0 = I~(x)) x + (Y+1) = (x+y)+ I (i.e., x+(y+l) = N(x+y))

illustrate the use of recursion (where n = 1) in which we obtain the function of addition (which is f) from the projection function I~(x) (which is g) and the successor function (which is h). Similarly, in the equations

x· 0 = 0 (Le., x . 0 = Z(x)) x . (y + I) = (x . y) + x

we use recursion so as to obtain the function of multiplication (which is n from the zero function (which is g) and the addition function (which is h). To illustrate our third rule, let g be the binary function defined by the following equations:

g(x,y) = 0 = I

if x <y2 otherwise.

It is clear that for every x there will be a y such that g(x ,Y) = O. We can, then, use the J.l-operator to obtain a singulary function f as follows:

f(x) = J.ly(g(x ,y) = 0).

Thus,!(O) = 1;10) = 2;1(2) = 2;1(3) = 2;1(4) = 3; etc. As examples of primitive recursive functions, we have the famil

iar functions of successor, addition, multiplication, exponentiation, the minimum function min(x,Y) (i.e., the minimum of x and y), the maximum function max(x,y) (i.e., the maximum ofx andy), and for each n the constant function Zn(x), whose value is n, for each x. Further examples will be described in the next section of this chapter.

Our interest in the class of recursive functions is with the relationship of this exactly defined class to the intuitively defined class of effectively computable, or effectively calculable, functions. A function is effectively computable (or calculable) if and only if there is an effective procedure for computing its value for any given arguments. Now it is very important to note at this point that recursive functions are all effectively computable. This is clear


from the definition of recursive functions. For the initial functions are all obviously computable. When we apply the rule of substitution to the functions h, g1' ... ,gm' if these functions are computable then the resulting function f will be computable. For in order to compute the value of f for any given arguments xl' ... ,Xn ' all we need to do is to compute first the m values of gl' ... ,gm for these arguments, and then compute the value of h for these m values taken as arguments. Sirriilarly for the rule of recursion. Equation (I) permits us to compute the value of f(x l' ... ,X n' 0), supposing that g is con~putable; and equation (2) permits us to compute the value of f(x1' .. . , xn,y+l) once we know the value of f(x], ... ,xn,Y), supposing that h is computable. And, finally, applications of the Jl-operator to a computable function g yields a function f which is computable. In order to compute the value of f(xl' ... ,xn ), we simply compute the values ofg(x l ,· .. ,xn' 0), g(xl' ... , xn ' I), etc., until we reach a zero value. We may con-clude, therefore, that all recursive functions are effectively computable. It is by no means obvious, however, that the converse holds; that is, that all effectively computable functions are recursive. The thesis that the converse does hold is known as ChurcL's thesis. Thus, if we accept Church's thesis, we must conclude that the class of computable functions is identical with the class of recursive functions. It is largely this result that gives recursive functions the importance that they have for logicians. For, as we shall see, the decision problem - one of the central problems in all of logic - can be formulated as a problem in computability. Once computable functions are identified with recursive functions, then, the decision problem takes on an exact character.

Recursive relations, now, can be defined in terms of recursive functions. If R is an n-ary relation, then its characteristic function CR is defined as follows:

CR(Xl'· .. ,Xn ) = 0 = I

ifR(x 1, ... ,Xn ) holds otherwise.

Thus, the function g(x, v) in our illustration of the use of the Jl-operator is the characteristic function of the relation x <y2.


Clearly every n-ary relation R has a unique characteristic function. This function assigns 0 to those ordered n-tuples of numbers which satisfy R, and I to all other ordered n-tuples. Thus, we define a relation as recursive if and only if its characteristic function is recursive; and as primitive recursive if and·only if its characteristic function is primitive recursive. The familiar arithmetic relations from elementary school arithmetic are all primitive recursive; e.g., the identity relation, the less than relation, as well as the sets (i.e., singulary relations) of even numbers, odd numbers, primes, squares, multiples of some particular number k, etc.

Just as on the basis of primitive rules of inference we can establish certain derived rules of inference, from the above three rules (a)-(c) for obtaining new functions from given functions we can derive further rules for doing this, as well as for obtaining new relations and functions from given relations. These rules when applied to primitive recursive (or recursive) functions and relations lead to new functions and relations which are primitive recursive (or recursive). Thus, for example, we can form the negation of any relation (e.g., the inequality relation); the disjunction of any two relations (e.g., the less than or equal to relation); and so on for the remaining connectives from the sentential logic. As for the quantifiers, applying them to relations does not in general lead from primitIve recursive (or recursive) relations to primitive recursive (or recursive relations). We can, however, apply the bounded q uan tifiers. Thus, if R (x l' ... , X n' y) is an n + l-ary primitive recursive (or recursive) relation, then the n + l-ary relation (Y)y<zR(XI' .. . , xn,y) - that is, the relation such that for all y less than z, R(xI' ... ,xn' y) holds - is primitive recursive (or recursive). Similarly for the relation (3Y)y<zR(X1' ... ,xn,y). This is due, of course, to the fact that applying a bounded universal quan tifier is equivalent to asserting a finite conjunction, and applying a bounded existential quantifier is equivalent to asserting a finite disjunction.

It is important to notice that the above characterization of the class of recursive functions and relations makes no reference to any formalized theory of arithmetic. This characterization is arithmetic in nature, rather than logical. In particular there is no


reference in it to what can be proved within some theory. We introduce now a second characterization of a certain class of number-theoretic functions and relations. This second characterization is of a logical nature, in that it proceeds in terms of provability within some arithmetic theory; in particular, within the theory N from Chapter V. The class of functions and relations which will here be characteQ.zed is the so-called class of representable functions and relations. These functions and relations will be representable within N; i.e., within first-order Peano arithmetic.

First we introduce some new notation. A numeral is one of the symbols

o so SSO SSSO

Numerals are certain names for natural numbers. Thus, for every natural number k, the expression which consists of a sequence of k oS's followed by '0' is the numeral for k. We shall use theexpres-

sions - -o I 2 3

to denote these numerals. In general, that is, if k is a natural number, the expression

k

denotes the numeral which denotes k. Further, we use the expres

sion

A(al"'" an)

as a syntactical meta-variable to range over formulas which contain the variables al"'" an as free variables. For any formula A(al' ... , an)' and for any natural numbers k l ,···, kn, the expression

A(kl ,· .. , kn )

denotes the result of replacing the free variables aI' ... , an in A(al' ... , an) by the numerals kl ,· • • , k11.: !,or example, let A(al,a2) be the formuh 'x<y'. Then A(l,2) is the formula

'SO< SSO'.

RECURSIVE FUNCTIONS AND RELATIONS ! 95

We say now that an n-ary number-theoretic relationR(x l' .... x,) is representable in N if and only if there is in N a formula A(al' ... , an) with n free variables such that for all natural num-bers k l , ... , kn ,

(1) IfR(k l ,···, kn)holds, then rNA(kl ,···, kn); (2) If R(k 1 , ••• , kn) does not hold then rN'VA(k1,.·., kn).

In this case we say that A(al, ... , an) represents R. Let us illustrate this definition. Let R be the relation .of less

than among the natural numbers. Then this relation is representable in N, because there is in N a formula with two free variables -viz., 'x<y' - such that, for all natural numbers kl and k 2, conditions (1) and (2) hold. In terms of the formula 'x < y', condition (1) requires that if kl is less than k 2, then the formula

k} < k2

be provable in N. Condition (2) requires that if kl is not less than k 2 , then the formula

'V k} < k2

be provable in N. And in fact these requirements are met in N. Loosely speaking, what is being required is that whenever the number x is less than the number y, this be provable in N; and whenever x is not less than y, this also be provable in N. Since these two requirements are in fact met, we say that the less than relation is representable in N.

An n-ary number-theoretic function [(xl" .. , x n ) is said to be representable in N if and only if there is in N a formula A(al,' .. , an+l) with n+l free variables such that for all natural numbers k l , ... , kn + l ,

(1) If[(k l ,···, kn) = kn+l , then rNA(k}, ... , kn, kn+l ); (2) rN (3xn+l) A(kl , ... , kn, Xn+ 1) A (x)(y)(A(kl ,· .. , kn, x) A

A(k}, . .. , kn , y):) x = y). In this case we say that A(a} , ... , an+1 ) represents.f

As an illustration, the operation of addition is representable in N because there is in N a formula - viz., 'x + y = z' - such that for


all natural numbers k1' k2 and k3 conditions (1) and (2) hold. In terms of the formula 'x + y = z', what condition (1) requires is that if the sum of k1 and k2 is k3' then the formula

k1 + k2 = k3

be provable in N. Condition (2) requires that the formula

(3Xn+1)(k1 + k2'-= xn+1) 1\

(x)(y)(k1 + k2 =x 1\ kl +k2 = y:> x = y)

be provable in N. And in fact these requirements are met in N. How, now, is the (loJically defined) class of relations represent

able in N related to the (arithmetically defined) class of recursive relations? In fact, these two classes turn out to be one and the same class, as do the class of functions representable in N and the class of recursive functions. 1 All recursive functions and relations are representable in N; and all functions and relations representable in N are recursive. This is a result of fundamental importance, and the first part of this result will be drawn upon later in this chapter.

8.3. Arithmetization

One more tool is needed; viz., arithmetization. In his 1931 paper, Godel showed how the syntax of a formalized theory could be mapped into arithmetic; that is, how the language of syntax could be mapped into the language of arithmetic. The fact that one language can be n:apped into another is familiar to us from Descartes' famous discovery that Euclidean geometry can be mapp~d into the theory of real numbers. The fundamental idea here is really that of a model; just as Descartes discovered that Euclidean geometry has a model in real number theory, Godel discovered that syntax has a model in arithmetic. In each case there is an isomorphism between one language and part of

1 For a proof of this, see E. Mendelson 1964, pp. 131-134, 142. The classic text on recursive functions is S.C. Kleelle 1952.

ARITHMETIZA TION 197

another. In particular, under the isomorphism discovered by Godel truths of syntax tum out to be equivalent to certain numbertheoretic truths.

It is not necessary for our purposes to present here an actual ari thmetiza tion of the syntax of some particular theory. It will suffice if we merely describe the essential features of an arithmetization of a theory T. Let T be any theory, of either first or second order. What we do first is assign natural numbers to each of the symbols, expressions, and finite sequences of expressions of T, in a way such that each number that is assigned is assigned only once, and such that (a) for each particular symbol, expression, or finite sequence of expressions, we can effectively compute its number -- called its Godel number; and (b) given any nUlJlber, we can effectively determine whether it is assigned to anything by this assignment, and if so, precisely to what it is assigned. Such an assignment, then, is simply a one-to-one function from the symbols, expressions, and finite sequences of expressions of T to a subset of the natural numbers, where this function meets the requirements (a) and (b). For any given theory T, there will be infinitely many different possible assignments meeting these requirements.

The second step in the arithmetization of T is to define number-theoretic equivalents to the various syntactic relations and functions pertaining to T. That is, in terms of some particular G6del numbering, we define for each syntactic relation a "numbertheoretic relation which is its equivalent in terms of this G6del numbering, in the sense that this particular syntactic relation holds between given syntactic entities (I.e., symbols, expressions, or finite sequences of expressions) if and only if this particular number-theoretic relation holds between the G6del numbers of these syntactic entities. And similarly for the syntactic functions. Thus, for example, the syntactic properties (i.e., singulary relations) of being a variable, an individual constant, a formula, an axiom, a proof all receive their number-theoretic equivalents, which are the sets of all the G6del numbers of variables, individual constants, formulas, axioms, and proofs (that is, derivations from axioms).


Assume now that T is a theory such that the sets (i.e., singulary relations) of Godel numbers of individual constants of T, of predicate constants of T, of operation symbols of T, and of non-logical axioms of T are all (primitive) recursive sets. Then the numoer-theoretic equivalents of most of the syntactic relations and functions pertaining to T will be (primitive) recursive. That is, if these former sets are all recursive, then these latter relations and functions will be recursive; and if these former sets are all primitive recursive, then the latter relations and fuctions will be primitive recursive. In particular, this will be true of the numbertheoretic equivalent of the relation of being a proof of. Let us use the expression 'PfT(x,y), to stand for that number-theoretic relation which holds between the numbers x and y if and only if x is the Godel number of a formula and y is the Godel number of a proof in T of the formula with Gbdel number x. Then (under our assumption) PfT(x,y) will be a (primitive) recursive relation. The expression '( 3y )Pf T(X,y)' will then denote the number-theoretic property of being the Gbdel number x of a theorem of T. We have here, however, applied an unbounded existential quantifier, and for that reason we ca.mot conclude that this particular numbertheoretic property will be (primitive) recursive. For some theories T it will be (primitive) recursive, but for others it will not. In particular, it is known (as we shall see) that it is not a recursive property if T is the arithmetic theory N.

The fact that the number-theoretic equivalents of most syntactic relations and functions are (primitive) recursive for theories meeting the assumption of the preceding paragraph is of crucial importance for our subsequent consideration of incompleteness and undecidability. For, as we remarked in the preceding section of this chapter, recursive relations and functions are all representable in the theory N. Now since N is a theory satisfying our assumption - in particular, in the case of N each of the sets mentioned in this assumption is primitive recursive -, it follows that the syntax of N can be developed within N itself. That is, N can serve as its own syntactical metalanguage. It is this fact which is central in the demonstration that N is incomplete, to which we now turn.

CODEL'S FIRST INCOMPLETENESS THEOREM 199

The argument of Godel's 1931 paper, which we are about to consider, uses the concept of representable relations, together with the concept of primitive recursive functions and relations. It does not use the concept of (general) recursive functions and relations. 2 Indeed, the major conclusion of this paper - viz., the incompleteness of arithmetic - can be established using only the concept of representable functions and relations (since the concept of recursive functions and relations is known to be coextensive with this concept). We shall state the argument, however, so as to use both the concept of representable functions and relations and the concept of (primitive) recursive functions and relations.

8.4. Godel's First Incompleteness Theorem

In his 1931 paper G6del demonstrated the incompleteness of a certain axiomatic theory of arithmetic; viz., the the0ry which results from adding Peano's axioms to the logic of type theory as developed in Principia Math ema tica. Godel's reasoning, however, can readily be transferred to elementary axiomatic theories of arithmetic, and in this section we consider the application of Godel's reasoning to the arithmetic theory N.

We say that the theory N is w-consistent if and only if for every formula A(a) of N, if r-NA(k) for every number k, then it is not the case that r-N(3x) 'VA(x). (Similarly for any theory Twith the same symbols as N.) This means, under the intended interpretation of N, that if for some property we can prove, for each natural number k, that k has that property, we cannot prove that there is some number that does not have that property. Since we accept the intended interpretation of N as a model of N, we can conclude that N is w-consistent. However, this argument that N is w-consistent is clearly of a 'non-constructive' semantic character, and for

2 For an introductory treatment of many of the remaining topics in this chapter which uses only the concept C'f representability, see R. Jeffrey 1967, Chapter 10. See also R. Lyndon 1966, pp. 80-90, which also uses only the concept of representability, and covers most of these remaining topics within the scope of 10 pages!


that reason whenever we need the assumption that N is w-consistent - an assumption which is of a syntactic character - in any subsequent argumentaHon we shall list that assumption explicitly.

It is easy to show that if a theory T (having the same symbols as N) is w-consistent, then it is consistent. The converse, however, does not hold in general; and we shall subsequently give an example of a theory which is consistent but not w-consistent. Consistency, then, is a/weaker requirement than is w-consistency.

Consider now the number-theoretic relation W(k 1, k 2) which holds between the natural numbers kl and k2 if and only if k1 is the Godel number of a formula A(x) containing the free variable 'x', and k2 is the G6de1 number of a proof (in N) of A(k1); that is, of the formula which results from A(x) by replacing all free occurrences of 'x' by the numeral k1 . This relation can be shown to be primitive recursive. It is, therefore, representable in N. Thus, there is in N a formula B(a,b) with two free variables a and b such that, for all natural numbers k1 and k2'

(a) if W(k 1, k 2) holds, then rN B(k l' k2); (b) ifW(k 1, k 2) does not hold, then rN 'V B(k l' k2).

Consider next the fQ1'mula

(C) (y) 'V B(x,y);

that is, the formula which is obtained by taking the formula B(a,b), where a is 'x' and b is 'y', and prefixing the expression '(y)'V'. Call this open formula C. C is one of the formulas A(x) with free variable 'x'. It has a Godel number. Let this Godel number be m. We now 'diagonalize' on C, by replacing all free occurrences of 'x' in C by the numeral m, and obtain the closed formula which is crucial for Godel's whole argument:

(G) (y)'VB(rn,y).

Our characterization of the formula G proceeds in terms of the primitive recursive relation W. Just how is W related to the formula G? W(k l' k 2) holds, recall, if and only if k 1 is the Godel number of a formula containing the free variable 'x' , and k2 is the Gbdel number of a proof in N of A(k 1)' Now the formula C is a formula containing the free variable 'x', and m is its Godel

GODEL'S FIRST INCOMPLETENESS THEOREM

number. Furthermore, G results from C by replacing all free occurrences of 'x' by the numeral m. Therefore, we have the resu!:

(I) W(m, k) holds if and only if k is the Godel number of a

proof in N of the formula G.

Clearly, then, if W(m, k) holds for some number k, G is provable in N; otherwise, not.

We now state and prove Godel's First Incompleteness Theorem: (1) If N is consistent, then G is not provable in N. (2) If N is w-consistent, then 'V G is not provable in N.

Since w-consistency implies consistency, it follows from (1) and (2) that if N is w-consistent, then neither G nor 'V G is provable in N, and thus that N is incomplete.

The proof of (1) is as follows. Assume that N is consistent, and that G is provable in N. Then some number will be the Godel number of a proof of Gin N; let it be k. From the above result (I), W(m, k) then holds. Now since the formula B(a, b) represents Win N, it follows that the closed formula B(rn, k) is provable in N. But since G - that is, (y) 'V B(rn, y) - is provable in N, the closed formula 'VB(rn, I<) is also provable in N, by the logic of quantification. Thus, both B(rn, k) and 'VB(m, k) are provable in N. But this contradicts our assumption that N is consistent. Therefore, if N is consistent, G is not provable in N.

The proof of (2) is as follows. Assume that N is w-consistent. As we earlier remarked, from the assumption that N is w-consistent, it follows that N is consistent. Therefore, by (1), G is not provable in N. Thus, for every number k, k is not the Gbdel number of a proof of G; hence, by (I), for every k, W(m, k) does not hold. Because B(a, b) represents W, 'VB(rn, k) is therefore provable in N, for every k. Now we draw again upon the assumption that N is w-consistent, and conclude that (3y) 'V 'V B(m,y) is not provable in N. Therefore, (3y) B(m, y) is not provable in N; and thus, by the logic of quantification, 'V(y) 'V B(m,y) - i.e., 'VG - is not provable in N. This concludes the proof of Godel's First Theorem.

Any sentence A of a theory T which is such that neither A nor 'VA is a theorem of T is said to be undecidable in T. Godel's First


Theorem, then, shows that if N is w-consistent, then N contains a sentence that is undecidable in N. And Godel has proved this theorem in such a way as to provide us with an effective procedure for actually constructing such a sentence. As Godel himself points out, his proof is constructive, and meets even intuitionistic standards of rigor." It can, indeed, be completely carried out within N itself - though the proof we have given (Godel's proof) is of course carried out not within N, but within the English syntactic metalanguage of N.

As we have already remarked in Chapter V, Godel's First Theorem was improved upon in 1936 by Rosser, who showed that the incompleteness of ',r could be proved constructively using only the assumption of consistency (rather than the stronger assumption of w-consistency).

We have at no point so far assumed that the intended interpretation of N is in fact a model of N - otherwise the above proof of Godel's First Theorem would not be a constructive proof. Let us now make this semantical assumption, and see what we can learn about G and 'VG. From this assumption it follows that (a) all of the theorems of N are true (under the intended interpretation), and that (b) N is cOllsistent, and also w-consistent. Thus, by (b) and Godel's First Theorem, it follows that neither G nor 'VG is a theJrem of N. Thus, (a) is of no help to us in deciding whether it is G or 'VG that is true. As Godel points out, however, when we look at what G says (under the intended interpretation of N), we see that we are able to determine its truth-value. G says that every natural number fails to be the Godel number of a proof of a certain formula; viz., of the formula which results from replacing all free occurrences of ''(' in the formula whose Gbdel number is m - viz., C- by the numeral m. But this 'diagonally defined' formula turns out to be G itself! Thus G is a self-referential

3 Notice that GOdel's proof differs in this respect from the proof of the Completeness Theorem for the first-order predicate logic which appears in Chapter III. The proof which is given there of that theorem is non-constructive. See footnote 2, page 60; and the final paragraph of part I, page 63.

GODEL'S FIRST INCOMPLETENESS THEOREM 203

sentence which says of itself that it is not provable in N. Thus, if it is provable, it is untrue, contrary to (a). Thus we can conclude that it is not provable, and therefore that it is true. The formula 'VG, then, is, of course, false. 4 .

We have, then, discovered the truth value of G; viz., truth. But not by showing that G is provable in N, because it isn't (assuming that N is consistent). Rather, the argument we have given is carried out in the semantical metalanguage of N, in which we make the assumption that the intended interpretation is a model of N. But this assumption is clearly true! Thus, by drawing upon seman tical reasoning about N, we are able to establish number-theoretic truths which cannot be established within N. Godel's 1931 paper, thus, not only proves that N is incomplete; but introduces us to a new way of establishing number-theoretic truths.

Since G is true, we can add it to the axioms of N as a further axiom if we wish. Call the resulting theory N1. G is now decidable in N1. Nl is a stronger theory than N, since G is independent of the remaining axioms of N1. And if the intended interpretation is a model of N, then it is also a model of N1. But adding G to N does not give us a complete theory. For Nl is an effectively defined extension of N, and thus Godel's methods can be applied to Nt> showing that Nl is incomplete (assuming that Nl is w-consistent). These methods show effectively how to construct a sentence G1 which is true but unprovable in N l' This in turn can be added to N 1; and the whole process repeated, and indeed kept up .an infinite number of times. In this fashion we can construct an infinite number of number-theoretic sentences which are all true, but all unprovable in N.

We call a theory T recursively axiomatizable if and only if there is a theory T' having the same theorems as T and such that the set of Gbdel numbers of the non-logical axioms of T' is a recursive set.

4 Notice that, still assuming that the intended interpretation is a model of N, we can conclude from the fact that 'VG is false that it also is not provable in N; and thus that N is incomplete. This gives us a semantic proof of the incompleteness of N, in which we do not need the concept of w-consistency. This proof is, of course, a weaker proof than the above constructive proof, since it uses a non-constructive semantic assumption.


Now GOde~'s argument (taken together with Rosser's improvem~nt) aFplies not ?nly to N, but to all consistent recursively aXlOmatIzable theones T - including not only first-order theories, but second-order theor'es also - in which all recursive relations and functions are representable. All such theories will contain sentences which are undecidable within those theories. This includes all consistent and effectively defined extensions of N (if we accept Church's thesis that all effectively defined sets of natural numbers are recursive). It follows then that N is essentially incomplete' ~here a theory T is essentially incomplete if and only if T i; mcomplete and every consistent axiomatic extension of T is incomplete: Furthermore, it follows as an immediate corollary from Godel's Incompleteness Theorem that Skolem's arithmetic is not decidable; that is, that there is no effective procedure for determining whether any arbitrary sentence of arithmetic is true under the intended interpretation. For if there were such a procedure w~ wou.ld ?ave a consistent, complete, and effectively defined aXlomatlZatlOn of all the truths of arithmetic, simply by taking 1111

true sentences of arithmetic as our axioms. 5

The careful reader of the preceding paragraph should be w?nderi~g whet~er G6del's incompleteness result applies to any anthmetIc theones which are weaker than N. N has infinitely many axioms, and is indeed not finitely axiomatizable. Does G6del's result apply perhaps to any finitely axiomatizable arithmetic theories? It has in fact been proved that it does apply to such a theory. Consider the following theory, which is known as l!-0binson's arithmetic (R.M. Robinson, 1950). 6 This theory has as Its symbols those of N, and the following seven non-logical axioms:

(1) (x)(y)(Sx = Sy ~ x = y) (2) (x)(O:f Sx)

5 ~he ~e,ader should, note that we are not here saying that Godel in 1931 proved the und:cld~biJlty of ~he aXiomatic theory N. This was first shown in Church 1936,

. ThiS the~ry IS presented, and its properties are established, in A, Tarski, A. MostowSki, R.M. Robinson 1953, p. 51 ff.

GODEL'S FIRST INCOMPLETENESS THEOREM

(3) (x)(x =f: 0 ~ (3'y)(x = Sy» ( 4) (x )(x + 0 = x) (5) (x)(y)(x + Sy = Sex + y» (6) (x)(x . 0 = 0) (7) (x)(y)(x . Sy = (x . y) + x)

205

These axioms differ from those of N just in that in place of the induction schema we have the above single axiom (3). Since axiom (3) is a theorem of N, this theory is an axiomatic subtheory ofN. It is clearly recursively axiomatizable. Since it does not have the induction schema as an axiom schema, Robinson's arithmetic is a very weak theory. Examples of familiar arithmetic truths which cannot be proved within it include the commutative laws of addition and multiplication, together with the sentence '(x )(x =f Sx)'. Still, Robinson has shown that all recursive functions and relations are representable within this theory. Thus, since this theory is clearly consistent, it follows from Godel's 1931 argument that this theory is essentially incomplete. We shall make further reference to this theory later in this chapter, in connection with the decision problem.

We now bring this section to a close by returning to the topic of w-consistency and then introducing the concept of w-completeness. We have seen that the sentence "vG is undecidable in N. Thus, the theory N' which results from adding "vG as an axiom to N is a consistent theory, supposing that N is consistent. It is, however, w-inconsistent, as the reader should be able to show. This illustrates the fact that the requirement of w-consistency is a stronger requirement than that of consistency. Since "vG is false in the intended interpretation of N, that interpretation will not be a model of N'. All models of N' are, then, non-standard models of arithmetic. '

We say that N is w-incomplete 7 if and only if there is a formula A(x) such that r-N A(n), for every n, but not r-N (x)A(x). That is, if there is a formula A in one free variable. such that, for each

7 See Tarski's 'Some Observations on the Concepts of w-Consistency and w-Completeness' (1933), which is paper IX in A, Tarski 1956, The concepts of w-consistency and w-completeness were first defined and studied by Tarski, in 192 7.


natural number n, we can prove in N that A holds for n, but cannot prove in N that A holds for all n. It is rather surprising that a theory should be w-incomplete, but it clearly follows as a corollary from Godel's results that N is w-incomplete, supposing that it is consistent. For consider the formula G; viz.,

(y) '" B(rn,y).

Since B(a, b) represents the relation Win N, the formula "'B(rn, Ii) is provable in N for each n that makes it true. But we have seen that for every natural number n, the formula ",B(rn,n) is true; that is, for every n, n is not the Godel number of a proof of G. Thus, for every n, f-N "'B(rn, n). Nevertheless, it is not the case that rN(y) "'B(rn,y); that L, it is not that case that f-NG.

The reader may feel at this point that we should add a new rule ofinference to N; viz.,

If for each n, rN A(n), then rN (x)A(x).

This rule would clearly be a sound rule of inference. This rule has in fact been studied by logicians, 8 and is known as the w-rule. It differs from all the rules of inference we have considered in this book, however, in that it can be applied only to an infinite collection of hypotheses! For that reason, it cannot appear as a rule of inference in any theory in the familiar sense of the word 'theory'.

8.5. Godel's Second Incompleteness Theorem

In the concluding section of his 1931 paper Godel drew attention to a 'remarkable result' which follows as a corollary from his First Incompleteness Theorem. This result - the so-called Godel Second Incompleteness Theorem - concerns the possibility of consistency proofs. The result is that the consistency of N is unprovable in N, assuming that N is consistent; and that, in general, the consistency of any theory T which is a consistent extension of N is unprovable in T, provided that the class of

8 See, for example, J.R. Shoenfield 1967, pp. 231 ff.

GODEL'S SECOND INCOMPLETENESS THEOREM 207

axioms of T is a recursive class (in the sense that the class of its Godel numbers is a recursive class). .

The informal proof of Godel's Second Theorem is very simple. The first part of the First Incompleteness Theorem is:

(1) If N is consistent, then G is not provable in N. Now Godel remarks that (1) can be stated within N itself, and

then proved in N as a theorem. 9 There will be a sentence in N which states that N is consistent; in the sense that there is a number which is the Godel number of a formula of N which is not a provable formula of N. Let us call this sentence ConN. Thus, since G says of itself that it is not provable in N, the sentence

ConN :J G

will be a sentence in N which says what (1) says. Furthermore, this sentence will be provable in N. Thus, if ConN were provable in N, then by Modus Ponens G would be provable in N. Godel's First Theore~, however, tells us that if N is consistent, G is not provable in N. We conclude that if N is consistent, ConN is not provable in N. The generalization of this result to consistent recursive extensions of N is proved similarly.

The intuitive meaning of ConN is that N is consistent. Assuming that N is consistent, then, we may conclude that ConN is true. It is, then, another example of a sentence of N which is true but unprovable in N. Furthermore, it can be shown th.at if N is w-consistent, then'" ConN is not provable in N either. Thus, if N is w-consistent, ConN is undecidable in N.

No consistency proof for N - in the sense of a proof that there is some formula of N which is not provable in N - then, can itself be mapped into N; for its conclusion would be the sentence ConN' which is unprovable in N. Can we conclude, however, that no sentence whatever which expresses the consistency of N can be proved in N, supposing that N is consistent? No. Godel has indeed shown that a certain sentence which states that N is consistent is not provable in N. Nevertheless, there are, in fact, other sentences

9 This was first done by Bernays in D. Hilbert and P. Bernays 1939, pp. 285-328.


which express the consistency of N which are provable in N. Mostowski, in his discussion of Gbdel's Second Theorem, makes the following observations. 10 Let Z(x,y) be the number-theoretic relation such thaty is the Godel number of a formula in N and x is the Gbdel number of a proof of that formula in N. Godel has shown that Z(x,Y) is representable in N, and has shown how to construct a formula which represents Z(x,Y) in N. Let this formula be A(x,Y). Further, let k be the Godel number of the formula '010'. Then the formula (x) 'V A(x,k), - which intuitively means that no number is the Gbdel number of a proof of '010' - is Gddel's formula ConN' Now Gddel has indeed shown that the formula (x) 'V A(x, k) is not provable in N. However, there are formulas different from A(x,Y), which also represent Z(x,y) in N, and for which Gbdel's theorem fails. II Mostowski gives as a simple example the formula A(x,Y) /\ 'VA(x, k). If N is consistent, then this formula clearly represents Z(x,Y) in N. Thus the formula A(x, k) /\ 'VA(x, k) then represents the property Z(x, k); that is, the property of being the Gbdel number of a proof of '0 1 0'. But then the formula

(F) (x) 'V (A(x, k) /\ 'V A(x, k»

expresses the consistency of N. However, this formula is obviously provable in N! The correct interpretation of Godel's Second Theor0m, then, is that it shows that certain formulas expressing the consistency of N are not provable in N; others, however, are provable in N.

The provability in N of the above formula F, however, in itself obviously constitutes no consistency proof for N; that is, an argument in the metalanguage of N showing that there is at least one formula in N which is not provable in N. The principal objective of the Hilbert program was to find a finitistic, combinational proof of the consistency of arithmetic; indeed, of all of

lOA. Mostowski 1966, pp. 23-26. The most advanced work on the problems raised by Godel's Second Theorem has been done by S. Feferman.

II Does this open up the possibility of consistency proofs for N which can be mapped into N?

cODEL'S SECOND INCOMPLETENESS THEOREM 209

classical mathematics. 12 Because of Godel's Second Theorem. however, most contemporary logicians regard it as very doubtful (at best) that such a proof is possible. 13 Still, due in part at least to the vagueness of the concept of finitistic, it cannot be said that it has been strictly demonstrated that such a proof is impossible. Indeed, G6del's own comment on his Second Theorem .was that it 'represents no contradiction of the formalistic standpoint of Hilbert. For this standpoint presupposes only the existence of a consistency proof effected by finite means, and there might conceivably be finite proofs which cannot be stated in P.' Here P is the particular system Godel used in his 1931 paper, which results from adding Peano's axioms to the logic of Principia Mathematica; that is, the logic of type theory. Thus in 1931 Godel was willing to suppose that there might be finitistic proofs that cannot be mapped into a system even stronger than N.

We must not conclude from Godel's Second Theorem, of course, that there are no consistency proofs for N - .aside from simply accepting the intended interpretation of N as a model of N. Various proofs have been given. The first consistency proof for N was given by Gentzen in 1936, and other proofs have since been given. Gentzen's proof uses a modified form of transfinite induction. Most logicians take the view that Gentzen's proof, though quite satisfactory, does not fall within the confines of the Hilbert program, and thus is not an example of a finitistic proof. 14

Further, it is known that within any language which is sufficiently stronger than N one can define a (normal) truth predicate for the sentences of N and then show that all of the axioms and thus all of

. the theorems of N are true - and thus that N is consistent. This

12 Recall that the consistency of the elementary algebra of real numbers has been proved in a constructive fashion, by Tarski. As further examples, there are constructive consistency proofs for the elementary addition of natural numbers, and for the elementary multiplication of natural numbers.

J 3 My colleague Professor Gerald Holien, however, claims to have a consistency proof for first-order set theory, using the Herbrand-Skolem extension theorem, which is finitistic in a very strict sense of the word.

14 Church and Mostowski are notable exceptions to this widely accepted view. See Church 1965.


method, which we shall consider further in the following section, is due to Tarski. 15 It provides a method for proving the consistency of interpreted formalized theories in general. With respect to N, it can be used to give a consistency proof of N within a theory of second-order arithnletic; e.g., the theory N2 of Chapter V. N2 ,

that is, is an adequate seman tical metalanguage for N; though, by Godel's arguments, not for N2•

8.6. Tarski's Theorem

As G5del himself pointed out in his 1931 paper, there is an obvious analogy between the undecidable sentence G and the paradox of the liar. The sentence G, recall, says of a certain sentence that that sentence is not provable (in a certain system), and it turns out that that sentence is G itself. Thus, G in a roundabout way says of itself that it is not provable. In the liar paradox we encounter a sentence which says of itself that it is not true. This latter sentence obviously leads to contradiction. If the property of being a true sentence of N were the same as the property of being a provable sentence in N, from Godel's sentence G we could establish a contradiction in N. Assuming that N is consistent, then, we conclude that the property of being a true sentence of N is not identical with the property of being a prov,~ble sentence in N. Indeed, the liar paradox in effect draws our attention to a certain incompleteness in N. We see from it that if N is consistent the notion of being true in N cannot be defined within N, although from Godel's 1931 paper we know that the notion of being provable within N can be defined within N. Thus, N evidently is incomplete in two respects: (1) incomplete in the sense that not all tme sentences of N are provable within N (supposing that N is consistent); and (2) incomplete in the sense

IS Tarski, 'The Concept of Truth in Formalized Languages,' pp. 236 ff; pp. 273-274. This very important article first appeared in Polish in 1933, and in German (Der Wahrheitsbegriff in den formalisierten Sprachen) in 1936. It was communicated to the Society of Sciences in Warsaw in 1931. The results it contains date for the most part from 1929. It appears in English translation as paper VIII in A. Tarski 1956.

TARSKI'S THEOREM

that though a good many concepts pertaining to N can be defined within N, not all concepts pertaining to N can be defined within N. N is limited, then, both in its deductive power and in its expressive power.

The above argument is of an informal nature, but it can be made precise as follows. We say that an n-ary number-theoretic relation R is arithmetical- or arithmetically definable, or definable in N -if and only if there is some formula A(al' ... , an) of N such that for all natural numbers k1, ... , kn , the n-tuple <k1, ... , kn > is an element of R if and only if A(k1, ..• , kn ) is true underthe intended interpretation of N. In the particular case where R is a set, R is arithmetical if and only if there is some formula A(a) of N such that for all natural numbers k, k is an element of R if and only if A(k) is true under the intended interpretation of N. (Note that we are here using 'true under the intended interpretation of N', and not 'provable in N'.) We are now able to state and prove a theorem which makes our above informal argument precise; viz.,

Tarski's Undefinability Theorem (1936): The set of all GOdel numbers of those sentences of N which are true under the intended interpretation of N is not an arithmetical set ..

This theorem can be proved as follows, in a manner directly analogous to Godel's semantical proof of the incompleteness of N. (Cf. footnote 4.) From G5del's 1931 paper we know that the following function D is definable in N: if n is the G5del number of a formula A(x) with free variable 'x', then D(n) is the G5del number of AUl); otherwise, D(n) = n. Thus, the result of applying the function D to the Godel number of a formula with free occurrences of 'x' is the Godel number of the formula which results from that formula by replacing those free occurrences of 'x' by occurrences of the numeral of the Godel number of that formula. This function, of course, is used in the cons.truction of Godel's undecidable sentence G. Because application of it recalls Cantor's procedure in his famous proof of the non-denumerability of the real numbers, this function D is called the diagonal function. And the operation of replacing all free occurrences of 'x' in A(x) by occurrences of the numeral designating the Godel number of A(x) is called the operation of diagonalizing on A(x).


Suppose now that the property of being the G6del number of a true sentence of N is arithmetical. Then there is in N a formula, 'Tr(x)' say, which is true for x if and only if x is the G6del number of a true sentence of N. Consider now the formula with free variable 'x':

(A) 'V Tr(D(x)).

This formula has a G6del number. Let it be m. Substituting ill for all free occurrences of 'x' in A - i.e., diagonalizing on A - gives us the formula

(T) 'V Tr(D(m)).

This formula T says that the formula with G6del number D(m) is not true. Now the m mber D(m) is the Godel number of the formula which results from the formula with G6del number m _ viz., A - by replacing all free occurrences of 'x' by occurrences of m. But this formula is T itself! Thus T is the formal rendition of the famous 'This sentence is not true' of the liar paradox. The sentence ''V Tr(D(m)), is true if and only if ''V Tr(D(ffi)), is not true. This contradictory result shows that our supposition that the property of being the G6del number of a true sentence of N is an arithmetical property is false.

Notice now how Talski's theorem leads to G6del's incompleteness result for N. We know that the property of being the G6del number of a provable sentence of N is an arithmetical property. From the fact that the property of being the G6del number of a true sentence of N is not an arithmetical property, then,·it follows that the class of true sentences of N is distinct from the class of provable sentences of N. Supposing that all provable sentences of N are true, it follows that there are true sentences of N that are not provable in N.

Tarski's theorem can obviously be generalized to theories other than N, including second-order theories. Let T be any theory whose logical basis includes at least the first-order predicate logic with identity. Let us say that the set of true sentences of a theory T is definable in T if and only if T contains a formula with one free variable which is satisfied by a natural number n if and only if

TARSKI'S THEOREM 213

n is the Godel number of a true sentence of T. Similarly for the diagonal function's being definable in T. Then a very general form of Tarski's theorem is as follows:

Tarski's Undefinability Theorem (general form): If T is a consisten t theory, then the diagonal function and the set of true sentences of T are not both definable in T. 16 The proof of this general theorem parallels the proof of Tarski's theorem for N. To see its scope, let T be any consistent extension of the weak finitely axiomatizable theory of Robinson's arithmetic. Then the diagonal function will be definable in T. It follows that the set of true sentences of T will not be definable in T. Suppose further that Tis also an axiomatizable theory. Then the set of (Godel numbers 00 theorems of T will be definable in T. Thus the theorems of Tare only certain of the true sentences of T and it follows that T is incomplete - which is a generalized version of G6del's (semantic) incompleteness theorem. We could, then, have established incompleteness via Tarski's theorem, without retracing Godel's original proof. The advantage of presenting Godel's original proof is that it is a constructive proof which presents us with a particular sentence which is undecidable in T, and that it leads directly to Godel's Second Incompleteness Theorem.

We have stated Tarski's theorem in a semantical form. It can also be stated in a syntactical form. We say that a predicate (or formula) of a theory T with one free variable - 'Tr(x)" say - is a truth-predicate for T if and only if all sentences of the. following form are provable in T:

Tr( ... ) == A,

where A is any sentence of T, and in the position ... there occurs the numeral of the Godel number of A. Then we have the following theorem:

Tarski's Undefinability Theorem (syntactical form): If T is consistent and the diagonal function is definable in T, then no

16 See A. Tarski, A. Mostowski, R.M. Robinson 1953, pp. 46 f. Tarski's Undefinability Theorem in its general form as here stated is indeed a special case of th.e theorem on p.46.


predicate (or fonnula) in T is a truth-predicate for T. This theorem can be proved by a diagonal proof paralleling the proof of Tarski's theorem in its seman tical fonn.

We say that a theory T' possesses a normal truth-definition 17

for another theory T if and only if (a) there is in T' a predicate (or fonnula), say 'Tr(x)" s'tch that all sentences of the following fonn are provable in T':

TrC. .. ) == A,

where A is a sentence of T (or a translation of a sentence of T into T'), and the position ... is filled by the numeral designating the ~5d~1 number of A (or by some name of A); and (b) the sentence In T asserting that

For all sentences A of T, if A is a provable sentence of T then , Tr( ... )

is a provable sentence in T'.

No consistent theory can contain a normal truth-definition for itself. If T' contains a nonnal truth-definition for T, then within T' we can prove the conSistency of T, simply by being able to prove that all provable sentences of T are true. This gives us a general method for proving the consistency of formalized theories. To prove the consistency of T, it suffices to find a theory T' which contains a nonnal truth-definition for T. Admittedly, T' will have to be a '.stronger' theory than T; and thus this method of proving the conSistency of a theory T will not dispel any doubts we might ?ave about the consistency of T. But it has the following Importance: If we can show that T' contains a nonnal truthdefinition of T, we thereby show that T' is a stronger theory than T. Here, then, is a general method for showing that one theory is stronger than another. As an illustration of the use of this method it has been proved (Kemeny, 1948) that in Zermelo set theory on~ can con~truct a nonnal truth-definition for the simple theory of types with an axiom of infinity added, as well as an axiom of choice for each level (Le., Principia Mathematica). Thus, we can

17. As we ha~e remarked, the notion of a tru th-definition, or a truth-predica te, is from Tarskl. The notIOn of a normal truth-definition is from H. Wang 1952.

DECISION PROBLEM. CHURCH'S THESIS 215

conclude that the fonner theory must be a stronger theory than the latter.

8.7. Decision Problem. Church's Thesis. Recursively Enumerable Sets

We tum now to the decision problem. 18 The decision problem for an arbitrary theory T, recall, is the problem as to whether there exists an effective procedure - i.e., an algorithm - for detennining whether an arbitrary fonnula A of T is a theorem of T. Such a procedure, if there is one, is called a decision procedure, or a decision method, for T. Thus, the decision problem for Tis the problem as to whether there is a decision metho~ fo: T. Not, of course, simply whether there is such a method whIch IS known to us; but whether there is such a method at all, known or unknown.

More generally, a class of entities is a decidable class if and only if there is an effective procedure for determining whether a particular entity is a member of that class; otherwise, an undecidable class. Thus, for example, for an axiomatic theory T, the class of axioms of T is a decidable class; the class of theorems of T, however, mayor may not be a decidable class. .

Once we arithmetize a particular theory T, the questIOns as to whether the class of its fonnulas, the class of its axioms, the class of its theorems, etc., are decidable classes can be replaced by the questions whether the class of Godel numbers of its. formulas, of G5del numbers of its axioms, of Godel numbers of ItS theorems, etc., are decidable classes. We shall understand these questions in this way from this point forward.

In our consideration of recursive functions and relations (section 8.2), we pointed out that all recursive functions are effectively computable, or effectively calculable. And since a recursive relation is simply a relation whose characteristic function is

18 For an introductory approach to the decision problem, with proofs of important results, see S.C. Kleene 1967, Chapter V; also, Mostowski 1966, Lecture XII.


recursive, all recursive relations are effectively calculable, in the sense that there is an effective method for determining whether any given recursive relation applies to any given arguments. Thus, recursive functions and relations are all effectively calculable. There are very good reasons for accepting the converse, also. The converse is the famous 1936 thesis of A. Church: 19

Church's thesis: All (effectively) calculable functions and rela~ tions are recursive.

This thesis relates two precisely defined concepts (viz., the concepts of a recursive function and of a recursive relation) to two intuitive concepts (viz., the concepts of a calculable function and of a calculable relation). It is to the effect that wherever the latter concepts apply, the former concepts also apply. Because it makes reference to intuitive concepts, it is impossible to give a formal proof or disproof of Church's thesis. It can be supported only inductively, and by arguments which are not formal demonstrations. Still, the evidence in support of Church's thesis is very impressive, and almost all logicians accept Church's thesis. The evidence is of various sorts. First, a great many calculable functions and relations have been shown to be recursive, and no one knows of any admittedly calculable function or relation which is not recursive. Second, many methods for obtaining calculable functions from calculable functions have been shown to lead from recursive functions to ncursive functions; nor does anyone know of any methods which are counter-examples to this generalization. Third, a number of different exact characteristics of the class of effectively calculable fungtions and relations have been proposed, independently of one another. Thus, it has been proposed (Turing) that all effectively calculable functions and relations are calculable by a certain type of machine (a 'Turing machine'). Further, that all effectively calculable functions and relations are calculable within a certain system of equations (Herbrand-Gbdel). Again, that effectively calculable functions and relations are all recursive, given various definitions of recursive other than the one that we have given. Now it has been shown that these exact characteriza-

19 A.Church 1936.

DECISION PROBLEM. CHURCH'S THESIS 2 ·~ i'

tions will coincide in their extension; they all determine the class of recursive functions and relations. Most logicians would agree. then with Shoenfield when he concludes of the class ofrecursive , . . functions, 'This certainly suggests that this class of functIons IS a very natural class; and it is hard to see why this should be so. unless it is just the class of calculable functions.' 20 '.

We shaU, then, accept Church's thesis. Its converse IS clearly true, as we remarked in section 8.2. Thus, we shall from this point forward identify the class of effectively calculable functions and relations with the class of recursive functions and relations.

The decision problem for an arbitrary theory T now takes on the following precise form:

Is the class of theorems of T a recursive class? Once the decision problem is stated in this exact form, it becomes possible to prove conclusively that for certain theories T the class of theorems of T is not a decidable class. This negative result could hardly be established for any theories T given only the intuitive concept of a decida ble class.

The class of axiomatic theories now becomes identical with the class of theories whose axioms form a recursive class. And an axiomatizable theory now is one which is equivalent to a theory whose axioms form a recursive class.

For all theories T considered in this book, the class of formulas of T is a recursive class. For all axiomatic theories T, the class of axioms of T is a recursive class. For certain theories T, the class of theorems of T is a recursive class; for others, not. Thus, the class of theorems of the theory R (Chapter VI) is recursive; however, the class of theorems of the theory N is not recursive, as we have observed earlier and shall subsequently show. '

Using the concept of a recursive function, we now introduce a new concept, which serves to make precise the intuitive concept of being a class of natural numbers whose members .can all. ~e generated one after another in some mechanical fashIOn. ThIS IS

the concept of a recursively enumerable class. A class of natural

20 J.R. Shoenfield 1967, p. 121.


numbers is said to be recursively enumerable if and only if either (a) it is the empty class, or (b) is the range of a (singulary) recursive function. That is, the class X is recursively enumerable if and only if either X is empty, or there is some (singulary) recursive function f such that

n EX == (3x)( f(x) = n).

The members of X will be precisely f(O), fO), 1(2), .. . ,t(m), .... The function f will enumerate the members of X, that is, and n will be the mth element in this enumeration, for some number m. Repetitions are permitted, and thus n may be some other element in this enumeration, also.

As simple examples of recursively enumerable classes of natural numbers, we have (a) the class of all natural numbers; (b) the class of all even natural numbers; and (c) the class of all perfect squares. F or (a) is the range of the function f(x) = x; (b) is the range of the function f(x) = 2x; and (c) is the range of the function f(x) = x 2;

and these functions are all clearly recursive. An alternative definition of a recursively enumerable class is as

follows: a class X is recursively enumerable if and only if there is some recursive relation R such that

n EX== (3y)R(n,y).

This definition can eClsily be shown to be equivalent to the definition in terms of recursive functions.

I t can be shown that every recursive class is recursively enumerable. 21 The converse, however, is not in general true, as we shall

21 Further, that every recursively enumerable class is arithmetical. We shall subsequently mention a class which ran be used to show that the converse, however, does not hold in general.

We have, then, the following hierarchy of increasingly more comprehensive types of classes of natural numbers: (a) primitive recursive; (b) recursive; (c) recursively enumerable; (d) arithmetically definable; (e) set theoretically definable. Each of these types is of denumerable cardinality, whereas the set of all sets of natural numbers is of nondenumerably infinite cardinality.

DECISION PROBLEM. CHURCH'S THESIS 219

see. And it can be shown that a class of natural numbers X is recursive if and only if X and its complement - i.e., the class of all natural numbers not in X - are both recursively enumerable.

To illustrate further the concept of a recursively enumerable class, consider any axiomatic theory T (of either first or secondorder). The class of (G6del numbers of) theorems of T will be a recursively enumerable class. For the relation PfT(x,y) - y is the G6del number of a proof in T of the formula with G6del number x - is recursive for all axiomatic T. Thus, letting 'ThmT' denote the class of G6del numbers of theorems of T,

n E ThmT == (3y)PfT(n,y).

Thus, (by our alternative definition of a recursively enumerable class) the class of theorems of T is recursively enumerable. And the same holds if we suppose simply that T is axiomatizable. Suppose now that T is an axiomatizable but undecidable theory. Consider the complement of the class ThmT' If this class were recursively enumerable, then the class ThmT would be not only recursively enumerable but also recursive. But then T would be decidable, contrary to our hypothesis. Thus, as an example of a class of natural numbers which is not recursively enumerable, we have the complement of the class ThmT , for all theories T which are both axiomatizable and undecidable. More interestingly, if Tis a theory which is not axiomatizable, then ThmT will not be a recursively enumerable class. As an example, let T be' Skolem's arithmetic: the class of G6del numbers of all true sentences of elementary arithmetic is not recursively enumerable. Indeed, by Tarski's theorem, this set is not even arithmetically definable -though it is, to be sure, set theoretically definable.

As we have already remarked, the members of a recursively enumerable set can all be generated one by one in a mechanical fashion. Thus, from the fact that the set of theorems of an axiomatizable theory T is recursively enumerable, it follows that for each such theory T one could build a machine Ml which would generate the theore111s of T one by one. For any formula A of T, if A is a theorem of T, then within some finite length of time that machine will show that A is a theorem of T. Suppose now that the


set of formulas of T which are not theorems of Tis (,llso recursively enumerable. Then this set, too, could be generated by a machine M2 . It would then follow that T is a decidable theory. For, given any formula A of T, within some finite length of time either Ml will show that A is a theorem of T, or M2 will show that A is not a theorem of T

8.8. Undecidability

A theory is essentially undecidable, recall, if and only if it is consistent and all consistent extensions of T (including T itself) are undecidable. 22

We now prove the following general undecidability theorem, where T is any theory whose logical basis includes at least the first-order predicate logic with identity:

General Undecidabili(! Theorem: If T is a consistent theory in which all recursive functions and relations are representable, then T is es~entially undecidable.

The diagonal proof of this theorem is as follows: Since all recursive functions and relations are representable in T, the property of being a theorem of T is at least definable in T, as is the diagonal function. Let 'ThmT(x), and 'D(x)" then, be predicates in T denoting the property of being a theorem of T and the diagonal function, respectively. Then the following formula will appear in T:

(A) "-' ThmT(D(x)).

Let its Godel number be m, and let G be the following closed formula:

(G) "-' ThmT(D(m)).

This formula - which is, of course, the famous Godel formula G for the theory T - is cInrly true if and only if it is not a theorem

22 The classic work in the whole area of undecidability is A. Tarski, A. Mostowski, R.M. Robmson 1953. See also A. Tarski 1949; S.C. Kleene 1952, pp. 432-439; Kleene 1967, pp. 273-282.

UNDECIDABILITY 221

of T, since the G6del number of G is the number D(m). Suppose now that the property ThmT is a recursive property. Then it is representable in T. But then (by the definition of 'representable'), (a) if the sentence with G6del number D(m) is a theorem, then rT ThmT(D(m)); and (b) if that sentence is not a theorem, then rT"-' ThmT(D(m)). Consider now the following argument: .

(1) Suppose '''-'ThmT (D(m))' is true. (2) Then ',,-, ThmT (D(ffl»), is not a theorem of T (3) But then, by (b), rT"-' ThmT(D(m)); that is,

'''-'ThmT (D(ffi)), is a theorem of T. (4) Suppose next that ',,-,ThmT(D(m»)' is not true. (5) Then ',,-, ThmT(D(m))' is a theorem of T. (6) Thus, by (a), rT ThmT(D(m)). _ (7) Thus, since Tis consistent, not rT,,-,ThmT(D(m».

That is, ',,-, Thm(D(m»' is not a theorem of T.

This argument clearly shows that ',,-, ThmT(D(m»)' is true if and only if it is a theorem of T. Since this contradicts the fact that ',,-, Thm (D(m», is true if and only if it is not a theorem of T, the supposiIion that the property ThmT is recursive must. be false. T~e property of being a theorem of T, then, is not recu:S.Ive:. thu~, TIS undecidable. Since definability in T and representabIhty In T Imply definability and representability in all consistent extensions of T, all consistent extensions of T will be undecidable; that is, T is essentially undecidable. Thus the General Undecidability Theorem is proved.

If now we suppose further that T is an axiomatizable theory, then the class of Godel numbers of theorems of T wi"ll be a recursively enumerable class which is not a recursive class.

From the General Undecidability Theorem there follows as a corollary the

Church-Rosser nleorem (1936): The theory N is essentially undecidable.

Furthermore, since the theory of Robinson's arithmetic is consistent and all recursive functions and relations are representable in it, it follows that this very weak finitely axiomatizable


theory is essentially undecidable. We can use this fact now to establish the undecidability of the class of theorems of the underlying logic of Robinson's arithmetic; viz., the first-order predicate logic with identity (with the non-logical constants of Robinson's arithmetic). For let A be the conjunction of the finitely many non-logical axioms of Robinson's arithmetic, and let B be any formula of Robinson's arithmetic. Then if the first-order predicate logic with identity (with the non-logical constants of Robinson's arithmetic) were decidable, we would have an effective method for determining whether the formula A:J B were a theorem of this logic. But such a method would be decision method for Robinson's arithmetic: if A:J B is a theorem of this underlying logic, then B is a theorem of Robinson's arithmetic; otherwise, not. Furthermore, since it is possible to axiomatize the first-order predicate logic with identity in such a way as to use only finitely many axicms for identity, we also have

Church's Theorem for the first-order predicate logic (1936): The dass of theorems of the first-order predicate logic is undecidable.

Since the second-order theory of arithmetic N2 is consistent and all recursive functions and relations are representable in it, we can now use the General Undecidability Theorem to establish the undecidability of the theory N2. Because this theory has only finitely many axioms, it follows that the class of theorems of its underlying logic - viz., the second-order predicate logic (with the non-logical constants '0' and'S') - is an undecidable class.

As a further example of an undecidable theory, we have the elementary theory of groups (Tarski, 1946). This theory, however, is not essentially undecidable, because the elementary theory of Abelian groups (page 82), which is an extension of it, is known to be decidable (W. Szmielew, 1950). Thus, all theories fall into three classes: (a) decidable theories; (b) essentially undecidable theories; and (c) theories which are undecidable but not essentially undecidable. In class ~a) there is, for example, the elementary theory of complete ordered fields (elementary algebra), 23 as well

23 The decidability of elementary algebra is due to Tarski, who established this result

UNDECIDABILITY 223

as the elementary theory of dense ordering (page 77). Further. the elementary theory of addition of natural numbers, as well as the elementary theory of multiplication of natural numbers, and the elementary theory of Abelian groups. In class (b) are all consistent extensions of Robinson's arithmetic; and in class (c), we have as examples elementary group theory, and the elementary theories of fields and ordered fields.

As another very interesting example of theories falling under (b), consider the following finitely axiomatizable fragment of set theory (W. Szmielew and Tarski, 1950). Let S be the elementary theory with identity whose only non-logical constant is 'E'. The non-logical axioms of Tare: (1) an axiom of extension, stating that any two sets with the same elements are identical; (2) an axiom stating the existence of a null set; and (3) an axiom stating that for any two sets X and Y there is a set Z whose elements are those elements which are elements of X or are identical with Y. It is known that this theory S is essentially undecidable. 24 Since the usual elementary axiomatic set theories - e.g., ZF - are extensions of S, it follows that these theories are all undecidable, supposing that they are consistent. (This result, of course, already follows from the General Undecidability Theorem, together with the fact that recursive functions and relations are representable within the usual axiomatic set theories.) Furthermore, it follows that the first-order predicate logic with identity which has a binary predicate constant as its sole non-logical constant is undecidable.

Thus we now have two examples of consistent, finitely axiomatizable and essentially undecidable theories: the fragment of set theory S, and Robinson's arithmetic. Both of these theories are very weak, and thus very useful for showing that other theories are undecidable. A further example of a finitely axiomatizable and

by means of the so-called method of eliminating quantifiers (Skolef!1-Tarski). This method shows how to eliminate the quantifiers one by one in any formula of elementary algebra, resulting in an equivalent formula which is free of quantifiers, and whose validity can be readily ascertained. Tarski's solution is very complex, however, and is as yet of little practical importance. The decidability of the elementary theory of Abelian groups has also been shown by means of the same method.

24 A. Tarski, A. Mostowski, R.M. Robinson 1953, p. 34.


essenLally undecidable theory is the von Neuman-Bernays-Godel set theory. This theory is far too powerful, however, to be useful for establishing the undecidability of other theories.

An interesting question is whether there is an effective method for deciding whether an arbitrary theory is decidable. The answer to this question is in the negative. It is known that this so-called second-degree decision problem is unsolvable: there is no decision method for determining whether an arbitrary theory is decidable. 25

We close now with a number of general theorems concerning decidability, undecidability and essential undecidability.

(1) It is clear that if a theory T is decidable, then T is axiomatizable. For to axiomatize T, it would suffice to take as axioms the decidable (i.e., recursive) class of all theorems of T.

(2) For a complete theory T, the following two conditions are equivalent: (1) T is und~cidable; (2) T is not axiomatizable. Now all theories whose axioms are semantically defined as all sentences of the theory which are true under a certain interpretation are clearly complete (as well as consistent). It follows that all such theories are either both axioma tizable and decidable, or neither axiomatizable nor decidable. Elementary algebra is an example of the first sort of theory; and Skolem's arithmetic is an example of the second.

(3) From (2) there follows: Let T be an axiomatizable theory. Then if T is complete, T is decidable; thus, if T is undecidable, Tis incomplete. Here, then, is another way of establishing incompleteness. As illustration, let T be any consistent axiomatizable extension of Robinson's arithmetic. Then T will be undecidable, and thus incomplete. It is not true in general, however, that if T is an axiomatizable theory, then if T is decidable then T is complete. A counter-example is elementary Abelian group theory.

Suppose now that in some way we can show that a particular axiomatizable theory T is complete; e.g., by means of Vaught's test. By (3), we can immediately conclude that T is decidable. For example, we remarked in Chapter III that by using Vaught's test

25 A. Tarski, A. Mostowski, R.M. Robinson 1953, pp. 34-35.

UNDECIDABILITY 125

we can show that the elementary theory of dense ordering is complete. By (3), since this theory is axiomatizable, we conclude that it is decidable.

(4) Let T1 be a consistent extension of T 2. If T2 is essentially undecidable, then T1 is also essentially undecidable.

Following the Tarski 1949 abstract, we now define a number of concepts which permit us to state several further general theorems which have proved very useful for establishing undecidability results.

A theory Tl is a finite extension of a theory T2 if and only if Tl is an extension of T2 (that is, all theorems of T2 are theorems of T1), and only finitely many axioms of Tl are not theorems of T 2. Theories Tl and T2 are compatible if they have the same nonlogical constants and a consistent common extension. And Tl is weakly (or consistently) interpretable in T2 if and only if Tl and T2 have a common consistent extension T such that every non-logical constant k in Tl which is not in T2 is definable in T (in the sense of Chapter IV, page 105) in terms of non-logical constants of T2 and possibly individual constants of T. That is, for each such constant k, there is a theorem of T which is a possible definition of k in terms of these non-logical constants. We now have the following theorems. 26

(5) If T is undecidable, then every theory Tl with the same constants of which T is a finite extension is undecidable. We have already presupposed this theorem in pointing out, for example, that the undecidability of the fragment of set theory S implies the undecidability of its underlying logic. .

(6) T is essentially undecidable if and only if it is consistent and no consistent and complete extension of T is decidable.

(7) If T is essentially undecidable, finitely axiomatizable, and compatible with T1> then Tl is undecidable (though not necessarily essentially undecidable).

(8) If T is essentially undecidable, finitely axiomatizable, and weakly interpretable in T 1, then Tl is compatible with an essential-

26 For proofs, see A. Tarski, A. Mostowski, R.M. Robinson 1953, Part I. This part was written by Tarski; its results date from 1938-1939.

226 INCOMPT~ETENESS. UNDECIDABILITY

ly undecidable and finitely axiomatizable theory T2, and thus by (7) Tl is undecidable. Indeed, every subtheory of Tl which has the same constants as Tl is undecidable. By virtue of this important theorem, then, in showing that Tl is undecidable it suffices to find possible definitions for the relevant non-logical constants of T not necessarily in T1, but in some consistent common extension of T1 and T.

By virtue of the Geni~ral Undecidability Theorem, we are able to show that any consistent theory in which all recursive functions and rehtions are representable is undecidable. Theorem (8), on the other hand, permits us to establish undecidability in a far wider range of theories, once given the existence of theories which are essentially undecidable, finitely axiomatizable, and sufficiently weak so as to be easily weakly interpreted in other theories. Once given such theories T, theorem (8) permits us to establish the undecidability of theories Tl which are often far removed in their mathematical content from that of T; and often too weak to be shown to be undecidable directly by means of the General Undecidability Theorem. Thus we see the importance for the decision problem of such theories as Robinson's arithmetic and the fragment of set theory S. In particular, largely by drawing upon Robinson's arithmetic, the undecidability of the elementary theories of groups, rings, fields, ordered fields, abstract projective geometries, as well as many other elementary theories has been established. For further results and proofs, see Undecidable Theories.

BIBLIOGRAPHY

Ackermann, W., See Hilbert and Ackermann. Bar-Hillel, Y., See Fraenkel and Bar-Hillel. Bernays, P., See Hilbert and Bernays. Beth, E., 1959. The Foundations of Mathematics. North-Holland Publishing Company,

Amsterdam. Carnap, R., 1956. Meaning and Necessity: A Study in Semantics and Modal Logic. The

University of Chicago Press, Chicago. Church, A., 1936. An Unsolvable Problem of Elementary Number Theory. American

Journal of Mathematics, Vol. 58, pp. 345-363. 1936a. A Note on the Entscheidungsproblem. Journal of Symbolic Logic, Vol. 1, pp.

40-41; 101-102. 1956. Introduction to Mathematical Logic, I. Princeton University Press, Princeton. 1965. Review of R.B. Braithwaite's Introduction to Kurt Godel: On Formally Undecidable Propositions of Principia Mathematica and Related Systems (translated by B. Meltzer), Journal of Symbolic Logic, Vol. 30, pp. 358-359.

Cohen, P.J., 1963, 1964. The Independence of the Continuum HypothesiS, I, II. Proceedings of the National Academy of Sciences, Vol. 50, pp. 1143-1148; and

Vol. 51, pp. 105-11 O. . 1966. Set Theory and the Continuum Hypothesis. W.A. Benjamin, Inc., New York.

Cresswell, M.J., See Hughes and Cresswell. Fraenkel, A.A. and Bar-Hillel, Y., 1958. Foundations of Set Theory. North-Holland

Publishing Company, Amsterdam. GOdel, K., 1930, Die Vollstandigkeit der Axiome des logischen FunktionenkalkUls.

Monatshefte fiir Mathematik und Physik, Vol. 37, pp. 349-360. English translation appearing in van Heijenoort 1967. 1931. Uber formal unentscheidbare Siitze der Principia Mathematica und verwandter Systeme 1. Monatshefte fur Mathematik und Physik, Vol. 38, pp. 173-198. English translation appearing in van Heijenoort 1967. 1938. The Consistency of the Axiom of Choice and of the Generalized Continuum Hypothesis. Proceedings of the National Academy of Sciences of the U.S.A., Vol. 24,

pp.556-557. 1940. The Consistency of the Axiom of Choice and of the Generalized Continuum Hypothesis with the Axioms of Set Theory. Princeton University Press; Princeton. 1947. What Is Cantor's Continuum Problem? The American Mathematical Monthly,

Vol. 54, pp. 515-525. Halmos, P., 1960. Naive Set Theory. D. Van Nostrand Company, Inc., Princeton. Henkin, L., 1949. The Completeness of the First-Order Functional Calculus. Journal of

227

228 BIBLIOGRAPHY

Symbolic Logic, Vol. 14, pp. 159-166. Reprinted in Hintikka 1969. 1950. Completeness in the Theory of Types. Journal of Symbolic Logic, Vol. 15, pp. 81-91. Reprinted in Hintikka 1969. See also Montague and Henkin.

Hilbert, D. and Ackermann, W., 1950. Principles of Mathematical Logic. Chelsea Publishing Company, New York.

Hilbert, D. and Bernays, P., 1934, 1939. Grundlagen der Mathematik. Vol. I (1934). Vol. II (1939) Berlin.

Hintikka, J. (ed.), 1969. The Philosophy of Mathematics. Oxford University Press. Hughes, G.E. and Cresswell, M.J., 1968. An Introduction to Modal Logic. Methuen and

Co., Ltd., London. Jeffrey, R.C., 1967. Formal Logic: Its Scope and Limits. McGraw-Hill Book Company,

New York. Kalish, L., See Montague and Kalish. Kemeny, J.G., 1958. Undecidable Problems of Elementary Number Theory. Math.

Annalen, Vol. 135, pp. 160-169. Kleene, S.C., 1952. Introduction to Metamathematics. North-Holland Publishing Com

pany, Amsterdam. 1967. Mathematical Logic. John Wiley & Sons, Inc., New York.

Lyndon, R.C., 1966.Noteson Logic. D. Van Nostrand Company, Inc., Princeton. Mates, B., 1965. Elementary Logic. Oxford University Press. Mendelson, E., 1964. Introduction to Mathematical Logic. D. Van Nostrand Company,

Inc., Prince ton. Montague, R., 1965. Set Theory and Higher-Qrder Logic. Appearing in J.N. Crossley and

M.A.;:. Dummett, eds., Formal Systems and Recursive Functions. North-Holland Publhhing Company, Amsterdam.

Montague, R. and Henkin, L., 1956. On the Definition of Formal Deduction. Journal of Symbolic Logic, Vol. 21, pp. 129-136.

Montague, R. and Kalish, D., 1964. Logic: Techniques of Formal Reasoning. Harcourt, Brace & World, Inc., New York.

Mostowski, A., 1966. Thirty Years of Foundational Studies. Barnes & Noble, New York. See also Tarski, Mostowski, and Robinson.

Quine, W.V.O., 1953. Mr. Straws on on Logical Theory. Mind, Vol. 63, pp. 433-451. 1960. Word and Object. The Massachusetts Institute of Technology. 1966. The Ways of Paradox. Appearing in W.V.O. Quine, The Ways of Paradox and Other Essays. Random House.

Robbin, J.W., 1969. Mathematical Logic: A First Course. W.A. Benjamin, Inc., New York.

Robinson, R.M., See Tarski, Mostowski, and Robinson. Rubin, H. and Rubin, J., 1963. Equivalents of the Axiom of Choice. North-Holland

Publishing Company, Amsterdam. Shoen field, J.R., 1967. Mathematical Logic. Addison-Wesley Publishing Company. Suppes, P., 1957. Introduction to Logic. D. Van Nostrand Company, Inc., Princeton.

1960. Axiomatic Set Theory. D., Van Nostrand Company, Inc., Princeton. Tarski, A., 1949. On Essential Undecidability (abstract). Journal of Symbolic Logic,

Vol. 14, pp. 75-76. 1951. A Decision Method for Elementary Algebra and Geometry, 2nd ed., Rev. Univenity of California Press, Berkeley and Los Angeles.

BIBLIOGRAPHY 229

1956. Logic, Semantics, Metamathematics. Papers [rom 1923 to 1938. Translated by J.H. Woodger. Oxford University Press. 1959. What is Elementary Geometry? Appearing in L. Henkin, P. Suppes, and A. Tarski, eds., The Axiomatic Method, With Special Reference to Geometry and Physics. North-Holland Publishing Company, Amsterdam. Reprinted in Hintikka 1969. 1965 . Introduction to Logic and to the Methodology of Deductive Sciences, 3rd ed., Rev. Oxford University Press.

Tarski, A., Mostowski, A. and Robinson, R.M., 1953, Undecidable Theories. North-Holland Publishing Company, Amsterdam.

von Heijenoort, J. (ed.), 1967. From Frege to Godel: A Source Book in Mathematical Logic, /879-1931. Harvard University Press, Cambridge.

Wang, H., 1952. Truth Definitions and Consistency Proofs. Transactions of American Mathematical Society, Vol. 73, pp. 243-275.

AUTHOR INDEX

Ackermann, W., 29, 83, 90 Bar-Hillel, Y., 149, 150, 165 Behmann, H., 71 Bernays, P., 187, 207 Beth, E., 145 Boole, G., 3 Burali-Forti, C., 144 Cantor, G., 128, 143, 144, 146, 167,

168,175,211 Carnap, R., 16 Church, A., 3, 29,60,70,71,83,90,93,

118,123,186,188,204,209,216 Cohen, P., 172, 174, 175 Cresswell, M.J., 20 Dedekind, R., 121, 139, 170 Descartes, 196 Ehrenfeucht, A., 119 Euclid, 53 Feferman, S., 208 Fermat, P., 121 Fraenkel, A.A., 149, 150, 162, 165 Frege, G., 3, 29, 149, 187 Gentzen, G., 209 G6del, K., 53,60,74,89,93,113,117,

118,134,175,177 ff, 181, 132, 186, 188,196 f, 199, 202, 204,206,209, 210, 216

Halmos, P., 149 Henkin, L., 35, 60, 62, 93, 94,126 Herbrand, 1., 216 Hilbert, D., 29,53, 83, 90, 140, 186 ff,

207 Holien, G., 209 Hughes, G.E., 20 Jeffrey, R.C., 60,199 Kalish, D., 60, 130, 136

230

Kalmar, 1., 71 Kemeny, 1.G., 126,214 Kleene, S.C., 196, 215, 220 Kuratowski, C., 155 Leibniz, 39,88,128 Lindenbaum, A., 63 Los, J., 76 Lyndon, R.C., 35, 199 Mates, B., 60 Mendelson, E., 21, 26, 35, 62, 74, 76,

109,196 Montague, R., 35, 60, 83, 130, 136, 150,

163 Morley, M., 76 Mostowski, A., 52, 61, 76, 172, 204,

208,209,213,215,220,223ff Newton, 128 Padoa, A., 105 Pascal, 121 Peano, G., 29,53,121,187 Peirce, C.S., 3, 29, 121 Poincare, H., 144 Post, E., 3, 26 Presburger, M., 120 Quine, W.V.O., 16, 148, 164 Ramsey, F.P., 144, 145 Richard, J., 146 Robbin, J.W., 83, 90, 93, 124 Robinson, R.M., 52, 204 f, 213, 220,

223 ff Rosser, J.B., 118, 186, 188, 202 Rubin, H., 166 Russell, B., 29, 53,144,149,187 Ryll-Nardzewski, Cz., 117 Schr6der, E., 3 Sheffer, H.M., 9

AUTHOR INDEX 231

Shoenfield, J.R., 109, 122, 172, 206, 217

Skolem, T., 29, 118, 120, 149, 152, 162, 223

Suppes, P., 98, 144, 149, 150 Szmielew, W., 222 f Tarski, A., 52, 98,134,135,137,138,

141, 146, 183, 188, 204, 205,209, 210,213, 214,220, 222, 223 IT

Turing, A.M., 216 van Heijenoort, J., 60 Vaught, R., 76 von Neumann, J., 160, 162, 176, 177 Wang, H., 214 Whitehead, A.N., 29,187 Wiener, N., 155 Zermelo, E., 149, 151, 152, 153, 162,

165,167,187

Antecedent, 6 Arithmetization, 196 f Atomic rormula,

of P, 12 ofFl,32

Axiom, of choice, 165 ff of cOnstructibility, 180 of extension, 151 of inaccessible cardinals, 183 of infinity, 159 ff, 183 f of pairing, 154 f of regularity, 163 ff of replacement, 161 of separation, 151 f of unions, 156 f

Axioms, of P, 21 ofF 1,43 ofFl,73f ofFo,80f ofF2,87f

Axioms of a theory, creative, 99 non-creative, 99 logical, 54 non-logical, 54

Axiomatic method, 53

Banach-Tarski theorem, 169 Biconditional, 7 f

Cantor's theorem, 175 Cardinal number,

finite, 59 infinite, 59

SUBJECT INDEX

232

Church's Theorem, 222 Church's Thesis, 216 f Church-Rosser Theorem, 221 Commutative group theory, 82 Completeness,

of P, 25 f of F 1,60 ff of FI, 74 f of FO, 81 of F2, 93 f

Conditional,S ff Conjunct, 3 Conjunction, 3 Consequence,

logical, 39 f tautological, 18

Consequent, 6 Consistency,

ofP, 25 f of F 1,47 f of F2, 89 semantical, 39 syntactical, 46 f

C onstructib Ie set, 178 f Continuity,

axiom of, 139 schema, 132

Continuum Hypothesis, 175, 178 Contradiction, 18

Decidable class, 215 Decision method, 215 Decision problem, 215 Definable, 105 Definitional equivalence,

for formulas of P, 22

SUBJECT INDEX 233

for formulas of F 1,44 Denumerably (countably) infinite, 59 Derivation, 45 Diagonal function, 211 Diagonal proof, 128 Difference, 158 Disjunct,S Disjunction, 5

Effective test, 17 f Elementary addition of natural numbers,

120 Elementary multiplication of natural

numbers, 120 Equivalent,

logically, 40 tautologically, 18 f

Evaluation, 91 Expression, 12 Extensional,

context, 10 logic, 10

Field, commutative, 131 complete ordered, 132

Finite vs. infinite, 169 f Finitistic, 187 Formula,

of P, 12 ofFl,32 of FI, 73 of FO, 79 of F2, 86 closed, 33 open, 33

Function, 156

General Undecidability Theorem, 220 Generalization,

existential, 32 universal, 32

GOdel number, 197 Group theory, 81

Hilbert program, 186 f

Implication, logical, 39 f tautological, 18

Impredicative definition, 148 Inconsistency,

logical, 39 tautological, 18

Independent set of formulas, 57 Indirect proof, 187 Inner model, 177 Interpretation,

of F 1,35 of FI, 73 ofFo,79f of F2, 90 intended interpretation of a theory, 58,98 principal interpretation of F2, 90 secondary interpretation of F2, 92 sound interpretation of F 2, 91 f

Intersection, 157 f Intuitionist, 171

Lindenbaum's lemma, 63 Logical constants,

of P, 11 f ofFl,30 of FI, 73

Logistic method, 53 Lowenheim-Skolem theorem,

for F 1,68 for F2, 94 downward, 173

Mathematical induction, weak,111, 113 ff strong, 115

Material implication, 7 Meta-language, 13 Metamathematics, 186 f Meta-variable,

of P, 13 ofFI,31

Minimum principle, 122 Modal logic, 10

234 SUBJECT INDEX

Model, of a class of formulas of F I, 39 of a class of formulas of F 2, 93 of a theory, 55 isomorphic models of a theory, 57, 97 principal and secondary models of F 2,93 standard model of a theory, 59, 98 non-standard model of a theory, 59, 98

Negation, 4 f Normal truth-definition, 214 Null set, 159 f Number-theoretic function, 188

characteristic, 1 92 effectively computable, 191 general recursive, 190 initial, 189 primitive recursive, 190 representable in N, 195

Number-theoretic relation, 188 arithmetical, 211 definable in N, 211 general recursive, 193 primitive recursive, 1 93 representable in N, 195

Numeral, 194

Official notation, 12 f, 72 f One-to-one correspondence, 156

Padoa's Principle, 105 Pair,

ordered, 155 f unordered, 154

Paradox Cantor's paradox, 146 paradox of the liar, 146 Richard paradox, 146 f Russell's paradox, 145 f

Paradoxes of material implication, 17 Peano arithmetic,

first-order, 109 ff second-order, 120 ff

Power set, 158 Prenex normal form, 71

Predicate constants, 30 Principle,

of comparability of sets, 166 f of extensionality, 10,74 of Identity of Indiscernibles, 88 of induction, 122 of Reductio Ad Absurdum, 51

Progression, 114 Proof, 23,45 Proof theory, 186

Quantifier, of F 1,31 f of F2, 86

Rank of a set, 165 Recursively enumerable class, 217 f Relation, 34 f, 155 f

asymmetric, 167 connected, 167 transitive, 167

Relative consistency proof, 176 f Replacement Principle for sentential

logic, 10 Requirement of Eliminability, 98 Requirement of Non-Creativity, 99 Robinson's arithmetic, 204 f Rules of inference,

of P, 23 of F I, 44 f of FI, 74 of FO, 81 of F2, 89 derived,25 primitive, 24 sound,23

Satisfiable, formula of F I, 39 formula of F2, 92 secondarily satisfiable, 92

Satisfies, 36 f, 80, 91 Schema, 14 Semantics, 1, 34 Sentence,

ofF I,33 ofF2,87

SUBJECT INDEX 235

Series, dense, 109 discrete, 108

Skolem's arithmetic, 120 Soundness,

for FI, 47 for F2, 93

Subset, 156 Symbols,

of P, 11 of F 1,29 f of FO, 79 of F2, 85 f

Syntax, I, 34

Tautology, 14 ff tautological consequence, 18 tautologically equivalent, 18 f tautologically inconsistent, 18 tautologically valid, 15

Term, of FO, 79 ofF 2,86

Theorem, ofP,23 of FI, 45 of F2, 89 theorem in the semantic sense, 96 theorem in the syn tactic sense, 96

Theory, axiomatic, 55 axiomatizable,55 categorical, 58, 98 categorical in power, 76 compatible theories, 225 conservative extension of a, 56 consistent, 56 definitional extension of a, 99 elementary, 31,52 ff equivalent theories, 56 essentially incomplete, 117 essentially undecidable, 118

extension of a, 56 finite extension of a, 225 finitely axiomatizable, 56 recursively axiomatizable, 203 second-order, 96 semantical1y complete, 97 simple extension of a, 56 subtheory of a, 56 weakly interpretable, 225

Theory of densely ordered sets, 77 Theory of simple ordering, 75 Theory of types, 95 Third-order logic, 95 Triple, 157 True under an interpretation, 38 Tru th-functional,

connective, 9 context, 9

Truth-predicate, 213 Truth-tables, 4 ff

Undecidable sentence in a theory T, 201 Underlying logic, 54 Union, 157 Unit set, 154 f Universal closure, 38 Use and mention, 6

Validity, for formulas of F I, 39 of formulas of F 2, 92 secondarily valid, 92

Value of a term, 80 Variant of a sequence of individuals, 36 f

Vaught's criterion, 77

Wel1-founded set, 165 Wel1-ordering theorem, 167 ff

w-consistent, 199 w-incomplete, 205 w-property, 65 w-rule, 206

rogers robert_ mathematical logic and formalized theories

Documents