on the existence of inﬁnitesimals

On the existence of infinitesimals

Richard KayeSchool of Mathematics

University of Birmingham

26th May 2010

Abstract

So-called nonstandard mathematics uses infinite and infinitesimal num-bers to develop mathematics, especially calculus, without the use of thenotion of limit. These methods are rigorous and all results are provablein usual mathematics or ZFC, and these nonstandard numbers are typi-cal of modern mathematics in that they abstract familiar operations andconcepts (in this case, of limit in the ε–δ analysis) as mathematical ob-jects in their own right. This makes them a good test case for the studyof what a mathematical object is, and issues relating to mathematicaltruth and knowledge. This paper studies from a mathematical and quasi-philosophical point of view the potential existence of such nonstandardnumbers for problemms of existence, realism and platonism of mathemat-ical objects, and indeed to develop ideas of what mathematical objectsreally are. Structuralist and utilitarian views are emphasised and we con-clude suggestions for a theory of ‘pure structuralism’ that might unitemathematical and general thought.

Contents

1 Introduction 2

2 Mathematical existence 4

3 Existence of infinitesimals 14

4 The case for ‘pure’ structuralism 21

5 Questions for further research 24

This is the second version of this paper and was a complete rewrite of anearlier one. I continue to be interested in the material here and I am still devel-oping it. This paper is issued as a preview of work in progress. All commentsare most welcome! Part of the material here was used as the basis of a collo-quium to the Philosophy Department of Warwick University, in the summer of2010, and there was a long and useful discussion afterwards.

1

1 Introduction

The nature of mathematical objects, whether they are abstract or real, howone might reason about them, and how one might understand the meaning ofstatements concerning them, are substantial and interesting philosophical ques-tions.1 It is quite common in such discussion to direct attention towards the caseof natural numbers, 0,1,2,. . . , as examples of mathematical objects, because oftheir comparative simplicity and their familiarity in the non-mathematical worldas well as with them being objects of on-going research in pure mathematics.Because of this familiarity and because they are so fundamental makes naturalnumbers rather atypical examples of the kind of mathematical objects usuallyfound in mathematical practice, and therefore not necessarily the most usefulexamples for our philosophical questions. This paper will comment on the im-portant questions of the nature of mathematical objects by focusing on anotherexample, that of infinitesimal numbers. It is hoped that by looking at num-bers that have seemingly contradictory or impossible nature but nevertheless—according to mainstream mathematics—incontrovertibly exist, we may learnsomething about the nature of mathematical objects. In any case, these areparticularly interesting numbers in their own right, with many potential appli-cations.

Infinitesimals are numbers that are ‘so small that there is no way to seethem or to measure them’. More formally, in an ordered field, a positive num-ber x is infinitesimal if 0 < x < 1/n holds for each ordinary positive naturalnumber n, i.e. of the form 1 + 1 + · · · + 1. Newton and Leibniz both used in-finitesimals in the development of their calculus, but were famously criticised byBerkeley. In the nineteenth century Cauchy, and also Riemann and Weierstrassand many others, replaced the notion of infinitesimal with that of limit. Butin 1966, Abraham Robinson’s book Non-standard Analysis showed that the useof techniques from first-order logic, in particular the Completeness, Soundnessand Compactness Theorems, the notion of infinitesimal could be put on a firmfoundation and be useful enough to develop the calculus in the way Newton andLeibniz intended [9]. The name of Robinson’s theory is often abbreviated toNSA.2

Thus, according to Robinson at least, infinitesimals exist and can be usedprofitably in analysis. However this does not completely deal with questions todo with such numbers, questions I will associate with their ‘existence’ for reasonsthat will hopefully become clear. For example, the fact that the existence ofinfinitesimals follows from other axioms of mathematics (or set theory) canbe used as a way to focus on those axioms and provide a testing ground forquestions about those axioms: whether we believe them, or in what sense webelieve they model the (or a) valid mathematical universe.

This paper attempts to be a mathematician’s view on questions on the natureof mathematics, mathematical objects and their existence, and mathematical

1The volume of essays edited by W.D. Hart [4] is an excellent introduction to these issues.2Robinson called the new numbers in his system ‘non-standard’ to distinguish them from

the ‘standard’ numbers or usual numbers of other kinds of mathematics. Thus ‘standard’and ‘non-standard’ are technical terms with precise meanings. Unfortunately, many peoplereading ‘Non-standard Analysis’ see it incorrectly as meaning the activity of analysis done ina non-standard way, and this easily becomes a pejorative term for the subject, which is mostunfortunate. Most recent authors write ‘nonstandard’ without the hyphen to emphasise thetechnical meaning of the word, and I will follow this convention here.

2

reasoning. There are a range of mathematical ideas here, which I attempt to tell‘straight’ without over-simplification, and where there is a choice concentratingon the mathematical view. I make comments on the underlying philosophywhere I am able, without being particularly thorough. A more thorough detailedexamination of these ideas might constitute a new research project in its ownright, or possibly more than one.

The paper is organised as follows. After this introduction, the first mainsection, Section 2, contains descriptions of four main points of view of math-ematical research, as a working mathematician would see them. These fourviewpoints are not mutually exclusive, nor do I claim that the list is complete.I suggest ways in which the four views merge into each other, but these are notthe only ways. (Indeed this property of mathematics that it can be looked atin different ways and at different levels simultaneously is one of its strengths.)In the following section, Section 3, I shall present the case for and against in-finitesimal numbers in each of the four views. My presentation is again mostlymathematical, though I try to make speculative suggestions wherever possible.Section 4, contains my personal conclusions from this thought, which are thatthe structuralist view is essential, not just for mathematics, but for everydaythought and arguments too. However, the structuralist view as sometimes pre-sented (e.g. Parson’s essay [7] reprinted in Hart [4, Chapter XIII]) has problemswhich must be addressed. My suggestion is essentially that the structuralistaccount des not go far enough, and this section concludes with what mightbe described as a brief manifesto, or research proposal, for what I call ‘purestructuralism’. I conclude with a list of research questions that arise out of thediscussion that I think are worthy of further study. An appendix presents sometechnical information on nonstandard mathematics for the interested reader.

The four viewpoints are as follows. Firstly, what I call the view from aunifying theory is the idea that all mathematics can be done or perhaps evenis best done as if from a single unifying theory. A set theory such as ZFC istypically used. I mention this view first because it seems to be the most com-mon ground for the majority of mathematicians. It is also a useful simplifyingview for mathematical practice: it may not be optimum for all work or repre-sent the full views of any particular working mathematician but it is ideal for afirst presentation of new work. Secondly, there is what I call the pluralist view,where the main work of mathematics is in looking at a large variety of differ-ent number systems, with ‘number’ being taken in the widest possible sense,and I would include geometrical arguments within these terms. The underly-ing theory is weakened but the construction of these systems is still possiblein ZFC. These number systems are regarded as the most important aspect ofmathematics and a number of them are more fundamental than others becauseof their ease of construction and applications. Chief amongst these systems arethose for natural numbers, integers, rationals, reals and complexes. The thirdof my viewpoints is the structuralist one, that a system of numbers or othermathematical objects takes abstract meaning through what it does rather thanwhat it is—the axioms it satisfies rather than the way it is constructed. Here,the axioms take priority, but constructions are still required to show that suchobjects still exist. The most important feature that this brings to bear for us iscanonicity. Remarkably often it is possible to prove that two systems satisfyingthe same axioms are isomorphic, the number systems are ‘naturally forced uponus’, or canonical, and sometimes even more: sometimes the canonicity itself is

3

canonical. This, it will be seen has very deep consequences for the structuralistview of mathematical objects. Finally, we consider what I call the utilitarianview, that mathematical objects that are useful for other kinds of mathematics,and other applications such as science, are the ones that deserve most attentionand the ones that have may be said to have existence in their own right. Apartfrom the issue of practicality, an argument related to Hilbert’s programme sup-ports this view. It will be seen that the issues of canonicity are essential hereand also support the utilitarian existence of certain objects, and in particularcanonicity is particularly interesting with respect to applications for physicaltheories, even possibly quantum mechanics.

This lists my four viewpoints and summarises the content of Section 2. Ishould emphasise that this section is not complete, even from the mathematicalperspective, in the sense that there are many other sensible viewpoints, includ-ing different ways of combining or prioritising the viewpoints I have given. Onemajor omission is the more modern view of mathematics as done on a com-puter, with computer algebra systems, or similar. This is particularly poorlyrepresented and presents interesting philosophical, mathematical, and computa-tional problems, related to constructivism, but perhaps distinct from it too. Butperhaps the biggest omission is the lack of discussion of how the various waysthe views fit together and what extra they offer when they are taken together.In particular I do not mention anything like a constructivist or intuitionistframework for combining these views, this being where I suspect intuitionisticmathematics might still have its greatest impact.

2 Mathematical existence

The main issue, according to Benacerraf [1] (reprinted as Hart [4, Chapter I]) isthat theory of truth of mathematical statements should be consistent with thatof everyday truth, and so should the idea of knowledge. There are a number ofsuggestions, not all incompatible with the others, for answers to the questions,‘What is a mathematical object?’ and ‘What does it mean to say that an objectexists?’ In this section we highlight some of the main options and choices fromthe point of view of a typical working mathematician and actual mathematicalpractice. In each case I will hint at how mathematics is done, and what theidea of its underlying foundation is. (These different views are not exclusive ofeach other. Indeed one of the interesting things about mathematics is the waythe various different foundational views can work together as different aspectsof ones work or at different levels.)

The unifying view of mathematics. By ‘the unifying view’ I mean theview of mathematics that it can all be done from one global theory, such as thatof set theory, typically ZFC, and whether or not one chooses to write proofsand other arguments formally (and most choose not to) it is clear from theirpresentation that they all can be written in this way. Therefore one’s workcontributes in the very least to the body of knowledge of consequences of theglobal theory.

I think it is fair to say, however, that most mathematicians do not givetheir underlying theory much attention, preferring to ‘get on with the job ofdoing mathematics’. But if they were asked, they could give a list of principles

4

they find admissible for deduction which would probably amount to first-orderlogic and axioms that would be available in a (possibly multi-sorted) version ofZFC set theory. Most mathematicians in this sense are rather conservative, andunderstand that this conservatism places them well within the realm of ZFC.This places their work on a reasonably sound footing, at least according to oneof the standard paradigms, but their belief in the soundness of mathematicsis typically much stronger. They have little difficulty mentally picturing theset theoretical universe described by the ZFC axioms, or (more realistically) thepart they are working on, as existing in some sort of platonic way. They ‘get on’with their mathematics, which is to say, they posit the idea and consequencesof there being particular objects with particular properties, both informally(using images, diagrams, analogies, and so on) and semi-formally. Researchmathematicians have generally trained themselves to be ‘pessimistic’ about theirpicture of the universe; that is to say the mental picture they have is generallyinclusive of all possibilities and therefore necessarily somewhat incomplete. This‘pessimism’ arises because the informal ‘brain-storming’ stage is important forsuccessful work, but from experience they know it can be unreliable. Potentiallyunreliable arguments generated by informal means are always carefully checked,verified and communicated in a rather different semi-formal style when it isbelieved that some important conjecture that can be proved has been identified.

When working semi-formally, proofs are written down in a mixture of nat-ural language and mathematical symbolism in such a way that they could inprinciple be rewritten or developed into formal proofs in first-order logic. Fewmathematicians work in anything other than first-order logic. These logicalprinciples used in such proofs are, however, always phrased in a way that arecompatible (via an informal version of the soundness theorem) with the notionof truth (defined using something like Tarski semantics) relative to the universethey conceive of as (at least for the moment) existing platonically. These proofscan be rewritten as formal proofs in ZFC and are frequently interpreted as such,but at the time of conception the syntactical proof rules are rather consideredas semantic rules concerning truth and possible situations which relate to theconception of the universe being ‘explored’.

If an example or algorithm or other object is explicitly exhibited rather thanshown to exist by non-constructive means a mathematician will usually say so,rather than leave the realm of classical logic. Proofs are deliberately written ina semi-formal way because mathematicians know that there may be a numberof subtly different interpretations of what they write, and they emphasise the(semi-formal) arguments rather than the pure statements of their results toaid these different interpretations. A proof can be read as the reason whysome statement is true, but also often as a method or process by which tocarry out a calculation, and although mathematicians are generally unfamiliarwith intuitionistic logic and other constructive logics, they do present proofs asmethods when appropriate.

Of course, some mathematicians are more familiar with foundational mattersand may explicitly state they are ‘working in ZFC’ or similar. A few may beworking in areas where additional axioms (such as CH, GCH, AD or largecardinal assumptions) are useful and typically pick and choose from this list ofadditional axioms as suits them. In any of these cases, mathematics is usuallydone in the first instance within some standard logical framework, such as ZFC,and the actual work taking place is both informal and semi-formal, but both

5

parts are conceived by the mathematician as taking place in some semanticmanner relative to the conceived or imagined set theoretic universe.

The pluralist view of mathematics. Mathematics is primarily (but notexclusively) about numbers, where ‘number’ is often taken in the most generalsense possible. Some number systems are of particular importance, and newnumber systems are usually built from more fundamental ones. Arguably themost fundamental number system of all is the system N of the natural num-bers, 0, 1, 2, 3, . . .. These frequently are described as being the numbers corre-sponding to a finite sequence of strokes on the page, the number 5 correspondingto ||||| for example. The natural number system is extended to the integers, Z,the rationals, Q, the reals, R, and the complexes, C. Other number systemsmay be devised by related means, such as polynomial rings, groups, finite andother fields, and extension and quotients of these. Other structures that are notstrictly speaking of numbers but are treated as numbers by mathematicians,such as the collection Vω of hereditarily finite sets, the collection of (finite orinfinite) graphs, groups, etc., may also be defined and used.

In this view, Mathematics becomes an industry of combining these ‘num-bers’ from these different systems to find interesting properties or facts aboutthem, to devise new systems, and to use these systems to model phenomenain the natural world. Some logical theory is required for this endeavour ofcourse, but mathematicians working like this typically regard each individualsystem of numbers as having some genuine existence. It may be that a work-ing mathematician will have what amounts to different logical conventions forthe different areas of work. The theory for the reals might be based on first-order logic, but that for making new systems out of old may be based on somecategory-theoretic framework. From the foundational point of view the math-ematician feels on safer ground as the theory of each system has some sort ofindependent life, and if one falls, being found to be inconsistent or uninteresting(that is, if some interesting mathematical result is proved stating that there isno system with the particular properties in question) then the others still stand.

In this view, each system is constructed, and exists because it is constructed.It has a particular construction, and therefore a particular definition. A rationalnumber p/q is the equivalence class of a pair of integers (p, q) with q 6= 0. Oneof the early tasks of set theory (the theory ZFC, for example) was to verifythat all the constructions of all these number systems could be carried out inthat theory, so that ZFC could (but not necessarily should) be regarded as aunifying theory of all these number systems. In this sense set theory was more-or-less successful (though it is inconvenient that so-called ‘large categories’ arenot sets) and the pluralist view can be and is partially subsumed by the globalunifying-theory view.

The structuralist view of mathematics. The main problem with whatI have called the pluralist view of mathematics is that two workers may havedifferent constructions of number systems and these systems need to be com-pared. It is certainly the case that many number systems that look like the realnumbers can be constructed, and the important thing about a particular realnumber is not how it is actually constructed (as a Dedekind cut, the equivalenceclass of a Cauchy sequence, a continued fraction, or whatever) but what it does.

6

Thus the important property of π is that it is the ratio of the circumferenceof the circle to the diameter, and not that it happens to be a Dedekind cut ofsmaller rational numbers (in the view of one construction of R). To get aroundthis problem one writes down axioms for each structure of numbers one devisesand then proves these axioms to be true for the number system in question.The name ‘axiom’ is used not because it is to be assumed without proof (on thecontrary, these axioms must be proved) but because the other features of thestructure will be quietly forgotten and future work regarding the structure willbe from the axioms we have listed alone and nothing else.

The word ‘structure’ has crept in here because a set such as N or Q withoutadditional structure is just amorphous and depends only on its cardinality. Sothe additional algebraic structure (the order relation and addition and multipli-cation operations in N and Q) are important to distinguish these systems. Theaxioms describe the properties of the elements of the set (as primitive objects,i.e. without structure in themselves) and the properties they have relates toorder, addition and multiplication. The number 2/3 in Q is described by theproperties it has with respect to order, addition and multiplication comparedto other rationals. In other words mathematical objects like N and Q, and alsothe numbers themselves, are abstract objects characterised by ‘what they do’rather than ‘what they are’. For an elementary mathematical introduction tomathematics considered this way, read Gowers [3]. A more critical philosophicalaccount is to be found by Parsons [7] (Hart [4, Chapter XIII]).

There are two main aspects of this new way of thinking about mathematics.The first is that we have the basis of understanding the idea of an abstract math-ematical object: an object that has no structure in itself but is characterised bywhat it does in certain situations. Whatever the philosophical implications ofthis view are, it is at least in accordance with modern mathematical practice.The second is that an abstract mathematical object is in some sense independentof how it is constructed—the actual ‘internal structure’ of a real number as aDedekind cut or whatever—but is in fact the same object as one constructed inan entirely different way but with the same properties. Two key examples illus-trate this perfectly. The axioms for the natural numbers N given by Dedekindcharacterise the structural properties of the natural numbers precisely. Simi-larly the axioms for the real numbers as being a complete Archimedean orderedfield (also essentially due to Dedekind) characterise the structural properties ofthe reals exactly. We have the following.

First Canonicity Theorem for the Natural Numbers. Let N and Msatisfy the axioms for the natural numbers. Then N and M are isomorphic.

First Canonicity Theorem for the Real Numbers. Let R and S satisfythe axioms for the real numbers. Then R and S are isomorphic.

In undergraduate lectures I like to describe a conversation between humansand some intelligent extra-terrestrial species soon after first-contact. Humansand ET might struggle to agree on what constitutes something fashionable, orelegant, or even beautiful, but the mathematicians of the two races would gettogether and discuss axioms for real numbers, and would presumably agree onthe set of axioms each species takes (or if not, prove from one set of axioms theaxioms of the other, and vice versa, showing that the two sets are equivalent)and therefore be able to conclude that both humans and ET share the same

7

concept of real number, irrespective of any ideas each race might have of theirimplementation using Dedekind cuts, Cauchy sequences, or whatever.

Although these theorems are well-known and appear to support the viewthat the structuralist approach to objects works, at least in these two cases, thestory is not complete. Although we now know that all systems of reals are infact structurally similar, these results do not tell us how they are similar. Butthe theorems that there are essentially only one natural number system and onereal number system are even stronger than this, in another subtle but importantway. Given two systems N and M of natural numbers, or two systems R andS of real numbers, not only are they isomorphic, but it is possible to show thatin each case there is only one possible mapping f : N → M , g : R → S thatdemonstrates this isomorphism.

Second Canonicity Theorem for the Natural Numbers. Let N and Msatisfy the axioms for the natural numbers. Then there is a unique isomorphismbetween N and M .

Second Canonicity Theorem for the Real Numbers. Let R and S satisfythe axioms for the real numbers. Then there is a unique isomorphism f : R → S.

Not only are the ideas of ‘natural number’ and ‘real number’ canonical,or forced upon us in a natural mathematical way, but the isomorphism thatshows this canonicity is canonical too. This means that however one definesreal numbers, not only is the structure of real numbers essentially unique, butthe individual real numbers are characterised by their properties and are alsoessentially unique.

The Second Canonicity Theorem has important mathematical consequences.But it has important consequences for physical applications of the reals andmeasurement too, which brings us on to our fourth view of mathematics.

The utilitarian view of mathematics. This is the idea that mathematicalideas, objects, and theories exist because they are necessary or useful to explainor model scientific phenomena, including other areas of mathematics. In math-ematics, one can temporarily posit the existence of all sorts of mathematicalstructures and objects and it is remarkable from a psychological point of viewhow these objects can take some sort of real existence in the imagination whenone starts to work with them. In this sense one can choose to believe in almostanything, including the leprechaun with a pot of gold at the end of the rain-bow. A reasonable restriction is that one’s beliefs should not force one into aninconsistent point of view, but it is not necessary to be reasonable.

I have heard mathematicians being compared to children at a sweet shop,being offered many glittering packages of sweets to which they may pick andchoose the ones they want. The choice of such sweets, be they axioms or numbersystems or something else, is usually made for practical reasons—to solve thecurrent problem at hand—or for reasons of elegance, which may or may not inthe long term amount to the same thing. We have already seen an example,where a mathematician needing axioms for set theory that go further than theusual ZFC axioms tends to pick and choose the ones they need without toomuch concern about how these are justified. But we were all brought up in avery proper way and know that an excess of sweets can give one a tummy-ache.So one tries to get by with as little as reasonably possible, though starting with

8

a large tub of sweets and being able to pick and choose a small number fromsuch a large variety certainly adds to the excitement and excites the mind as tothe possibilities of some hitherto undreamt-of exotic combination.

There are two arguments supporting this view.One is the application of Peirce’s principle of abduction used by Quine [8]

(Hart [4, ChapterII]), that if a piece of mathematics X is required to understandan observed phenomenon Y then the observation of Y tends to support theargument that X is correct, or true, or sound. This argument is also employedeven if X cannot be shown as necessary for an understanding of Y but is perhapsthe most elegant or the most powerful or suggestive of other applications. Thisargument might be considered to have more force if Y is some aspect of ‘the realworld’ and X is being used as part of a mathematical theory to model phenomenain the real world, but it seems reasonable to take this further and argue thatsome new kind of mathematics X has mathematical existence (whatever thatmay be) if it is the most elegant or powerful way of explaining some other pieceof mathematics Y.

The second argument relates to Hilbert’s programme and says that pro-vided that X can at least be argued to be consistent (or consistent with othermathematical ideas one is using) then it can be regarded as ‘ideal’ mathemat-ics that has some validity of its own. Godel’s Second Incompleteness Theoremshows that Hilbert’s programme as originally posed cannot succeed, but themain thrust of the programme still holds weight. This is that new axioms ornew ideal elements may be accepted if shown consistent, and such ideal mathe-matics makes a useful contribution if it can be shown to have many reasonableconsequences for ordinary ‘real’ mathematics—and there are many levels of ‘re-alness’ from verifiable statements about the natural numbers (Hilbert’s originalnotion of ‘real’) to comprehensible statements about one of the other standardnumber systems discussed above.

In some sense the abduction argument and Hilbert’s programme argumentare similar, in that they both try to measure the success of the theory in terms of‘real’ consequences, be they in some familiar mathematical structure, or in theiruse as models for natural phenomena, and these useful consequences ‘trickledown’ in the sense that given a theory Y which has consequences for X, a the-ory Z that has consequences for Y is likely to also have consequences for X. Puta different way, if at the lower levels of this ‘hierarchy’ we can readily detectproblems (such as inconsistency, and this is the point of Hilbert’s programme,that inconsistency is ‘real’) by ‘trickle down’ any problems in higher mathemat-ics will eventually show up. This is even more true of powerful and eleganthigher mathematics, which being one of the more glittering sweets available islikely to be taken up more often by other mathematicians, who will surely indue course find out what the problems of it are, if there are any.

In addition to all this there are mathematical reasons for taking the utili-tarian point of view. A consistent first-order system is, by the CompletenessTheorem, satisfied by some mathematical structure. The Completeness Theo-rem is provable in a minimal theory of mathematics (ZF set theory with theAxiom of Choice is certainly sufficient, but rather less is actually required) and,as we shall see, arguments supporting consistency are not necessarily as difficultas they might seem in all such cases.

However the main problem with consistency as a criterion for belief is that itis rather weak: given its consistency and some unifying ZFC-like framework, the

9

existence of our number system follows, but we are looking to see if there is morethan this. Thus belief (at least for the context of this paper) needs to have somereason or rationality associated with it. From the point of view of mathematicswe expect belief in an object to have some usefulness: adding an axiom for theexistence of a leprechaun does not in itself improve mathematical knowledge andif we tried it we would tend to reject the axiom and disbelieve in the leprechaun.But if the ‘leprechaun’ was simply a fanciful name for an abstract mathematical‘point at infinity’ (mathematicians are indeed given to using fanciful names forabstract ideas such as this) and the axioms state this property of the ‘leprechaun’correctly then its addition could quite likely simplify the description of thegeometry of the system being considered and we would have rational reason tobelieve the new axioms and the existence of the ‘leprechaun’ so characterised.In other words, it’s not what one believes and what one calls it that’s important,but rather how it affects the way one thinks about everything else.

Consider for example the addition of the number i for√−1 to the real

numbers, making the complex numbers. From the point of view of the physicaluniverse, especially when one is thinking of the reals as measuring distance ortime, the square root of minus one is a mysterious object, and historically itwas rejected for a long time because this number does not seem to exist in thisphysical sense. However, the addition of this number to the reals turns out tobe straightforward mathematically and not nearly as complicated as applyingthe Completeness Theorem of logic: essentially all that is required is to knowthat the polynomial X2 + 1 is irreducible over the reals, something that is quiteeasy to establish. More precisely, the symbol i is taken simply as a formalsymbol and a complex number is a formal expression x + iy where x, y are realnumbers, and this expression can be considered as being simply a notation for apair (x, y) of real numbers, with special rules for addition, multiplication and soon. That these rules make sense depend simply on the irreducibility of X2 + 1.The number i itself is 0 + i1, and once one has added i to the reals one can seethat all numbers of the form x + iy need to be added, so this construction hasa pleasant kind of ‘inevitability’ about it.

Thus the complex number system is easy to construct from the reals, andthis is already in its favour. Is it a useful system of numbers? Well yes, mostdefinitely, as the addition of i simplifies a great many theorems and formulas.For example the ‘Fundamental Theorem of Algebra’ that ‘every polynomial hasa root’ becomes true in general without having to qualify the hypothesis as‘every polynomial of odd degree’ as one would have to for the reals. Complexnumbers simplify the equations for the solutions of polynomial equations of thirdand fourth degree even when the solutions are purely real numbers (this was thefirst use they were put to and their original motivation) and the introduction ofi unifies equations for real-valued trigonometric and hyperbolic functions intoone single set of formulas.

It could be said of the complex numbers that these are merely a technicaldevice for handling a pair of real numbers simultaneously. And of course this ishow they are constructed or defined. However it is important to be clear that theapplications of complex numbers show that they are rather more than this. Inparticular the important notion of differentiability of a complex-valued functionis not at all the same as that of functions of two real variables, and may not havebeen discovered but for the view of complex numbers as single numbers ratherthan pairs of reals. In other words, the fact that complex numbers suggest new

10

mathematics that would not have been otherwise obvious is a very strong factorfor their usefulness.

The other aspect of the utilitarian view is the usefulness of the mathematicsto scientific theories, especially theories of physics. Here it is important to stressthat the question is whether any particular kind of number can be used todevelop a useful theoretical model of some aspect of the universe, not whethernumbers really exist in the physical world. For example, it is traditional tomeasure the traditional dimensions of length and time using real numbers. (Theswitch from the use of rational numbers to real numbers for this was made by theancient Greeks who were genuinely concerned about measurements that seemedto have to be made with irrational numbers such as

√2. After several centuries

we do not seem to have any serious rival for this use of the reals, something Ifind surprising.)

If we ask how complex numbers help us with measurements and physicaltheories we see that although distance and time do not obviously have complexvalues, some quantities, notably current and voltage in AC circuits, are naturallymodelled as complex numbers, with the magnitude of the number being the peakvalue and the argument of the number being the phase.3 So from the point ofview of modelling physical phenomena, complex numbers play a part and shouldbe accepted. Whether one goes so far as speculating whether other equationsthat occur in physical models also apply to complex numbers, for example thata particle with imaginary rest mass might exist and if so would travel fasterthan light speed, is perhaps more the realm of science fiction. However the factthat numbers such as i promote such speculations and that at least one or twoof these speculations may turn out to be reasonable science rather than fictionis in itself also a reason to accept the utility of i.

For the application of numbers to natural phenomena, the canonicity theo-rems are particularly important. The First Theorem is obviously essential, forif we were to measure a quantity by a number—a real number perhaps—it isimportant that the resulting numerical measurement comes from an identifiedstructure so that two such measurements can be combined or compared. Butthe Second Canonicity Theorem is important too: it is this that guaranteesthat the result of a measurement is unique and reproducible. If there are twodifferent isomorphisms between structures R and S then each of R, S has atleast one nontrivial automorphism sending numbers to different numbers withthe same properties. And if two numbers x, y ∈ R have the same propertiesthey are both candidates for the same measurement of a physical quantity.

The Second Canonicity Theorem might fail for a structure S because thesystem S may not have enough structure to distinguish between its elements. Anexample of a number system satisfying the First but not the Second CanonicityTheorem is the system of complex numbers C as a field with +, ·, 0, 1 and (sothat the real number line can be identified as a special subsystem) the absolutevalue operation |x|. The First Canonicity Theorem follows from that for R. Butthere is no way to distinguish i and −i, nor (more generally) x + iy and x− iy.In other words conjugation x+ iy 7→ x− iy is an automorphism of the structureand Second Canonicity fails. We can resurrect Second Canonicity by addingto our structure an additional function, for example the argument map arg z

3The magnitude if x + iy is the real quantityp

x2 + y2 and its argument is tan−1(y/x)taken in the appropriate quadrant.

11

(returning a value in the interval [0, 2π)) but adding such a function requiresan arbitrary choice of which is the upper half-plane and which the lower, orwhether angles are measured clockwise or anticlockwise.

Failure of canonicity for C has some consequence for measurements using C.For example in an experiment or electronic design using C to model alternatingcurrent (AC) there are two choices for the measurement of the very first currentor voltage, but once the conventions for this first measurement is chosen therest of the measurements must follow suit. This is of little consequence to thephysical theory using C to model the actual physical system, but suggests thatit is not in fact exactly true that we ‘see’ complex numbers as complex voltagesor currents in an AC circuit. Put mathematically, the first complex numbervalue or measurement is one of two values, x + iy or x − iy, the set of whichis called the orbit of (either) value under the automorphism group in question.There is nothing to choose between these two values, although they turn out tobe the same value if y = 0. But (provided y 6= 0) the second value u+ iv will beuniquely determined. The orbit of a single point x+iy under the automorphismgroup has 2 elements (or 1 if y = 0) but the orbit of a pair of points (x+iy, u+iv)also has at most 2 points. No physical theory of measurement I can think ofcan distinguish between i and −i, so it is not quite true to say that the complexnumbers exactly models AC circuits.

For the complex numbers mathematicians usually choose to live with thefailure of the Second Canonicity Theorem and to signal this failure and theadditional properties of C as an extension of R that it gives us is coded into theconjugation operation.

The canonicity of the reals is not necessary for believing their existence, butit is a very desirable property of the reals and strong evidence for such belief.In the framework already set up, it is an elegant property of the reals that ispotentially highly useful. Although canonicity itself does not imply that realnumbers can be used to measure physical quantities it does at least show thatthe number system is available for such measurements. And, as we know, it iscommon in physics to measure distance, time, mass, energy and so on as realnumbers with appropriate units. This is not to say that real numbers must beused in this way or that there is no other more appropriate system to use, butrather that the real numbers forms a particularly useful model of such quantitiesthat is applied extensively in physical theories.

In contrast, consider the case of a family of number systems described by aset of axioms A which fails to have the basic canonicity property, i.e. that wecan’t prove that every two systems satisfying A are isomorphic. We might beable to convince ourselves of the existence of systems satisfying A by elementaryor straightforward manipulations of systems whose existence we already areconvinced about. For example if A is the set of axioms for abelian groups,we can present the reals with the addition operation, or the reals with zeroremoved and the multiplication operation, or the integers modulo 5 as concreteexamples of systems satisfying A. But if our evidence for the existence of systemssatisfying A only comes from complicated arguments in ZFC this option is notavailable to us. If we can prove in ZFC that we expect that there is up toisomorphism only one system satisfying A then we can posit the existence of sucha system ‘in the real world’ and describe it accurately in terms of the theoremsabout it that are provable in ZFC. We have a lot of concrete information aboutthis system that we can at least consider, and maybe later choose to believe

12

in. (There is another example of the semi-semantical reasoning going on here.)Thus provable canonicity in a set of axioms like ZFC is at least a useful precursorto belief of existence. Equally, with the obvious necessary extra care beingtaken, provable canonicity some other conceivable set of axioms other thanZFC is helpful evidence for the belief of prior existence of the number system,irrespective of what we take for our usual axioms for set theory or mathematics.

If systems satisfying A are not canonical then perhaps they are there (inmodels of ZFC) because of some dubious axiom or artifact from the way ZFC isconceived. This is particularly pertinent because one of the axioms of ZFC thathas been the subject of much debate as to its correctness over the last hundredyears—the Axiom of Choice (AC)—is often recognisable in its consequences bytheir non-canonicity. For example, AC implies that there is a ‘well-order’ ofthe set of real numbers. It doesn’t really matter what a well-order is for thisdiscussion, except that no such well-order can be defined by elementary meansas discussed earlier and well-orders are (provably in ZFC) highly non-canonical.What’s more, from knowing in more detail the structure of any well-order ofthe reals, it would be possible to read off the solution to one of the biggest openproblems in set theory: whether the continuum hypothesis (CH) should beregarded as true or not. (CH is known to be independent of the other axiomsof ZFC, but no satisfactory evidence as to whether it is CH or its negationdescribes the true mathematical universe is known.) Non-canonicity in itself issufficient to make the issue warrant further investigation and my view is thatthe other evidence is quite compelling in the direction of not accepting at thistime the prior belief in the existence of a well-order on the reals.

There are other more fundamental issues connected with canonicity thatare not related to AC, but which are rather more difficult to isolate. Interest-ingly some of these other issues may have an impact on physical principles andquantum mechanics.

Consider the air in front of me. Most theories say it consists of particles—airmolecules. Certainly there is plenty of good scientific evidence to say that thereis matter in the air about us and it is in the form of very small particles, sothis seems entirely reasonable to believe. But if I was asked to focus on oneparticular air molecule and describe it—in particular whether it exists—thisbecomes more problematic. The immediate question is which one? There is anamorphous mass of air molecules in front of me and I can’t pick a single oneout. Does that matter? Is it a reasonable position to believe in the existenceof the air in front of me and have some belief about the form of structure thatair takes without any specific belief in any particular air molecule? If I am tobelieve in the existence of any single air molecule, shouldn’t I be able to saysomething specific about it other than it is simply an air molecule and it existssomewhere?

From the point of view of quantum physics, my refusal to believe in a singlemolecule might be quite a sophisticated position. The uncertainty principle saysI shouldn’t be so sure of any single molecule because I cannot specify its positionand velocity. Furthermore, the Pauli exclusion principle says that all individualparticles must have distinct states, i.e. there should be ways to distinguish them.Now I didn’t refuse to believe in the existence of individuals rather than theamorphous mass because I choose to bow to the Heisenberg–Pauli god, butrather because of some more general principle that needs to be pinned downand understood better. I admit to finding it difficult to articulate the exact

13

principle here, but it is something along the lines of the following. Were I tobelieve in a single air molecule without being able to say anything at all specificabout it, this would hardly be a useful belief but instead would be rather likea belief in an object that has no impact whatsoever on the rest of my thinking,like the arbitrary belief in the leprechaun. I can however reasonably believe inthe amorphous mass of air, and also reasonably believe in the theory that saysit is described best as a collection of individual molecules. Were I to be able tosay something specific about some air molecules, the ones that are molecules ofoxygen perhaps, then I would have a stronger belief in a particular part of themass of air, the part that is oxygen, but I still would not be able to have anyuseful belief in any particular oxygen molecule.

One wonders whether this issue of canonicity or definability and existenceof individuals may have some bearing on the underlying principles of quantummechanics. Unfortunately I have to leave these speculations open here as thequestions seem difficult at this stage, but it would seem worthwhile returningto them at another time.

3 Existence of infinitesimals

The previous section set out four main viewpoints a working mathematicianmight typically take in his or her work. None is thought through in detail ac-cording to the underlying philosophy—we will make further comments on theseviews later. In this section I would like to describe the case for existence ofinfinitesimals and nonstandard number systems from these different viewpoints.For background information on infinitesimals see Robinson [9], Kossak’s arti-cle [6], or the technical appendix to this paper. Additional material on first-orderlogic, as well as a brief introduction to nonstandard analysis appears also in TheMathematics of Logic [5].

Infinitesimals in the unifying view. From the unifying point of view, theexistence of a number system with specified properties follows, if at all, fromthe axioms of the unifying theory one has chosen to adopt. In the case ofthe theory ZFC, axioms are available to construct or define the set of naturalnumbers, N (or ω, as it is usually called in this context), and from this theusual constructions allow us to define Z, Q and R. These systems are regarded(within ZFC) as structures for first-order languages and ZFC can state andprove the main results of first-order logic including the Soundness Theorem, theCompleteness Theorem, and Los’s ultraproduct theorem. Then by usual model-theoretic means, we can either analyse the structure R and using Soundnessdeduce that a first-order theory of hyper-reals with infinitesimals is consistentand hence by Completeness there is such a structure in the universe, or godirectly from R to a hyper-real structure ∗R by means of Los’s theorem and asuitable ultrafilter, usually a non-principal ultrafilter on ω or N.

In this sense, number systems with infinitesimals clearly exist, and this whyRobinson’s approach is considered correct and rigorous. One concern that wemay have is that the Axiom of Choice (AC) is used in an essential way as one ofthe ZFC axioms required for the Completeness Theorem, or for the constructionof the ultrafilter to use when applying Los’s theorem. Looking at it a differentway, these nonstandard number systems could be regarded as a test case for

14

theories such as ZFC: ZFC clearly ‘predicts’ the existence of them but directconstructions do not yield such systems. Is there some more direct way thatarguments for such number systems can be given? Does this prediction supportor refute the traditional belief that ZFC is a good unifying theory?

We have concentrated on those unifying axiomatic systems in the ZF family.It is worth remarking that a number of alternative systems exist in which non-standard numbers appear more naturally. Some of these have associated philo-sophical motivation (such as Vopenka’s Alternative Set Theory [10]). There area number of systems proposed by Kanovei and others. In any case, to adoptsuch a system requires one understand the consequences especially as it forcesone’s mathematics outside the mainstream.

Infinitesimals in the pluralist view. In the pluralist view, we should takeas little as possible from our metatheory and construct nonstandard numbersystems directly as an extension of R perhaps, analogously to the constructionof C. The theory ZFC ‘predicts’ that this should be possible, and the Los con-struction appears to be the most straightforward approach. It is direct andexplicit, once one is given a suitable ultrafilter. For this approach to work oneseems to need an axiom for the metatheory saying that such ultrafilters exist,and this axiom needs to be justified. The two such possible justifications thatcome to my mind are: (A) an argument that ZFC or some fragment of it isjustified, as is AC (or the Boolean Prime Ideal Theorem) and the argument forultrafilters from these; or (B) an argument of the utilitarian sort that says thatultrafilters are necessary to explain and work with a great number of mathe-matical phenomena.

The existence of these ultrafilters is, it seems to me, not unreasonable, sothat it seems that we can reasonably imagine our pluralist universe populatedwith nonstandard number systems amongst others. Without such ultrafilters,it is possible to make poor versions of nonstandard number systems. One cantake for an infinitesimal h a transcendental number over R and order the fieldextension R[h] so that 0 < h < 1/n for all n ∈ N. This is an ordered field with aninfinitesimal, but not as rich as the hyper-reals constructed from an ultrafilter,and not yet as useful for mathematical analysis either. I suspect that if thisapproach were to be continued, we might have a workable, but clumsy, theoryof analysis very like the ε-approach with limits.

Infinitesimals in the structuralist view. The most useful and commonlydiscussed nonstandard number systems in practice are sufficiently saturatedmodels (it is usual to take them ℵ1-saturated) of an appropriate first-ordertheory—the theory of the reals with additional functions and relations, for ex-ample. The systems constructed from a non-principal ultrafilter over ω are ofthis type, for instance. From the structuralist point of view one would like towork with these properties as axioms, rather than concern oneself about theproperties of the ultrafilter one used to construct the system, if it was obtainedthat way. Indeed, a construction via the completeness theorem seems preferablein this sense since it is ‘purer’ and from it one cannot easily see the details ofhow one obtained the system, only the properties of the system so obtained.

The canonicity theorems for nonstandard number systems are, on the otherhand, more problematic. The immediate reason for concern arises from the

15

fact that all the usual constructions of nonstandard number systems with in-finitesimals use the Axiom of Choice (or a slightly weaker form of it such as theBoolean Prime Ideal Theorem) in some essential way. These axioms have beenaround for some time, but are not to everyone’s taste, so are worth looking atin this context. Looking ahead, issues to do with canonicity also impact on theapplication of such numbers in physical theories, particularly in measurement,and our question on physical existence.

We concern ourselves here with nonstandard number systems that are ele-mentary extensions (in the sense of first-order logic) of structures of the form

R = (R, 0, 1, +, ·, <, Z, . . . , f, . . .)f∈F

for some suitable set of functions F . Following standard terminology from math-ematical logic we shall also call such systems models, the theory of the modelbeing understood to be the elementary diagram of the structure above, i.e. theset of all first order statements that can be written down using addition con-stants naming real numbers and true in the above structure. Other nonstandardstructures considered in NSA usually contain all this as a substructure or as aninterpreted structure, so the non-canonicity phenomena we will be talking aboutfor this structure applies to these others too. I have included the integers Z asa unary predicate in order to code infinite sets, in the way that is common inNSA. The set of natural numbers N can be defined from that of the integers ifthat is one’s main interest, and using this one can use results from the theoryof models of arithmetic to help classify models.

The issue for canonicity is whether there is some identifiable model of thisform that can be described simply by means of mathematical axioms, other thanthe so-called ‘standard’ one (i.e. the one above) which contains no infinitesimals.The answer in general is no, and there are a number of obstructions.

The first is well-known, but is not a particularly serious obstruction. Bya pair of theorems of first-order logic known as the Upward and DownwardLowenheim–Skolem Theorems, models of the appropriate theory can be foundof every suitably large cardinality. (Where ‘suitably large’ means in this case atleast as big as R itself and of the set of functions F used.) That is not so muchof a problem is because we can specify the cardinality we are interested in in anatural way, as the first cardinal bigger than this minimum, perhaps.

The other obstructions to canonicity are specific to the particular theory weare looking at, the fact that it codes sequences, computations in Z, and otherrather complex mathematics. It is necessary for NSA to look at structures thatcode complex mathematics to enable us to solve difference equations in thenonstandard world as indicated above, or more generally to use NSA to reducecontinuous problems concerning sets of reals or functions of real variables todiscrete problems with solutions by combinatorial means. In the terminologyof the classification of first order theories given by model theory, the theory weare looking at is highly unstable with too many models at each cardinality toexpect a classification of these models.

One possible candidate for a ‘canonical choice of model’ is a ‘minimal’ or‘smallest’ one, but it turns out from this and some model theory that there isno minimal nonstandard model. (By a slight irony, minimal models do exist fora third method of construction outlined in the appendix—the one using Godel’sIncompleteness Theorem—but these necessarily give structures satisfying false

16

sentences, such as ¬Con(PA). In any case there is an issue as to which falsesentences we are to choose.)

Perhaps instead one should look for large models, models which contain everypossible feature that one might want, models that contain elements satisfyingevery possible property. This is a common idea in model theory, and models ofthis type are said to be saturated. Saturated models are very powerful, not onlyfor model theory, but for nonstandard analysis, where saturation principles areoften exactly what one needs to transfer a problem or definition from the realworld to the nonstandard word. There are many notions of ‘saturation’ in modeltheory, but for highly unstable theories such as ours, all the notions of saturationhave some difficulty too. Some weaker notions of saturation (such as recursivesaturation, arithmetical saturation, resplendency) are available to allow modelsof all theories to have such models at all cardinalities, but unfortunately thesenotions of saturation do not characterise the models up to isomorphism, i.e. thereare no canonical weakly saturated models. There is a notion of full saturation4

which does characterise models up to isomorphism, but unless the underlyingset theoretical framework of ZFC that we are using is changed, saturated modelsof our theory need not exist at all.

The best general results showing existence of saturated models are of thefollowing type [2].

Theorem. Suppose ZFC together with either the generalised continuum hy-pothesis (GCH) or the assumption that there is a strongly inaccessible cardinal.Then there is a saturated elementary extension of R.

One might say that the set theoretic assumptions required to build saturatedmodels are irrelevant in the structuralist view, but if one takes this standpointone still has to argue for the existence of saturated structures. In any case oneof the strengths of mathematical work is that the four views I have outlinedare in some senses compatible, and we should not throw away the unifying viewlightly.

If we are adding axioms to set theory, I would argue that adopting GCH isnot something that one would want to do unless strong evidence is forthcomingon the continuum problem (CH), but an axiom for the existence of arbitrarilylarge strongly inaccessible cardinals is a much more reasonable addition to ourset theoretic axioms for mathematics. Indeed much of modern set theory is con-cerned with adding ‘large cardinal axioms’ that cannot be proved from the usualaxioms and seeing what the consequences of them are, especially for ordinarysets such as the set of reals. From the point of view of more advanced NSA,these large cardinal axioms are useful in another way too, since they allow us tohave access to a number of models of set theory (something that is not availablewithout large cardinal axioms) and for some applications it is helpful to startNSA by taking an elementary extension of a suitable model of set theory, ratherthan an elementary extension of our structure R.

Any two saturated models of our theory of the same cardinality will beisomorphic, and it is difficult to see how non-saturated models might be suffi-ciently canonical to be of interest, so the conclusion is that for canonicity we

4For experts, the technical definition I refer to is: a model M is λ-saturated if it realisesall types over sets of parameters of cardinality strictly less than λ, and it is saturated if it isλ-saturated where λ is the cardinality of M .

17

do require addition axioms in our mathematics to allow us the chance to workwith saturated models. If we feel that the reasons for including nonstandardsystems as part of zoo of useful mathematical structures that we wish to acceptand use, we would under the unifying view, require additional principle for ex-istence of mathematical objects, but suitable additional principles are availableas additional axioms in the ZFC style.

This deals with the First Canonicity Theorem. The Second Canonicity The-orem adds a further complication. Given any two saturated models of the sameinfinite cardinality, by standard methods in model theory there will always bea huge number of isomorphisms between the two. A consequence of the SecondCanonicity Theorem is that there is precisely one automorphism of a structurethat satisfies it, so the Second Canonicity Theorem fails for saturated systems ofnonstandard numbers. This does not cause immediate problems for the Struc-turalist view, but it does have important consequences for the utilitarian view,to be discussed later.

Infinitesimals in the utilitarian view. In the utilitarian view, we wouldhave evidence to support the existence of nonstandard number systems if wecan show that such systems are useful and important enough. This might meanwith relation to scientific theory, or to mathematics itself. We start here bylooking at applications of infinitesimals to mathematics, and look at possibleapplication to other ares later.

It is quite easy to say that infinitesimals as used by Robinson are simply atechnical device to simplify and code up the idea of ‘limit’ and this is in somesense correct. For example, there is no doubt that the nonstandard definitionof derivative is simpler that the definition using limits, but arguably it doesno more than use the same idea underlying that of ‘limit’ in a different way.Against this criticism we might offer the argument that Newton and Leibnizmay not have come up with the differential calculus but for their thinking interms of infinitesimal quantities. Of course this is difficult to judge so many yearslater. It is certainly true that many mathematicians today find it easier to thinkin terms of infinitesimal quantities, even if they later re-work their argumentsin terms of limits. But also, many others prefer to think in terms of limitsinstead. Perhaps it has more to do with how one is (mathematically) nurtured,and at present current teaching methods at universities certainly emphasiselimits rather than any alternative, and indeed infinitesimals rarely enter theundergraduate curriculum at all.

Another key thing to look at is whether infinitesimals unify different areasof existing mathematics and simplify the presentation of them or the state-ment of their results. In fact there is one area in which infinitesimals do thisbeautifully: that of the parallel topics of differential equations and differenceequations. A differential equation is an equation for an unknown function y(x)of a real variable involving the derivative y′(x) of this function, or higher-orderderivatives of this, y′′(x), y′′′(x), etc. A difference equation is an equation foran unknown discrete function y(n) of a natural number variable involving thedifference function ∆y(n) = y(n+1)−y(n) and possibly higher order differences∆2y(n) = ∆y(n+1)−∆y(n), ∆3y(n), etc. That these two types of equation canbe classified and solved by similar techniques is rather well-known, and a typicalmethod for solving a differential equation numerically (i.e. approximately) on

18

a computer involves choosing a small step size h, approximating a continuousfunction y(x) by the discrete function y(n) = y(nh) and each derivative y′(x)by (y(n + 1) − y(n))/h = ∆y(n)/h and so on. It is usually the case that theresulting equation can be rearranged to take the form y(n) = F (y(n − 1)) orpossibly y(n) = F (y(n − 1), y(n − 2), . . . , y(n − k)) so that on choosing appro-priate starting values y(0) (or y(0), y(1), . . . , y(k−1)) one can generate all othervalues y(n) on the computer. A particularly simple example is known as theEuler method in which the differential equation

y′(x) = F (x, y(x))

is replaced by the difference equation

y(n) = hF (hn, y(n− 1)) + y(n− 1).

Difference equations like this have the advantage that it is obvious to see thatsome solution exists, though finding a closed expression for a solution is oftenmore difficult, whereas the existence of solutions of differential equations is oftenmore delicate. Many differential equations can be solved exactly by nonstandardmethods by the same numerical method by choosing an infinitesimal step sizeh and solving the difference equation in the nonstandard world. Once again,the existence of the solution is usually obvious, and this gives a rapid existenceproof for solutions of some kinds of differential equations.

So infinitesimals and nonstandard methods unify numerical methods withthe classical analysis of real valued functions. There are other examples too.Nonstandard methods allow one to give adequate nonstandard approximationsof useful but classically-speaking fictitious functions such as the delta function.Nonstandard methods allow problems in calculus of variations be solved by tra-ditional means such as using Lagrange multipliers maximising or minimising afunction of nonstandard-infinitely many variables with constraints. Thus non-standard methods at least highlight the connection between discrete problemssuch as difference equations and analytic problems such as differential equationsvia versions of numerical methods normally used to find approximate solutions.This is not quite the same thing as unifying these problems—seeing them as allthe same kind of problem. (That would appear to be a useful project for someother time.)

Do infinitesimals and nonstandard methods permit new kinds of mathemat-ics to be done that could not easily have been achieved without them? HereI think the jury is still out. Certainly it was hoped (by Robinson and peo-ple following him) that some significant problems in analysis could be solvedby nonstandard means, and in one case, Robinson himself solved an importantoutstanding problem in analysis by nonstandard means before Halmos identi-fied the key ideas and presented an alternative classical argument. In fact itseems that most work in nonstandard analysis with impact on problems thatcan be stated purely in the classical language of analysis has been confined tofinding elegant nonstandard methods to existing problems with classical meth-ods already known. It seems that for such problems, nonstandard methods andclassical methods using limits are too close: it is a little too easy for experts totranslate between one method and another. Where nonstandard methods aremost useful, in my opinion, is that they allow the construction of interestingnew analytical structures based on traditional discrete structures.

19

One possible stumbling block for the introduction of nonstandard analysis isthat the procedure of arguing for classical results, about the reals R for example,using nonstandard means involves switching between two worlds: the ordinaryreal world and the hyper-real world. This is the moment where ‘infinitesimalquantities are neglected’ which was most problematic for Berkeley and others,but is given a precise justification by Robinson. In some accounts, this is donealmost algorithmically—adding or removing stars from symbols, taking ‘stan-dard parts’ and ignoring infinitesimal quantities according to tightly definedrules. Alternatively one does it using the standard tool-kit of first order logic,which is elegant and comprehensive, but perhaps too much for beginners, espe-cially students, to learn. This is also an issue with the subject and unfortunatelya misunderstanding of some of these rules can lead to errors. Obviously thereis still work to do in this direction too.

The tentative conclusion to this part of the discussion is that infinitesimalsand nonstandard numbers in general do seem to form a useful system or systemsby which to do mathematics, and (with some careful warnings about potentialerror that might occur if the methods are incorrectly applied) we might encour-age more mathematicians to believe in their existence and usefulness.

Now we address similar questions about physical existence of nonstandardnumbers.

The most obvious remark is that infinitesimals seem to be useless to measuretraditional quantities such as time and space, as infinitesimal amounts of timeand space would be too small to be measured in any conventional sense. Nor isthere (to my knowledge) any physical theory in which infinitesimals are potentialmeasurements for physical quantities. This remark is a bit glib, as it presupposesthat the traditional real-valued measurements of space and time are ‘correct’.So let us speculate for a moment what such a theory with nonstandard valuesfor measurements might look like.

Suppose some physical quantity—we will call it the mass of a particle, butnothing we say will be specific to this—is measured in a nonstandard systemand the measurement may take infinitesimal values. Then the physical theorywill only ‘see’ the orbit of this value obtained by the measurement, i.e. theset of automorphic images of it under the automorphism group of the numbersystem. Rather than seeing a hyperfine continuum of possible particles withinfinitesimal masses we would see a classification of ‘types’ of masses based onthe orbits of the values. However, the measurement of this one value will affectthe measurement of the mass of a second particle, since the orbit of a pair ofpoints (u, v) is not necessarily the Cartesian product of the orbits of u and vindividually.

All of this looks suggestive of what actually happens in physics, i.e. thatcertain symmetry groups underlie physical structure and the result of one mea-surement may affect another, but I am not enough of a physicist to see thisidea through. It is not completely without precedent. Complex numbers per-vade mathematical physics, but as we have discussed, they cannot be seen inisolation, at least not to the detail required to distinguish i from −i.

One reason for proposing the quark model of hadrons was that they showstructure and a non-zero size. Some particles, notably quarks and electrons andelectron-like objects do not apparently have a size, i.e. they are point massesaccording to the best measurements possible today. But if we speculativelyimagine physics at the scale of infinitesimals, they might then have structure,

20

such as being made of smaller elementary particles at the infinitesimal scale.These particles and any (infinitesimal) distance between them would not showup as distances, but they would contribute to properties or quantum numbersdescribing the particle; they might be the key to understanding how a particlesuch as an electron might have a ‘hidden variable’ or to understanding the seem-ingly random processes in quantum mechanics. They might interact (possiblyin infinitesimal time) with other particles, and the large number of differentkinds of interactions might usefully be unified. From the ‘outside’, i.e. at non-infinitesimal scales, we would not actually see these infinitesimal distances, butinstead we would see the properties of infinitesimals that are preserved by theautomorphisms of the infinitesimal number system.

Of course most of the ideas in this section is complete speculation. The onlymessage that I want to draw out from this discussion is not the detailed specu-lations as such but rather the fact that infinitesimal quantities would not lookinfinitesimal on the macroscopic scale, but rather would manifest themselvesas ‘quantum numbers’ or properties of the objects concerned which would beidentified through their classification and patterns that relate closely to the sym-metries and orbits of the situation at infinitesimal level and in particular howthe automorphism group of the nonstandard universe acts on such numbers andon the physical set-up at the infinitesimal level.

4 The case for ‘pure’ structuralism

The main issues in the philosophy of mathematics that I wish to address con-cerning the existence and nature of mathematical objects, theories of truth anddeduction about them, and theories of knowledge of mathematical truths arediscussed in Hart’s volume The Philosophy of Mathematics [4]. I will presentsome speculations and personal views that will need defending further elsewhere,but I will not do these points justice here, and certainly do not answer themfully. My aim is simply to show how discussion of other kinds of objects canstimulate this discussion.

Firstly, there is a major difficulty with both combining theories of bothtruth and knowledge of mathematics and mathematical objects with theories ofordinary everyday objects. (See Benacerraf [1], reprinted as Hart [4, ChapterI].) Mathematical objects, if they are to have Tarskian semantics must havesomething like platonic existence, but it is difficult to see how this allows us tointeract with them to obtain knowledge of them

Secondly, if mathematical objects are to be viewed in a structuralist view (asI think they must, for it is impossible to conceive any other nature for them andthis is in any case closest to the working point of view of most pure mathemati-cians) we have the issue that ‘structuralism’ is subject to circularity: structuresexplain objects, but being objects must be explained first. (See Parsons [7],reprinted as Hart [4, Chapter XIII], for a more detailed account.)

It seems to me that what I have called the utilitarian view is the only rea-sonable way to justify new axioms or new mathematics. (It corresponds to amodern and somewhat more pluralistic version of Hilbert’s programme with‘levels’ of realness and different modes of application, and it is these applica-tions and their success that give the axioms credibility.) Structuralism is theonly reasonable way to manipulate these different kinds of mathematics. One

21

can go further and postulate a unifying theory that brings various stands to-gether. This may be a matter of taste, or a matter for some other overall viewof mathematics (an overarching constructive or intuitionistic one perhaps) butI find the evidence given above that the bulk of mathematics is utilitarian andstructuralist compelling.

Canonicity results are important for both structuralism and the utilitarianapproaches, and I have argued, the issues of existence associated with canonicityare very much ‘true to life’ too, possibly even having the ability to explain somephenomena from quantum mechanics that from a naıve point of view seemunnatural.

The other major notion that arises from my discussion above of how mathe-matics is typically done is that of quasi-semantic reasoning, and reasoning about‘imagined’ structures. This doesn’t quite fit the usual semantics versus syntaxcanon that we have come to expect from foundational results in mathematicsabout mathematics, but is natural, commonplace, and appears reliable enough,especially with the additional checks that mathematicians employ.5

To speculate first on the first issue, the similarity or otherwise of truth andknowledge in the mathematical and everyday realms, it seems to me that thetheoretical dichotomy between pure Tarskian semantics and formal first-ordertheories and their syntax is stretching things rather and in everyday life, as wellas in mathematics, some sort of quasi-semantic arguments take place rathermore often than Hart’s introduction (op cit) would have us think. To take hisexample, the ‘trite’ example that ‘All bachelors are unmarried.’ We see thatthis is true in a quasi-semantic way, not by some argument in a formal system,by observing the semantic meaning of the definition of ‘bachelor’ and ‘married’and making a connection in an (imagined) structure of people, some of knownare bachelors, and some of whom are married. Hart’s more worldly example,‘All bachelors are sexually frustrated,’ is determined false not by engaging inan opinion poll of bachelors on the street to see if we can find one that isnot sexually frustrated or else exhaust the supply, but rather by recalling frompast experience some bachelor, that we might have been envious of perhaps,that was not sexually frustrated. We don’t actually know if this individualhas since got married (if so he would be no use as an example for us) but wedo have a semantic conception of the world and its people as being large withall reasonable possibilities represented, does not change particularly fast, andthat our experience is rather more limited. Since it was not difficult to finda counterexample 20 years ago, things are unlikely to be different now. Toanswer a critic who says this argument is not proof enough, we reserve theright to carry out an opinion poll, or possibly scan the men’s magazines to seeif such a poll has already been carried out. Other examples work similarly.Benacerraf’s example, ‘There are at least three perfect numbers greater than17’ can be determined by an opinion poll—or rather a computer search—butother quasi-semantic arguments are more satisfactory are more revealing, andthe reason why mathematicians prefer these other arguments is that it givesmore information, not because a computer search is out of the question.

There seems to be a spectrum of modes of informal argument for propositions5Weaved in with all of this are psychological effects, and we must ask to what extent do

mathematicians work in the way they do because it is convenient and productive for them,and to what extent do they actually need to because of the nature of mathematics itself? Thisquestion clearly requires further study.

22

of all types, and I suggest that these arguments are essentially semantic innature. In mathematics there are arguments about propositions concerningnumbers that could possibly be found by calculation or computer search, andthese are the ‘real’ propositions of Hilbert, but in mathematics there are alsoindirect means for argument, and many levels of indirectness, corresponding toHilbert’s ‘ideal mathematics’. But so too are there indirect means for arguing inreal life. The fact that mathematical argument can be formalised as a syntacticsystem is an interesting and useful fact (useful for reliability, communication andverifying arguments) but it is no accident that the so-called ‘natural deduction’rules are based on quasi-semantic steps.

Nevertheless, if we are going to argue that mathematical arguments andknowledge, like other kinds of knowledge, are essentially informal but semantic,we will have the problem of providing some details of these semantics and show-ing that these arguments determine truths—in particular that mathematicaltruths are truth like any others.

Here the usual approach is to try to identify the objects to which Tarskiansemantics apply, and this is fraught with difficulties. The same difficulties ap-ply in everyday arguments: for example, which is the object corresponding tothe ideal present-day sexually satisfied bachelor that we know exists by ratherconvincing means based on personal experience of the world? This too I willhave to leave aside for further detailed discussion another time. But in somesense we all use these arguments and they do work, even if they correspondto neither the usual Tarski semantics nor any precisely delineated syntacticalformal system.

Objects, I am arguing, are of necessity abstract objects, described by whatthey do, in both mathematics (where this requirement is quite clear) and ordi-nary experience (where the issues are muddied by the possibility of the apparentavailability of a pollster’s approach to truth). In other words they are objectspresented to us by some kind of structuralist view of abstract objects.

One take on structuralism is that objects are presented as being part of astructure (perhaps a structure for a first order language) and are identified insome way by what they do, i.e. what properties they have in that structure.But, according to Parsons, this has two major difficulties. The first is thatmany objects have the same properties and therefore must be identified in someway. But it is difficult to see how to do this: are we somehow taking a typical orparticular or canonical example of each? If so it seems difficult to see how onecould be chosen over the others. Or are we taking the equivalence class of allsuch objects? This has issues with the underlying theory of sets, of course, but Ifeel uncomfortable with this as the equivalence class of an object is not the samesort of thing as the object itself—the equivalence class construct has added extraunwanted structure onto the object. The second difficulty is that the first-orderstructure which gives the abstract objects their definition is apparently itselfan abstract object so needs to be defined in the structuralistic way in terms ofsome other structure, and this creates circularity.

It seems to me that these problems may be resolvable, and indeed I haveargued that they must be resolvable. The mistake (and I believe it is a mistake)is focusing on ‘objects’ too strongly. Structuralism is a way of looking at things,but it isn’t itself defined in terms of objects. Structuralism is a pair or spectacles,or a lens, or filter, which we look through at things. The spectacles or filterremoves properties that we are not interested in and leaves us looking at things

23

with certain limited properties relating to some specific operations. The abstractobjects that this gives us are the ‘things’ identified by their properties. Thusan object exists if there are ‘things’ that correspond to it but the object is notone thing nor the set of all of them, but an abstraction of all of them by theirproperty. Mathematical examples are easy to find, but the everyday exampleI used earlier is a god one to think about. In the collection of things in frontof me there are (or so the physical theory says) air molecules, but the abstractobject ‘air molecule’ is not a particular air molecule or the set of them all butan abstraction of those properties perceived through the filter.

Having chosen the structuralist view, in discussions about circularity wherethere is a choice one must put the structuralist view first. This means that thefilter through which we see things is not an abstract object at all. It is a sort ofdescription of how we are for the moment looking at things. Such descriptionsare typically very simple. It might just be that we are looking at the thingsin front of us as a collection of molecules, and not as tables, chairs, etc. Butthese filters or descriptions will be difficult to express in words. The mistakemade by many with respect to structuralism is, I believe, that they think of thefilters or spectacles as arising from structures which are objects. But they arenot objects, nor do they arise from objects: they are something else less easy topin down, akin to ‘pure descriptions’. When we choose to be more precise aboutthem what we are actually doing is modelling the filter with a theory of objects,just as we choose to model space with a mathematical theory in which distancesare given by real numbers, or we choose to model the flight of a projectile asthe movement of a point-mass particle in a uniform gravitational field.

Structuralism taken this way (where the structuralist view is taken as pri-mary and is modelled by abstract objects rather than defined by abstract ob-jects) I call ‘pure structuralism’. Clearly the scope of this paper has not allowedfor any detailled look at it, and there is much work still to do. It seems to methat the filters or spectacles are well-modelled by the idea of forgetful functorin category theory, and category theory should also provide a kind of semanticsthat is close in spirit to pure structuralism than the Tarskian one.

5 Questions for further research

Examine mathematics from the point of view of mathematics done on a com-puter with a computer algebra system, for exmaple.

To what extent does the computer science concept of abstract object cor-respond to the one suggested above? Do the considerations above suggest anyimprovements to the object-orientated paradigm for computer programming?

Does the fact that mathematics has several different compatible viewpoints(and different ways of making them compatible) add to the weight of its resultsor make cloudy water even more murky?

What precisely is informal ‘quasi-semantic’ argument and how does it con-trast with more formal modes of argument? What is it good for and what is itnot so good at?

There is also the interesting issue of the role of first-order logic and what Ihave called ‘quasi-semantic’ arguments. As I have said, it is easy and natural toargue in ZFC ‘quasi-semantically’ with reference to an imagined universe or partof that universe. Then the syntactic rules for first-order logic are justified quasi-

24

semantically, by a reflection on this imagined universe and an argument similarto the soundness theorem. In principle, quasi-semantic arguments of this formare more powerful, as other rules could in principle be imagined and used that gobeyond the usual first-order logical rules. In practice this rarely occurs, and onewonders why. Is this some psychological phenomenon restricting the mathemati-cian’s imagination, perhaps related to the ‘pessimism’ mentioned earlier? Or isthere some deeper philosophical reason that puts these imagined universes andquasi-semantic reasoning about them on a firmer foundation? something to dowith an informal Completeness Theorem for such quasi-semantical deductions?In any case, it always seems remarkable to me that the true Completeness Theo-rem makes excellent predictions about provability in first-order theories despiteits non-constructive nature and the fact that AC is required to prove it.

References

[1] Paul Benacerraf. Mathematical truth. J. Philos., 70(19):661–679, 1973.

[2] C. C. Chang and H. J. Keisler. Model theory, volume 73 of Studies inLogic and the Foundations of Mathematics. North-Holland Publishing Co.,Amsterdam, third edition, 1990.

[3] Timothy Gowers. Mathematics, a very short introduction. Oxford Univer-sity Press, Oxford, 2002.

[4] W.D. Hart, editor. The Philosophy of Mathematics. Oxford UniversityPress, 1996.

[5] Richard Kaye. The mathematics of logic. Cambridge University Press,Cambridge, 2007. A guide to completeness theorems and their applications.

[6] Roman Kossak. What are infinitesimals and why they cannot be seen.Amer. Math. Monthly, 103(10):846–853, 1996.

[7] Charles Parsons. The structuralist view of mathematical objects. Synthese,84(3):303–346, 1990.

[8] W. V. Quine. Two dogmas of empiricism. In The philosophy of language,pages 39–52. Oxford Univ. Press, New York, 1996.

[9] Abraham Robinson. Non-standard analysis. North-Holland Publishing Co.,Amsterdam, 1966.

[10] Petr Vopenka. Mathematics in the alternative set theory. BSB B. G. Teub-ner Verlagsgesellschaft, Leipzig, 1979. Teubner-Texte zur Mathematik.[Teubner Texts in Mathematics], With German, French and Russian sum-maries.

Appendix: commentary on the different views

[consider deleting some of this or merging with the text above]The discussion throughout this paper is about existence of abstract objects

such as numbers and number systems in mathematics, and of course the main

25

area of interest is in the foundations of mathematics. Because of the nature ofthe main questions addressed in this paper I have tended to take the workingmathematician’s point of view as understood, and have rather skated over thefundamentals behind this. It is time to make amends and address the foundationfor this view, and put a bit more flesh on what a mathematician might meanby ‘existence’. It may seem strange to discuss this at the very end of the paper,but ideas here require some understanding of a few key examples, in particularthat of real numbers, as discussed earlier.

The notion of real number is a good place to start, since it is straightforwardenough for most working mathematicians to appreciate, and yet complicatedenough for a number of differing views to have been put forward, especiallyin the first half of the twentieth century. But to most people, there is a clearconcept of real number based initially perhaps on the idea of a decimal expan-sion with several examples, including terminating decimal expansions, such asthat for 1/4, repeating ones, such as for 1/7, non-repeating ones such as for√

2 and more complicated ones such as for π. After some point, with theseexamples in mind one can abstract the idea of a general decimal expansion, anddevelop the concept of real number from that. (Of course some other intuition,such as Dedekind cuts can be used in place of decimal expansions.) One feelsjustified in this endeavour at the point when one writes down a set of veryreasonable-looking axioms for the resulting system of numbers and proves butthe existence theorem for such a system based on decimal expansions and provesboth canonicity theorems.

It seems to me that two very positive mental acts are being described in thelast paragraph, and both give strong evidence towards the existence of a systemof real numbers populated by familiar numbers such as 1/4, 1/7,

√2 and π. The

first is the moment of abstraction when after several calculations with particularexamples one realises that every sequence of decimal digits corresponds to a realnumber, and (ignoring the technical problem with recurring sequences of 9s) thateach distinct decimal corresponds to a distinct real number and that converselythe mental view of a number on a number line shows that each can be measuredby a sequence of decimal digits. The second is the moment of axiomatisationand the theorems of existence and canonicity that shows that there is essentiallyonly one system of real numbers. The first is a private moment of insight: afterplaying with calculations and symbols on a page I suddenly have an inklingof this particular kind of number in all its generality. The second is a wayof sharing this insight: now that I can describe my real numbers and provethat my real numbers are the same as yours I can discuss them with you toexamine their structure in much more detail. It seems that for most peoplethis is ample evidence that the system of real numbers exists in as concrete away as is required. It is just as robust, or perhaps more robust, than some ofthe notions of the physical objects around us and their qualities, and we haveexcellent reasons for believing that these mathematical objects are viewed inthe same way by all other mathematicians—arguably much better reasons thanwe might have for believing that some particular patch of grass is actually seenas the same colour by all individuals, irrespective of the label ‘green’ that theychoose to put on that colour.

Against this evidence, some counter-arguments have been put forward. Mostcentre round difficulties in the idea of an arbitrary sequence of digits (or an arbi-trary bounded set of rational numbers, or whatever is relevant on one’s favourite

26

conception of the reals). When one examines it, it seems that this idea of ‘ar-bitrary set’ or ‘sequence’ is harder to pin down that one might expect, and itis essential for full understanding of the reals, especially for canonicity. Onealternative that has been proposed is the formalists’, which says that generalsequences of digits exist in an ideal sense and we can accept them in this limitedway because this belief in them and their properties does not impact in any neg-ative way on our conception of concrete real numbers such as

√2 and π—indeed

more, that this world of ideal sequences might provide new information aboutour familiar numbers. Another alternative is the intuitionists’ which says thatonly numbers that can actually be constructed (such as via a computer programthat prints out their digits) can be accepted—there are no others; and all de-cisions (such as whether one number is bigger than another) have to be madein a similar constructive manner. Cases can be made for both these points ofview, especially in certain areas of research in which they are relevant, such asfoundations of mathematics, theoretical computer science, philosophy of scien-tific method, etc. But most mathematicians reject them for practical reasons.It is as if mathematics has moved on a few steps beyond the formalists or in-tuitionists, in abstracting a number of important concepts as objects satisfyingaxioms; perhaps this process of abstraction is genuinely necessary for human orsocial or scientific reasons, is that if it is to be useful and be used as one of thebuilding blocks of the next piece of theory then we have to in some sense beable to mentally picture these objects and believe in their existence.

For another example that clearly separates the formalist, intuitionist andclassical mathematician, consider the natural numbers, the ordinary countingnumbers. For an intuitionist, the natural numbers are given. They are the start-ing point for the rest of mathematics. For a formalist they are strokes on a page,together with rules that combine sequences of strokes, or compare two differentsuch sequences. To a classical mathematician, the intuitionistic approach ex-plains little, and in particular there is no place for Dedekind’s elegant axiomaticdescription of the natural numbers and the canonicity of this system, becauseit rests on more complex notions such as arbitrary sets of numbers, somethingthe intuitionist rejects. Similarly, the formalist approach is cumbersome, andin it it seems that numbers always have to be manipulated using these formalsystems. There is no place for the intuition of number that suddenly arises ina child when he or she sees some sheep and exclaims for the first time ‘Threesheep!’ having identified and abstracted the concept of ‘three’ and therefore nolonger has to perform any tedious matching against a collection of apples todetermine that of the set of apples there is exactly one apple each for the sheepto eat.

Abstract objects in general presumably arise in the same sort of way, evenoutside mathematics. Perhaps by playing a mental game, maybe with symbolsfollowing rules, or using a formal system, or by argumentation using a knownstyle of argument, or whatever, one sees through intuition a pattern. The de-scription of the pattern and analysis of how it arises is then the first part ofabstracting the pattern into an object or system of objects that have personalmeaning to us. The next stage is to describe that pattern more fully and showthat in some sense it is canonical, enough to communicate it with other peoplethat have a concept of the same sort of pattern, and a shared concept arises fordiscussion and research, and in this discussion and research it is most convenientto talk about and believe that these objects have some sort of prior existence

27

as abstract objects.It’s important to remember that failure of canonicity does not necessarily

indicate the failure of this programme. An object—be it a number system orwhatever—may not be canonical, but that may be more of the fault of thepurported definition. Even if the object is canonical, the failure of the SecondCanonicity Theorem may be an issue. But on the other hand, this may besomething one can live with, or may even be necessary for theories based on theconcept.

Appendix: Infinitesimals

This section provides a slightly technical background on nonstandard numbersystems and infinitesimals.

Just as with the complex numbers, where the addition of i to the realsnecessarily requires us to add other numbers of the form x + iy to preserve asmany of the usual properties of arithmetic that we expect, adding infinitesimalsto the reals requires us to add other numbers too. Thus if h is a positiveinfinitesimal, −h should be a negative infinitesimal and 1/h a positive infinitenumber; also π+h is a new number that is infinitesimally close to π, but slightlylarger than it, and so on. We get a picture of a system of hyper-real numberswhich is like that of an extended real number line where each real number is‘fattened’ to a set of numbers all infinitesimally close to that real number. Thisset of numbers infinitesimally close to a real number x is usually called the monadof x and written µ(x) or st−1(x). As well as containing infinitesimals, our hyper-real number line also contains infinite numbers larger than previously existingreals, and also the negatives of these infinite numbers. There are finite hyper-reals (i.e. ones that are in magnitude bounded by ordinary real numbers) andinfinite hyper-reals (other ones which are greater than all normal real numbersin magnitude). One non-obvious fact that follows from the completeness of thereal number system is that every finite hyper-real y lies in the monad µ(x) ofsome standard real number x, which is uniquely determined by y. This standardreal is called the standard part of y and written as st(y).

The previous paragraph gives an account of the intuitive structure of thehyper-reals, but rather fails to explain why such number systems exist. In fact,prototype systems with infinitesimals can be constructed by algebraic meanssimilar to the construction of the complex numbers, in a way in which the usualarithmetic laws of addition and multiplication hold, but such systems are notparticularly rich or useful. For analysis and much other mathematics we needplenty of other functions to be defined, and for example it is not clear howby algebraic means we might define the value of the function sin(1/x) at aninfinitesimal h. Because h is infinitesimal and sin(1/x) varies wildly between 0and 1 near x = 0 the choice of sin(1/h) seems to be arbitrary. And there aremany other similar arbitrary choices to make. This is where tools from logichelp, and one method of constructing hyper-reals is to apply the CompletenessTheorem in logic as described above. Essentially what is happening is that one ofthe axioms of ZFC, the Axiom of Choice or AC,6 usually via its equivalent form,Zorn’s Lemma, is applied to decide all these arbitrary choices simultaneously,

6In this paper, AC for the Axiom of Choice is not to be confused with AC for AlternatingCurrent. Both acronyms are too firmly set in place to be changed, unfortunately.

28

and in a consistent way relative to all the other properties of the real numbers.Thus the first true models of nonstandard analysis contain infinitesimals h, havemany or perhaps all real-valued functions defined, and satisfy all first orderstatements that were already true in the reals.

An alternative and popular but related method of construction is to use anultrafilter. Essentially an ultrafilter is a set theoretic object which encodes acollection of infinitely many choices, and which is shown to exist in ZFC by anapplication of Zorn’s Lemma. The hyper-reals are now constructed by takinga large Cartesian product RN of the reals and using the ultrafilter to makedecisions as to which elements of this product should be regarded as equal andhow functions such as sin(1/x) should be defined on it. This type of constructiongoes back to the Polish mathematician Los and his work in the 1950s, thoughsimilar ideas were already being used by Skolem (in a slightly less powerful way)to construct nonstandard models of counting numbers in the 1920s and 1930s.7

A third method of construction uses Godel’s Incompleteness Theorem. Tak-ing for T to be one’s favourite consistent recursively axiomatised theory of arith-metic, the theory of first-order Peano Arithmetic (PA) being a common choice,by Godel’s Second Incompleteness Theorem the theory with the axioms of Ttogether with a single extra statement ¬Con(T ) saying that T is inconsistent isitself consistent, so by the Completeness Theorem of logic has a model or systemof numbers satisfying it. In this model, the statement that there is a proof ofan inconsistency 0 = 1 from the axioms of T is true, but we knew in advancethat there is no such proof in the real world. Therefore that proof of 0 = 1 isa nonstandard object and it turns out rather quickly that it must have infinitelength. So our model has infinite numbers. Now we can replicate the standardconstruction of the integers, rational numbers and real numbers starting withthis model rather than starting from the usual set of counting numbers N. Theresult is a nonstandard model resembling the reals and containing all standardreals.

For the purposes of this paper, constructions by using tools from first orderlogic and constructions by ultrafilters are perfectly reasonable constructions ofnonstandard number systems. Nonstandard analysts tends to dismiss the thirdmethod based on the incompleteness theorems, because it is inconvenient tocheck that statements they know true in the reals (such as the continuity ofthe sin function perhaps) are indeed expressible and provable in the theory T .We shall also dismiss this third construction for a different but related reason.That is, starting with a theory T that we know is consistent we build a model inwhich the theory T is not consistent. In some sense the model is wrong aboutthe consistency of T and therefore we must reject it as not being a ‘true modelof infinitesimals’.

Newton and Leibniz’s account already contain some ideas of when an in-finitesimal number can be ignored and when it must not. For example, theratio of two infinitesimals x/y should be calculated as it could turn out to bezero, infinite or some finite number, but in the sum of a real number and aninfinitesimal a+h the infinitesimal can often be neglected. Berkeley reasonablycriticised this because the rules for when an infinitesimal may be neglected werenot explained. Robinson’s nonstandard analysis provides these rules, and sim-plifies many of the definitions of analysis that according to Cauchy and others

7References required.

29

need the concept of limit.For example, the sum of a real number and an infinitesimal a+h is a number

in the monad of a. The notion of continuous function is one that maps monadsto monads. More precisely, a function f defined on the reals is continuous ata if ∗f(a + h) ∈ µ(f(a)) for all infintesimals h, where ∗f is the nonstandardversion on the function f . (The function f is defined on the real numbers onlyso for technical reasons the corresponding function defined of the hyper-realsis a different function ∗f , though it extends f in as natural a way as possible.)In post-Cauchy mathematics, this is a theorem: one can prove quite easily theequivalence of Cauchy’s definition of continuity and the one just given.

Differentiability and all other notions from analysis can be given a similartreatment. Newton and Leibniz calculated the derivative of a function f at aby computing (f(a + h) − f(a))/h for an infinitesimal h and then neglectinginfinitesimals afterwards. (For a continuous function f we have just seen thatf(a + h) − f(a) is infinitesimal, so this is the ratio of two infinitesimals whichinitially must be calculated in some different way.) In nonstandard analysis, fis differentiable at a (with derivative b) if the quantity (∗f(a + h) − ∗f(a))/halways lies in the same monad (the monad µ(b)) irrespective of which nonzeroinfinitesimal h is taken. This appears to be exactly what Newton and Leibnizintended, and agrees with Cauchy’s definition exactly.

30

on the existence of inﬁnitesimals

Documents