spring semester 2017 edition · einstein’s general theory of relativity. so modern math does...

286
SENIOR MATHEMATICS 2017 Edition

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

SENIOR MATHEMATICS

JUNIOR MATHEMATICS

SPRING SEMESTER

2017 Edition

Page 2: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

2

Page 3: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

3

Table of Contents INTRODUCTION ...................................................................................................5 1 THE MACLAURIN SERIES ..............................................................................10 2 SERIES BY BINOMIAL DIVISION ....................................................................15

3 OTHER SERIES CONVERGING ON 𝜋 ...........................................................18

4 THE TAYLOR SERIES .....................................................................................24

5 INTRODUCTION TO 𝑖 ....................................................................................34

6 EULER’S IDENTITY ........................................................................................39 7 CASPAR WESSELL’S DIRECTIONAL ALGEBRA ..........................................43 8 EULER’S IDENTITY REVISITED .....................................................................61 9 THE PRODUCTS OF DIAGONALS IN REGULAR POLYGONS ......................64 10 INTRODUCTION TO DEDEKIND ..................................................................72 11 THE INFINITE WORLD OF GEORG CANTOR ..............................................76 12 A FOREWORD TO NON-EUCLIDEAN GEOMETRY ...................................102 13 A EUCLIDEAN REVIEW ..............................................................................104 14 THE FORERUNNERS TO NON-EUCLIDEAN GEOMETRY .......................107 15 NIKOLAI LOBACHEVSKY: GEOMETRICAL RESEARCHES ON THE THEORY OF PARALLELS .........125 16 FURTHER EXPLORATION OF HYPERBOLIC GEOMETRY ......................158 17 POINCARÉ’S DISK MODEL OF THE HYPERBOLIC PLANE: A EUCLIDEAN MIRROR OF A NON-EUCLIDEAN WORLD ........................205 18 PHILOSOPHICAL REFLECTIONS ON THE FIFTH POSTULATE AND NON-EUCLIDEAN GEOMETRY .........................................................259

Page 4: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

4

APPENDIX 1: DOT PRODUCTS AND CROSS PRODUCTS ...........................267 APPENDIX 2: LINE INTEGRALS ......................................................................279 APPENDIX 3: DOUBLE INTEGRALS ...............................................................282 APPENDIX 4: SURFACE INTEGRALS .............................................................285

Page 5: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

5

Introduction Does modern mathematics belong in a liberal education? And if so, why? Is it because math is a liberal art, a preparation for further study? Modern math is a preparation for many things, among them modern science. For example, some fairly sophisticated calculus is a prerequisite to the study of James Clerk Maxwell’s electromagnetic theory, and tensor calculus is a prerequisite to the study of Albert Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines, such as natural science, that unquestionably belong in liberal education.

That cannot be the whole story, however. For one thing, it pertains to those who are liberally educated to understand the intellectual climate in which they live, to know the philosophical customs of the day, to grasp the forces that shape the modern mind. Modern mathematics exerts an enormous influence on the modern mind, and can almost be said to define it. Just as ancient mathematics was in many ways the scientific ideal according to Aristotle, so modern mathematics is the scientific ideal for the modern mind—the more mathematical our science of nature is, for example, the more scientific it is considered to be.

Even that is not the whole story. Although it pertains to a liberally educated person to understand the intellectual forces of the present, in order to be able to engage them, assess them, and take them into account when addressing others, and for this reason alone a serious study of modern mathematics deserves some place in a liberal education, there is also the matter of truth. Modern mathematics has uncovered countless truths that were unknown, and even unknowable, to the ancients, given the limitations of their methods. Not only has modern mathematics dramatically increased the power of the human mind to discover new theorems, but it has developed entirely new ways of thinking about numbers and geometrical objects, and these ways of thinking not only drive scientific and technological discovery, but are also in many cases very beautiful. Modern mathematics is not

Page 6: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

6

simply more mathematics, a continuation of what the ancients began. It is a positive development, displaying profound differences and some departures from ancient mathematics in its principles, methods, and arguably even its subject matter. If liberal education should include a look at the truths most accessible to us, especially when they are very beautiful, then for this reason, too, modern mathematics deserves a seat at the table.

This reason for including modern mathematics in a liberal education calls to mind another, almost opposite, reason. Modern mathematics (and modern science) can appear to many of us like Descartes’ philosophy, as a wrong turn, even a perverse exercise in attacking common sense, since it so often does challenge or ignore our intuitive grasp of things. Many things come along in modern math that are questionable, more philosophically provocative than anything we run into in ancient mathematics, such as pairs of straight lines that intersect each other twice, parallels that meet at infinity, transfinite numbers, and other things whose intelligibility and legitimacy are not so clear. Perhaps sometimes modern mathematics is to blame for this. Galileo tried to compose continua out of indivisibles, which is an impossibility, and so a reader who finds fault with Galileo on that score is not to blame. Nonetheless, this very idea of his also contained much truth as well—it was something like a first crude grasp of calculus, and of the importance and relevance of calculus to the science of motion. Simply to condemn the philosophical error in Galileo’s thought while overlooking the germ of a great discovery would be a grave injustice to him. The same can be said for modern mathematics even in its most outrageous moments. Modern mathematics might propose some ideas that involve error or confusion, but these are well worth considering in part because of the great truth they also include, and in part because learning about subtle errors concerning first principles and how to resolve them is the way to wisdom about first principles. Modern mathematics, as much as modern science, often strikes against common sense, against what seems to be (or even is) self-evident, and for this reason alone it deserves careful study as part of a liberal education, which is chiefly concerned with first principles. Besides, it might be that in many cases the fault lies not with modern mathematics, but with ourselves—we can be too ready to reject an idea as contrary to self-evident truth simply because it runs contrary to what appears self-evident to us. This will be one of the great background questions throughout this course: how do we tell the difference between what really is self-evident and what merely seems so to us?

Still another reason for the study of modern mathematics is that even the wise among the ancients did not regard mathematics simply as a preparatory discipline. Aristotle, for example, reckoned it among the parts of wisdom.1 Mathematics in general, modern or not, is not a mere preparation for wisdom (as logic and grammar are, which are not reckoned as parts of wisdom), but is itself a speculative science that is a part of wisdom in some way, even if the least part. It is not just about refining our tools for knowing and communicating, but is about the beautiful, about an order that we do not simply make but also discover—or so Plato and Aristotle would have said. For this general reason, mathematics, like natural science and theology, should occupy a place throughout a liberal education, not merely at the beginning, as logic and grammar do. And one’s grasp of

1 “And so there are three theoretical philosophies: mathematical, natural, theological.” Metaphysics 6.1 1026a17.

Page 7: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

7

mathematical things, and of the nature of mathematics itself, will be more complete (or less incomplete) in a general way if it is not confined to ancient mathematics, but includes modern mathematics as well.

Probably there are other reasons why modern mathematics should occupy our minds in the course of a liberal education. Supposing the foregoing reasons are sufficient to justify its inclusion in this little book, there remains the question of selection. The mathematics of just the last two centuries alone is so enormous in extent that no professional mathematician can come close to knowing it all, or even a large portion of it. Even a purely mathematical training for many years must remain a highly selective study of mathematics. All the more will this be true of a single course for a semester or two among many other courses outside of mathematics. We will base our selection from modern mathematics for this course partly on where we began our study of modern mathematics, with Descartes, and on some of the philosophical concerns central to the other parts of our program.

With Descartes and calculus, we began to see a development in the idea of mathematical number. First was Euclid’s idea of discrete multitudes composed of indivisible units. Then came the measured straight lines of Descartes, and then the measure of any continuous quantity came to be called a “number.” What exactly is this new idea of number? Is it a broadening out of Euclid’s idea, inclusive of his and of other things besides? Or is it simply another thing entirely? Do we freely create such numbers, or do they exist independently of us, whereas we just find them and learn about them? And what about space, or the geometrical continuum? Does that have a nature of its own independently of us that we merely study, or is it something we create, imagine as we please, into which we get to build rules of our choosing by which it will behave? And what about physical space? What is that, exactly? Are we considering that in geometry in some way? Is it just one of many possible spaces, whereas in geometry we consider all possible ones? And if arithmetic and geometry are about things we make entirely, and in no way about things that themselves dictate to us what must be true about them, so that the starting points of mathematics are a matter of arbitrary choice, then is mathematics really about quantity, or is it more truly about logical relations and structures among ideas that we invent? Is math just logic?

These are some of the questions that underlie the selection of modern mathematics that we will cover. The purpose of the selection will be to give us some sense of the modern nature of modern mathematics, of how it differs and develops from ancient thought, and how it raises in new ways fundamental questions about the nature of number, continuity, infinity, space, and mathematics itself.

That is still rather general. What specifically will be the content and order of this course? As to number theory, a certain question comes up when we begin defining numbers by operations carried out with reference to an arbitrarily chosen unit in a continuum: what operations are allowed, exactly? Beginning with Descartes (or even Viète), we saw that operations that used to be forbidden or meaningless became permissible and meaningful. For example, in Euclid’s numbers, 5 − 3 had meaning, but 3 − 5 did not. Once the idea of positive and negative lengths (measured on opposite sides of a point of origin) entered the

picture, 3 − 5 took on a clear meaning, and negative numbers were born. Again,

for Euclid’s numbers, √25 was meaningful, but √2 was not. With Descartes, √2 came to mean the straight line that is the mean proportional between the straight

Page 8: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

8

line called 1 and the straight line called 2. Hence √2 became a “number” in some

sense. For Euclid’s numbers, 6 ÷ 2 was possible, whereas 2 ÷ 3 was meaningless. But for Descartes, for whom those numbers would represent straight lines, the operation is possible and meaningful, and so 2

3 became a number, a magnitude in

some sense measured by 1, because it is made known through operations on

magnitudes that are multiples of 1. But even for Descartes, who puts himself forward as extremely permissive,

certain operations remain forbidden or without meaning. For example, there is √−1 which Descartes called “imaginary,” as if to say “fictional and meaningless.” Why is it forbidden? Because the square of a positive number along an x-axis in a Cartesian coordinate system gives a positive number, and the square of a negative number also gives a positive number (as we first learned in grade school, then saw

explained to some extent by Viète). But √−1 would have to be a number whose square is negative, and there doesn’t seem to be any such thing, or not under Cartesian operations. This is the sort of thing that provokes the mind of a modern mathematician. Is there a new, more universal, more natural way of understanding

mathematical operations under which √−1 will take on meaning? Descartes himself began the custom of finding new and more general definitions for our mathematical operations, making the formerly impossible and meaningless into something possible and meaningful. Descartes’ own success with this should make us wonder whether strange limitations and restrictions on our operations are a sign that we have defined them too narrowly. The limitations are like signs that our operations have embraced only part of some natural mathematical reality, and don’t mirror the whole thing.

And what about 𝜋 and 𝑒 ? Are these “numbers”? 𝑒 is the limit of the sum of

inverse factorials (1

0! +

1

1! +

1

2!+

1

3! + … +

1

𝑛!) as 𝑛 goes to infinity. There is something

troubling about that. It is not a sum of such inverses, but a limit of such sums. Descartes excluded certain curves from geometry on the grounds that they could not be described by algebraic equations. Are there also quantities that cannot be described by a finite number of algebraic operations? Even if it is not possible to

express 𝑒 as the result of a finite number of algebraic operations on integers, at least it is expressible as the limit of a sum constituted entirely of algebraic operations on integers, and there is a simple rule or pattern among those operations. So far, we have not seen 𝜋 thus expressed. Can it be thus expressed? Is every magnitude expressible at least as the limit of some definite pattern of algebraic operations? To what extent is continuous quantity expressible, knowable, by means of algebraic operations on whole numbers? We will explore

the question of expressing 𝜋 by algebraic operations with the help of MacLaurin

and Taylor, then the question of √−1 with the help of Euler and Wessel, and then the broader questions of the nature of continuity and number, and of infinity and non-algebraic numbers, with the help of Dedekind and Cantor. There are many questions we could ask about number theory to which modern mathematics provides answers,2 but we will begin with these.

After this glimpse of modern developments in number theory, we will look into modern developments in geometry. Arguably the greatest, most influential, radical, and quintessentially modern development in geometry is non-Euclidean geometry.

2 For example, we will not look into group theory, or the theory of primes and relative primes, or the modern theory of equations.

Page 9: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

9

Therefore the course will move to that next. We will begin with a brief review of Euclid, then move on to Proclus, who numbers among the ancient critics of Euclid’s Fifth Postulate, and from there to Lobachevsky, one of the pioneers of non-Euclidean geometry. This part of the course will culminate in Einstein’s relativity theory, which, although not pure mathematics, not only makes use of non-Euclidean geometry, but also in other ways reconceives the nature of space (and time), and moves us to ask about the relationship between mathematics and the natural world. Although time constraints will prevent us from seeing Einstein’s full-blown theory (the general theory of relativity) with the kind of mathematical detail in which we studied Newton’s Principia, we will be able to see some of the great relativistic shifts in the concepts of space and time, mass and energy, gravity and inertia.

In this way, the course will have a certain unity of its own, but will also round out other parts of the program. Aristotle and Newton had important things to say about the nature of space and time, but since their day new facts have come to light that only Einstein, so far, has been able to account for. Those facts themselves, and Einstein’s way of accounting for them, are marvelously counterintuitive, and provide ample opportunity for discussing the difference between what is truly self-evident, and what is not but only seems to be.

These considerations should make it plain that this is not just a course in mathematics, but a philosophy course as well.

Page 10: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

10

1 The Maclaurin Series

Let’s begin our acquaintance with modern number theory by looking into the question whether 𝜋 can be expressed in terms of algebraic operations somehow,

say as a limit of some pattern in algebraic operations on whole numbers, as 𝑒 can be expressed. 𝜋 is really just one instance, albeit a fundamental one, of a whole family of numbers associated with trigonometric functions. Consider, for example, sin 𝑥. When we plug in a definite value for 𝑥, the result is a definite number. In very rare cases, the number is a nice rational thing, as for example sin 30° = 1

2. In other

cases, it is irrational, but at least expressible as a finite number of algebraic

operations, as for example sin 45° = √2

2. But what happens if the angle of which we

are taking a sine is an irrational angle, that is, an angle that is incommensurable

with a full turn (360°) or with a right angle? For example, consider sin1

√17360°. What

number is that equal to? Well, it turns out we cannot express it in terms of a finite number of algebraic operations on whole numbers. And 𝜋 itself is the same way, for which reason it is called a transcendental number, meaning it is not the root of an algebraic equation. This disturbing truth about 𝜋 was first proved by the German mathematician Ferdinand von Lindemann (1852-1939) in 1882. We will return to the idea of transcendental numbers later with Cantor, and then we will see a general reason why such numbers must exist, and in great abundance. But

for now let’s ask the more modest question: if 𝜋 and sin 𝑥 (for irrational 𝑥) cannot be expressed by a finite number of algebraic operations on whole numbers, can they at least be expressed as the limit of a definite pattern of such operations? The number 𝑒 is also transcendental, as it happens. This was the first number to be proved transcendental, and it was proved such by the French mathematician Charles Hermite (1822-1901) in 1873. And yet we saw that

Page 11: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

11

𝑒 = 1

0! +

1

1! +

1

2! +

1

3! +

1

4! +

1

5! + …

which allows us to evaluate 𝑒 to any degree of precision. This expression is also a

beautiful pattern, and gives us a way of defining 𝑒 by algebraic operations on whole numbers. It would be nice if we could do the same for 𝜋.

Again, even though 𝑒𝑥 is a transcendental function (Descartes would not have permitted it into geometry), we saw in the junior mathematics tutorial that

𝑒𝑥 = 𝑥0

0! +

𝑥1

1! +

𝑥2

2! +

𝑥3

3! +

𝑥4

4! +

𝑥5

5! + …

which allows us to evaluate it to any degree of precision for a given value of 𝑥. It would be nice if we could do the same for sin 𝑥. What, for example, is sin1 ? Can

you say how the value of this is determined out to 27 decimal places? To deal with such questions, we need the Maclaurin Series, which is a way of expressing functions in the form of an algebraic polynomial that can be evaluated for any given value

of 𝑥. It is named after Scottish mathematician Colin Maclaurin (1698-1746), who was a professor at the University of Edinburgh and a disciple of Newton. Suppose we have some puzzling function, 𝑓(𝑥) (whether it be sin 𝑥 or any other function we wish to understand better), and we are trying to find a way to express it as an algebraic polynomial. To start, we simply assume that our function is equal to the limit of some infinite polynomial, and force the function to specify for us what all the terms in the polynomial must be:

𝑓(𝑥) = 𝑎0 + 𝑎1𝑥1 + 𝑎2𝑥

2 + 𝑎3𝑥3 + …

Our job is to find a way to identify the coefficients.

To find out what 𝑎0 is, we just have to find a new equation that includes 𝑎0 but gets rid of all the other unknowns, which is easy, since we just have to let 𝑥 = 0 to accomplish that: 𝑓(0) = 𝑎0 So if we know the value of our function at 0, we will know this first coefficient, 𝑎0. How do we find the second coefficient, 𝑎1 ? We need a new equation that will kill every term in the polynomial except that one.

The key insight is that the first derivative will kill the constant term, 𝑎0, and then taking the value of this first derivative at zero will kill all the others:

Page 12: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

12

𝑓′(𝑥) = 0 + 𝑎1 + 2𝑎2𝑥1 + 3𝑎3𝑥

2 + 4𝑎4𝑥3 + 5𝑎5𝑥

4 … so 𝑓′(0) = 𝑎1 and that gets us our second coefficient, as long as we know the value of our original function’s first derivative at zero. Now rinse and repeat:

𝑓′′(𝑥) = 0 + 2𝑎2 + 2 ∙ 3𝑎3𝑥1 + 3 ∙ 4𝑎4𝑥

2 + 4 ∙ 5𝑎5𝑥3 …

so 𝑓′′(0) = 2𝑎2

so 𝑓′′(0)

2 = 𝑎2

Again,

𝑓′′′(𝑥) = 1 ∙ 2 ∙ 3𝑎3 + 2 ∙ 3 ∙ 4𝑎4𝑥1 + 3 ∙ 4 ∙ 5𝑎5𝑥

2 … so 𝑓′′′(0) = 3! 𝑎3

and 𝑓′′′(0)

3! = 𝑎3

Likewise,

𝑓𝑖𝑣(0)

4! = 𝑎4

And so on. So now, replacing all our coefficients with our new expressions for them, we have

𝑓(𝑥) = 𝑓(0)𝑥0

0! +

𝑓′(0)𝑥1

1! +

𝑓′′(0)𝑥2

2! +

𝑓′′′(0)𝑥3

3!+ …

This is the general Maclaurin series. So if we know all of a function’s derivatives, and if we know the value of the function and of all its derivatives at zero, we can write the Maclaurin series for that function. Is the series trustworthy? We simply assumed that every function can be rewritten as a polynomial function, at least an infinite one. One way to test it is to take any

polynomial function, such as 5𝑥3 − 3𝑥2 + 2𝑥 − 6, and write out what the Maclaurin series says this should be equal to, and see whether it all works out. Another way to test it is to compare it to a function not expressible as a finite polynomial whose infinite polynomial expression we already know. (We will come back and prove our assumption is valid later, when we consider a more general form of the Maclaurin series called the Taylor series.)

Page 13: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

13

TEST DRIVE So let’s take the Maclaurin series for a test drive, shall we? What is the Maclaurin series for 𝑒𝑥 ? This is a good one to ask about, since we already know what this function looks like when it is expanded as a polynomial. According to our new general formula,

𝑒𝑥 = 𝑓(0)

0! +

𝑓′(0)𝑥1

1! +

𝑓′′(0)𝑥2

2! +

𝑓′′′(0)𝑥3

3!+ …

Now we know that for this function 𝑓(0) = 𝑓′(0) = 𝑓′′(0) = 𝑓′′′(0) etc., since

the derivative of 𝑒𝑥 is just 𝑒𝑥. And the value of 𝑒𝑥 for 𝑥 = 0 is just 1. Therefore

the MacLaurin series for 𝑒𝑥 is

𝑒𝑥 = 𝑥0

0! +

𝑥1

1! +

𝑥2

2! +

𝑥3

3! +

𝑥4

4! +

𝑥5

5! + …

which we already know to be correct from independent inquiry. So this is a strong indication that our Maclaurin series is a true and accurate way to re-express functions. Really, the only assumption we made in deriving the Maclaurin series is that any function can be expressed precisely as the limit of an infinity of algebraic operations, if not precisely by a finite number of algebraic operations. Does this seem to be a reasonable assumption? Is it a self-evident principle, a postulate? In any case, we shall see it is true when we later derive the Taylor Series. APPLICATION TO TRIG FUNCTIONS We already knew about 𝑒𝑥. We are more interested in something we have not yet

expressed as a polynomial, such as sin 𝑥. Can we get anywhere with that one? We know all the derivatives of sin 𝑥. And we know all their values at 𝑥 = 0. So we

can write a Maclaurin series for sin 𝑥, like this:

sin 𝑥 = (sin0)𝑥0

0! +

(cos0)𝑥1

1! +

(−sin0)𝑥2

2! +

(−cos0)𝑥3

3!+ …

or sin 𝑥 = 0 + 𝑥1

1! + 0 −

𝑥3

3! + …

i.e. sin 𝑥 = 𝑥1

1! −

𝑥3

3! +

𝑥5

5! −

𝑥7

7! …

in other words, it is just the odd powers of 𝑥 over the odd factorials, with alternating signs. What about cosine? As in the case of sine, we know all the derivatives of cos 𝑥,

Page 14: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

14

and we know all their values at 𝑥 = 0. So we can write a Maclaurin series for cos𝑥, like this:

cos 𝑥 = (cos0)𝑥0

0! +

(−sin0)𝑥1

1! +

(−cos0)𝑥2

2! +

(sin0)𝑥3

3! + …

or cos 𝑥 = 𝑥0

0! + 0 −

𝑥2

2! + 0 − …

i.e. cos 𝑥 = 𝑥0

0! −

𝑥2

2! +

𝑥4

4! −

𝑥6

6! + …

in other words, it is just the even powers of 𝑥 over the even factorials, with alternating signs. Now we can find the sine or cosine of any number to any desired precision, although it might be a computational pain.

But our original question was about whether it is possible to re-express 𝜋 in terms of algebraic operations. We need to make one more preliminary consideration before coming back to that.

Page 15: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

15

2 Series by Binomial Division

We still have no series for 𝜋 as we do for 𝑒. Another technique will help us remedy this. The technique is called binomial division (that is, dividing by a binomial). Observe:

1 = 1 + 𝑟 − 𝑟 + 𝑟2 − 𝑟2 + 𝑟3 − 𝑟3 + 𝑟4 − 𝑟4 + …

so 1 = (1 + 𝑟 + 𝑟2 + 𝑟3 + … ) + (− 𝑟 − 𝑟2 − 𝑟3 − … )

so 1 = (1 − 𝑟)(1 + 𝑟 + 𝑟2 + 𝑟3 + … )

so 1

1 − 𝑟 = 1 + 𝑟 + 𝑟2 + 𝑟3 + …

That is a neat truth by itself. Furthermore, if we let 𝑟 = −𝑞, we get

1

1 + 𝑞 = 1 − 𝑞 + 𝑞2 − 𝑞3 + 𝑞4 − 𝑞5 + …

If we now notice the similarity of our binomial division to the expression 1

1 + 𝑥2, we

will be on our way to striking gold. Why? Because of the relationship between 1

1 + 𝑥2

and the trigonometric function arctan𝑥. Looking at our general formula for binomial division, let’s see what happens if we let

𝑞 = 𝑥2 Plugging this back into our formula, we have

Page 16: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

16

1

1 + 𝑥2 = 1 − 𝑥2 + 𝑥4 − 𝑥6 + 𝑥8 − …

Recall, now, from the junior mathematics manual, that 1

1 + 𝑥2 is the derivative of

arctan𝑥. So now we know that

derivative of arctan𝑥 = 1 − 𝑥2 + 𝑥4 − 𝑥6 + 𝑥8 − … If we take the integral (or antiderivative) of both sides, we will get another equation. But the antiderivative of the derivative of arctan 𝑥 is just arctan 𝑥. And the antiderivative of the right side is found simply by using the Power Rule in reverse, term by term. So now we have:

arctan𝑥 = 𝑥1

1 −

𝑥3

3 +

𝑥5

5 −

𝑥7

7 +

𝑥9

9 …

Letting 𝑥 = 1, we have

arctan1 = 1

1 −

1

3 +

1

5 −

1

7 +

1

9 …

Now what is the arctan of 1, that is, what is the arc-length whose tangent is 1 ?

It is the arc of 45°, also known as 𝜋

4 .

So 𝜋

4 =

1

1 −

1

3 +

1

5 −

1

7 +

1

9 …

so 𝜋 = 4

1 −

4

3 +

4

5 −

4

7 +

4

9 …

So 𝜋 is just the limit of four over the odd numbers with alternating signs.

This is amazing. Not only do we have the kind of series for 𝜋 that we were hoping for, but we seem to have here a definition of 𝜋 that is independent of circles.

Moreover, we see that 𝜋 is somehow related to the odd numbers.

Page 17: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

17

QUESTION We said that

1

1 − 𝑟 = 1 + 𝑟 + 𝑟2 + 𝑟3 + …

Will this work for all values of 𝑟? What happens if 𝑟 = 1

2? If 𝑟 = 2? If 𝑟 = −1?

Do the results make sense? If not, what should we say about them? Should we say that this method of division works only for certain values of 𝑟? Is there some reason why it should work for some, but not for others?

If we let 𝑟 be positive but less than 1, then take the series (1 + 𝑟 + 𝑟2 + ⋯) out some finite number of terms and add it up, we will always be left with some

difference between this sum and 1

1 − 𝑟. But that difference will shrink as we take

more terms in the series, and it will shrink as close to zero as we please, the more

terms we take. On the other hand, this does not happen if 𝑟 is equal to or greater than 1. If it is equal to 1, then we get zero in the denominator, and we have

something undefined. If it is greater than 1, for example if it is equal to 2, or again if its absolute value is equal to or greater than 1, for example if 𝑟 is −1, the

difference between 1

1 − 𝑟 and the series does not shrink to as close to zero as we

please as we take more terms in the series. In cases such as these, we do not have any reason to believe that the binomial division is being approached by the series.

Page 18: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

18

3

Other Series Converging on 𝝅 Coming back, now, to 𝜋. Thanks to the series we just discovered for 𝜋, we can in

principle evaluate 𝜋 to any degree of precision. But by means of the particular series we have found, we cannot really do so in practice, since it converges much

more slowly than does our series for 𝑒. The series

𝜋

4 = ∑

(−1)𝑘+1

2𝑘 − 1

𝑘=1

= 1

1 −

1

3 +

1

5 −

1

7 +

1

9 …

is known as the Gregory series (it was first found by Leibniz and Gregory3), and it converges so slowly that going as far as three hundred terms into it does not suffice to calculate 𝜋 correctly even to two decimal places, and going nearly 100,000 terms is required before the first four decimal places are correctly obtained. Are there other series for 𝜋 besides this one? There are. Leonhard Euler found a very interesting series for 𝜋 as follows. Let a polynomial function 𝑝(𝑥) be such that 𝑝(0) = 1, and such that it has 𝑛 roots

𝑎, 𝑏, 𝑐, … 𝑞, that is,

𝑝(𝑎) = 0 𝑝(𝑏) = 0 𝑝(𝑐) = 0

.

.

. 𝑝(𝑞) = 0

3 James Gregory (1638-1675) was a Scottish mathematician who knew the basics of calculus before Newton or Leibniz had published their work on it, and knew Taylor’s series, which we will learn in the next section, forty years before Taylor published it.

Page 19: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

19

Let the polynomial be defined as

𝑝(𝑥) = (1 − 𝑥

𝑎) (1 −

𝑥

𝑏) (1 −

𝑥

𝑐) … (1 −

𝑥

𝑑)

It is clear that plugging in 0 for 𝑥 gives 𝑝(0) = 1, and also plugging in any root will make the function value 0. But this is also a general formula for any polynomial

meeting our conditions. For example, if 𝑝(𝑥) is a cubic polynomial for which

𝑝(2) = 𝑝(3) = 𝑝(6) = 0

and 𝑝(0) = 1 then it must be of the form

𝑝(𝑥) = (1 − 𝑥2)(1 − 𝑥

3)(1 − 𝑥

6)

Let’s assume this is so for any polynomial function, even if it is infinite. Now consider the function

𝑓(𝑥) = sin𝑥

𝑥

By the Maclaurin expansion of sin 𝑥 we can write

𝑓(𝑥) = [𝑥 −

𝑥3

3! +

𝑥5

5! −

𝑥7

7! +

𝑥9

9! − … ]

𝑥

or 𝑓(𝑥) = 𝑥

𝑥[1 −

𝑥2

3! +

𝑥4

5! −

𝑥6

7! +

𝑥8

9!− … ]

Now this is zero whenever sin 𝑥 is zero, that is, it is zero at

𝑥 = ±𝜋, ± 2𝜋, ± 3𝜋, … and so these are the roots of the equation. So, by our above method of factoring a polynomial, we have

1 − 𝑥2

3! +

𝑥4

5! −

𝑥6

7! +

𝑥8

9!− … = [1 −

𝑥

𝜋] [1 −

𝑥

−𝜋] [1 −

𝑥

2𝜋] [1 −

𝑥

−2𝜋] [1 −

𝑥

3𝜋] [1 −

𝑥

−3𝜋] …

1 − 𝑥2

3! +

𝑥4

5! −

𝑥6

7! +

𝑥8

9!− … = [1 −

𝑥

𝜋] [1 +

𝑥

𝜋] [1 −

𝑥

2𝜋] [1 +

𝑥

2𝜋] [1 −

𝑥

3𝜋] [1 +

𝑥

3𝜋] …

1 − 𝑥2

3! +

𝑥4

5! −

𝑥6

7! +

𝑥8

9!− … = [1 −

𝑥2

𝜋2] [1 − 𝑥2

4𝜋2] [1 − 𝑥2

9𝜋2] [1 − 𝑥2

16𝜋2] [1 − 𝑥2

25𝜋2]…

Next we “multiply out” the right side, grouping terms involving the same power of 𝑥, and then imitate the signs on the left side (we are following the principle that the

Page 20: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

20

right side, multiplied out, must actually just be the same as the left side, and not just equal to it):

1 − 𝑥2

3! +

𝑥4

5! −

𝑥6

7! +

𝑥8

9!− … = 1 − [

1

𝜋2 +1

4𝜋2 + 1

9𝜋2 + ⋯] 𝑥2 + [… ]𝑥4 + …

We get the 1 on the right side just by multiplying all the ones in the brackets in the previous equation. And we get the first bracketed sum on the right side by

multiplying a 1 by all 𝑥2 terms and then factoring out 𝑥2. If our earlier principle about the forms of polynomials is correct, then we are

justified in equating coefficients here, and that means that the 𝑥2 term on the left

is equal to the 𝑥2 term on the right, that is

𝑥2

3! = [

1

𝜋2 +1

4𝜋2 + 1

9𝜋2 + ⋯]𝑥2

or

1

6 = [

1

𝜋2 +1

4𝜋2 + 1

9𝜋2 + ⋯]

or

1

6 =

1

𝜋2 [1

1+

1

4+

1

9+ ⋯]

or

𝜋2

6 =

1

12 + 1

22 + 1

32 + 1

42 + 1

52 + ⋯

Amazing. Who knew that 𝜋 = √6 (1

12 + 1

22 + 1

32 + 1

42 + 1

52 + ⋯) ?

And again, where’s the circle? We now see that 𝜋 has a deep relationship to the inverses of the square numbers. This seems to be a strong motivation again for thinking of 𝜋 as a number. ANOTHER SERIES FOR PI We saw that

sin 𝑥

𝑥 = [1 −

𝑥2

𝜋2] [1 −

𝑥2

4𝜋2] [1 −

𝑥2

9𝜋2] [1 −

𝑥2

16𝜋2] [1 −

𝑥2

25𝜋2]…

Letting 𝑥 = 𝜋

2, we have

sin

𝜋

2 𝜋

2

= [1 − (𝜋

2)2

𝜋2 ] [1 − (𝜋

2)2

22𝜋2] [1 − (𝜋

2)2

32𝜋2] [1 − (𝜋

2)2

42𝜋2]…

1 𝜋

2

= [1 − 1

22] [1 − 1

2222] [1 − 1

2232] [1 − 1

2242]…

Page 21: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

21

2

𝜋 = [1 −

1

22] [1 − 1

2222] [1 − 1

2232] [1 − 1

2242]…

2

𝜋 = [

3

4] [

15

16] [

35

36] [

63

64]…

𝜋

2 = [

4

3] [

16

15] [

36

35] [

64

63]…

𝜋

2 = [

2×2

3] [

4×4

3×5] [

6×6

5×7] [

8×8

7×9]…

𝜋

2 =

22 ∙ 42∙ 62∙ 82∙ 102

32 ∙ 52∙ 72∙ 92∙ 112 …

Wow! Yet another series for 𝜋, again relating it to the squares of the integers. This constant has not only a significance for circles, then, but also for squares of numbers, for some reason. RAMANUJAN Srinivasa Ramanujan was an Indian mathematician who lived from 1887 to 1920. He had incredible native genius, and was largely self-taught. He eventually drew the attention of some Indian mathematicians who put him in touch with English mathematicians. Ramanujan, with the help of some friends, drafted letters to three leading mathematicians at Cambridge University. The first two (H. F. Baker and E. W. Hobson) returned his papers without any sort of comment, a polite “Take a hike, kid.” But the third was G. H. Hardy, who, looking at nine pages of highly original results coming from a nobody, at first thought Ramanujan must be a fraud. But he saw many new results that were almost impossible to believe, and later said that the new theorems “defeated me completely; I had never seen anything in the least like them before.” He then reasoned that Ramanujan’s theorems “must be true, because, if they were not, no one would have the imagination to invent them.” Hardy showed the theorems of Ramanujan to his colleague, J. E. Littlewood, and to another colleague, E. H. Neville, who said “not one [theorem] could have been set in the most advanced mathematical examination in the world.” Ramanujan eventually came to England and met Hardy. There was something of a clash, or contrast, between them. Hardy was an atheist, and loved proof and dealt little with intuition. Ramanujan was deeply religious and highly intuitive (although he was also fully capable of offering proofs for his theorems). He gave credit for his ingenuity to his family goddess, Namagiri of Namakkal, and often said

Page 22: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

22

“An equation for me has no meaning, unless it represents a thought of God.” Ramanujan was the youngest man ever to be elected a Fellow in the whole history of the Royal Society. He was elected for his investigation into Elliptical Functions and for his theory of numbers. Legend has it that when Hardy arrived at Ramanujan’s place of residence one day

in a taxi cab numbered 1729, Hardy remarked that the number appeared to be uninteresting. But on the spot Ramanujan replied: Not at all, it is a very interesting number. It is the smallest number that can be represented as the sum of two cubes in two ways: 1729 = 13 + 123 = 93 + 103 Hardy said that Ramanujan’s “ignorance was as remarkable as his knowledge” (a remark that Watson makes about Sherlock Holmes in Arthur Conan Doyle’s A Study in Scarlet), noting that while he knew an abundance of hitherto unknown and profound mathematical theorems, he was ignorant of Cauchy’s theorem, and had only the vaguest notion what a function of a complex variable might be, things already considered quite basic for mathematicians at the time. One of the things Ramanujan is famous for is discovering new series that not only converge on 𝜋, but that do so much more rapidly than the series that were known before him. One of these is the following

1

𝜋 =

√8

992 ∑

(4𝑛)!

(𝑛!)4

𝑛=0

×26390𝑛 + 1103

3964𝑛

In 1995, Simon Plouffe discovered another very rapidly converging series for 𝜋, known as the BBP formula,4

𝜋 = ∑ [4

8𝑛 + 1 −

2

8𝑛 + 4 −

1

8𝑛 + 5 −

1

8𝑛 + 6] [

1

16]𝑛∞

𝑛=0

Other formulas for 𝜋 take the form of integrals, such as the following one which was used in an exam at the University of Sydney in 1960:

𝜋 = 22

7 − ∫

𝑥4(1 − 𝑥)4

1 + 𝑥2

1

0

𝑑𝑥

4 For Bailey-Borwein-Plouffe. The formula is actually a digit-extraction algorithm for 𝜋 in base 16.

Page 23: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

23

An interesting infinite product formula due to Euler relates 𝜋 to the 𝑛th prime

number, 𝑝𝑛, like this:

𝜋 = 2

∏ [1 + sin(1

2𝜋𝑝𝑛)

𝑝𝑛]∞

𝑛=1

The notation in the denominator is for a product (hence the Π, capital pi, as in “p” for “product”) multiplying an infinity of terms like the one in the brackets, so that 𝑛 ranges in value from 1 to infinity. So 𝜋 is not just about circles and spheres, but also about odds, evens, squares, and primes. This thing seems to have strong ties to the numbers.

An attractive expression for 𝜋 is this infinite product over an infinite sum:

𝜋 = ∏ (1 +

14𝑛2 − 1

)∞𝑛=1

∑ (1

4𝑛2 − 1)∞

𝑛=1

A very fast method for calculating digits of 𝜋 was published in 1989 by two brothers, both born in Kiev, David Volfovich Chudnovsky (B.1947) and Gregory Volfovich Chudnovsky (B.1952). The Chudnovsky brothers became American mathematicians, and they are renowned for their world-record mathematical calculations, their home-made supercomputers, and their close working relationship. (Gregory is regarded as one of the best living mathematicians, but he suffers from a condition called myasthenia gravis, and his brother David assists him.) Their algorithm was used in several world record calculations of the digits of

𝜋, for example in 2013 it was used to calculate 12.1 trillion digits. The formula is:

1

𝜋 = 12 ∑

(−1)𝑘(6𝑘)! (545140134𝑘 + 13591409)

(3𝑘)! (𝑘!)3(6403203)𝑘+ 1 2⁄

𝑘=0

And there are many, many, many more series for 𝜋.

Page 24: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

24

4 The Taylor Series

The Maclaurin series is extremely useful for calculating the values of certain non-algebraic functions to any desired degree of precision, and it is also useful for turning non-algebraic functions into operable polynomial expressions, so that we can more easily prove things about them and see relationships among them. There are, however, certain limitations to the Maclaurin Series. The Maclaurin’s series requires us to know the value of the function (whose series-expression we wish to find) at zero and all its derivatives at zero. But there can be functions whose value at zero is unknown, or worse, functions that

simply have no value at zero, such as 𝑦 = 1

𝑥.

To overcome this limitation, there is another series called the Taylor Series (due to the work of the English mathematician Brook Taylor in 1715). In order to derive the Taylor Series, we may begin by considering a function that

the Maclaurin Series can’t handle, such as 𝑦 = 1

𝑥. The Maclaurin Series would work

just fine if the undefined point occurred anywhere else than at 𝑥 = 0. So all we need to do is shift the y-axis a bit and redescribe our function from our new y-axis.

Page 25: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

25

Produce a new y-axis that is shifted over to the right by some distance ℎ along the x-axis. Call this new origin point 0𝑠, where the subscript indicates that this is not our original origin, but a new, shifted-over one. Taking any point along the x-axis, we can now restate its value in relation to this new y-axis. Let’s designate an x-

value that is specified in relation to the shifted y-axis by the expression 𝑥𝑠.

Thus 0𝑠 = ℎ and 𝑥𝑠 = 𝑥 − ℎ Knowing these relationships, we can now redescribe the original function in terms of our new, shifted y-axis. Let’s call this new function, described by the shifted y-axis, 𝑓𝑠(𝑥𝑠). Now we know that 𝑓(𝑥) = 𝑓𝑠(𝑥𝑠) 𝑓′(𝑥) = 𝑓𝑠′(𝑥𝑠) 𝑓′′(𝑥) = 𝑓𝑠′′(𝑥𝑠) and so on, since the y-values on the left sides are exactly the same as the corresponding ones on the right, just differently described. Now we just write the Maclaurin Series for 𝑓𝑠(𝑥𝑠),

𝑓𝑠(𝑥𝑠) = 𝑓𝑠(0𝑠)𝑥𝑠

0

0! +

𝑓𝑠′(0𝑠)𝑥𝑠1

1! +

𝑓𝑠′′(0𝑠)𝑥𝑠2

2! +

𝑓𝑠′′′(0𝑠)𝑥𝑠3

3! + …

But we already said that 𝑓𝑠(𝑥𝑠) = 𝑓(𝑥)

and 𝑥𝑠 = 𝑥 − ℎ

and 0𝑠 = ℎ so we can now substitute equals for equals, giving us

𝑓(𝑥) = 𝑓(ℎ)(𝑥−ℎ)0

0! +

𝑓′(ℎ)(𝑥−ℎ)1

1! +

𝑓′′(ℎ)(𝑥−ℎ)2

2! +

𝑓′′′(ℎ)(𝑥−ℎ)3

3! + …

And this is the general form of the Taylor Series. We can choose an arbitrary ℎ to use (with certain restrictions, as we shall soon see), but should choose one such

that we know the value of the function and of its derivatives at that point. If 0 works, it is usually easiest to use that, and in that case the Taylor Series collapses

into the Maclaurin Series. If 0 does not work, 1 is often a good choice.

hss x0

Page 26: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

26

EXERCISE: FIND THE TAYLOR SERIES FOR 𝑦 = 1

𝑥

Start by setting it equal to the general form of the Taylor Series:

1

𝑥 =

𝑓(ℎ)(𝑥−ℎ)0

0! +

𝑓′(ℎ)(𝑥−ℎ)1

1! +

𝑓′′(ℎ)(𝑥−ℎ)2

2! +

𝑓′′′(ℎ)(𝑥−ℎ)3

3! + …

Now take all the derivatives of 1

𝑥 , that is, of 𝑥−1,

𝑓(𝑥) = 𝑥−1 = 1

𝑥

𝑓′(𝑥) = (−1)𝑥−2 = −(1!)𝑥−2 = −1!

𝑥2

𝑓′′(𝑥) = (−1)(−2)𝑥−3 = (2!)𝑥−3 = +2!

𝑥3

𝑓′′′(𝑥) = (−1)(−2)(−3)𝑥−4 = −(3!)𝑥−4 = −3!

𝑥4

Therefore

𝑓(ℎ) = 1

𝑓′(ℎ) = −1!

ℎ2

𝑓′′(ℎ) = +2!

ℎ3

𝑓′′′(ℎ) = −3!

ℎ4

So 1

𝑥 =

(1

ℎ)(𝑥−ℎ)0

0! −

(1!

ℎ2)(𝑥−ℎ)1

1! +

(2!

ℎ3)(𝑥−ℎ)2

2! −

(3!

ℎ4)(𝑥−ℎ)3

3! + …

or 1

𝑥 =

(𝑥−ℎ)0

ℎ1 − (𝑥−ℎ)1

ℎ2 + (𝑥−ℎ)2

ℎ3 − (𝑥−ℎ)3

ℎ4 + …

and there it is. QUESTION: We saw that some odd results come about if we recklessly use Newton’s Binomial Division to develop series expressions for certain familiar numbers and fractions.

Here is another strange result, now, if we evaluate 1

3 by means of the Taylor Series

expansion of 1

𝑥, that is, if we set 𝑥 = 3 and ℎ = 1 (we get to choose ℎ, right?):

1

3 = (3 − 1)0 − (3 − 1)1 + (3 − 1)2 − (3 − 1)3 + …

Page 27: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

27

or 1

3 = 20 − 21 + 23 − 24 + 25 − 26 + …

The difference between 1

3 and the sum on the right is increasing rather than

decreasing the more terms we take. Should we say, then, that the expansion works for absolute values of 𝑥 that are less than 1, but not otherwise? Or shall we say

that the expansion will work for any value of 𝑥, but certain values of 𝑥 require that

we choose a value of ℎ such that the series will approach 1

𝑥 ? If we let 𝑥 = 3, that

places certain restrictions on what we can let ℎ be. If we let ℎ = 1, that way lies

madness, since it removes the powers of ℎ from the denominators in the series, allowing the series to be composed of increasing whole numbers whose total sum

is increasing without bound, and not approaching any limit. But if we let ℎ = 2, we get

1

3 =

(3−2)0

21 − (3−2)1

22 + (3−2)2

23 − (3−2)3

24 + …

or 1

3 =

1

21 − 1

22 + 1

23 − 1

24 + …

and if we calculuate this up to 1

211, we get 1

3 ≈ .333007812, which is pretty good.

Page 28: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

28

EXERCISE: FIND THE TAYLOR SERIES FOR 𝑦 = log𝑒 𝑥 (Question: Can we use the Maclaurin Series for this?) Again, we start by setting this equal to the general form for the Taylor Series:

log𝑒 𝑥 = 𝑓(ℎ)(𝑥−ℎ)0

0! +

𝑓′(ℎ)(𝑥−ℎ)1

1! +

𝑓′′(ℎ)(𝑥−ℎ)2

2! +

𝑓′′′(ℎ)(𝑥−ℎ)3

3! + …

Now we take the derivatives of log𝑒 𝑥: 𝑓(𝑥) = log𝑒 𝑥 so 𝑓(ℎ) = log𝑒 ℎ

𝑓′(𝑥) = 𝑥−1 = 1

𝑥 so 𝑓′(ℎ) =

1

𝑓′′(𝑥) = −𝑥−2 = −1!

𝑥2 so 𝑓′′(ℎ) = −1!

ℎ2

𝑓′′′(𝑥) = 2𝑥−3 = +2!

𝑥3 so 𝑓′′′(ℎ) = +2!

ℎ3

𝑓′′′′(𝑥) = −6𝑥−4 = −3!

𝑥4 so 𝑓′′′′(ℎ) = −3!

ℎ4 etc.

So now we have the Taylor expansion of log𝑒 𝑥, or of ln 𝑥,

log𝑒 𝑥 = (log𝑒 ℎ)(𝑥−ℎ)0

0! +

(1

ℎ)(𝑥−ℎ)1

1! −

(1!

ℎ2)(𝑥−ℎ)2

2! +

(2!

ℎ3)(𝑥−ℎ)3

3! −

(3!

ℎ4)(𝑥−ℎ)4

4! + ⋯

We get to choose the constant ℎ, so if we let it be 1, we get

log𝑒 𝑥 = 0 + 1(𝑥−1)1

1! −

1!(𝑥−1)2

2! +

2!(𝑥−1)3

3! −

3!(𝑥−1)4

4! + ⋯

or log𝑒 𝑥 = 0!(𝑥−1)1

1! −

1!(𝑥−1)2

2! +

2!(𝑥−1)3

3! −

3!(𝑥−1)4

4! + ⋯

In each term after the first we have (𝑛−1)!

𝑛!, which is always equal to just

1

𝑛.

Consequently

log𝑒 𝑥 = (𝑥−1)1

1 −

(𝑥−1)2

2 +

(𝑥−1)3

3 −

(𝑥−1)4

4 + ⋯

Page 29: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

29

DERIVATION OF THE TAYLOR THEOREM So far, we have derived the Taylor series from the Maclaurin series, which we got from making the assumption that any function can be expressed as a polynomial, at least an infinite one. Perhaps that is a reasonable assumption, and induction in all cases that we can check certainly verifies it. But can we prove that every function can be expressed in the form of the Taylor series?

We can. The first step is to note that by “every function” we mean “every continuously differentiable function,” that is, every function 𝑓(𝑥) for which the

derivative 𝑓′(𝑥) exists and is itself a continuous function, or at least we will be considering those parts of a function that are continuous over a certain interval.

The next step is to justify a commonly used technique in calculus called integration by parts. The justification of this formula will stand as a lemma to the proof of the Taylor Theorem we aim to derive. LEMMA: INTEGRATION BY PARTS If 𝑢(𝑥) and 𝑣(𝑥) are two continuously differentiable functions, then

∫𝑣 𝑑𝑢

𝑑𝑥𝑑𝑥 = 𝑢𝑣 − ∫𝑢

𝑑𝑣

𝑑𝑥 𝑑𝑥

For 𝑑

𝑑𝑥(𝑢𝑣) = 𝑣

𝑑𝑢

𝑑𝑥 + 𝑢

𝑑𝑣

𝑑𝑥 [product rule]

Now we integrate both sides of that with respect to 𝑥:

∫𝑑

𝑑𝑥(𝑢𝑣) 𝑑𝑥 = ∫ 𝑣

𝑑𝑢

𝑑𝑥 𝑑𝑥 + ∫ 𝑢

𝑑𝑣

𝑑𝑥 𝑑𝑥

(Note that we assumed on the right side that the sum of the integrals of the two functions was the same as the integral of the sum of the two functions; is that reasonable? Can we justify that with a diagram?) Apply the Fundamental Theorem of Calculus on the left side (i.e., recall that the integral of the derivative is the original function):

𝑢𝑣 = ∫ 𝑣 𝑑𝑢

𝑑𝑥 𝑑𝑥 + ∫ 𝑢

𝑑𝑣

𝑑𝑥 𝑑𝑥

so ∫𝑣 𝑑𝑢

𝑑𝑥𝑑𝑥 = 𝑢𝑣 − ∫ 𝑢

𝑑𝑣

𝑑𝑥 𝑑𝑥 Q.E.D.

Sometimes this is written more simply, as

∫𝑣 ∙ 𝑢′𝑑𝑥 = 𝑢𝑣 − ∫ 𝑢 ∙ 𝑣′ 𝑑𝑥

or more explicitly, evaluating over an interval [𝑎, 𝑏],

∫ 𝑣 𝑑𝑢

𝑑𝑥𝑑𝑥

𝑏

𝑎 = [𝑢𝑣]𝑎

𝑏 − ∫ 𝑢 𝑑𝑣

𝑑𝑥 𝑑𝑥

𝑏

𝑎

Page 30: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

30

THE TAYLOR THEOREM This theorem states that for any continuously differentiable function 𝑓(𝑥),

𝑓(𝑥) = 𝑓(ℎ)(𝑥 − ℎ)0

0! +

𝑓′(ℎ)(𝑥 − ℎ)1

1! +

𝑓′′(ℎ)(𝑥 − ℎ)2

2! + ⋯+

𝑓𝑛−1(ℎ)(𝑥 − ℎ)𝑛−1

(𝑛 − 1)! + 𝑅𝑛

where ℎ is a constant of our choosing, and

𝑅𝑛 = ∫(𝑥 − 𝑡)𝑛−1

(𝑛 − 1)!

𝑥

ℎ 𝑓𝑛(𝑡) 𝑑𝑡

To begin proving this, we first introduce a new function, 𝑓(𝑡), which is the same as 𝑓(𝑥), except that its independent variable is 𝑡, rather than 𝑥, which will also give

meaning to 𝑓′(𝑡), the derivative of 𝑓(𝑡). Then 𝑥 will be a constant so far as this

new function is concerned (just as 𝑥 is a constant so far as ∆𝑥 is concerned when

we evaluate such expressions as lim∆𝑥→0

𝑥 + ∆𝑥, even though 𝑥 is a variable in

another function). So we may now consider 𝑥, and some other value, ℎ, as the bounds of integration. Accordingly, the Fundamental Theorem of Calculus says

(1) ∫ 𝑓′(𝑡)𝑑𝑡𝑥

ℎ = 𝑓(𝑥) − 𝑓(ℎ), and therefore

(2) 𝑓(𝑥) = 𝑓(ℎ) + ∫ 𝑓′(𝑡)𝑑𝑡𝑥

ℎ, or

𝑓(𝑥) = 𝑓(ℎ) + ∫ 𝑢 ∙ 𝑑𝑡

𝑑𝑡𝑑𝑡

𝑥

where 𝑢 = 𝑓′(𝑡), and 𝑑𝑡

𝑑𝑡 is just the derivative of 𝑡 with respect to itself,

which is just 1.

Now, if we let 𝑣 = 𝑡 + (a constant), then 𝑑𝑣

𝑑𝑡 =

𝑑𝑡

𝑑𝑡 = 1, so

(3) 𝑓(𝑥) = 𝑓(ℎ) + ∫ 𝑢 ∙ 𝑑𝑣

𝑑𝑡𝑑𝑡

𝑥

And since 𝑥 is a constant so far as 𝑡 is concerned, therefore we can let the constant defining 𝑣 be −𝑥, that is, we can let 𝑣 = 𝑡 − 𝑥. The motivation for this choice will become clear below. Now the formula for integration by parts, after we adapt the terms in it to fit our current situation, says

∫ 𝑢 𝑑𝑣

𝑑𝑡𝑑𝑡

𝑥

ℎ = [𝑣𝑢]ℎ

𝑥 − ∫ 𝑣 𝑑𝑢

𝑑𝑡 𝑑𝑡

𝑥

Applying this to the integral in (3) gives

(4) 𝑓(𝑥) = 𝑓(ℎ) + [𝑣𝑢]ℎ𝑥 − ∫ 𝑣

𝑑𝑢

𝑑𝑡 𝑑𝑡

𝑥

Page 31: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

31

(5) [𝑣𝑢]ℎ

𝑥 = (value of 𝑣𝑢 at 𝑡 = 𝑥) − (value of 𝑣𝑢 at 𝑡 = ℎ) so [𝑣𝑢]ℎ

𝑥 = (𝑥 − 𝑥) ∙ 𝑓′(𝑥) − (ℎ − 𝑥) ∙ 𝑓′(ℎ)

Of course (𝑥 − 𝑥) = 0, so the first term vanishes (this was our motivation

for letting 𝑣 = 𝑡 − 𝑥), leaving us with [𝑣𝑢]ℎ

𝑥 = − (ℎ − 𝑥) 𝑓′(ℎ)

Substituting this last expression for [𝑣𝑢]ℎ𝑥 into (4) gives

(6) 𝑓(𝑥) = 𝑓(ℎ) − 𝑓′(ℎ)(ℎ − 𝑥) − ∫ 𝑣 𝑑𝑢

𝑑𝑡 𝑑𝑡

𝑥

Supplying the meanings of 𝑣 and 𝑢 in that remaining integral, we have

(7) 𝑓(𝑥) = 𝑓(ℎ) − 𝑓′(ℎ)(ℎ − 𝑥) − ∫ 𝑓′′(𝑡)(𝑡 − 𝑥) 𝑑𝑡𝑥

Or, distributing the −1 into (ℎ − 𝑥) on the right side, and factoring out −1 from the integral,

𝑓(𝑥) = 𝑓(ℎ) + 𝑓′(ℎ)(𝑥 − ℎ) + ∫ 𝑓′′(𝑡)(𝑥 − 𝑡) 𝑑𝑡𝑥

Now we can just reapply the formula for integration by parts to that new

remaining integral, this time letting 𝑢 = 𝑓′′(𝑡) and 𝑑𝑣

𝑑𝑡= (𝑥 − 𝑡), which will

give us

(8) ∫ 𝑓′′(𝑡)(𝑥 − 𝑡) 𝑑𝑡𝑥

ℎ = 𝑓′′(ℎ) ∙

1

2(𝑥 − ℎ)2 + ∫

1

2𝑓3(𝑡)(𝑥 − 𝑡)2 𝑑𝑡

𝑥

so that

𝑓(𝑥) = 𝑓(ℎ) + 𝑓′(ℎ)(𝑥 − ℎ) + 𝑓′′(ℎ) ∙ 1

2(𝑥 − ℎ)2 + ∫ 𝑓3(𝑡)

1

2(𝑥 − 𝑡)2 𝑑𝑡

𝑥

Again applying the formula for integration by parts to this new remaining

integral and this time letting 𝑢 = 𝑓3(𝑡), and 𝑑𝑣

𝑑𝑡= 1

2(𝑥 − 𝑡)2, we get

(9) ∫ 𝑓3(𝑡)1

2(𝑥 − 𝑡)2 𝑑𝑡

𝑥

ℎ =

(𝑥−ℎ)3

3!𝑓3(ℎ) + ∫

(𝑥−𝑡)3

3!𝑓4(𝑡) 𝑑𝑡

𝑥

so that

𝑓(𝑥) = 𝑓(ℎ)(𝑥 − ℎ)0

0! +

𝑓′(ℎ)(𝑥 − ℎ)1

1! +

𝑓2(ℎ)(𝑥 − ℎ)2

2! +

𝑓3(ℎ)(𝑥 − ℎ)3

3! + ∫

(𝑥 − 𝑡)3

3!𝑓4(𝑡) 𝑑𝑡

𝑥

Continuing in this way, we see the pattern emerge:

Page 32: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

32

𝑓(𝑥) = 𝑓(ℎ)(𝑥 − ℎ)0

0! +

𝑓′(ℎ)(𝑥 − ℎ)1

1! + ⋯+

𝑓𝑛−1(ℎ)(𝑥 − ℎ)𝑛−1

(𝑛 − 1)! + ∫

(𝑥 − 𝑡)𝑛−1

(𝑛 − 1)!

𝑥

𝑓𝑛(𝑡) 𝑑𝑡

where 𝑛 is the number of times we integrated by parts (and also the number of terms, at any stage in this process, prior to the remaining integral). Q.E.D. COROLLARY: From this theorem it immediately follows that whenever 𝑅𝑛, the

remaining integral at the end of some such series, approaches zero as a limit as 𝑛 goes to infinity, then the series converges on function 𝑓(𝑥). That is,

𝑓(𝑥) = lim𝑛→∞

[𝑓(ℎ)(𝑥 − ℎ)0

0! +

𝑓′(ℎ)(𝑥 − ℎ)1

1! + … +

𝑓𝑛(ℎ)(𝑥 − ℎ)𝑛

𝑛!]

which is what we really wanted to know. But when will it be the case that lim

𝑛→∞|𝑅𝑛| = 0 ?

To answer that, let’s reexamine the expression for the remainder 𝑅𝑛:

𝑅𝑛 = ∫(𝑥 − 𝑡)𝑛−1

(𝑛 − 1)!

𝑥

𝑓𝑛(𝑡) 𝑑𝑡

As long as 𝑓𝑛(𝑡) has a finite maximum value as 𝑛 goes to infinity, then 𝑅𝑛 will

tend to zero as 𝑛 goes to infinity. This is because the other part of the integrand, namely

(𝑥 − 𝑡)𝑛−1

(𝑛 − 1)!

tends to zero as 𝑛 goes to infinity. To see this, consider that the integral is evaluating from ℎ to 𝑥, so really we are just asking about what happens to

(𝑥 − ℎ)𝑛−1

(𝑛 − 1)!

as 𝑛 goes to infinity. More simply, we are asking about what happens to

(𝑥 − ℎ)𝑛

𝑛!

as 𝑛 goes to infinity. And since 𝑥 and ℎ are just constants, so far as this integral is concerned, we are really just asking about what happens to

𝑎𝑛

𝑛!

Page 33: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

33

as 𝑛 goes to infinity. But this is clear, since

𝑎𝑛

𝑛! =

𝑎

1 ∙

𝑎

2 ∙

𝑎

3 ∙

𝑎

4 ∙

𝑎

5 ∙

𝑎

6 ∙

𝑎

7 ∙

𝑎

8 ∙

𝑎

9 ∙

𝑎

10 ∙ …

Eventually, after a finite number of terms, we will come to a fraction in which the denominator is greater than 𝑎. Forever after that, the denominators will all be greater than 𝑎. So the fractions in which 𝑎 is greater than the denominator will

eventually be outnumbered by subsequent fractions in which 𝑎 is less than the denominator, and the total product up to that point will become equal to or less

than one. Forever after that, the denominator will get to exceed the constant 𝑎 by more than any assigned amount, and therefore the whole product (the whole

fraction) will come to be as little as you please, that is, it will approach zero as 𝑛 goes to infinity. Therefore, as long as 𝑓𝑛(𝑡) has a finite maximum value as 𝑛 goes to infinity, then

𝑅𝑛 will tend to zero as 𝑛 goes to infinity, and consequently

𝑓(𝑥) = lim𝑛→∞

[𝑓(ℎ)(𝑥 − ℎ)0

0! +

𝑓′(ℎ)(𝑥 − ℎ)1

1! + ⋯+

𝑓𝑛(ℎ)(𝑥 − ℎ)𝑛

𝑛!]

QUESTION: We just argued that if 𝑓𝑛(𝑡) has a finite maximum value as 𝑛 goes

to infinity, then 𝑅𝑛 will tend to zero as 𝑛 goes to infinity. Can we also say that if 𝑓𝑛(𝑡) does not have a finite maximum value as 𝑛 goes to infinity, then 𝑅𝑛 will not tend to zero as 𝑛 goes to infinity? Could it happen, for example, that 𝑓𝑛(𝑡)

grows without bound as 𝑛 goes to infinity, but (𝑥 − 𝑡)𝑛−1

(𝑛 − 1)! tends to zero so much more

rapidly than 𝑓𝑛(𝑡) grows that 𝑅𝑛 will still tend to zero as 𝑛 goes to infinity?

Page 34: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

34

5

Introduction to 𝒊 Although we will not be leaving 𝜋 entirely behind, we will now begin considering another number of fundamental importance in modern number theory, namely

√−1, the symbol for which is 𝑖 (as in “imaginary”). In fact, in the next section we will discover a tantalizing connection between these two numbers.

How did the number 𝑖 first gain the attention of mathematicians? It occurred in the attempt to solve various algebraic equations. Some of the roots of many equations turned out to be what Descartes called imaginary, that is, they turned out to be the square roots of negative quantities. That is deeply disturbing, if such things are purely meaningless fictions. Why should meaningful equations involve meaningless solutions? Consider, for example, the polynomial equation

𝑥3 − 3𝑥2 + 2𝑥 − 6 = 0 The solution of cubics is itself an important part of equation theory, but rather than get into all that, let’s note that this one is relatively easy to factor, since

𝑥3 − 3𝑥2 + 2𝑥 − 6 = (𝑥2 + 2)(𝑥 − 3) which is easily verified by “foiling.” Looking at the second bracketed factor on the

right, if 𝑥 = 3, then the whole business equals zero, which is another way of saying that 3 is a root of this polynomial. Are there any other roots of this equation? Any

other x-values that would make the whole thing turn to zero? Nothing but 3 can turn (𝑥 − 3) to zero. But what about the other bracketed factor, (𝑥2 + 2)? Will any

x-value turn that to zero? To find out, we just set it equal to zero and solve for 𝑥:

Page 35: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

35

𝑥2 + 2 = 0

𝑥2 = − 2

𝑥 = ±√−2 So here we have two more “roots” of the original equation, and indeed if we write

(𝑥 − 3)(𝑥 − √−2)(𝑥 + √−2) and multiply the whole business out, we will get back our original polynomial. The

number √−2 is the same as √(2)(−1), so if the ordinary algebra applies at all to

this strange creature, then √−2 = √2 ∙ √−1. So all such “imaginary numbers”

involve √−1 or 𝑖. Can we simply ignore these troubling numbers as meaningless things? Every instinct of the modern mathematical mind is against that kind of response. For one thing, these numbers keep turning up among solutions to meaningful equations. For another, as you see in the example above, if we allow these weird imaginaries to be called roots of an equation, then the number of roots to every polynomial equation will be exactly the same as the number of its highest power or degree—three, in the example. That is too beautiful and simple a rule to ignore. It is as if the equations are trying to tell us something, that the roots on the x-axis are not the only roots of an algebraic equation, and it was somehow arbitrary and partial of us to assume so. Where, then, do these other roots exist? Before coming back to that question, let’s read what Leonhard Euler has to say about imaginaries, since he will provide us with more thorough and general evidence that the number of roots to a polynomial equation will always be the same as the number of its highest power, but only so long as we count imaginaries as roots.

Page 36: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

36

RESEARCHES ON THE IMAGINARY ROOTS OF EQUATIONS5

Leonhard Euler

1. Every algebraic equation being delivered from fractions and radical signs always reduces to this general form:

𝑥𝑛 + A𝑥𝑛−1 + B𝑥𝑛−2 + C𝑥𝑛−3 + D𝑥𝑛−4 + ⋯+ N

where the letters A, B, C, D, … N mark constant real quantities, either positive or negative, without excluding zero. The roots of such an equation are the values

which, being put for 𝑥, produce an identical equation 0 = 0. Now if 𝑥 + α is a divisor or a factor of the given formula, the other factor being indicated by X, so

that the equation has this form (𝑥 + α)X = 0, it is clear that this happens if 𝑥 +α = 0, or 𝑥 = −α. From this we see that the roots of an equation are found by seeking the divisors or factors of this same equation; and all the roots of an equation are derived from all the simple divisors of the form 𝑥 + α. 2. Thus, to find all the roots of a given equation, we only have to seek all the simple

factors of the quantity: 𝑥𝑛 + A𝑥𝑛−1 + B𝑥𝑛−2 + C𝑥𝑛−3 + D𝑥𝑛−4 + ⋯+ N; and if we lay down these factors:

(𝑥 + α)(𝑥 + β)(𝑥 + γ)(𝑥 + δ) etc. it is immediately clear that the number of these factors must be equal to the

exponent 𝑛; and therefore the number of all the roots, which will be:

𝑥 = −α 𝑥 = −β

𝑥 = −γ 𝑥 = −δ etc.,

5 This is a translation of a selection from Euler’s Recherches sur les Racines Imaginaires des Equations, 1751. The selection was translated in 2011 by Ronald J. Richard, and is meant for use by the students and faculty of Thomas Aquinas College, Santa Paula, California and St. John’s College, Annapolis, Maryland and Santa Fe, New Mexico. This edition introduces minor alterations to that translation.

Page 37: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

37

will also equal this same exponent 𝑛, since a product such as (𝑥 + α)(𝑥 + β)(𝑥 +γ)(𝑥 + δ) etc. cannot become equal to zero unless one of its factors vanishes. Every equation, then, of any degree, will always have as many roots as the exponent of its highest power contains units. 3. Now it very often happens that not all of these roots are real quantities, and that some, or perhaps all, are imaginary quantities. We call a quantity imaginary which is neither greater than zero, nor less than zero, nor equal to zero; so this will be

something impossible, as for example √−1, or in general 𝑎 + 𝑏√−1, since such a quantity is neither positive, nor negative, nor zero. For example, this equation

𝑥3 − 3𝑥2 + 6𝑥 − 4 = 0 has these three roots,

𝑥 = 1

𝑥 = 1 + √−3

𝑥 = 1 − √−3, and the last two are imaginary, and there is only one real root, 𝑥 = 1. From this we see that if we want to include under the name of roots only those which are real, their number would often be much smaller than the highest exponent in the equation. And therefore when we say that every equation has as many roots as the exponent of its degree indicates, this must mean all the roots, both real and imaginary. 4. We, therefore, understand that whatever the degree of the given equation,

𝑥𝑛 + A𝑥𝑛−1 + B𝑥𝑛−2 + C𝑥𝑛−3 + D𝑥𝑛−4 + ⋯+ N = 0 it can always be represented by a form such as

(𝑥 + α)(𝑥 + β)(𝑥 + γ)(𝑥 + δ)… (𝑥 + ν) = 0

where the number of these simple factors would be = 𝑛. And since these factors being actually multiplied together must produce the given equation, it is evident that the quantities A, B, C, D, … N will be determined by the quantities α, β, γ,δ, … , ν in such a way that:

A = the sum of these quantities α, β, γ, δ, … , ν B = the sum of all their products taken two at a time

C = the sum of all their products taken three at a time D = the sum of all their products taken four at a time . . .

N = the product of all taken together, α, β, γ, δ, … , ν.

Page 38: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

38

Now, since the number of these equalities = 𝑛, the values of the letters α, β, γ, δ,… , ν will, conversely, be determined. 5. Although it might seem that knowledge of the imaginary roots of an equation might not have any utility, seeing that they do not furnish solutions to any problems we might have, nevertheless, it is very important in all of analysis to render familiar the calculus of imaginary quantities. For, not only will we acquire a more perfect knowledge of the nature of equations, but the analysis of infinites will receive very considerable help ... 6. It is demonstrated in algebra that when an equation has imaginary roots, their number is always even, so that every equation either has no imaginary roots, or else has two, or four, or six, or eight, etc., and the number of all the imaginary roots of an equation can never be odd. But, moreover, we maintain that the imaginary roots so occur in pairs that both the sum and the product of the two become real.

Or, what amounts to the same thing, if 𝑥 + 𝑦√−1 is one of the imaginary factors of an equation, we maintain that there will always be found among the others a factor

𝑥 − 𝑦√−1, also imaginary, which, being multiplied by the former 𝑥 + 𝑦√−1, gives

a real product. The product of 𝑥 + 𝑦√−1 by 𝑥 − 𝑦√−1 being = 𝑥2 + 𝑦2, and the

sum being = 2𝑥, it is clear that both are real quantities.

Page 39: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

39

6 Euler’s Identity

We have seen some reason for thinking that 𝑖 deserves to exist, to be admitted among legitimate numbers, since admitting its existence allows us to preserve the rule that every algebraic polynomial has precisely as many roots as the number of its highest power. Now we will see another reason we should give serious

consideration to the legitimacy of 𝑖, and to what its nature might be—we will see that it has an astonishing relationship to other known numbers.

Without knowing exactly what 𝑖 really is, we at least have a kind of operational definition of it:

𝑖 = √−1 We can operate on this thing in ways analogous to other algebraic numbers. We will come to see better what this number means, or can mean, once we get to Wessel. For now, let’s just use it as consistently as we can and see some of the interesting places it takes us.

Page 40: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

40

THE POWERS OF 𝒊

The powers of 𝑖, for example, are easy enough to see from its definition:

𝑖1 = 𝑖

𝑖2 = −1

𝑖3 = −𝑖

𝑖4 = 1

𝑖5 = 𝑖

𝑖6 = −1 and so on.

THE MACLAURIN SERIES FOR 𝒊 𝐬𝐢𝐧 𝒙

We can even derive the Maclaurin Series for 𝑖 sin 𝑥, like this:

𝑖 sin 𝑥 = 𝑖 [𝑥1

1! −

𝑥3

3! +

𝑥5

5! −

𝑥7

7! … ]

so 𝑖 sin 𝑥 = 𝑖𝑥1

1! −

𝑖𝑥3

3! +

𝑖𝑥5

5! −

𝑖𝑥7

7! …

That was easy!

THE MACLAURIN SERIES FOR 𝒆𝒊𝒙

To construct the Maclaurin Series for 𝑒𝑖𝑥, we need to find all the derivatives of

𝑒𝑖𝑥.

If 𝑦 = 𝑒𝑖𝑥 then what is the derivative of this? We can use the Chain Rule to find out: let 𝑢 = 𝑖𝑥

so 𝑦 = 𝑒𝑢

Now 𝑑𝑦

𝑑𝑥 =

𝑑𝑦

𝑑𝑢∙𝑑𝑢

𝑑𝑥 = 𝑒𝑢 ∙ 𝑖 = 𝑖𝑒𝑖𝑥

Page 41: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

41

So 𝑦′ = 𝑖𝑒𝑖𝑥 There we have the first derivative. We can find the second derivative in the same way:

𝑦′ = 𝑖𝑒𝑖𝑥

let 𝑢 = 𝑖𝑥

so 𝑦′ = 𝑖𝑒𝑢

Now 𝑑𝑦′

𝑑𝑥 =

𝑑𝑦′

𝑑𝑢∙𝑑𝑢

𝑑𝑥 = 𝑖𝑒𝑢 ∙ 𝑖 = 𝑖2𝑒𝑖𝑥

So 𝑦′′ = 𝑖2𝑒𝑖𝑥 There we have the second derivative. We can find the third derivative in the same way:

𝑦′′ = 𝑖2𝑒𝑖𝑥 let 𝑢 = 𝑖𝑥

so 𝑦′′ = 𝑖2𝑒𝑢

Now 𝑑𝑦′′

𝑑𝑥 =

𝑑𝑦′′

𝑑𝑢∙𝑑𝑢

𝑑𝑥 = 𝑖2𝑒𝑢 ∙ 𝑖 = 𝑖3𝑒𝑖𝑥

So 𝑦′′′ = 𝑖3𝑒𝑖𝑥

There we have the third derivative. And the pattern is clear. The 𝑛th derivative of

𝑒𝑖𝑥 will be 𝑖𝑛𝑒𝑖𝑥. By our general Maclaurin formula, we know that

𝑒𝑖𝑥 = 𝑓(0)

0! +

𝑓′(0)𝑥1

1! +

𝑓′′(0)𝑥2

2! +

𝑓′′′(0)𝑥3

3!+ …

And from what we just did, we know the expressions for the derivatives, and so we have:

𝑒𝑖𝑥 = 𝑒𝑖∙0

0! +

𝑖1𝑒𝑖∙0𝑥1

1! +

𝑖2𝑒𝑖∙0𝑥2

2! +

𝑖3𝑒𝑖∙0𝑥3

3!+ …

or 𝑒𝑖𝑥 = 𝑥0

0! +

𝑖𝑥1

1! +

𝑖2𝑥2

2! +

𝑖3𝑥3

3!+ …

Now we can replace the even powers of 𝑖 with 1 and −1, and the odd powers

Page 42: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

42

of 𝑖 with 𝑖 and −𝑖, and we have

𝑒𝑖𝑥 = 𝑥0

0! +

𝑖𝑥1

1! −

𝑥2

2! −

𝑖𝑥3

3! +

𝑥4

4! +

𝑖𝑥5

5! −

𝑥6

6! −

𝑖𝑥7

7! + …

And there we have our Maclaurin series for 𝑒𝑖𝑥. EULER’S IDENTITY In case you were wondering why we care about these particular formulas, you might begin to take an interest in this result if you recall that

cos 𝑥 = 𝑥0

0! −

𝑥2

2! +

𝑥4

4! −

𝑥6

6! + …

and 𝑖 sin 𝑥 = 𝑖𝑥1

1! −

𝑖𝑥3

3! +

𝑖𝑥5

5! −

𝑖𝑥7

7! + …

from which two series it is clear that our new series for 𝑒𝑖𝑥 is just the sum of these

two! If we start with the first term from cos 𝑥, then add the first term from 𝑖 sin 𝑥,

and keep going by alternating in this way, we have our series for 𝑒𝑖𝑥. This means that

𝑒𝑖𝑥 = cos 𝑥 + 𝑖 sin 𝑥 Wow! But wait—there’s more.

If we now let 𝑥 = 𝜋, we have

𝑒𝑖𝜋 = cos 𝜋 + 𝑖 sin 𝜋

or 𝑒𝑖𝜋 = −1 + 𝑖(0) = −1 + 0

so 𝑒𝑖𝜋 + 1 = 0 an amazing formula correlating the five most fundamental constants in mathematics, which formula is known as Euler’s Identity. It’s meaning is not exactly clear to us yet, though, since the meaning of 𝑖, and its meaning as an exponent, is still murky. Although Euler discovered this identity, he did not assign any clear meaning to 𝑖, as we saw in the selection from his

researches in the last section. This intriguing relationship between 𝑖 and other more familiar numbers should motivate us to find a genuine and intelligible

meaning for 𝑖. So let’s move on and see if we can get some help with that from a certain Caspar Wessel.

Page 43: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

43

7 Caspar Wessel’s Directional Algebra

It is high time for 𝑖 to acquire citizenship among numbers. For assistance in understanding this modern number, we will turn to Caspar Wessel (1745-1818), a Norwegian-Danish mathematician, land surveyor, and cartographer. He was the first person to give a sound geometrical

interpretation of 𝑖 and other imaginary numbers, although this went unnoticed for almost a hundred years, since his original paper on the subject was written in Danish and published in a journal that got little circulation outside of Denmark. Independently of Wessel, Jean-Robert Argand (1786-1822), a Swiss-born amateur French mathematician, obtained the same results in 1806, and so did the great German mathematician, Carl Friedrich Gauss (1777-1865) in 1831. It was Wessel’s work as a mapmaker that first inspired him to see whether rules such as those of ordinary algebra could be used to make calculations in terms that combined magnitude with direction. What follows is a translation of a selection from Wessel’s own original paper.

Page 44: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

44

ON THE ANALYTICAL REPRESENTATION OF DIRECTIONS6

An Attempt Applied Chiefly to the Resolution of Plane and Spherical Polygons

Caspar Wessel, Surveyor

THE present attempt deals with the question of how direction should be described

analytically, or how straight lines should be expressed if, by means of a single equation in one unknown (and other given lines), an expression might be found representing both the unknown length and its direction.

So as to answer this question reasonably well, I lay a foundation with two propositions, which seem undeniable to me. The first is: that the change in direction produced by algebraic operations should be represented by their symbols. The second: direction is not a subject of algebra except insofar as it can be changed by algebraic operations. But if direction cannot be changed by these (at least according to the usual explanation), except to the opposite, i.e. from positive to negative, and conversely, then these two directions alone could be denoted in the familiar way, and the problem of contemplating the other directions would be unsolvable. Presumably, this is also why no one has thus occupied himself. Undoubtedly it has been held impermissible to change anything in the explanation of operations once agreed upon. And, on the other hand, there is no objection to this as long as the explanation is applied to quantities in general; but probably it should not be called impermissible in rare cases, where the nature of the quantities seems to invite a more precise determination of the operations and to allow a useful application of it, because in going from arithmetic to geometric analysis, or from operations with abstract numbers to those with straight lines, one meets quantities that might well permit the same relations, but also many more relations than the abstract numbers can have to each other.

6 The original work is Wessel’s Om Directionens analytiske Betegning, et Forsog, anvendt fornemmelig til plane og sphæriske Polygoners Opløsning, originally published in Archiv for Mathematik og Naturvidenskab, 1799. This selection from it was translated in 2011 by Ronald J. Richard, and is meant for use by the students and faculty of Thomas Aquinas College, Santa Paula, California and St. John’s College, Annapolis, Maryland and Santa Fe, New Mexico. This edition introduces minor alterations to that translation.

Page 45: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

45

So, if one takes the operations in a digressive sense, and does not, as before, merely restrict them to be used with lines of the same or of opposite direction, but now extends their formerly restricted concept somewhat further, so that it becomes useful not only in the former cases, but also in infinitely many more cases; I say that, if one takes care with his freedom, and yet does not thereby violate the usual rules of operation, then he does not contradict the first teaching about numbers, but carries it out further, adapting it to the nature of the quantities and observing the rule of method that gradually makes a difficult learning understandable. Thus, it is not an unreasonable claim that operations used in geometry may be taken in a more digressive sense than the one given to them in the calculating art; one may also readily admit that in this way it may be possible to produce infinitely many changes in the directions of lines. But one thereby obtains precisely the result (as will subsequently be proved) not only that all impossible operations can be avoided, and that the paradoxical statement that the possible might sometimes be sought by impossible means can be illumined, but also that the direction of all lines in the same plane can be expressed just as analytically as their length, without the memory being burdened by new symbols or rules. Since it seems beyond doubt that the general validity of geometrical theorems is often easier to perceive when the direction is represented analytically and subjected to the algebraic rules of operation than when diagrams must be provided (and that only in some cases), therefore it also seems not only permissible, but even useful, to make use of operations that are extended to more lines than those of the same and of the opposite direction. Because of this I seek

1. First to determine the rules governing such operations;

2. Next, by means of a couple of examples, to display their

application to lines in the same plane;

3. Thereafter, to determine the direction of lines in different

planes by a new method of operation which is not algebraic;

4. Thereupon, with the assistance of this to discover solutions

to plane and spherical polygons in general;

5. Lastly, to derive in the same manner the known formulas of

spherical trigonometry.

This is the main content of the treatise. The occasion of it was that I sought a

method whereby impossible operations could be avoided, and when this had been found, I used the same method to become convinced of the generality of some known formulae. Mr. Tetens, Councillor of State, had the patience to read through these first investigations, and I owe to the encouragement, advice, and guidance of this renowned scholar that this composition now appears less imperfect and has been deemed worthy to be included in the writings of The Royal Academy of Sciences.

Page 46: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

46

A Method of Propagating Other Straight Lines from Given Ones by Algebraic Operations,

and Chiefly What Directions and Signs they Should Have There are some homogeneous quantities which, when they are placed within the same subject, increase or decrease each other only by increment and decrement.

There are others that in the same circumstances may change each other in numerous other ways. Straight lines are of this latter kind.

Thus the distance of a point from a plane can change in countless ways, when the point describes a more or less inclined straight line outside the plane.

If this line is perpendicular, i.e., the point’s path makes a right angle with the axis of the plane, then the point remains in a parallel plane, and its path has no effect on its distance from the plane.

If the described line is indirect, i.e., it makes an oblique angle with the axis of the plane, then it contributes a smaller piece than its own length to the lengthening or shortening of the distance, and may increase or decrease the distance in infinitely many ways.

If the line is direct, i.e., in line with the distance, it assigns to or strips from the same its full length, and in the first case is positive, in the other, privative.

All the straight lines that can be described by a point are, therefore, with respect to their effect on the distance of the given point from a plane (i.e., a plane deployed outside the lines), either direct, indirect, or perpendicular, depending on whether they add or subtract the whole, or part, or none of their own length.

Since a quantity is called absolute insofar as it is not relative to another quantity but is directly posited, so can the distance in the previous definitions be called the absolute line, and the contribution of the relative to the extending or shortening of the absolute can be called the effect of the relative.

There are still more quantities than straight lines that could admit of the mentioned relations. So, it was not useless to explain such relations in general, and to incorporate their general concept in the explanation of the operations; but since, on advice from experts, both the content of this paper and the clarity of the presentation require that I do not trouble the reader with such abstract concepts, I present only the geometric explanations, and therefore say

§ 1 Two straight lines are added together, when one joins them together so that one begins where the other ends, and next one draws a straight line from the first to the last point of the joined lines, and takes this to be their sum.

If, for example, a point moves forward 3 feet and then backward 2 feet, then

the sum of these two paths is not the first 3 and the last 2 feet together, but the sum is 1 foot forward, because this path described by the same point has the same effect as the other two paths.

Similarly, when one side of a triangle extends from 𝑎 to 𝑏, and another from

𝑏 to 𝑐, then the third one from 𝑎 to 𝑐 is called the sum, and should be denoted by 𝑎𝑏 + 𝑏𝑐, so that 𝑎𝑐 and 𝑎𝑏 + 𝑏𝑐 have the same meaning, or 𝑎𝑐 = 𝑎𝑏 + 𝑏𝑐 = −𝑏𝑎 + 𝑏𝑐, if 𝑏𝑎 is the opposite of 𝑎𝑏. If the added lines are direct, then the definition is perfectly consistent with the usual one. If they are not direct, it is not

Page 47: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

47

contrary to the analogy to call a straight line the sum of two other conjoined lines, insofar as it has the same effect as these do. The meaning I have given to the + sign is not very unusual; for example, in the expression

𝑎𝑏 + 𝑏𝑎

2 =

1

2 𝑎𝑏

𝑏𝑎

2 is no part of the sum. Thus one can write 𝑎𝑏 + 𝑏𝑐 = 𝑎𝑐 without thinking of 𝑏𝑐

as any part of 𝑎𝑐; 𝑎𝑏 + 𝑏𝑐 is only the sign by which 𝑎𝑐 is represented.

§ 2 When more than two straight lines are to be added, the same rule is followed; namely, they are joined so that the last point of the first is joined to the first one of the second, the last point of the second to the first point of the third, etc., and then a straight line is drawn from the point where the first begins to where the last one ends, and this is called the sum of them all.

Which lines is to be taken first, and which second, third, etc., is immaterial; for wherever, within three planes which make right angles with each other, a straight line is described by a point, this line has the same effect on the distance of the point from each of the planes; consequently, any one of the several added lines contributes just as much to determining the position of the last point of the sum, whether it be the first, the last, or whatever other order among the addends; thus the order in the addition of straight lines is indifferent, and the sum is always the same, because its first points is assumed to be given, and the last always receives the same position.

Hence, in this case also the sum is denoted by inserting the + sign between the connected lines. For example, when in a quadrilateral the first side is drawn from 𝑎 to 𝑏, the second from 𝑏 to 𝑐, the third from 𝑐 to 𝑑, but the fourth from 𝑎 to 𝑑, then one can set 𝑎𝑑 = 𝑎𝑏 + 𝑏𝑐 + 𝑐𝑑.

§ 3

If the sum of several lengths, widths, and heights = 0, then the sums of the lengths, of the widths, and of the heights, are each = 0.

Page 48: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

48

§ 4

The product of two straight lines must in every respect be formed from the one

factor, just as the other factor is formed from the positive or absolute line set = 1, that is:

First, the factors must have a direction such that both of them can

be included in the same plane as the positive unit.

Second, the length of the product must be to the one factor as the

other is to the unit; and

Finally, if we give the positive unit, the factors, and the product a

common first point, then the product, with respect to its direction,

must lie in the plane of the unit and the factors, and deviate as many

degrees from the one factor, and to the same side, as the other

factor deviates from the unit, so that the directional angle of the

product (or its deviation from the positive unit) is as great as the

sum of the directional angles of the factors.

§ 5 Let +1 denote the positive, rectilinear unit, and + 𝑖 a certain other unit,7 perpendicular to the positive one, and with the same initial point; then the directional angle of +1 = 0, of −1 = 180°, of +𝑖 = 90°, and of −𝑖 = −90° or

270°; and following the rule that the directional angle of the product is the sum of those of the factors, one gets (+1)(+1) = +1 (+1)(−1) = −1

(−1)(−1) = +1 (+1)(+ 𝑖) = + 𝑖 (+1)(− 𝑖) = − 𝑖 (−1)(+ 𝑖) = − 𝑖 (−1)(− 𝑖) = + 𝑖 (+ 𝑖)(+ 𝑖) = −1

(+ 𝑖)(− 𝑖) = +1 (− 𝑖)(− 𝑖) = −1

From which it is seen that 𝑖 = √−1, and the deviation of the product is determined so that not a single one of the usual rules of operation is violated.

7 Wessel employs the symbol ε, the Greek letter epsilon, but we will use the accepted symbol 𝑖 in its place.

Page 49: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

49

§ 6

The cosine of a circular arc that begins at the last point of its radius +1, is the piece of the same or the opposite radius, starting from the center and ending at the perpendicular from the last point of the arc. The sine of the same arc is drawn perpendicular to the cosine from its last point to the last one of the arc.

It follows from §5 that the sine of a right angle = √−1. Let √−1 = 𝑖; let 𝑣 denote any angle, and sin𝑣 denote a straight line of the same length as angle

𝑣’s sine, but positive when it ends in the first semicircumference, and negative when it ends in the second semicircumference: so it follows from §§ 4 and 5 that

𝑖 sin 𝑣 expresses angle 𝑣’s sine with respect to both direction and length.

§ 7 In conformity with §§ 1 and 6, the radius that starts at the center and deviates by angle 𝑣 from the absolute or positive unit equals cos 𝑣 + 𝑖 sin𝑣. But, according to §4, the product of two factors, one of which deviates from the same unit by angle 𝑣 and the other by angle 𝑢, deviates from the same unit by angle 𝑣 + 𝑢. So, when the straight line cos 𝑣 + 𝑖 sin 𝑣 is multiplied by the straight line cos 𝑢 + 𝑖 sin 𝑢, the product becomes a straight lines whose directional angle is 𝑣 + 𝑢. Consequently, following §§ 1 and 6, the product is denoted by cos(𝑣 + 𝑢) + 𝑖 sin(𝑣 + 𝑢).

§ 8 This product, (cos 𝑣 + 𝑖 sin𝑣)(cos 𝑢 + 𝑖 sin 𝑢), or cos(𝑣 + 𝑢) + 𝑖 sin(𝑣 + 𝑢), can be expressed in yet another way, namely by adding in one sum the partial products that appear when each of the added lines whose sum constitutes the one factor is multiplied by each of those whose sum constitutes the second one. Thus we get

(cos 𝑣 + 𝑖 sin 𝑣)(cos 𝑢 + 𝑖 sin 𝑢) = cos 𝑣 cos 𝑢 − sin 𝑣 sin 𝑢 + 𝑖 (cos 𝑣 sin 𝑢 + sin 𝑣 cos 𝑢)

which follows from the familiar trigonometric formulae cos(𝑣 + 𝑢) = cos 𝑣 cos𝑢 – sin 𝑣 sin𝑢

and sin(𝑣 + 𝑢) = cos𝑣 sin𝑢 – cos 𝑢 sin𝑣 These two formulae can be proved with exactness and without great verbosity for all cases, whether both of the angles 𝑣 and 𝑢, or only one, are positive, negative, greater than or less than a right angle. Consequently, the theorems derived from the same two formulae are universal.

Page 50: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

50

§ 9

cos 𝑣 + 𝑖 sin𝑣, from §7, is the radius of a circle whose length = 1 and whose deviation from cos0° is the angle 𝑣; consequently, 𝑟 cos 𝑣 + 𝑟𝑖 sin 𝑣 denotes a

straight line whose length is 𝑟, and whose directional angle = 𝑣; for if the legs of a right triangle be made 𝑟 times larger, then the hypotenuse will also be 𝑟 times larger, and the angles remain unchanged. But, from §1, the sum of the legs equals

the hypotenuse, i.e., 𝑟 cos 𝑣 + 𝑟𝑖 sin𝑣 = 𝑟 (cos 𝑣 + 𝑖 sin 𝑣). So this is a general expression for any straight line lying in the same plane with cos0° and 𝑖 sin0°, deviating from cos0° by 𝑣 degrees, and having length 𝑟.

§ 10 Let 𝑎, 𝑏, 𝑐, 𝑑 denote direct lines of any length, either positive or negative, and let

the two indirect lines 𝑎 + 𝑖𝑏 and 𝑐 + 𝑖𝑑 lie in the same plane as the absolute unit: then their product can be found even when their deviation from the absolute unit is unknown; one need only multiply each of the added lines which constitute one sum with each of those which constitute the second one, then these products added up constitute the sought product in terms of both length and direction; so that (𝑎 + 𝑖𝑏)(𝑐 + 𝑖𝑑) = 𝑎𝑐 − 𝑏𝑑 + 𝑖(𝑎𝑑 + 𝑏𝑐).

PROOF: Let line (𝑎 + 𝑖𝑏) be of length 𝐴 and deviate from the

absolute unit by 𝑣 degrees; and let line (𝑐 + 𝑖𝑑) be of length 𝐶

and deviation = 𝑢; then, from §9,

(𝑎 + 𝑖𝑏) = 𝐴 cos𝑣 + 𝐴𝑖 sin 𝑣, and

(𝑐 + 𝑖𝑑) = 𝐶 cos 𝑢 + 𝐶𝑖 sin𝑢, so that, by §3,

𝑎 = 𝐴 cos𝑣

𝑏 = 𝐴 sin𝑣

𝑐 = 𝐶 cos 𝑢

𝑑 = 𝐶 sin𝑢

Now, from §4,

(𝑎 + 𝑖𝑏)(𝑐 + 𝑖𝑑) = 𝐴𝐶[cos(𝑣 + 𝑢) + 𝑖 sin(𝑣 + 𝑢)]

and, from §8,

𝐴𝐶[cos(𝑣 + 𝑢) + 𝑖 sin(𝑣 + 𝑢)]

= 𝐴𝐶[cos 𝑣 cos 𝑢 − sin 𝑣 sin 𝑢 + 𝑖 (cos 𝑣 sin 𝑢 + sin 𝑣 cos 𝑢)]

Page 51: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

51

Consequently, when instead of 𝐴𝐶 cos𝑢 cos𝑣 we write 𝑎𝑐, and

instead of 𝐴𝐶 sin 𝑣 sin 𝑢 we write 𝑏𝑑, etc., there appears what was

to be proved.

From which follows that although the added lines of the sum are not all direct, there

need be no exception to the known rule on which the theory of equations and of

integral functions and their divisores simplices are grounded, namely, when the

two sums are to be multiplied by each other, then each of the added quantities in

one sum must be multiplied by each of the added ones in the second one. So we

can rest assured that when an equation concerns straight lines, and its root has

the form 𝑎 + 𝑖𝑏, then one has designated an indirect line. But, if one wants to

multiply by each other two straight lines which do not both lie in the same plane

with the absolute unit, then the rule mentioned will be overridden. This is the reason

why I passed over multiplication of such lines. Another way of denoting their

changed direction occurs in the following, §§ 24 - 35.

§ 11

A quotient multiplied by the divisor must be equal to the dividend. Thus there is no need to prove that these lines must be in the same plane with the absolute unit; for it follows immediately from the definition in §4. Likewise, we easily perceive that the quotient must deviate from the absolute unit by the angle 𝑣 − 𝑢, if the dividend deviates from the same unit by angle 𝑣, and the divisor by angle 𝑢.

Setting, for example, 𝐴(cos𝑣 + 𝑖 sin𝑣) to be divided by 𝐵(cos 𝑢 + 𝑖 sin𝑢),

then the quotient is 𝐴

𝐵[cos(𝑣 − 𝑢) + 𝑖 sin(𝑣 − 𝑢)], because, from §7,

𝐴

𝐵[cos(𝑣 − 𝑢) +

𝑖 sin(𝑣 − 𝑢)] × 𝐵(cos 𝑢 + 𝑖 sin 𝑢) = 𝐴(cos 𝑣 + 𝑖 sin 𝑣). This is because 𝐴

𝐵[cos(𝑣 − 𝑢) +

𝑖 sin(𝑣 − 𝑢)] multiplied by the divisor 𝐵(cos 𝑢 + 𝑖 sin𝑢) is equal to the dividend

𝐴(cos 𝑣 + 𝑖 sin𝑣): therefore also 𝐴

𝐵[cos(𝑣 − 𝑢) + 𝑖 sin(𝑣 − 𝑢)] is the quotient sought.

§ 12

If 𝑎, 𝑏, 𝑐 and 𝑑 are direct lines, and the indirect ones 𝑎 + 𝑖𝑏 and 𝑐 + 𝑖𝑑 are in the same plane with the absolute unit, then

1

𝑐 + 𝑖𝑑 =

𝑐 − 𝑖𝑑

𝑐2 + 𝑑2

and the quotient 𝑎+𝑖𝑏

𝑐+𝑖𝑑 = (𝑎 + 𝑖𝑏)

1

𝑐+𝑖𝑑= (𝑎 + 𝑖𝑏)

𝑐−𝑖𝑑

𝑐2 + 𝑑2 = 𝑎𝑐+𝑏𝑑+𝑖(𝑏𝑐−𝑎𝑑)

𝑐2 + 𝑑2 .

For, from §9, we can write 𝑎 + 𝑖𝑏 = 𝐴(cos 𝑣 + 𝑖 sin𝑣)

and 𝑐 + 𝑖𝑑 = 𝐶(cos 𝑢 + 𝑖 sin 𝑢)

Page 52: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

52

so that 𝑐 − 𝑖𝑑 = 𝐶(cos 𝑢 − 𝑖 sin 𝑢) from §3;

and since (𝑐 + 𝑖𝑑)(𝑐 − 𝑖𝑑) = 𝑐2 + 𝑑2 by §10,

thus is 𝑐−𝑖𝑑

𝑐2 + 𝑑2 = 1

𝐶[cos(−𝑢) + 𝑖 sin(−𝑢)] =

1

𝑐+𝑖𝑑 by §11,

and when this is multiplied by 𝑎 + 𝑖𝑏 = 𝐴(cos𝑣 + 𝑖 sin𝑣), we get

(𝑎 + 𝑖𝑏)𝑐−𝑖𝑑

𝑐2 + 𝑑2 = 𝐴

𝐶[cos(𝑣 − 𝑢) + 𝑖 sin(𝑣 − 𝑢)] =

𝑎+𝑖𝑏

𝑐+𝑖𝑑 , by §11.

Indirect quantities of this kind share this with the direct ones: that when the dividend is a sum of several quantities, then each of these divided by the divisor gives multiple quotients, whose sum constitutes the quotient sought.

§ 13

If 𝑚 is an integer, then cos𝑣

𝑚 + 𝑖 sin

𝑣

𝑚 multiplied by itself 𝑚 times produces the

power cos𝑣 + 𝑖 sin𝑣 (§7); so (cos 𝑣 + 𝑖 sin 𝑣)1

𝑚 = cos𝑣

𝑚 + 𝑖 sin

𝑣

𝑚. And it follows

from §11 that

cos (−𝑣

𝑚) + 𝑖 sin (−

𝑣

𝑚) =

1

cos𝑣

𝑚 + 𝑖 sin

𝑣

𝑚

= 1

(cos𝑣 + 𝑖 sin𝑣)1𝑚

= (cos𝑣 + 𝑖 sin𝑣)1

𝑚

So, whether 𝑚 is positive or negative, at all times

(cos 𝑣 + 𝑖 sin𝑣)1

𝑚 = cos𝑣

𝑚 + 𝑖 sin

𝑣

𝑚

and so, when both 𝑚 and 𝑛 are integers,

(cos 𝑣 + 𝑖 sin𝑣)𝑛

𝑚 = cos𝑛

𝑚𝑣 + 𝑖 sin

𝑛

𝑚𝑣

From this, we can find the value of expressions such as √𝑏 + 𝑐√−1𝑛

or

√𝑎 + √𝑏 + 𝑐√−1𝑛𝑚

. So, for example,8 √4 √2 + 2√3√−13

denotes a straight line whose

length = 2, and whose angle with the absolute unit is measured as 10°.

8 In the original text of the translation, the example given was √4 √3 + 4√−13

, but this

expression does not seem to agree with Wessel’s assertions about its length and angle with the absolute unit. To fit with his assertions, we substituted the expression

√4 √2 + 2√3√−13

.

Page 53: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

53

§ 14 When two angles have equal sines and equal cosines, then their difference is

either 0 or ±4 right angles, or a multiple of ±4 right angles, and conversely when the difference of two angles is either 0 or ±4 right angles (taken once or a number of times), then their sines as well as their cosines are equal.

§ 15

If 𝑚 is an integer, and 𝑤 = 360°,9 then (cos 𝑣 + 𝑖 sin𝑣)1

𝑚 can have the following values:

cos𝑣

𝑚 + 𝑖 sin

𝑣

𝑚

cos𝑤 + 𝑣

𝑚 + 𝑖 sin

𝑤 + 𝑣

𝑚

cos2𝑤 + 𝑣

𝑚 + 𝑖 sin

2𝑤 + 𝑣

𝑚

cos3𝑤 + 𝑣

𝑚 + 𝑖 sin

3𝑤 + 𝑣

𝑚

. . .

cos(𝑚−1)𝑤 + 𝑣

𝑚 + 𝑖 sin

(𝑚−1)𝑤 + 𝑣

𝑚

because the numbers by which 𝑤 is multiplied in the preceeding sequence are in

the arithmetic progression 1, 2, 3, 4, … , 𝑚 − 1. So the sum of any two will equal 𝑚 whenever one of them is as far from 1 as the other is from 𝑚 − 1. And if their

number is odd, then twice the middle one will equal 𝑚. Therefore, when one adds

(𝑚−𝑛)𝑤 + 𝑣

𝑚 +

(𝑚−𝑢)𝑤 + 𝑣

𝑚

and the former, in the sequence, is as far from 𝑤 + 𝑣

𝑚 as

(𝑚−𝑛)𝑤 + 𝑣

𝑚 is from

(𝑚−1)𝑤 + 𝑣

𝑚,

then the sum = 2𝑚−𝑢−𝑛

𝑚𝑤 +

2𝑣

𝑚 = 𝑤 +

2𝑣

𝑚. But, adding

(𝑚−1)𝑤

𝑚 is the same as

subtracting (𝑚−1)(−𝑤)

𝑚, and since the difference is 𝑤, then, from §14,

(𝑚−1)(−𝑤) + 𝑣

𝑚

9 Wessel uses the symbol 𝜋 to signify 360°, but since that is not the usual meaning of the symbol today, 𝑤 (for “whole circle”) is used here instead.

Page 54: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

54

has the same sine and cosine as (𝑚−1)𝑤 + 𝑣

𝑚 . Thus −𝑤 does not give any other

values than does +𝑤. And that none of these are equal results from the difference between two of the angles in the sequence being always less than 𝑤, and never = 0. Nor are there any more values by continuing the sequence, for then the

angles become 𝑤 + 𝑣

𝑚, 𝑤 +

𝑤 + 𝑣

𝑚, 𝑤 +

2𝑤 + 𝑣

𝑚, etc., and so, from §14, the values of

the cosine and sine are the same as before. Should the angles fall outside the

sequence, then the 𝑤 in the numerator was not multiplied by an integer, and the angles taken 𝑚 times could not produce an angle which, when subtracted from

𝑣, give 0, or ±𝑤, or a multiple of ±𝑤, so neither can the 𝑚th power of such an angle’s cosine and sine = cos 𝑣 + 𝑖 sin𝑣.

§ 16

Without knowing the angle that the indirect line 1 + 𝑥 makes with the absolute one, we find, when the length of 𝑥 is less than 1, the power

(1 + 𝑥)𝑚 = 1 + 𝑚𝑥

1 +

𝑚

1

𝑚−1

2𝑥2 + …

and if this series were arranged according to the powers of 𝑚, it keeps the same value and transforms to

(1 + 𝑥)𝑚 = 1 + 𝑚𝑙

1 +

𝑚2𝑙2

1∙2 +

𝑚3𝑙3

1∙2∙3 + …

where 𝑙 = 𝑥 − 𝑥2

2 +

𝑥3

3 −

𝑥4

4 + …

and is a sum of a direct and a perpendicular line. Calling the direct one 𝑎 and the

perpendicular one 𝑏√−1, where 𝑏 is the smallest measure of the angle that 1 + 𝑥

makes with 1, and setting

1 + 1

1 +

1

1 ∙ 2 +

1

1 ∙ 2 ∙ 3 + … = 𝑒,

then (1 + 𝑥)𝑚, that is, 1 + 𝑚𝑙

1 +

𝑚2𝑙2

1∙2 +

𝑚3𝑙3

1∙2∙3 + …, is denoted by 𝑒𝑚𝑎+𝑚𝑏√−1, that

is, (1 + 𝑥)𝑚 has the length 𝑒𝑚𝑎, and a directional angle whose amount is 𝑚𝑏, supposing 𝑚 to be positive or negative. Thus the direction of lines in the same plane may be expressed in yet another way, namely with the help of natural logarithms. I will present complete proof of these statements at another time, if allowed. Now that I am done accounting for the way in which the sum, product, quotient, and power of straight lines are found, I will only give a few examples of the application of the method.

Page 55: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

55

AFTERWORD WHAT WESSEL DID So much for our selection from Wessel. Now, what has Wessel done? He has shown a way of taking direction together with magnitude so that the old rules of the ordinary algebra still work. In other words, he has found that the old algebraic rules we learned from Descartes are capable of handling more information about straight lines than just their length—they can also handle direction in addition to length. Descartes broadened out the idea of multiplication from just integers to all comparable magnitudes. He observed that multiplication in numbers is just finding a fourth proportional from a unit and two factors. So why can’t we do the same for any two magnitudes, even if they are incommensurable with one another and with our chosen unit length? Finding a fourth proportional does not depend on commensurability among the first three magnitudes. Wessel has followed Descartes’ lead and broadened out the idea of multiplication (and the other operations) even more: why limit our fourth proportional to magnitude? Why not also add in the idea of direction? To do this, we choose a unit not only in length, but also as having a certain standard direction. And just as we measure off positive lengths in a chosen direction from our origin, we also measure off positive directional changes, or angles away from the unit, in a chosen rotational direction, such as counterclockwise. Then for any two given magnitudes, each having a certain direction, we take the fourth proportional from them and the unit length, which fourth proportional is a fourth proportional both in magnitude and in direction. This new magnitude is then defined as the product of the two given magnitudes.

In particular, the square root of −1 now has a clear and definite meaning. From our origin in the plane, we choose a unit length with a special direction and call

that +1. Going the same distance in the opposite direction gives us −1. And the straight line that is the mean proportional between them in magnitude, and also the mean proportional between them in direction, gives us the “square root of negative one,” or 𝑖. And this is just the straight line that is one unit long, drawn from the origin, and taken along the axis that is at right angles to our original unit line (and taken on the correct side so that it is counterclockwise from our original unit).

This is 𝑖, or at least the geometric interpretation of it. And we will find that in any equations in which 𝑖 occurs, for example as a root of some polynomial, we can make perfect geometric sense of it by reinterpreting the whole equation in terms of Wessel’s new direction-inclusive operations. Consider, for example, the function

𝑓(𝑥) = 𝑥2 + 2 in a Cartesian coordinate plane. This is a parabola with its axis along the y-axis, hovering two units above the origin. If we look for the roots of this equation, we get

Page 56: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

56

𝑥2 + 2 = 0

𝑥2 = −2

𝑥 = ± √−2

In the Cartesian plane, this makes sense in a limited way. √−2 has no operational meaning, and so the equation is telling us, as it were, “I have no real roots, no genuine x-values which make me equal zero.” That fits with the fact that our parabola nowhere intersects the x-axis. But this is more a negative meaning than a positive one—the equation is not saying “I have certain roots” but “I don’t have any roots, silly.” Now reinterpret the original function in terms of Wessel’s direction-included operations. That means the input, 𝑥, is not necessarily a length along the unit axis, but any length from the origin in any direction

you please, and 𝑥2 is not just the square of 𝑥 in

magnitude, but also in direction. The number 2 is still just double the unit in magnitude and has the same direction as the unit, but + 2 means Wessel’s

directional addition. So, for example, if our input 𝑥 is

a line √2 in length and at 45° to the axis along which

the positive unit lies, then 𝑥2 + 2 will be 2𝑖 + 2, which

is a line that is 2√2 in length and at 45° to the axis. Redefining the operations in the function in this way, if we now ask what inputs will make the function equal

to zero, the answer is ± √−2, or ± √2𝑖. To see this,

consider the square contained by the sides 1 and 𝑖, with a corner at the origin. The diagonal of this square

drawn from the origin is √2 in length, 45° from the axis, and may be written 1 + 𝑖. If we cut off from the 𝑖-axis

a length equal to this, that will be √2𝑖. If we now square this, we get a line with magnitude 2, but along

the negative axis. If we now add this to 2 along the positive axis, we get zero. Wessel’s operations are a very natural expansion of Descartes’ operations, which in fact already had direction involved in them, although only in a partial way. They included positive and negative direction along a pair of lines from the origin, ignoring the infinity of other directions from the origin.

2i2i + 2

0 2

0

2

2 i

1 + i

1

i

Page 57: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

57

QUESTION: Is it natural, or artificial, to combine magnitude and direction? Does it make sense to take them together, as though they somehow constituted something one? Can we do that with just any two things, like color and shape? Political affiliation and food preferences? It seems we cannot just combine any two things and really have something one resulting from them. But magnitude and direction do seem to come together to constitute, or cause, some one quantity somehow. We see this first of all in the natural world. If someone punches me in the face, the blow to me, or how great an effect the punch has on my face, is not just a matter of how hard my assailant threw his fist, but also a matter of how directly his fist lands on my face. A very powerfully thrown fist plus a glancing direction might not amount to much of blow to me, while a lighter throw plus a very direct, head-on, perpendicular-to-my-nose throw of the fist could mean much more damage. We also encountered this same sort of thing in Galileo and Newton who taught us how to combine two different velocities or forces to find the resultant velocity or force. UPDATING WESSEL: VECTORS AND COMPLEX NUMBERS As you might imagine, it became necessary to find standard terminology and notation to employ the concepts that we have just learned from Wessel. Let’s take a moment to familiarize ourselves with some of the current ways of dealing with magnitude and direction. First of all, the term vector means a quantity that has not only a magnitude, but a certain direction associated with it. This is today’s name for what Wessel was talking about. He taught us how to add, subtract, multiply, and divide vectors, and to take powers and roots of them, and showed that the usual algebraic manipulations apply to them. We encountered vector quantities in Galileo and Newton. Velocity, for example, is a vector, since it is a speed in a certain direction. Speed by itself, however, is not a vector, but a scalar quantity, that is, a quantity with no direction associated with it, a pure magnitude. “50 mph,” for example, is not a vector, but a scalar. So too are temperature and mass. But force is a vector, and so is acceleration. A function can have a geometrical meaning, but also many other meanings as well that are not geometrical, because the quantities they correlate need not be lengths of straight lines in a coordinate system but can be other things such as weights, pressures, times, speeds, and the like. Something similar may be said of vectors. In the foregoing, we have considered vectors as straight lines with certain lengths and with a direction associated with them. But the magnitude of a vector need not be the length of a straight line. It can instead be, for example, a force, or a speed, or some other thing that is capable of having a direction associated with it. Descartes himself used the term imaginary, and we have seen that the geometric interpretation of any imaginary number, 𝑖, 2𝑖, 3𝑖, 𝑎𝑖, is just a straight line whose magnitude is given by the coefficient, and whose direction is from the origin, along the axis at right angles to the axis along which the positive unit lies (what Wessel

calls “the absolute unit”), that is, 90° counterclockwise from that axis. This new axis

Page 58: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

58

is the imaginary axis, and by way of contrast to it the original positive/negative axis is called the real axis, although each is just as “real” as the other in the ordinary sense of the term real. Vectors along these axes are real and imaginary numbers. But what about all the other directions from the origin? Magnitudes with any other direction than one that lies along the real axis or imaginary axis are called complex numbers, because they are complexes or combinations of real and imaginary numbers. That is, if you draw any arrow with its tail at the origin and its tip somewhere out in the plane, not lying along either axis, then this vector is the vector sum of a real vector and an imaginary vector. From the tip of it, draw perpendiculars to the imaginary and real axes, and the vectors you cut off from these will be the components of your vector. In Wessel’s Section 12 above, Wessel calls lines along the real axis “direct lines,” and calls complex numbers “indirect

lines”, such as 𝑎 + 𝑖𝑏 and 𝑐 + 𝑖𝑑. And he refers to positive 1 along the real axis as “the absolute unit.” The plane defined by a real and imaginary axis, which in turn define all the vectors in the plane, is called the complex plane. What about new notation for vectors? This is necessary, since the same kinds of algebraic moves apply to both vectors and non-vectors, and yet they signify different things. We need to know when we are looking at vectors, and when we are not. To keep this quite clear, there are many kinds of notation, but the two most common ones are the following. First, a bold letter, such as 𝒖 or 𝒗 indicates a vector. Second, an ordinary letter designation with an arrow over it indicates a

vector, such as 𝐴𝐵⃗⃗⃗⃗ ⃗, which means a straight line of length 𝐴𝐵 that has associated with it the direction from 𝐴 to 𝐵. THREE OLD THINGS SAID IN THE NEW LANGUAGE OF VECTORS Let’s see now, how the language of vectors can be used to state some old truths in a new way. What, for example, is vector-speak for the relationship of the squares on the sides of a triangle (any triangle, now, not just a right one)? One way of talking about these, without vectors, is the law of cosines. But suppose we

consider the sides of a triangle as three vectors, 𝒂, 𝒃, 𝒄. We can take 𝒂 so that its tip touches the tip of 𝒄, and its tail touches the tail of 𝒃, in which case

𝒂 = 𝒃 + 𝒄 so

𝒂2 = (𝒃 + 𝒄)2

𝒂2 = 𝒃2 + 2𝒃𝒄 + 𝒄2

a

b

c

Page 59: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

59

Or consider vectors 𝒖 and 𝒗. What is the vector whose tip lies at the midpoint between the tips of these vectors? Since the diagonal of the parallelogram determined by vectors 𝒖 and 𝒗 is the vector 𝒖 + 𝒗, therefore half this diagonal is the vector 1

2(𝒖 + 𝒗), and

the tip of this vector lies at the midpoint of the straight line joining the tips of 𝒖 and 𝒗. Can we use vector algebra to show that the three medians of a triangle all intersect in one point? Let triangle OAB have C, D as the midpoints of OA, OB. Let P be the

intersection of AD, BC. Extend OP to E. To prove the theorem, all we have to do is show that BE = EA. We will consider O as the origin of our complex plane.

Then 𝑂𝐵⃗⃗ ⃗⃗ ⃗ + 𝑂𝐴⃗⃗ ⃗⃗ ⃗ = 2𝑂𝐸⃗⃗⃗⃗ ⃗ (diagonal of parallelogram)

so 𝑂𝐵⃗⃗ ⃗⃗ ⃗ − 𝑂𝐸⃗⃗ ⃗⃗ ⃗ = 𝑂𝐸⃗⃗ ⃗⃗ ⃗ − 𝑂𝐴⃗⃗ ⃗⃗ ⃗

Now since 𝑂𝐴⃗⃗⃗⃗ ⃗ = 𝑂𝐸⃗⃗ ⃗⃗ ⃗ + 𝐸𝐴⃗⃗⃗⃗ ⃗

thus 𝐸𝐴⃗⃗⃗⃗ ⃗ = 𝑂𝐴⃗⃗ ⃗⃗ ⃗ − 𝑂𝐸⃗⃗ ⃗⃗ ⃗

so −𝐸𝐴⃗⃗⃗⃗ ⃗ = 𝑂𝐸⃗⃗ ⃗⃗ ⃗ − 𝑂𝐴⃗⃗ ⃗⃗ ⃗

And 𝐸𝐵⃗⃗⃗⃗ ⃗ = 𝑂𝐵⃗⃗ ⃗⃗ ⃗ − 𝑂𝐸⃗⃗ ⃗⃗ ⃗ (since 𝑂𝐵⃗⃗ ⃗⃗ ⃗ = 𝑂𝐸⃗⃗ ⃗⃗ ⃗ + 𝐸𝐵⃗⃗⃗⃗ ⃗)

so 𝐸𝐵⃗⃗⃗⃗ ⃗ = −𝐸𝐴⃗⃗⃗⃗ ⃗

so 𝐵𝐸⃗⃗⃗⃗ ⃗ = 𝐸𝐴⃗⃗⃗⃗ ⃗ Q.E.D.

u

v

u + v

0

0

D

B

E

AC

P

Page 60: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

60

OTHER VECTOR OPERATIONS We have seen that operations analogous to Descartes’ algebraic operations apply to vectors. Vectors also come with new operations of their own that are particularly useful in pure mathematics, and also in physics, engineering, computer programming, and many other fields. The two most fundamental of these new operations are the “dot product” and the “cross product.” You can read about them in Appendix 1.

Page 61: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

61

8 Euler’s Identity Revisited

Since Wessel’s vector algebra has given a

definite meaning to 𝑖, we should be hopeful that it will also give a definite meaning to Euler’s

mysterious equation, 𝑒𝑖𝜋 + 1 = 0. (1) Consider any complex quantity Oc = 1 + 𝑏𝑖 Let’s imitate Wessel, and start taking powers of it

(like his powers of 1 + 𝑥). To square it, we just draw a right triangle Ocd on Oc that is similar to

the original triangle O1c. Thus Od ∶ Oc = Oc ∶ 1 and dOc = cO1

thus Od = (1 + 𝑏𝑖)2 Repeating the process by drawing triangle Odf similar to Ocd and O1c, we get

Of = Od ∙ Oc = (1 + 𝑏𝑖)2(1 + 𝑏𝑖) = (1 + 𝑏𝑖)3 and so on for higher powers.

(2) Therefore (1 + 𝑏𝑖)𝑛 is just the last hypotenuse drawn at the end of producing 𝑛 such similar right triangles. (3) Keeping the fO1 constant, if we let 𝑧 = 𝑏𝑛, so that 𝑏 =

𝑧

𝑛, then the number

of our triangles increases while at the same time the length 𝑏, the little side of our

start-triangle, decreases, and therefore the first hypotenuse 0c will come to differ as little as we please from 1, and consequently every subsequent hypotenuse or

power of 0c, that is, of (1 + 𝑏𝑖), will differ less and less from the unit in absolute magnitude. In other words

lim𝑛→∞

‖(1 + 𝑏𝑖)𝑛‖ = 1

O

f

d

c

b

1

bi

Page 62: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

62

(4) And what limit does the sum of the absolute magnitudes 1c + cd + df (the polygonal arc) approach as 𝑛 goes to infinity? Since 0d approaches 0c, therefore

triangle 0dc approaches congruence with triangle 0c1, and so dc approaches c1, which means dc approaches 𝑏 as 𝑛 goes to infinity. Likewise fd approaches dc, and therefore also fd approaches 𝑏 as 𝑛 goes to infinity. Therefore the length of the polygonal arc approaches 𝑛 ∙ 𝑏 as 𝑛 goes to infinity. (5) So, as 𝑛 goes to infinity, the magnitude of (1 + 𝑏𝑖)𝑛 approaches 1, and the

position of it approaches being at the end of a circular arc (since 0f, 0d, 0c all approach 1) of radius 1 and having an arc-length equal to 𝑛 ∙ 𝑏. (6) In short, lim

𝑛→∞(1 + 𝑏𝑖)𝑛 is a straight line of magnitude 1, rotated

counterclockwise from the unit by 𝑛 ∙ 𝑏 units of length along the circumference of the unit circle.

(7) But lim𝑛→∞

(1 + 𝑏𝑖)𝑛 = lim𝑛→∞

(1 + 𝑖𝑧

𝑛)𝑛 since 𝑏 =

𝑧

𝑛 .

and lim𝑛→∞

(1 + 𝑥

𝑛)𝑛 = 𝑒𝑥,

which we can see by expanding the left side using

the binomial theorem and taking the limit as 𝑛 goes to infinity (we did something quite like it in the junior math, and would have to use a similar technique here). And so, more particularly,

lim𝑛→∞

(1 + 𝑖𝑧

𝑛)𝑛 = 𝑒𝑖𝑧

And so, letting 𝑧 = 𝜋, we have

lim𝑛→∞

(1 +𝜋

𝑛𝑖)

𝑛 = 𝑒𝑖𝜋

(8) From Step (6) above, we know that the left side of this equation means a

straight line of magnitude 1, rotated counterclockwise from the unit by 𝑛 ∙ 𝑏 units

of length along the circumference of the unit circle, that is, by 𝑛 ∙𝜋

𝑛 units, that is, by

𝜋 units of length along the circumference. But that is just a unit semicircle. Therefore the left side of the equation above designates a straight line of magnitude 1 rotated counterclockwise from the unit by a full semicircle, therefore

−1 = 𝑒𝑖𝜋

or 𝑒𝑖𝜋 + 1 = 0 We now have a definite geometrical meaning for Euler’s Identity, thanks to Wessel.

O

f

d

c

b

1

bi

Page 63: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

63

OBSERVATION

Since 𝑒𝑖𝜋 means to rotate a unit by 𝜋 units of circular arc along the unit circle, then

(𝑒0)(𝑒𝑖𝜋) = 𝑒0 + 𝑖𝜋 means the same thing.

Here we see that 𝑒0 gives us the absolute length of what is rotated, that is, 1, and the 𝑖𝜋 gives us the amount of counterclockwise rotation through the unit circle,

namely 𝜋. More generally,

𝑎𝑏 + 𝑐𝑖 = (𝑎𝑏)(𝑎𝑐𝑖)

and this means a magnitude of absolute value 𝑎𝑏 rotated counterclockwise through

𝑐 units of circular arc along the unit circle, starting from the unit position.

Page 64: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

64

9 The Products of Diagonals in

Regular Polygons THE POWER OF VECTORS Do vectors empower us in any particular way? So far, we have seen that Wessel’s vector algebra enables us to assign meaning to Euler’s Identity, and to prove it rather quickly. But can his algebra get us anywhere new? When Descartes introduced his new algebra, it was partly to make our notation and calculation easier, but also in part to enable us to make new discoveries or construct new proofs that would lie beyond us without his methods. Vector algebra also increases our power of discovery and proof. What follows is just one example. Suppose v is one vertex of a regular polygon of 𝑛 sides (a “regular n-gon”) inscribed in a circle, and the chords from this vertex to all the other vertices are v1, v2, v3, v4, etc. Is there anything interesting about these chords? Will their sum or their product, for example, turn out to be anything interesting or pretty? This is just the sort of question that leads to mathematical discovery. Let’s ask about the product of all these chords—the Cartesian product, that is. If we express all their lengths in terms of some unit length, say the radius of the circumscribed circle, what will the product of all those chords be? Will there be any kind of pattern, a rule that the chord-products in all the different n-gons must obey?

V

1

2

345

6

7

8

9

10

11

15

14

1312

Page 65: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

65

We can start to tackle this by looking at the first and simplest case, the equilateral triangle. If VAC is an

equilateral triangle inscribed in a circle with center O and radius OC = 1, what will be the chord-product, VA·VC? If we bisect AC at E, then VOE will lie in a straight line, and triangle VCE will be a 30-60-90 triangle, and so will triangle OCE. Therefore

𝑉𝐶 ∶ 𝐶𝐸 = 2 ∶ 1

and 𝐶𝐸 ∶ 𝑂𝐶 = √3 ∶ 2

so 𝑉𝐶 ∶ 𝑂𝐶 = √3 ∶ 1 [by a “perturbed proportion”] but 𝑂𝐶 = 1

so 𝑉𝐶 = √3.

Therefore the chord product we are looking for, namely VA ∙ VC, is just

√3 ∙ √3 = 3 That’s pretty interesting. What happens now if we try it with 𝑛 = 4, that is, with a square?

Let the square be VABC. Then we are looking for the value of VA·VB·VC. Since OC = 1,

thus VC = √2 = VA and VB = 2

so VA ∙ VB ∙ VC = √2 ∙ 2 ∙ √2 = 4 Wow! If 𝑛 = 4, then the product of the chords is also 4. Could this be just a coincidence? Or could it be a general rule for regular n-gons inscribed in a unit

circle that the chord-product is always equal to 𝑛 ? With nothing but Descartes’ algebra, it is hard to see how we could ever answer this question definitively. We could just keep trying examples (and if we did, we would find them harder and harder to compute, although they would all point toward the same general rule). But is this really a universal rule? And is there a way to prove it?

V

A C

B

O

E

A

V

C

BO

Page 66: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

66

In fact it does turn out to be true that

If any regular polygon of 𝒏 sides is inscribed in a unit circle,

and the (𝒏 − 𝟏) chords are drawn from one vertex to all the

others, then the Cartesian product of all these chords is 𝒏.

The easiest proof of this amazing truth comes out of drawing our unit circle around the origin (O) of a complex plane (a “vector plane”). Let the vertex of our n-gon from which we will be drawing chords lie at the end of the unit in the plane (i.e., the end-point of “positive one”). From the origin draw the radial vectors to each of the vertices of the polygon. Call these, in counter-

clockwise order (in keeping with the convention of the complex plane), 𝑎1, 𝑎2, 𝑎3,

etc., where 𝑎1 is the first of these radii after the unit itself. Since there are 𝑛 such radial vectors in our polygon, the unit itself can also be called 𝑎𝑛. Now, since all these radii are equal to the

unit in magnitude, and 𝑛 times the counter-clockwise rotation of each radial vector returns us to the position of the unit (thanks to the regularity of the n-gon), therefore

(𝑎1)𝑛 = 1

and (𝑎2)𝑛 = 1

and (𝑎3)𝑛 = 1 etc.

Hence these are all (in vector algebra) 𝑛 th

roots of unity. Also, 𝑎1 is a root of all the other radial vectors, since

(𝑎1)2 = 𝑎2

and (𝑎1)3 = 𝑎3

and (𝑎1)4 = 𝑎4 etc. Also, the sum of all our radial vectors, since they are in a regular polygon and the “directions” all cancel out, is zero. For example

if 𝑛 = 3,

then 𝑎1 + 𝑎2 + 𝑎3 = 0

But since 𝑎1 itself is a root of each subsequent vector, it is also true that

(𝑎1)1 + (𝑎1)2 + (𝑎1)3 = 0

O

a1

a2

a3 = 1

Page 67: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

67

and similarly

(𝑎2)1 + (𝑎2)2 + (𝑎2)3 = 0

since (𝑎2)1 = 𝑎2

and (𝑎2)2 = 𝑎1

and (𝑎2)3 = 𝑎3 = 1

But since (in this example) 𝑛 = 3, the 3rd power of any of our radial vectors will

always be unity. Hence the last term in such equations is just 1. So we have

(𝑎1)1 + (𝑎1)2 + 1 = 0

and (𝑎2)1 + (𝑎2)2 + 1 = 0

In other words, 𝑎1 and 𝑎2 are the two roots of the equation

𝑣1 + 𝑣2 + 1 = 0 And in general, for any regular n-gon, the (𝑛 − 1) radial vectors other than 1

itself will be the (𝑛 − 1) roots of the equation

𝑣𝑛−1 + 𝑣𝑛−2 + … + 𝑣1 + 1 = 0 Now consider the equation

(1 − 𝑧)𝑛−1 + (1 − 𝑧)𝑛−2 + … + (1 − 𝑧)1 + 1 = 0 This is basically the same equation, although we have changed the “argument”

from our variable 𝑣 to (1 − 𝑧). Since the equation is basically the same, we can equate all the same roots as before with (1 − 𝑧), that is,

(1 − 𝑧) = 𝑎1

and (1 − 𝑧) = 𝑎2

and (1 − 𝑧) = 𝑎3 etc.

and all of these (and only these) values of (1 − 𝑧) will make the equation true.

Then what are the values of 𝑧 itself? Looking at the first equation above, and solving for 𝑧, we get

𝑧 = 1 − 𝑎1

and so too 𝑧 = 1 − 𝑎2 from the second equation above,

and 𝑧 = 1 − 𝑎3 etc.

Page 68: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

68

In short, the roots of the equation

(1 − 𝑧)𝑛−1 + (1 − 𝑧)𝑛−2 + … + (1 − 𝑧)1 + 1 = 0

that is, the values of 𝑧 that make it true, are

(1 − 𝑎1), (1 − 𝑎2), (1 − 𝑎3), ... (1 − 𝑎𝑛−1) So the product of all the roots of this equation is

(1 − 𝑎1)(1 − 𝑎2)(1 − 𝑎3) … (1 − 𝑎𝑛−1) We can easily evaluate that product by looking to the equation itself, and expanding its binomials.

Consider, for example, when 𝑛 = 3. Our equation is then

(1 − 𝑧)2 + (1 − 𝑧)1 + 1 = 0 In that case, expanding, we have

1 − 2𝑧 + 𝑧2 + 1 − 𝑧 + 1 = 0

Reorganizing this in descending order of powers of 𝑧, we have

𝑧2 − 2𝑧 − 𝑧 + 1 + 1 + 1 = 0

Notice that the leading coefficient (the coefficient of the highest power of 𝑧, namely

the 𝑛 − 1 power, in this case 𝑧2), is 1, and our final constant is just 𝑛 (in this

case 3). This must always be the case, regardless of the value of 𝑛.

And because the highest degree of 𝑧 must be the same as the highest power of (1 − 𝑧) prior to expansion, and this expansion means we multiply – 𝑧 by itself (𝑛 − 1) times to get the highest power of 𝑧, therefore when (𝑛 − 1), the maximum

degree in our equation, is even, the leading coefficient will be positive 1, and when (𝑛 − 1) is odd, the leading coefficient will be negative 1. Again, since each raised binomial in the equation yields only a single constant

term, namely 1, our final constant term will be equal to as many ones as we have raised binomials, plus the original 1 in our equation. But we always have (𝑛 − 1) binomials, no matter what 𝑛 is. Hence the constant term in the expanded form of our equation will always be (𝑛 − 1) + 1, that is, it will always be 𝑛. But the root-product of our equation, as with any polynomial, is the final constant (namely 𝑛) divided by the leading coefficient for an even maximum degree of (𝑛 − 1), and it is just the negative of this for an odd maximum degree of (𝑛 − 1). (See the explanation after this proof.)

Page 69: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

69

So, for 𝑛 odd, hence (𝑛 − 1) even, the root-product has to be

𝑛

1 = 𝑛

and for 𝑛 even, hence (𝑛 − 1) odd, the root-product has to be

𝑛

−1 = − 𝑛

So the root-product of

(1 − 𝑧)𝑛−1 + (1 − 𝑧)𝑛−2 + … + (1 − 𝑧)1 + 1 = 0

is always jus 𝑛 for odd 𝑛, and −𝑛 for even 𝑛. Hence

(1 − 𝑎1)(1 − 𝑎2)(1 − 𝑎3) … (1 − 𝑎𝑛−1) = 𝑛 If we return now to the geometry of the regular polygon in the complex plane, we see that all the chords drawn from the starting vertex (that is, from

the end-point of 1 or 𝑎𝑛 ) are

1 – 𝑎1

1 – 𝑎2 . . .

1 − 𝑎𝑛−1 And in vector-multiplication, one finds the fourth proportional (from the unit and the two terms to be multiplied) both in magnitude and in direction. So the direction will make no difference to the absolute magnitude of the final product, or to the Cartesian product. Therefore the Cartesian product of the (𝑛 − 1) chords in a regular polygon drawn from one vertex to all the others in a unit circle is 𝑛. Q.E.D.

O

a1

a2

a3 = 1

O

a1

a2

a5 = 1

a3

a4

Page 70: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

70

PRODUCT OF ROOTS IN A POLYNOMIAL To see that the product of all the roots of a polynomial is the final constant divided by the leading coefficient, or simply the final constant itself if the lead coefficient is

1, or the negative of this if the degree of the polynomial is odd, just consider some examples. First

(𝑥 − 𝑎)(𝑥 − 𝑏) = 𝑥2 − 𝑏𝑥 − 𝑎𝑥 + 𝑎𝑏

Clearly the roots are 𝑎 and 𝑏. Also, the final constant has to be their product. In this case, 𝑛 is even.

But if 𝑛 is odd, then (for example)

(𝑥 − 𝑎)(𝑥 − 𝑏)(𝑥 − 𝑐) = 𝑥3 − (𝑎 + 𝑏 + 𝑐)𝑥2 + (𝑎𝑏 + 𝑏𝑐 + 𝑎𝑐)𝑥 − 𝑎𝑏𝑐

Here, too, the roots are 𝑎, 𝑏, 𝑐 and the root-product is therefore 𝑎𝑏𝑐, and the final constant in the polynomial must be the negative of this, thanks to the odd 𝑛. DIRECTION OF THE CHORD-PRODUCT We were mainly concerned with the absolute value of the chord-product in our regular n-gon. But what is its final direction, if we take a vector-product?

If 𝑛 is odd, then the n-gon will have no chord lying along the x-axis, and so each chord will have its equal and opposite symmetrical correspondent, and consequently the unit will have to one of these the same angle that the symmetrical correspondent has to the unit. Consequently, every such pair produces a real and positive net direction, and the total direction is just real and positive.

If 𝑛 is even, then the n-gon will have a chord lying along the x-axis, and this will have no opposite to cancel out its negative direction, although all the other chords will produce a real and positive direction. Hence the net direction is real and negative.

Page 71: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

71

SUM OF CHORDS If you use the information in this proof to find the sum of the chords, you will be finding the vector-sum. Consider, for example, the equilateral triangle, in which

case 𝑛 = 3. If we want to find the vector-sum of the chords for this n-gon, this amounts to finding the sum of the roots of the equation

𝑧2 − 2𝑧 − 𝑧 + 1 + 1 + 1 = 0 that is, of the equation

𝑧2 − 3𝑧 + 3 = 0 But how do we tell what the sum of the roots of such an equation must be? Consider again:

(𝑥 − 𝑎)(𝑥 − 𝑏) = 𝑥2 − 𝑏𝑥 − 𝑎𝑥 + 𝑎𝑏

so (𝑥 − 𝑎)(𝑥 − 𝑏) = 𝑥2 − (𝑎 + 𝑏)𝑥 + 𝑎𝑏 So the negative of the coefficient of the second-highest power in the equation is the sum of the roots. This works also for higher powers:

(𝑥 − 𝑎)(𝑥 − 𝑏)(𝑥 − 𝑐) = 𝑥3 − 𝑏𝑥2 − 𝑎𝑥2 + 𝑎𝑏𝑥 − 𝑐𝑥2 + 𝑏𝑐𝑥 + 𝑎𝑐𝑥 − 𝑎𝑏𝑐

(𝑥 − 𝑎)(𝑥 − 𝑏)(𝑥 − 𝑐) = 𝑥3 − (𝑎 + 𝑏 + 𝑐)𝑥2 + (𝑎𝑏 + 𝑏𝑐 + 𝑎𝑐)𝑥 − 𝑎𝑏𝑐

In this example, the negative of the coefficient of the 𝑥2 term gives us the sum of the roots of the equation. Coming back, now, to our case of 𝑛 = 3, this means the sum of the roots of our equation (i.e., the sum of the chords in this figure, the equilateral triangle inscribed in the unit circle) is the negative of the coefficient of the 𝑧 term in

𝑧2 − 3𝑧 + 3 = 0 The sum of the roots is therefore just 3. And this is the vector-sum of the chords in our equilateral triangle (it is not their scalar sum).

Page 72: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

72

10 Introduction to Dedekind

Julius Wilhelm Richard Dedekind (1831-1916) never went by Julius or Wilhelm, but always by Richard. He was born in, and died in, Braunschweig, Germany, where he spent most of his life. He never married, but lived with his sister, Julia.

He was also, of course, a mathematician. His doctoral thesis advisor was none other than the great Carl Friedrich Gauss, one of the last mathematicians to know, pretty much, all of mathematics. Dedekind made significant contributions to abstract algebra (to ring theory, to be more specific), to the theory of algebraic numbers, and to the definition of real numbers. When he was first teaching calculus at the Polytechnic school in Zurich, he developed the idea of a “cut” (in German, Schnitt), which is today known as a Dedekindian cut, and which is used to define real numbers. Now that we have talked about imaginary numbers with Wessel, it is time to learn about real numbers from Dedekind. (Charming, isn’t it, to move from imaginary ones to real ones?)

What are the “real numbers”? They are the answer to a question, and so we cannot appreciate them until we first grasp the question to which they are the

answer. Wessel began with an abstract algebraic entity, 𝑖, and sought to give it a concrete geometrical meaning. Dedekind is going in the opposite direction, beginning with a concrete geometrical thing, namely the continuity of a straight line, and trying to find a way to define continuity in the abstract, apart from geometry and in “purely arithmetic” or logical terms. This project results in the real numbers. But what motivated the project in the first place?

Dedekind saw that those functions that were the primary concern of calculus were not just about curves in a Cartesian coordinate system, but could be interpreted as relating other kinds of geometrical quantities, such as areas and

Page 73: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

73

slopes, or volumes and surfaces, or even quantities that were not geometrical at all, such as speeds and times. Consequently, it was important to the universality of calculus not to ground its general assertions in things peculiar to geometry, much less in things peculiar to Cartesian coordinate systems in particular. Dedekind observed that the way calculus was taught to students, and even the way that it was explained and understood among mathematicians in his time, was very much tied to the geometrical interpretation of functions, and so he began to search for a way to liberate the foundations of calculus from the confines of geometry.

But there was also a deeper problem. Not only were general terms and concepts lacking, but the principles of calculus strongly depended on the idea of continuity, and yet this idea remained vague, and the reason it should convince us of the truth of the principles of calculus was unclear. For example, much of calculus rests on the principle that if any definite process adds to a magnitude again and again as many times as we please, but can never increase the total magnitude to a certain finite amount, then there must be some first magnitude that the process can never reach. In some cases, as in the case of the inverse powers of two, we can construct the magnitude that the series is approaching, and we can prove that the constructed magnitude (in this case 2) is the first thing that the series cannot reach, although it can come as near as we please to it. In other cases, however, we cannot construct the magnitude, as in the case of 𝑒 or 𝜋. So while we can prove that the series

1

0! +

1

1! +

1

2! +

1

3! + …

can never grow as large as 3, regardless of how many terms we take it to, we cannot actually construct some magnitude and then prove that this series is approaching it. Instead, we assume as a matter of principle that it must be approaching some definite magnitude, and then give it a name, 𝑒. Is this move legitimate? The answer generally given to that question, up to the time of Dedekind, was to make some vague appeal to the idea of continuity. Since a straight line is continuous, and since this series could not reach certain lengths along the straight line, but could surpass others, there had to be in the line some dividing point between the lengths it could surpass and the ones it could not reach, and this point marked the length that the series was approaching. The same

argument would have to be made in the case of 2√2, which we do not know how

to construct either, but which we can define as the limit of 2𝑥 with rational 𝑥 as 𝑥

approaches √2. Dedekind was unhappy with this. Calculus was supposed to be a science, and

here we were, at the bottom of it all, assuming the existence of things based on some vague notion that “they have to be in there, somewhere, because of continuity.” Was there no clearer (and more arithmetic or abstract) way of stating what continuity consisted in, such that we could be quite certain that the principles of calculus were true? And was there some other way to define continuity among our numbers without appealing to particular kinds of continuity, such as the geometrical or temporal sort? Even more fundamentally, one might wonder what quantities a continuum must contain in order to be considered continuous. For example, do we have to admit that 𝑒 is included somewhere in a straight line that is five units long? How do we know it is in there? If it is because the line is

Page 74: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

74

continuous, what does “continuous” mean such that the inclusion of 𝑒 in the line follows once we call the line continuous?

This was the question to which the real numbers would be the answer. More general questions about the nature of number hover in the background

of the essay we are about to read, Continuity and Irrational Numbers. There is, for example, the question What is a number? Where do numbers exist? What produces numbers? Plato, it is said, believed numbers to be separated substances existing independently of the participations in them that we see in this world. “Two Itself” subsisted by itself in some numerical realm apart from any two apples or two oranges. And of course we do not invent or produce this self-subsisting “Two Itself,” since it existed from all eternity, and never came to be. In our science of number, thought Plato, we discover the numbers, we do not produce them. He had a similar view of other natures, such as “Horse Itself.” Since we know many truths about the equine nature that are unchanging and timeless and universal, therefore that nature must exist somewhere in an unchanging way, apart from all material and individual horses.

Plato’s greatest disciple, Aristotle, disagreed. According to him, “what a horse is” does exist outside our minds, but only in individual horses. The separation of “what a horse is” from individual horses happens only in our minds. Dedekind seems to disagree with Plato even more radically than Aristotle did. At least Aristotle would admit that we find “what a horse is” in particular horses, and that we make this nature universal in our minds only by considering it apart from what is peculiar to any individual horse. He said the same about the nature of number—we find it, we don’t create it, and it is a kind of quantitative being that exists outside the mind in numerable things. Dedekind, on the other hand, thought not only that we can consider what numbers are apart from any particular kind of numerable thing (something with which Euclid and Aristotle would agree), but also that we create numbers in some sense—and perhaps not just individual numbers, as when we count, but whole kinds of numbers that did not exist before, such as “irrational numbers.” Aristotle seemed to speak as though we construct numbers in our minds (“numbering numbers”) somehow, but in imitation of the numbers that we find in things (“numbered numbers”). He did not think of numbers as though they were essentially logical entities, invented by the human mind for its own purposes. As you read through Dedekind’s essay, you might wonder about what he thinks numbers are, and in what way he thinks the human mind is the cause of numbers. Dedekind also wrote an essay titled “What Are Numbers and What Should They Be?” (Was sind und was sollen die Zahlen?) This is the sort of question that a modern mathematician might ask, but which would never occur to an ancient mathematician. The modern mind sees itself as being more a cause of the things it studies than the ancient mind did. In the other essay in our book (which we do not read in the program), called The Nature and Meaning of Numbers, Dedekind

attempts to define the number 1, and hence all other real numbers, purely in terms proper to logic. This purely logical approach to numbers, making them out of the stuff of mind, seems to be what he means by a “purely arithmetic approach.”

Our concern will be primarily with what Dedekind has to say about the nature of continuity. As you read through Dedekind’s essay Continuity and Irrational Numbers, keep in mind some of the following questions. First of all, what does continuity consist in? Aristotle offered two definitions of “continuous” pertinent to mathematics (and a third one of significance only in natural science), one in the

Page 75: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

75

Categories, where he says a quantity is continuous if every division of it is into parts that share a common boundary, and another in the Physics where he says a quantity is continuous if it is divisible forever. Are these ideas of continuity adequate? If a set of ordered points, for example, is infinitely divisible, so that between any two there is an infinity of others, is the set of points continuous? There are no point-free spaces among them, on that condition, and so it would certainly seem that they form a continuum somehow. In order to guarantee that a set of numbers is just as rich in numbers as a straight line is rich in points, is it enough if there is an infinity of numbers? Is it enough if there is also an infinity of numbers between any two of the numbers? What exactly is Dedekind’s idea of continuity, and how does it compare to Aristotle’s definitions? And is it right or wrong, foolish or ingenious, possible or impossible, to number a continuum, to endeavor to know all the divisions of something continuous through number and arithmetical operations? And, in the end, does Dedekind save the claim of calculus to be a science?

Page 76: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

76

11 The Infinite World of Georg Cantor

INTRODUCTION Georg Ferdinand Ludwig Philipp Cantor was born in St. Petersburg, Russia in 1845, the oldest of six children. When he was twelve years old, his family moved to Germany. Cantor’s mother, Maria Anna Böhm, was Russian and quite musical. Cantor’s grandfather, Franz Böhm (1788-1846), had been a well known musician in a Russian imperial orchestra, and Cantor himself was a talented violinist. Cantor’s mother was also born a Roman Catholic, while his father, Georg Waldemar Cantor, was a convert from Judaism to Protestantism. Religion played an important role in the Cantor household, and some say Georg Cantor’s fascination with infinity was partly due to this.

Whatever the cause, Cantor took an interest in questions about infinites that no one before him had answered or even asked. His mathematical work was arguably the most original of the nineteenth century, possibly the most original work since the first mathematical discoveries of the ancients. One consequence of the novelty of his work is that it met with powerful opposition from many influential mathematicians of his day, a thing that caused him great pain—and he was a sensitive man prone to nervous breakdowns, and who was more than once admitted into a psychiatric hospital. When he was on holiday in Interlaken in 1872, he met Richard Dedekind, who befriended him and became one of the first mathematicians to appreciate Cantor’s work, and who became an important ally for him in his disputes with Leopold Kronecker, an eminent mathematician who was philosophically opposed to transfinite numbers.

Another consequence of the novelty of his work is that it starts more or less from its own principles, presupposing little more than a background in the basics of number theory. If you know enough algebra to know what real numbers are, you are ready to begin the study of Cantor, at least as his work will be presented here. No calculus required.

Page 77: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

77

Whereas his friend, Richard Dedekind, had never married, Cantor married Vally Guttmann, and they had five children. Nonetheless, Cantor’s life was a sad one by all accounts. He suffered bouts of mental illness, and struggled to prove many strange things outside of mathematics that did not help him gain the acceptance among mathematicians he so much desired. He was intensely interested in proving that Francis Bacon actually wrote the works of Shakespeare, and claimed to have discovered information about the first British king that “will not fail to terrify the English government as soon as the matter is published.”

1884 was the year of his first mental breakdown. He recovered, but thereafter lived in fear of the return of this affliction. And return it did. His beloved and youngest son, Rudolf, died unexpectedly in 1899, and Cantor was back in the “neuropathic hospital” in Halle in 1902, and again in the years 1904, 1907, and 1911. Cantor retired in 1913, living in poverty and suffering from malnutrition during World War I. He died on January 6, 1918, in the hospital in which he spent the last year of his life for his condition.

But he was a man of great genius if also a troubled soul. And his work has come to be admired by mathematicians all over the world, and influenced some of the greatest mathematicians of the twentieth century, including Kurt Gödel.

Let us take a look at some of his ideas, and see what astonishing things Cantor had uncovered. (1) Let us start with this question: Which are there more of, rational numbers or irrational ones?

There is an infinity of each, of course. But is it possible for one infinite to be greater than another? To outrank another somehow?

Between any two rationals, no matter how close together they are, there is an

infinity of irrationals. For example, between 0 and 1 we have √2

2,

√2

3,

√2

4, and so on.

Between any two irrationals, no matter how close together they are, there is an

infinity of rationals. For example, between √2 and 2√2, we have 3

2 and

5

2 , and clearly

there is an infinity of rationals between these two numbers, since we can bisect

the distance between them and produce a new rational between them (namely 4

2 ),

then bisect the distance between that and either of the original rationals and get

new rationals between √2 and 2√2, and so on. (2) This consideration might incline us to think that the rationals and the irrationals are more or less equal in multitude, that their infinities are in some sense evenly distributed throughout the points of the real number line. Is this the case? Are the real numbers “half rational” and “half irrational,” so to speak?

Page 78: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

78

(3) Enter Georg Cantor, the first modern mathematician to consider from scratch a way to compare the sizes of sets of objects—whether finite sets or infinite ones. (4) One of the first realizations Cantor made will seem extremely simple, and indeed it is, and this marvelous simplicity characterizes much of his work. The realization is this: Seeing that two sets of things are “equinumerous” (equally many) does not always require counting. We can (sometimes) count the objects in two sets of objects in order to learn whether they are equinumerous, but we need not do so. There is another way.

To learn whether the number of fingers on one hand is equal to the number of fingers on another, for example, we may of course count the fingers in each case. If the number we count up to in one case is the same as in the other, say five, then the two sets of fingers are equinumerous. But we can also see that two sets are equinumerous without counting the objects in either set. Instead, we can tell that they are equinumerous if we can find a way to place all the objects in one set into a one-to-one correspondence with all the objects in the other set. This means that

every object in set A, for example, is paired with a unique object in set B, and conversely every object in set B is paired with a unique object in set A.

If you look out at a theatre with hundreds of seats in it, you might not be able to tell how many seats there are without counting. But if you see that every single seat is occupied by one person, and every single person in the theatre occupies just one seat, so that “seats” and “persons” are in a one-to-one correspondence, you may be sure that the number of seats is the same as the number of persons—even though you have no idea how many seats there are, or how many persons.

Seeing a one-to-one correspondence is something more basic, more fundamental, than counting, it would seem. It is analogous to coincidence among magnitudes. (5) In accord with this simple realization, Cantor proposed this definition:

Two sets M and N are equivalent ... if it is possible to put them, by some law, in such a relation to one another that to every element of each of them corresponds one and only one element of the other.

Notice that this idea of “equivalent sets” in no way assumes that we are talking about finite sets. The notion applies equally well to finite sets or infinite ones. Modern mathematicians speak of sets having the same cardinality if they meet Cantor’s condition above.

Page 79: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

79

(6) With this new concept and vocabulary in place, let us now ask a few warm-up questions before getting back to our question about the rationals and the irrationals.

Question: Does the set of positive integers, P, have the same cardinality as the set of even numbers, E ? It is plain that they do, since we can pair every positive integer with one and only one even, in such a way that each even will also be paired with one and only one positive integer. All we have to do is write out the positive integers in order, and then beneath them write their doubles: P 1 2 3 4 5 6 ... E 2 4 6 8 10 12 ... Galileo did much the same thing in Two New Sciences, pairing the positive integers with their squares. So we are not doing anything truly ground-breaking yet. (7)

If 𝐏 is a set, let us use the notation �̅� to signify “the cardinality of 𝐏.” So now we can write

�̅� = �̅� or, in English, “the set of positive integers and the set of even integers have the same cardinality.” (8) Now a slightly harder question. Does the set of integers (both positive and

negative, now), Z, have the same cardinality as the even numbers? Here we must be a little more clever, since if we begin writing out the integers from some place (such as zero) and then go from there in one direction (say, into the positives), then we will miss all the ones in the other direction (the negatives). So how do we write them in a way that guarantees we will miss none of them? We just flip back and forth:

Z 0 1 -1 2 -2 3 -3 4 -4 ... E 2 4 6 8 10 12 14 16 18 ... Clearly each list is exhaustive and non-repeating, and so every element included in the set of integers will be matched with a unique element in the set of even numbers and vice versa. So again, these sets have the same cardinality.

Page 80: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

80

(9) Since we are finding so many infinite sets with the same cardinality, let us give a name to that cardinality. Any set of objects having the same cardinality as the set of positive integers will be said to have cardinality ℵ0 (pronounced “aleph-naught”;

the ℵ symbol is the first letter of the Hebrew alphabet). So we have defined our first transfinite cardinal number. Also, let us call any set of objects that can be placed in one-to-one correspondence with the positive integers denumerable (or countably infinite). (10) Now another warm-up question, this time a little tougher still: Does the set of

rational numbers, Q, have the cardinality ℵ0 ? That is, can the set of rationals be placed in one-to-one correspondence with the set of positive integers? This is trickier. How can we list all the rationals? If we make a table of fractions formed only out of integers, we can easily arrange them so that in the first column

will be all the positive fractions with a numerator of 1, in the second column all those with a numerator of 2, etc., and in the first row will be all the fractions with a

denominator of 1, in the second row all those with a denominator of 2, etc. In this way, all the positive rationals will be included (in fact, there will be an infinity of

repetition, since we will have, for example, not only 1

2 but also

2

4 and so on). In

each box, we can also include the negative of each positive rational (not included in the accompanying table as drawn, in order to avoid clutter), and so we have a table of all the rationals. Can we assign a unique positive integer to each rational? Yes. Start with the first box (Column 1, Row 1), and pair the positive rational in there

with the first positive integer, 1, and the negative fraction in there with the

second positive integer, 2. Now go over to the next box (Column 2, Row 1), and pair the positive fraction in

there with 3, and the negative with 4. Now go diagonally down to the box in Column 1, Row 2, and pair the positive fraction in there with 5 and the

negative with 6. Continuing in this diagonal fashion, snaking back and forth like a sidewinder, we will not miss any of the rationals, and each will have gotten paired with a unique positive integer not belonging to any other rational. (We actually will have paired every rational with more than one integer, since every

Numerators

Denominators

1 3 5 72 4 6

1

2

3

4

5

6

7

1

1

2

14

1

5

1

6

1

7

1

3

1

1

22

23

2

4

2

5

2

6

27

2

1

3

1

4

1

5

1

6

1

7

2

3

3

34

3

5

3

6

37

3

2

4

3

4

4

45

4

6

47

4

2

53

5

4

5

5

56

57

5

2

6

3

64

6

5

6

6

67

6

2

73

7

4

7

5

7

6

7

7

7

Page 81: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

81

rational is repeated, in this way of listing them, an infinity of times; but that just means there are plenty of integers to go around—in fact, an infinity of integers to every rational. If we like, we can skip the higher-number expressions for each lowest-numbers rational, and assign them no positive integer, and then we will get a perfect one-to-one correspondence.) QUESTION: Can all the rationals be placed in one-to-one correspondence with just the positive prime numbers? (11) We have now seen that

�̅� = �̅� = �̅� = �̅� = ℵ0 It is starting to look as though all infinite sets will just have this same cardinality, which is just what one would expect. Can all infinite sets be placed in one-to-one correspondence with the positive integers? (12) To find out, let’s consider the set of real numbers. In particular, let’s consider the

set of points that can be taken between 0 and 1 on the positive x-axis. Can these be placed in one-to-one correspondence with the positive integers? We are now asking about the infinity of something continuous, which is different from the previous cases. Let us call an interval of real numbers, such as (0,1), a

continuum. (The notation (0,1) in this context will mean “the set of all real numbers 𝑥 such that 0 < 𝑥 < 1”.)

Our question, then, is whether (0,1) and P can be matched one-to-one. (13) THEOREM: THE NON-DENUMERABILITY OF THE CONTINUUM The interval of all real numbers between 0 and 1 is not denumerable. Following Cantor, we will prove this amazing theorem by reduction to the absurd.

First we note that any real number between 0 and 1 can be expressed as an infinite decimal:

1

2 = .5000000…

3

11 = .27272727…

𝜋

4 = .78539816…

Next we adopt a convention in order to avoid certain ambiguities. Some real numbers can be represented more than one way in a decimal expansion. For example:

Page 82: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

82

1

2 = .5000000…

1

2 = .4999999…

In cases like these, we will use the expression in zeroes rather than the one in nines.

The argument begins by noting that the process of adding 1 repeatedly is a definite process that misses no positive integer when carried far enough. Therefore, if there

were a definite process by which every real number between 0 and 1 could be paired with a unique integer (and in one-to-one fashion, so that every integer was

also, by this same process, paired with a unique real number between 0 and 1), then that would constitute a definite process of listing real numbers between 0 and 1 that misses none of them when carried far enough. In short, if the reals between 0 and 1 could be put in one-to-one correspondence with the positive integers by some definite process, there would be a definite process by which to list all the reals between 0 and 1. But there is no such process. For if possible, let there be such a process, and let the first real numbers between 0 and 1 that it generates be the following:

Positive Integers Real Numbers in (0,1)

1 𝑥1 = .89000340777777348… 2 𝑥2 = .50000000000000000…

3 𝑥3 = .34343434343434343… 4 𝑥4 = .44569070020003999…

5 𝑥5 = .68319256902819232… 6 𝑥6 = .00000000078300002… . . . . . . 𝑛 𝑥𝑛 = . 𝑎1𝑎2𝑎3𝑎4𝑎5𝑎6𝑎7 … 𝑎𝑛 … . . . . . . By supposition, the pairing process in question will produce any real number between 0 and 1 eventually, if it is carried far enough, and the list of reals on the right will be quite definite (the list is generated by our process, and each entry in it is just a matter of calculation). Consequently, we can define a number 𝑏 such that

Page 83: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

83

𝑏 = . 𝑏1𝑏2𝑏3𝑏4𝑏5 … 𝑏𝑛 …

where the decimal digits 𝑏1, 𝑏2, 𝑏3, etc., are found by taking the corresponding digits of 𝑥1, 𝑥2, 𝑥3, etc., and rolling each digit forward by 1—except that when the

result is 0 or 9, we instead roll backward 1 digit (through the cycle of digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3 etc.). For example, using our list above, we see that the first digit of 𝑥1 is 8. Adding 1 to

this would produce 9, so we instead go backward one digit, giving us 7. And we see that the second digit of 𝑥2 is 0. Adding 1 to this gives us 1.

And we see that the third digit of 𝑥3 is 3. Adding 1 to this gives us 4. And we see that the fourth digit of 𝑥4 is 6. Adding 1 to this gives us 7. And we see that the fifth digit of 𝑥5 is 9. Rolling that forward one digit would

give us 0, so instead we go backward one digit, giving us 8. And so on, giving us the definite number 𝑏 = .71478 … This number is clearly a definite real number. And because we avoided 9s and 0s,

it is impossible for the number 𝑏 to end up being .00000 ... = 0, nor can it be .99999 ... = 1. So 𝑏 is a real number between 0 and 1. And our process of generating the reals on our list cannot ever produce 𝑏, since by its construction 𝑏 differs from every number that our process can produce by at least one digit.

So we can now argue: if it were possible to place all the reals between 0 and 1 in a one-to-one correspondence with the positive integers, then our list of reals produced by that process would be exhaustive—but it is not exhaustive, and therefore it is impossible to place all the reals between 0 and 1 in a one-to-one correspondence with the positive integers. Q.E.D. Observation 1: There was an additional reason to avoid 9s besides the one

already given (we wanted to avoid 9s and 0s to guarantee that the resulting number 𝑏 was between 0 and 1). If we allowed 9s in the formation of the number 𝑏, then,

if our diagonal process required us to roll forward the digits 3, 8, 8, 8, 8, etc., we could end up with .4999999 ... , which is on our list, but expressed as .50000 ... Observation 2: If we simply add 𝑏 to our list, that does not fix the problem. We can then form a new number that is not on the list, using the very same process. Observation 3: There is nothing special about the interval between 0 and 1. Any interval of reals, no matter how small, will contain a set of points that cannot possibly be paired one-to-one with the positive integers. Any method of pairing we conceive will always omit an infinity of reals in the interval under consideration. Every integer will have its real, but not every real will have its integer. Observation 4: It is possible to pair all the reals (looking at the whole of the x-axis now, both positive and negative, and infinitely long in both directions) with the

Page 84: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

84

points on a finite interval (such as that between 0 and 1) in a one-to-one way. Bisect the finite interval, and set up the two equal straight lines (that are its halves) in a “v” shape that stands at the origin, and then draw a line through the top points of this “v” parallel to the x-axis. Bisect this line at P, and from P draw any straight line through one side of the “v” and to the x-axis. The one-to-one correspondence should then be clear. Observation 5: Since the cardinality of the interval of reals between 0 and 1 is

non-denumerable, it has a different cardinality from ℵ0. Let’s call it C, for “continuum.” (14) THEOREM U: The union of any two denumerable sets is denumerable. The union of two sets means the set that includes all the elements that belong to either of them (including any that might belong to both). For example, the union of “the set of even positive integers” with “the set of odd positive integers” yields “the set of positive integers.” In order to answer our opening question all the way back in item (1), namely “Are there more rational numbers or irrational ones?”, we need to see first that the union of any two denumerable sets must be denumerable also. SUPPOSE: A and B are denumerable sets.

U is the union of sets A and B.

PROVE: U is a denumerable set.

Since A and B are denumerable sets, each can be placed in one-to-one correspondence with the positive integers: P: 1 2 3 4 ... P: 1 2 3 4 ... A: 𝑎1 𝑎2 𝑎3 𝑎4 ... B: 𝑏1 𝑏2 𝑏3 𝑏4 ...

Since we are guaranteed to have all of the elements of sets A and B in these denumerations, we merely have to jump back and forth between them, alternating, in order to get all the members of the two of them, that is, all the members of set U, their union, into a one-to-one correspondence with the positive integers:

P: 1 2 3 4 5 6 ... U: 𝑎1 𝑏1 𝑎2 𝑏2 𝑎3 𝑏3 ... Q.E.D.

Page 85: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

85

(15) It follows immediately that the irrationals are non-denumerable. We saw in item (10) above that the rationals are denumerable. So if the irrationals were also denumerable, then both the rationals and the irrationals would be denumerable, and consequently their union, too, would be denumerable. But the union of the rationals and the irrationals is the reals. And the reals are not denumerable, as we proved in (13). Therefore, the irrationals are non-denumerable. This is the beginning of an answer to our opening question: Are there more rationals, or irrationals? We can now say this: All the rationals can be assigned a unique positive integer, but it is impossible to do this with the irrationals. No matter how we assign positive integers to irrationals, although we can succeed in making sure every integer gets used, not every irrational will get an integer. Whatever pairing method we adopt, there will always be an infinity of irrationals left deprived of any integer to be matched with. (16) Now another question: Are the algebraic numbers denumerable? An algebraic number is any number (possibly complex) that is the root of an algebraic equation. More precisely,

An algebraic number is a (possibly complex) number that is the root of a finite, non-zero polynomial in one variable with rational coefficients (or equivalently—by clearing denominators—with integer coefficients).

A polynomial in this context is defined by having positive integers for exponents. So an algebraic number is any root of an equation of this form:

𝑎𝑛𝑥𝑛 + 𝑎𝑛−1𝑥𝑛−1 + … + 𝑎1𝑥

1 + 𝑎0 = 0

where all the coefficients (𝑎𝑛, 𝑎𝑛−1, etc.) are integers (positive or negative), and all the exponents (𝑛, 𝑛 − 1, etc.) are positive integers. We specify that the equation must be “non-zero,” because otherwise we would allow 0𝑥 = 0 and even π would be a root of that.

Algebraic numbers include all the integers, all the rationals, and many kinds of

irrationals, too. √2 is algebraic, for example, and so is √2 + √57

, and so is

Page 86: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

86

√37 + √2 + √57

51

− 6 391 2047⁄

All numbers like these expressible in terms of radicals (regardless of how many, or how horrible each radical is) are algebraic. Even when we include all such horrid numbers, we do not exhaust the algebraics, since Niels Abel10 proved that algebraic equations of the fifth degree and higher have roots (which are thus algebraic numbers) that cannot be expressed in radicals at all. In fact, “algebraic number” seems to be no more specific than “number,” and we have seen that the real numbers (and a fortiori “all numbers,” which include complex ones) are non-denumerable. So it is not easy to see how one could enumerate the algebraics, that is, place them into one-to-one correspondence with the positive integers. There seem to be too many. (17) But we should consider the definition of an algebraic number. It is defined in relationship to a certain kind of equation. Can we use this fact to find a way to assign a unique positive integer to each and every algebraic number? One difficulty is that there is not just one equation for a given algebraic number. Any algebraic number will be the solution of an infinity of different algebraic equations. For example, the algebraic number 2 is the solution of both of the following algebraic equations:

𝑥 − 2 = 0

𝑥2 − 5𝑥 + 6 = 0 But of all the algebraic equations of which a given algebraic number is the solution, there will be a simplest and most primitive one—in fact, for a given algebraic number there is one, and only one, irreducible, single-variable polynomial with rational coefficients and with its leading coefficient (the coefficient of the highest power of 𝑥 to have a non-zero coefficient) equal to 1, and of the least degree among all polynomials having the given number as a root. This polynomial is called the algebraic number’s minimal polynomial. (Multiplying by the least common denominator for the coefficients then gives us a definite equation with integer coefficients.) That can be proved, but instead of assuming that theorem, we may observe that of all the polynomial equations (with integer coefficients) of which a given algebraic is a root, there will have to be some that have the lowest degree. And of those that

10 A Norwegian mathematician who lived from 1802 to 1829, and who had perhaps even more difficulty gaining recognition during his short lifetime than Cantor did.

Page 87: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

87

share this lowest degree, there will have to be some with the smallest lead coefficient (if the lead coefficient is negative, we can multiply the whole polynomial

by −1 to make it positive). And even if there can be many polynomials having the given number as a root, and having integer coefficients, and having the lead coefficient as small as possible, among these there will have to be some whose second coefficient has the least absolute value (and if it is possible for some of these equations to have that value as a positive, while others have it as a negative, then we will choose the ones in which it is positive, to keep our way of selecting equations simple and definite). Continuing this process, we must arrive at a single equation, with least integer coefficients, of which our given algebraic number is a root. Let us call this the algebraic number’s least integers equation. So a given algebraic number determines one and only one least integers equation. A particular algebraic equation of degree 𝑛, however, can have as many as 𝑛 roots (and always will, if we allow for complex roots). So a given algebraic number does not exactly have its least integers equation all to itself. But if we now place the 𝑛 roots in order of magnitude (and if any complex ones or positive and negative ones share the same magnitude, list those that are a smaller counterclockwise rotation

from +1 before those that are a further rotation away), then our given algebraic will occupy a unique position in that definite order. Sometimes the same root occurs more than once in the solution of an equation. For example, consider

𝑥3 − 8𝑥2 + 21𝑥 − 18 = 0 This can be factored into (𝑥 − 2)(𝑥 − 3)(𝑥 − 3) = 0

So the three roots of the equation, in order of magnitude, are 2, 3, 3, and the number 3 occupies both the second and third positions, here. Perhaps this kind of thing cannot happen with a least integers equation, since we need only divide out the extra (𝑥 − 3) to get a simpler algebraic equation that still has 3 as a root. But even if this sort of thing can happen (you may wish to consider this question on your own), there will be only one first place occupied by 3. The first place it occupies in this order is the second position. If we call this the “root position” of a given algebraic number, namely the first position it occupies when listed in order of magnitude along with the other roots of its least integers equation, then we see that this is unique to a given algebraic number. We can now correlate any algebraic number with a unique “root position” of a specific algebraic equation. If we can only assign a unique positive integer to such a root position of such an equation, then we will be home free. There are many conceivable ways of going about this. But let’s try the following. The least integers equation is fully specified by its exponents and coefficients. And these are numbers. The exponents, taken in ascending order, are just some number of natural numbers beginning from 0. For example, if the least integers

equation is of the fifth degree, then its exponents are 0, 1, 2, 3, 4, 5. (The exponent 0 turns the variable to 1, of course, and when this is multiplied by its special

Page 88: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

88

coefficient, that gives us just the coefficient itself, which is the constant term in the equation.) Now pair all possible exponents, taken in ascending order, with prime

numbers taken in ascending order and beginning from 2: Exponents of the least integers

equation: 0 1 2 3 4 5 6 ...

Primes: 2 3 5 7 11 13 17 ... So we have “tagged” or “labeled” each of the exponents with its own prime. Next we have to deal with the coefficients. All the coefficients in the least integers equation are integers, whether positive or negative. We can easily set out all the integers by starting with 0, then going to 1, then -1, then 2, then -2, and so on, back and forth—we will not miss any that way. And each of these can be paired, in this order, with the primes taken in ascending order and beginning from 2: Integer coefficients of the least integers

equation: 0 1 -1 2 -2 3 -3 ... Primes: 2 3 5 7 11 13 17 ... Using this scheme, we can now construct a unique integer for any algebraic

number as follows. Let the given algebraic number be 𝑟, and let its least integers equation be

𝑎 + 𝑏𝑥1 + 𝑐𝑥2 + 𝑑𝑥3 + 𝑒𝑥4 + 𝑓𝑥5 = 0 (So we will be looking at a fifth-degree example, here, but the procedure will work for a least integers equation of any degree.) Set out the primes corresponding to the exponents in ascending order: 2 3 5 7 11 13 Now form the product of all those primes, writing them still in order:

2 × 3 × 5 × 7 × 11 × 13 Next set out the coefficients of our equation in the order in which they occur in the equation: 𝑎 𝑏 𝑐 𝑑 𝑒 𝑓 Using the scheme above, write the prime numbers to which these integers are matched, keeping them in the same order as the coefficients to which they correspond:

Page 89: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

89

𝐴 𝐵 𝐶 𝐷 𝐸 𝐹 Looking back at our product of primes, raise each prime in it to the prime number above that corresponds to it in order:

2𝐴 × 3𝐵 × 5𝐶 × 7 𝐷 × 11𝐸 × 13𝐹

Now suppose the given algebraic number 𝑟 occupies the 𝑧th root position among the roots of its least integers equation. Take the next prime after the bases we have already set out, and raise this to 𝑧, so in this case 17𝑧. Now tack this on as our final factor, and accordingly we form the number

2𝐴 × 3𝐵 × 5𝐶 × 7 𝐷 × 11𝐸 × 13𝐹 × 17𝑧 Clearly we can do this for any algebraic number 𝑟. So it is possible to assign an integer to every algebraic number by a definite process. But will the number we have thus constructed be unique to 𝑟? Or could it also be assigned to some other algebraic number by this process? It will be unique to 𝑟. To see this, start with the number and work in reverse. To what algebraic number must it correspond? The number itself will tell us, if we ask it nicely. If this number has been “tagged” to an algebraic number by our process,

then the power of the last prime in its prime factorization, namely 𝑧, will mean that

the algebraic number occupies the 𝑧th root position of the algebraic equation specified by the remaining parts of the prime factorization. Looking to these, the number is telling us to construct an equation with six powers starting from 0, that is to say a fifth degree equation, and it tells us that the coefficients of those powers of 𝑥 are the integers that correlate to the prime exponents 𝐴, 𝐵, 𝐶, etc. But these

bits of information fully determine the algebraic equation. And the 𝑧th root position belongs to at most one root. I say “at most,” because the number 𝑧 could be too high, thus not specifying a position of a root in the equation at all. For example, the number

23 × 311 × 537 × 7 13 × 115 × 1319 × 1723 is not assigned to any algebraic number by our scheme, since the equation here

specified would be a fifth degree equation with precisely 5 algebraic roots—so no root occupies the 23rd position. We now see that the algebraics are perfectly denumerable. EXERCISE: If we call a number thus assigned to an algebraic number its Cantor Number,11 see whether you can find the Cantor number of the algebraic number

√2.

11 “Cantor number” is not a mathematical term—we are inventing it here for the purposes of this exercise.

Page 90: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

90

(18) Now we should stop and notice what we’ve done. We have just now proved the existence of non-algebraic numbers! That the numbers 𝑒 and 𝜋 are non-algebraic is not easy to see, and few non-mathematicians ever go through (and understand) the proofs. We have not, and will not, prove that these particular numbers are non-algebraic. But we now see that non-algebraic numbers must exist, and that algebraic numbers are not all the real numbers, let alone all the numbers.

How have we proved the existence of non-algebraic numbers? By showing that algebraics are denumerable. If all reals were algebraic, then since algebraics are denumerable, so too would be the reals. But they are not, as we saw before in item (13). Therefore not all reals are algebraic. The general name for non-algebraic numbers is transcendental numbers.

Moreover, we should be able to see now that the transcendental numbers (though they were last to be discovered, and though we know only a few of them by their specific names) are the vast majority of the reals, with the algebraics constituting a tiny minority among them, so to speak. The algebraics are denumerable, and therefore a subset of them, the real algebraics, are also denumerable, whereas the reals are not denumerable. This implies that the real transcendentals are not denumerable, since if they were, then the union of them with the real algebraics, which is just the reals, would also be denumerable, which they are not. Therefore the remaining reals, the transcendentals, are not denumerable. The noted historian of mathematics Eric Temple Bell once wrote that

The algebraic numbers are spotted over the plane like stars against a black sky; the dense blackness is the firmament of the transcendentals.

(19) We have now seen that there can be infinite sets of the same cardinality, and also infinite sets with different cardinality. And we have just now been leaping to the conclusion that those which cannot be placed in a one-to-one correspondence with another are somehow “greater.” Can we develop a more exact idea of “less than” and “greater than” for transfinite numbers? A first attempt might go like this. Let there be any two sets, 𝐀 and 𝐁. If all elements

of 𝐀 can be placed in an exact one-to-one correspondence with some subset of 𝐁,

then �̅� < �̅�. This notion works well with finite sets. For example, according to it a set of 3 things

is less than a set of 5 things. But it does not work at all with infinite sets. For example, the set of positive integers, 𝐏, can be placed in a one-to-one correspondence with a subset of the rationals, such as the harmonic series:

𝐏: 1 2 3 4 5 6 … 𝑛 ….

𝑠𝑢𝑏𝑠𝑒𝑡 𝑜𝑓 𝐐: 1

1

1

2

1

3

1

4

1

5

1

6 …

1

𝑛 ….

Page 91: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

91

So can we conclude that �̅� < �̅� ? No, since we have already seen that there is

a different method of matching that places all the members of 𝐐 in one-to-one correspondence with the members of 𝐏. Indeed, we can also place all the members

of 𝐐 in one-to-one correspondence with a mere subset of 𝐏 (e.g., with the primes) if we want. This is a strange and important fact: if two infinite sets can be placed in one-to-one correspondence with each other, that does not preclude putting one of them in one-to-one correspondence with a subset of the other, too. That is practically where we began, by putting all the members of the positive integers into one-to-one correspondence with a mere subset of itself. (20) So what to do? What can “greater than” and “less than” mean among infinite sets?

DEFINITION: If 𝐀 and 𝐁 are sets, then �̅� < �̅� if there exists a one-

to-one correspondence of all elements of 𝐀 to some elements of 𝐁, but there exists no one-to-one correspondence of all elements of

𝐁 to some or all elements of 𝐀.

This definition works for finite sets. For example, according to it, 3 < 5, since 3 fingers of my left hand can be placed in one-to-one correspondence with some of

the 5 fingers of my right, but the 5 fingers of my right hand cannot be placed in one-to-one correspondence with some or all of the 3 fingers of my left. Similarly, we can now say that ℵ0 < 𝐂, that is, that the set of positive integers is less than the set of points from 0 to 1 on a number line. This is because we can place all the positive integers in a one-to-one correspondence with a subset of the reals between 0 and 1. For example: 𝐏: 1 2 3 4 5 6 … 𝑛 ….

𝑠𝑢𝑏𝑠𝑒𝑡 𝑜𝑓 𝐂: 1

√2

1

2√2

1

3√2

1

4√2

1

5√2

1

6√2 …

1

𝑛√2 ….

But we showed before, in item (13), that it is impossible to put all the reals between

0 and 1 into one-to-one correspondence with the positive integers. With our new definition in mind, we can now see that if we have two sets, 𝐀 and 𝐁,

and all the members of 𝐀 can be placed in one-to-one correspondence with a

subset of 𝐁, that is not enough to be sure that �̅� < �̅�. That is one possibility, but

it might also turn out that �̅� = �̅�, as we have seen. So what we can be sure of is

that �̅� ≤ �̅�.

Page 92: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

92

(21) THE SCHRÖDER-BERNSTEIN THEOREM Here is a claim:

If �̅� ≤ �̅�

and �̅� ≤ �̅�

then �̅� = �̅� This is obvious if we are talking about finite sets. But if we are talking about infinite

sets, it is not obvious, and is in fact a theorem. We are saying that if set 𝐀 can have every one of its members matched to a unique member of a subset of 𝐁, and

similarly if set 𝐁 can have every one of its members matched to a unique member of a subset of 𝐀, then it must also be true that 𝐀 and 𝐁 themselves can be placed in a one-to-one correspondence. The proof was difficult enough that Cantor himself never found it, but it was found during his lifetime by two mathematicians independently of one another—these were Ernst Schröder (1896) and Felix Bernstein (1898). We will take this one on faith, since it is one of the few transfinite truths that fits with our intuitions. (22) With the help of the Schröder-Bernstein Theorem, we can at last answer our opening question more definitively. What is the cardinality of the irrationals? We

will denote the set of all real numbers as ℝ, and the set of all irrationals as 𝐈. We can match every irrational to a real, because irrationals are just a subset of the reals. Obviously any subset can be matched to a member of the larger set to which it belongs, since each member of the subset can be paired with itself, and hence with a member of the larger set.

Consequently, �̅� ≤ ℝ̅. But we can also match every real number to an irrational, one-to-one, as follows. Let some given real number be

𝑥1 = 𝑀. 𝑏1𝑏2𝑏3𝑏4𝑏5𝑏6 … 𝑏𝑛 … where 𝑀 is its integer part. We pair this with 𝑦1 = 𝑀. 𝑏1 0 𝑏2 11 𝑏3 000 𝑏4 1111 𝑏5 00000 𝑏6 111111 … All we did was insert first 1 zero, then 2 ones, then 3 zeroes, then 4 ones, then 5

zeroes, etc., between the decimal digits of 𝑥1. This defines a new number, 𝑦1, which is necessarily irrational since its decimal expansion neither stops nor repeats.

Page 93: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

93

And the matching is one-to-one, since a given 𝑥1 goes with one and only one 𝑦1

by this process, and likewise a given 𝑦1 will go with one and only one 𝑥1, namely the one that we get by removing the zeroes and ones that the process inserts.

Consequently, ℝ̅ ≤ �̅�.

And so the Schröder-Bernstein Theorem allows us to conclude that �̅� = ℝ̅. NOTE: We did not produce any one-to-one pairing of 𝐈 and ℝ. In each case we paired the entirety of one set with a subset of the other. When we paired every

member of ℝ each to its own special member of 𝐈, we did not use up all the members of 𝐈, since 𝜋 (to name but one instance) is irrational, but it is not among

any of the numbers 𝑦 that we constructed in 𝐈 in order to pair members of ℝ with those of 𝐈. (23) Cantor wrote to his friend Richard Dedekind in 1874, asking him whether he thought the points inside a square (not on the perimeter) could be placed in one-to-one correspondence with the points on one side. Cantor was trying to find an infinite set that was greater than ℝ. We saw

before that the set of points between 0 and 1, or on a line of unit length, has the same cardinality

as ℝ. So if we can show that the number of points in an area such as that of a square is greater than the number of points on a line, we

will have found a set that is greater than ℝ. Cantor thought this was obviously true, but he could find no proof for it. Clearly

the area was more abundant in points than one side of the square was. Perhaps proof was superfluous, he suggested, and he could simply postulate this as a fact.

But then he discovered that this was in fact false. The points in the area of a square can be placed in one-to-one correspondence with those on one of its sides. This shocked him, and he wrote to Dedekind in 1877 “I see it but I do not believe it!” (Perhaps he should not have been too surprised, since the number of points on a finite straight line can be put in one-to-one correspondence with those on an infinite straight line, as we have already seen.)

Here is how he found the one-to-one correspondence between the points inside the area of a square and the points between 0 and 1 on a finite straight line.

Let our square have a side-length of 1, and let one of its corners be taken as an origin, so that one side of it will serve as an x-axis, and the other as a y-axis. Any point inside the area of the square will therefore have a unique pair of coordinates, (𝑥, 𝑦). Let S denote the set of all ordered pairs (𝑥, 𝑦) where

y

x

0 1z

0 1

1

S

(x, y)

Page 94: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

94

0 < 𝑥 < 1

and 0 < 𝑦 < 1

As before, we insist on unique specific expansions. For example, .1999999... is not allowed, but we must instead write .2000000... Now choose any point (𝑥, 𝑦) in S, and let the decimal expressions of its coordinates be 𝑥 = . 𝑎1𝑎2𝑎3𝑎4𝑎5𝑎6 … 𝑎𝑛 … 𝑦 = . 𝑏1𝑏2𝑏3𝑏4𝑏5𝑏6 … 𝑏𝑛 … Now form a new number 𝑧 = . 𝑎1𝑏1𝑎2𝑏2𝑎3𝑏3𝑎4𝑏4𝑎5𝑏5 … which is clearly a point on the interval between 0 and 1 (since 𝑥 and 𝑦 are both

between 0 and 1, there is no way to combine their decimal expansions as we have and get either 0 or 1). So clearly every pair (𝑥, 𝑦) can in this fashion be matched to a number

𝑧 that is between 0 and 1. Moreover, the resulting number will be unique for the (𝑥, 𝑦) pair. That is clear, since if we start with 𝑧 we can “unshuffle” its digits, reversing our process, and we will end up with one and only one (𝑥, 𝑦) pair.

So there is a one-to-one correspondence between the points between 0 and 1 on one side of the square, and the points inside its area.

y

x

0 1z

0 1

1

S

(x, y)

Page 95: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

95

(24) We can also use this method to pair all points in infinite three-dimensional space with points on a unit line. Let P be any point in infinite space, AB a finite straight line which we may say goes

from 0 to 1. Let the absolute values of the coordinates of P be |𝑥| = 𝑥𝑟 … 𝑥3𝑥2𝑥1 . 𝑎1𝑎2𝑎3𝑎4 … |𝑦| = 𝑦𝑠 … 𝑦3𝑦2𝑦1 . 𝑏1𝑏2𝑏3𝑏4 … |𝑧| = 𝑧𝑡 … 𝑧3 𝑧2 𝑧1 . 𝑐1𝑐2𝑐3𝑐4 …

where before the decimal we have 𝑟 digits in the integer part of 𝑥, 𝑠 digits in the integer part of 𝑦, and 𝑡 digits in the integer part of 𝑧. Now each coordinate is in fact

either positive or negative. We will assign the digit 0 to negative, and 1 to positive (let us say, for this example, that in the present case 𝑥 is negative but the other two coordinates are positive). Also, we will observe our usual conventions to avoid producing either 1 or 0 as an output number. We now define a new number 𝑞 = .011𝑥1𝑦1𝑧1𝑎1𝑏1𝑐1𝑥2𝑦2𝑧2𝑎2𝑏2𝑐2𝑥3𝑦3𝑧3𝑎3𝑏3𝑐3 ….

The first three digits correspond to the signs of 𝑥, 𝑦, 𝑧. And the rest is in a clear pattern that will exhaust the digits of the three coordinates. (We are in effect reading down through the first digits immediately left of the decimal, then through the first digits after the decimal, then back down through the second digits left of the decimal, and so on, back and forth.) What do we do, however, when we come to 𝑥𝑟, and after that there will be no more digits to fill in? We want our new number to indicate that we ran out of digits, so that

the meanings of all the digits in 𝑞 will remain clear, and we can unravel it back into the original coordinates. One solution is simply to write 0 to fill in the digits of 𝑞 that would be occupied by pre-decimal digits of 𝑥, except that we ran out of them

(just as . 5000̅ is the same as . 5, so 5 is

the same as 0̅005.). Now we have an unambiguous way to

construct a number 𝑞 for any point P in infinite space. And the 𝑞 for any P is also unique to it (so we have a one-to-one correspondence), because we can

unravel any 𝑞 back into one and only one set of coordinates, which gives us one and only one point P in infinite space.

0 x

y

z

P

A B

0 1q

Page 96: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

96

(25) DEFINITION: POWER SET

Given a set 𝐀, the power set of 𝐀 means the set of all subsets of 𝐀. Notation: P[𝐀] means “the power set of 𝐀”

Example: If 𝐀 = {𝑎, 𝑏, 𝑐}

then P[𝐀] = {{ }, {𝑎}, {𝑏}, {𝑐}, {𝑎, 𝑏}, {𝑎, 𝑐}, {𝑏, 𝑐}, {𝑎, 𝑏, 𝑐}}

Notice that the empty set { } and also the set 𝐀 itself, {𝑎, 𝑏, 𝑐}, are included as subsets of 𝐀, and hence as members of P[𝐀]. (26)

CANTOR’S THEOREM: If 𝐀 is any set, then �̅� < P[𝐀]̅̅ ̅̅ ̅̅ . Cantor wanted to get his transfinite numbers to be as much like the integers as possible, to be as fully intelligible and complete as can be. So he wanted to show that there is an infinity of different transfinite numbers with different cardinalities. His proof of this is the theorem named after him. It goes like this: Clearly we can match all members of 𝐀 to those in a subset of P[𝐀]. For example, if 𝐀 = {𝑎, 𝑏, 𝑐, 𝑑, 𝑒, … } then we can match

Member of 𝐀 Member of P[𝐀]

𝑎 {𝑎} 𝑏 {𝑏} 𝑐 {𝑐} 𝑑 {𝑑} 𝑒 {𝑒} etc. On the right, we have just a small subset of all the subsets of 𝐀, hence a small

subset of P[𝐀].

Therefore �̅� ≤ P[𝐀]̅̅ ̅̅ ̅̅ .

Page 97: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

97

To finish the theorem, we need only to show that there is no one-to-one

correspondence between 𝐀 and P[𝐀]. We proceed by assuming the opposite and reasoning to a contradiction. Suppose, if possible, there were some process

by which the elements of 𝐀 and P[𝐀] could be placed in perfect one-to-one correspondence. Let the pairings that result from this process be, for example: Elements of 𝐀 Elements of P[𝐀] (i.e., subsets of 𝐀)

𝑎 {𝑏, 𝑐} 𝑏 {𝑑} 𝑐 {𝑎, 𝑏, 𝑐, 𝑑} 𝑑 { } 𝑒 𝐀 𝑓 {𝑎, 𝑐, 𝑓, 𝑔 … } 𝑔 {ℎ, 𝑖, 𝑗 … } . . . . . . Note that some elements of 𝐀 on the left are matched with subsets of 𝐀 that include them, and others are not. For example, 𝑐 is matched with the subset {𝑎, 𝑏, 𝑐, 𝑑}, which includes it, whereas 𝑎 is matched with the subset {𝑏, 𝑐} which does not.

We now define a set 𝐁 as the set of every element of 𝐀 that is not a member of the subset to which it is matched. So 𝐁 = {𝑎, 𝑏, 𝑑, 𝑔 … } which is itself a subset of 𝐀, of course. And therefore 𝐁 is a member of P[𝐀], the

power set of 𝐀, the set of all the subsets of 𝐀. Therefore 𝐁 must appear somewhere in the right column, because we have assumed that all the members of P[𝐀] are paired with those of 𝐀 by our pairing method. Therefore 𝐁 is paired with some unique element of 𝐀 on the left. Call this 𝑦: . . . . . . 𝑦 𝐁 . . . . . . We now ask: is 𝑦 an element of 𝐁 ? Either it is or it isn’t. Let us consider each possibility.

Page 98: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

98

• If 𝑦 IS NOT an element of 𝐁, then 𝑦 is one of those elements of 𝐀 that is not a member of the subset to which

it is matched (since it is matched with 𝐁, but we are now assuming that it is not an element of 𝐁). But that is precisely what makes an element of 𝐀 to be an element

of 𝐁, by the definition of set 𝐁. Therefore, if 𝑦 is not an element of 𝐁, then it is an element of 𝐁. Which is absurd. • If 𝑦 IS an element of 𝐁,

then 𝑦 is one of those elements of 𝐀 that is not a member of the subset to which it is matched (since that is what it means to be a member of set 𝐁). But 𝑦 is

matched to subset 𝐁. Therefore, 𝑦 is not an element of 𝐁. Therefore, if 𝑦 is an element of 𝐁, then it is not an element of 𝐁. Which is absurd. Therefore an absurdity follows in any case if we assume that the elements of 𝐀 and of P[𝐀] can be placed in a one-to-one correspondence. So this cannot be done.

Therefore �̅� < P[𝐀]̅̅ ̅̅ ̅̅ . Q.E.D. (27) Thanks to Cantor’s Theorem, we now know that

ℵ0 < �̅� < P[(0,1)]̅̅ ̅̅ ̅̅ ̅̅ ̅̅ < P[P[(0,1)]]̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ < P [P[P[(0,1)]]]̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅

< 𝑒𝑡𝑐.

(28) ANTINOMIES The reasoning in Cantor’s Theorem is similar to that of certain antinomies (paradoxes, or ideas that seem to lead to contradictions now matter what you choose to say) connected with infinite sets. One of these goes as follows. We define the “universal set,” U, as the set of all sets. It contains every possible set that ever was or could be—sets of ideas, of things, of numbers, subsets of these,

and so on. By definition, then, there is no set that U does not include. Now we

apply Cantor’s Theorem to it, and we conclude that �̅� < P[𝐔]̅̅ ̅̅ ̅̅ . But this means that

P[𝐔] contains elements that U does not. Hence there are some sets that are not included in U, even though it is the set of all sets.

Another such antinomy connected with infinite sets is known as Russell’s Paradox, named for the English mathematician and philosopher Bertrand Russell.

Page 99: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

99

We define a “normal” set as one that does not contain itself as a member, for example “the set of all mathematicians,” since that set is clearly not itself a

mathematician. We define N as the set of all normal sets. We now ask: is N a normal set? If N is normal, then it contains itself as a member (since it is the set of all normal sets), and hence it is by definition not normal (since it contains itself as a member), and so it is both normal and not normal. If N is not normal, then since it does not contain itself it is (by the definition of a normal set) normal. So if it is normal, then it is not normal; and if it is not normal, then it is normal. This antinomy shows that the idea of “set of all X,” so seemingly innocent and self-evidently permissible, can sometimes contain hidden contradictions.

A similar antinomy is the Richard Paradox, first propounded by the French mathematician Jules Richard in 1905. Cantor himself became aware of such antinomies in 1895.

What are we to make of them? Do they completely invalidate Cantor’s discoveries? Do they prove that all talk of the infinite is nonsense? Modern mathematicians for the most part did not wish to go so far as that. Cantor’s results had become very much prized. So they carefully decided which things were allowed to be “sets” and which things were not, so that the axioms of set theory would apply only to such collections as would not lead to any of these contradictions. Cantor’s informal approach to the discipline is now referred to as “naive set theory,” whereas the real deal is the formally axiomatized version due to such mathematicians as David Hilbert.

Is this a satisfactory solution, or is it too ad hoc ? On the one hand, it seems true that Cantor has found some real truth, and it needs to be saved from these contradictions. On the other hand, it seems that something deeper is going on, and a mere declaration that certain things shall not be called “sets” seems to be only a way of doing an end-run around the deeper questions. (29) Cantor’s “continuum hypothesis” was that there is no transfinite number falling strictly between ℵ0 and 𝐂 (i.e., no infinite that was greater than ℵ0 but less than 𝐂). He wanted his transfinite numbers to resemble integers, and to take on a life of their own. But he could never prove this hypothesis.

Kurt Gödel (1906-1978), using the axiomatized version of set theory, proved Cantor’s continuum hypothesis was logically consistent with the axioms of set theory—hence there was no way to disprove it.

But Paul Cohen (1934- ) of Stanford University showed that there was also no way to prove it—it did not follow from the axioms and rules of inference of axiomatized set theory.

What does this mean? Does it mean that the continuum hypothesis is a matter of taste? That it is neither true nor false? The question is similar to the matter of Euclid’s Fifth Postulate, which we will come to very soon. (Today the prevailing opinion is that the “continuum hypothesis” should be considered false, i.e., that its contradictory should be laid down as an axiom.)

Page 100: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

100

(30) Let’s consider a further problem or question about the infinite, one that probably motivated many of Cantor’s opponents.

For there to be a last in some series that is infinite is against the idea of the infinite. But for there not to be a last in some series that is finished is against the idea of what is finished. And so what is finished seems to be against the idea of the infinite. Nothing infinite can be finished, and nothing finished can be infinite (which fits with the “fin” in these words). Does this mean, then, that anything infinite is unfinished business? That what is infinite is never what is done, hence actual and present and complete, but only what can be done? Consider for example, the two following equations:

1 = . 9̅

1 = lim𝑛→∞

(9

101 + 9

102 + 9

103 + … + 9

10𝑛)

Are these just two ways of saying exactly the same thing? The first says (or at least, one way of reading it is) that the number 1 is equal to the number you get by putting an infinity of nines after the decimal, all in one fell swoop. The second says that the number 1 is the first thing you cannot reach by adding more and more nines after the decimal point, as many as you like. The second is the sort of thing we think about in calculus, and does not present any special problems. But what about the first? Suppose we had an infinity of nines after the decimal, all there—

suppose, in other words, that we had all of the fractions of the form 9

10𝑛, in order. Is

the result equal to 1? Imagine we had this infinite sum before us in the form of stacked bricks. Each brick has the same square face, but they are all of different

thicknesses. The bottom brick, sitting on the desktop, is 9

101 of a foot thick. The brick

on top of that is 9

102 of a foot thick. The third brick, sitting on top of the second one,

is 9

103, and so on. If we had an infinite stack of such bricks, would the height of the

whole stack in fact be 1 foot? One difficulty with this scenario is that we are presuming the stack is finished, in which case we are obliging ourselves to say that there is a top brick (or at least it would take some doing to explain how we are not thus obliging ourselves). How thick is this top brick, this last brick added? The options are that it has some thickness or none. If it has none, then since the brick beneath it has ten times its thickness, the brick beneath it also has no thickness—and the same goes for all the other bricks in the stack. If, on the other hand, it has some thickness, then no matter how small it is, since there is an infinity of other bricks beneath it that are all thicker than it is, the whole stack will have an infinite height, not a height of 1 foot. To say that there is no limit to how many parts of 1 can be taken by taking nine tenths of it, then nine hundredths, then nine thousandths, and so on, and that by taking sufficiently many such parts we can leave as small a remainder of 1 as we please, is not the same as saying that if we had all those parts we would simply have all of 1. It might be impossible to have all those parts actually distinct from one another—it might be that only a finite number of them can actually exist, while

Page 101: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

101

the rest must exist only as possibilities in some divisible remainder. Great mathematicians do not always see any importance to such distinctions. Galileo, for instance, thought little of the distinction between “actual” and “potential” parts of a line. Dedekind and Cantor, too, do not attend much to that kind of distinction. Perhaps it is not the sort of distinction that pertains to mathematics to make? On the other hand, not only philosophers have objected to the idea of a completed and simultaneously present infinity of things, but also Carl Friedrich Gauss, who once said “... I protest above all against the use of an infinite quantity as a completed one, which in mathematics is never allowed. The Infinite is only a manner of speaking ...” If there is something fishy about the idea of an actually infinite multitude (one might also wonder about infinite magnitudes), does it follow that all of Cantor’s discoveries are worthless? Surely that would be an overstatement. Even on the supposition that infinite multitudes are always infinite in the sense of being unfinished business, and not in the sense of being a present multitude of things that cannot be numbered, Cantor’s results all still have meaning, truth, and beauty. Consider, for example, his theorem about the non-denumerability of the continuum. The proof in no way depends, for its rigor or its wonder, on the actual presence of any infinite multitude. It is enough that certain kinds of infinite things, such as the fractions, can be placed by a pairing process into one-to-one correspondence with the positive integers, and others cannot. The process we described does not depend on the actual presence of all the positive integers, or of all the rationals, for example—the process will eventually get to every positive integer, and every rational. But no such process will get to every real, but will in principle and forever miss certain reals that we can specify in advance.

Page 102: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

102

12 A Foreword to Non-Euclidean Geometry

We have finished our brief tour of certain parts of elementary modern number theory. Now we will take a similar tour of certain parts of elementary modern geometry, in particular, non-Euclidean geometry. Non-Euclidean geometry is not just geometrical theorems that Euclid did not discover. The geometrical work of Archimedes, Apollonius, Pappus, Ptolemy, even Descartes, is all Euclidean geometry. Non-Euclidean geometry is geometry that contradicts Euclid, or at least is opposed to the geometry of Euclid in such a way that it seems to contradict it. If Euclid’s geometry is true, then it could well sound as if we are about to embark on a journey of error—even willful and stupid error, since Euclid is so elementary. But this is not so. Even if non-Euclidean geometry were (or is) somehow in error, it would not be pure error, but an error that contains and uncovers important truth, and even beauty. Probably it will be best to postpone certain questions about the truth of non-Euclidean geometry until after we have seen something of it. Here is an overview of what we will do in regard to non-Euclidean geometry. First, we will review a little of Euclid himself, in order to have fresh in our minds those parts of his geometry that led to the advent of non-Euclidean geometry. This will be easy and pleasant going, and it might even make you feel a little nostalgic. Next, we will move on to Proclus and a few other critics of Euclid, to learn how there was trouble brewing in Euclidean paradise even in ancient times, and how this trouble grew for centuries until it boiled over in modern times, giving birth to radically new ideas.

Page 103: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

103

Then we will look at one particular non-Euclidean geometry, that of Lobachevsky, both in his own words, and in some of our own developments of his ideas. After this look at another geometry, we will look at a magnificent proof that Lobachevsky’s geometry is just as consistent as Euclid’s, and in so doing we will meet a number of delightful Euclidean theorems along the way. Finally, we will return to the philosophical questions that the experience of non-Euclidean geometry should provoke us to ask and wonder about.

Page 104: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

104

13 A Euclidean Review

Review Euclid’s Elements, Book 1, from the very beginning up through Proposition 32, and prepare Propositions 16, 17, 27, 28, 29, 32 for demonstration. As preparation for non-Euclidean geometry, consider some of the following questions about particular portions of Book 1. DEFINITION 4 Euclid defines a straight line as a line which lies evenly with the points on itself. What exactly does that mean, anyway? One interpretation is that he is invoking something like the carpenter’s definition of straight. If you look along a line, superimposing any two points on the line in your vision, and the line looks just like a point, then the line is straight. Carpenters use this technique to determine whether the edges of a board are straight. A straight line, in other words, is one that participates in the simplicity of a point—viewed a certain way, it looks just like a point, and none of its length is visible.

Does this definition really brings out the essence of straightness? What about “the shortest distance between two points”? Does that say what straightness is? What does it really mean for a line to be straight? What other things can be said only about straight lines, and not about any kind of curve?

One might ask analogous questions about Definition 7, the definition of a plane surface, that is, of a flat surface. What does it mean to be flat?

Page 105: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

105

DEFINITION 23 Euclid defines parallel straight lines thus:

Parallel straight lines are straight lines which, being in the same

plane and being produced indefinitely in both directions, do not

meet one another in either direction.

Why does he specify that they are “in the same plane”? If that part of the definition

were removed, would it make a difference? Also, is there a presumption in this

definition that to a given straight line, and through a given point outside that line,

there can be at most one straight line in the same plane that does not cut the given

straight line? Does this definition bring out what parallel straight lines are? Does it

say what makes them parallel, what causes straight lines to be parallel? What

about the sameness of distance between a pair of straight lines? Is that a better

way to define parallels?

POSTULATE 5

The text of this postulate runs thus:

That, if a straight line falling on two straight lines make the interior

angles on the same side less than two right angles, the two straight

lines, if produced indefinitely, meet on that side on which are the

angles less than the two right angles.

Some questions about this postulate:

(1) This postulate is sometimes called Euclid’s “parallel postulate.” Is it about

parallels? Is it relevant to parallels?

(2) How does it differ from the other four postulates preceding it? Do any

differences strike your eye as you look at the printed page exhibiting these five

postulates? Are there any differences in the imaginability of the postulates?

(3) Is this postulate true? If so, how do you know it is true? What convinces you?

(4) And what exactly is it about? Is it about physical straight lines in the sensible

world? Or ideal straight lines somewhere?

(5) When is the first time Euclid uses Postulate 1? Postulate 2? Postulate 3?

Postulate 4? Common Notion 1? Common Notion 2? Common Notion 3?

Common Notion 4? Common Notion 5? Postulate 5?

Page 106: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

106

PROPOSITION 16 Euclid here proves that the exterior angle of a triangle is greater than either interior and opposite angle. Could the same construction be used to prove here that it is equal to the sum of the interior and opposite angles, prior to and independently of any work with parallel lines? Can it at least be used to prove the existence of an infinity of dissimilar triangles that all have the same angle sum? Why bother with 16, if 32 will prove the same thing in a more informative way? PROPOSITION 17 What is the logical relationship between this proposition and Postulate 5? PROPOSITION 27 How is this demonstration related to Propositions 16 and 17? This Proposition closes with Q.E.D. What is the first Q.E.F. about parallels? Why does Euclid proceed in that order? PROPOSITIONS 28-29 What is the relationship between these two propositions? What is their relationship to Postulate 5? PROPOSITION 32 Euclid here uses Proposition 29 which in turn uses Postulate 5. Could Proposition 32 still be true even if Postulate 5 were false? Could Proposition 47 still be true even if Postulate 5 were false? Can you find a way to prove Postulate 5 from the other four postulates and the common notions? Is it more reasonable to postulate it, or to try to demonstrate it? Since Propositions 27 and 28 can be proved without Postulate 5, should we expect to be able to prove Proposition 29 also without Postulate 5? And if so, can we prove Postulate 5 from Proposition 29, and make it a theorem without having to postulate it?

Page 107: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

107

14 The Forerunners to Non-Euclidean Geometry

Proclus Lycaeus (412-485 AD), known as Proclus Diadochus (that is, Proclus the Successor), was a Greek Neoplatonic philosopher, born in Constantinople and raised in Xanthus. In Alexandria he studied rhetoric, mathematics, and philosophy, and in 431 went to Athens to study at the Academy of Plato (founded 387 BC), or rather at the Neoplatonic version of it that existed at the time. Proclus succeeded Syrianus as the head of the Academy.

He lived the rest of his days in Athens as a vegetarian and a bachelor, writing commentaries on the dialogues of Plato, some of which commentaries still exist today. He wrote two major systematic works. One was called Elements of Theology, consisting of 211 propositions, each followed by a proof. The work started from the existence of the One (the divine unity), and ended with the descent of souls into the material world (and so in a general way its order resembles that of the Summa theologiae). In medieval times appeared a book called the Liber de causis (the Book of Causes), thought to be by Aristotle. Thomas Aquinas also thought this at first, but became the first person to realize that this work was not by Aristotle, but was instead a kind of summary version of the Elements of Theology by Proclus. The other major systematic work of Proclus was called Platonic Theology, systematizing material from the dialogues of Plato. The thought of Proclus was more or less in line with that of Plotinus, the founder of Neo-Platonism.

Proclus also wrote a commentary on the first book of Euclid’s Elements, which is the work that will draw our attention now. The book is an important source for the history of ancient mathematics, many details of which would be unknown to us

Page 108: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

108

were it not for the commentary of Proclus. We are interested in what he had to say about Euclid’s fifth postulate, the original troublemaker among the principles of Euclid. Sir Thomas L. Heath, in his commentary on Euclid’s Elements, early in his remarks on Postulate 5, says

From the very beginning, as we know from Proclus, the [Fifth]

Postulate was attacked as such, and attempts were made to prove

it was a theorem or to get rid of it by adopting some other definition

of parallels; while in modern times the literature of the subject is

enormous. Riccardi ... has twenty quarto pages of titles of

monographs relating to Post. 5 between the dates of 1607 and

1887. Max Simon ... notes that he has seen three new attempts, as

late as 1891 (a century after Gauss laid the foundation of non-

Euclidean geometry), to prove the theory of parallels independently

of the Postulate.

Let’s now turn to Proclus, and see if we can find out what all the fuss is about.

COMMENTARY ON THE FIRST BOOK OF EUCLID’S ELEMENTS12

Proclus

POSTULATE 5. That, if a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which the angles are less than the two right angles.

This ought to be struck from the postulates altogether. For it is a theorem—one that invites many questions, which Ptolemy proposed to answer in one of his books—and requires for its demonstration a number of definitions and theorems. Moreover, Euclid himself proves the converse as a theorem.

But, perhaps, some might mistakenly think that this proposition deserves to be ranked among the postulates on the ground that the angles’ being less than two right angles makes us at once believe in the convergence and intersection of the straight lines. To them Geminus has given the proper answer when he says that the founders of this science have taught us not to pay attention to plausible imaginings in determining what propositions are to be accepted in geometry.

12 This is a translation of selections from Primum Euclidis Elementorum Librum Commentarii, a Latin translation of the commentary by Proclus Diadochus (5th century). The selections were translated in 2011 by Ronald J. Richard, and is meant for use by the students and faculty of Thomas Aquinas College, Santa Paula, California and St. John’s College, Annapolis, Maryland and Santa Fe, New Mexico. This edition introduces minor alterations to that translation.

Page 109: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

109

Aristotle likewise says that to accept probable reasoning from a geometer is like demanding proofs from a rhetorician. And Simmias is made by Plato to say, “I am aware that those who make proofs out of probabilities are impostors.” So here, although the statement that straight lines converge when the right angles [they make with a third straight line] are diminished is true and necessary, yet the conclusion that because they converge more as they are extended farther they will meet at some time is plausible, but not necessary in the absence of an argument proving that this is true of straight lines. That there are lines that approach each other indefinitely but never meet seems implausible and even paradoxical, yet it is nevertheless true and has been shown for other species of lines. May not this, then, be possible for straight lines as for those other lines? Until we have demonstrated that they meet, what is said about other lines strips our imagination of its plausibility. And, although the arguments against the intersection of these lines may contain much that surprises us, should we not all the more refuse to admit into our tradition this unreasoned appeal to probability?

These considerations make it clear that we should seek a proof of the theorem that lies before us and that it lacks the special character of a postulate. But how it is to be proved, and with what arguments the objections to this proposition may be met, we can only say when the author of the Elements is at the point of mentioning it and using it as obvious. At that time it will be necessary to show that its obvious character does not appear independently of demonstration but is turned into a matter of knowledge by proof. ... PROPOSITION 29. A straight line falling on parallel straight lines makes the alternate angles equal to one another, the exterior angle equal to the interior and opposite angle on the same side, and the interior angles on the same side equal to two right angles.

This theorem is the converse of both of the preceding ones, for the conclusion of each of them is assumed here, and what is given in them is proposed for proof. We should note this difference among converses: a converse may be the converse either of a single theorem, as the sixth is of the fifth, or of more than one, as this is of those two which precede it. In this theorem, the author of the Elements uses for the first time the postulate, “If a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which the angles are less than the two right angles.” As I said in the part of my exposition that precedes the theorems, not everyone admits that this generally accepted proposition is indemonstrable. For how could it be so when its converse is among the theorems as something demonstrable? For the theorem that in every triangle any two interior angles are less than two right angles [Elements 1.17] is the converse of this postulate. since also the fact that two straight lines when produced approach one another more and more is not, as I said before, a sure sign that they will meet, because other lines have been discovered which converge towards one another more and more but never meet.

Hence, others before us have placed it among the theorems and demanded a proof of this statement which was taken as a postulate by the author of the Elements. Ptolemy is thought to have proved it in his book entitled That Lines

Page 110: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

110

Produced From Angles Less Than Two Right Angles Meet One Another. His proof employs many of the theorems established by the author of the Elements prior to this one. In order not to add to our labors, let us assume that these are all true and take it as a lemma that they have been proved by the previous arguments. One of the propositions previously proved is this, that lines produced from angles equal to two right angles never meet. [Here is Ptolemy’s proof:]

I say, therefore, that the converse

is also true, namely, that when

parallel straight lines are cut by a

straight line, the interior angles on

the same side are equal to two

right angles. For it is necessary

that the line cutting the parallel lines make the interior angles on the

same side either equal to two right angles or less than or greater

than two right angles. Let AB and CD be parallel lines, and let GF

fall upon them. I say that it does not make the interior angles on the

same side greater than two right angles. For if angles AFG and CGF

are greater than two right angles, the remaining angles, BFG and

DGF, are less than two right angles. But these same angles must

also be greater than two right angles; for AF and CG are no more

parallel than FB and GD, so that if the line falling on AF and CG

makes the interior angles greater than two right angles, so also

does the line falling on FB and GD make the interior angles greater

than two right angles. But these same angles are less than two right

angles (for the four angles AFG, CGF, BFG, and DGF are equal to

four right angles), which is impossible. Similarly we can prove that

the line falling on the parallels does not make the interior angles on

the same side less than two right angles. If, then, the line falling on

them makes them neither greater than nor less than two right

angles, the only conclusion left is that it makes the interior angles

on the same side equal to two right angles.

When this has been demonstrated, the proposition before us

can be proved. I say that, if a straight line falls upon two straight

lines and makes the interior angles on the same side less than two

right angles, the straight lines if produced will meet on that side in

which the angles are less than two right angles. For let us suppose

that they do not meet. Now, if they are non-cutting on the side on

which the angles are less than two right angles, much more will they

be non-cutting on the other side, where the angles are greater than

two right angles, so that the straight lines will be non-cutting on both

sides; and if so, they are parallel. But it has been proved that the

line which falls upon parallels will make the interior angles on the

same side equal to two right angles. The same angles are therefore

both equal to two right angles and less than two right angles, which

is impossible.

A B

C D

F

G

Page 111: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

111

Having proved this, Ptolemy tries to add extra precision and reach the proposition before us by proving that, if a straight line falls upon two straight lines and makes the interior angles on the same side less than two right angles, not only are the straight lines not non-cutting, as he has proved, but also they will meet on that side on which the angles are less than two right angles, not on the side on which they are greater. [Here is Ptolemy’s proof:]

Let AB and CD be two straight lines, and let the line EFGH fall upon

them and make angles AFG and CGF less than two right angles.

Hence the other angles are greater than two right angles. Now it

has been demonstrated that the straight lines are not non-cutting.

But if they meet one another, it will be either in the direction of A

and C or in the direction of

B and D. Let us assume

that they meet in the

direction of B and D at

point K. Then, since

angles AFG and CGF are

less than two right angles,

and angles AFG and BFG

are equal to two right angles, if the common term, angle AFG, is

subtracted, angle CGF will be less than angle BFG. It follows that

the exterior angle of triangle KFG is less than the opposite interior

one, which is impossible [Elements 1.16]. Consequently they do not

meet on this side. But they do meet. Therefore they meet on the

other side, the one where the angles are less than two right angles.

This is Ptolemy’s proof. It is worth pausing to see whether there may be a fallacy

in the hypotheses that he has adopted. I mean those which assert that, when a

straight line cuts the non-cutting lines and makes four interior angles, the angles

in the same direction on both sides are either equal to two right angles or greater

than or less than two right angles. His division is not exhaustive. There is no reason

why one who calls the lines produced from angles less than two right angles ‘non-

cutting’ should not say that the angles lying in the same direction on one side are

greater than two right angles, and those in the same direction on the other side are

less than two right angles—that is, no single principle can be admitted to cover

them. Since his division is not exhaustive, the proposition under examination has

not been demonstrated. Furthermore, this also must be said against the proof, that

it does not show the impossibility to be one intrinsic to parallels. For it is not

because a straight line cutting parallels makes the angles in the same direction on

both sides greater or less than two right angles that the hypotheses are reduced

to an absurdity; it is because the four angles interior to the lines that are cut are

equal to four right angles that each of the hypotheses becomes impossible, since

even if one does not take the straight lines as parallel the same consequences

follow from assuming these same hypotheses. With these remarks we shall end

A

C

E

F

G

HD

B

K

Page 112: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

112

our comments on Ptolemy, for the weakness of his proof is evident from what has

been said.

Now let us examine those who say it is impossible that lines produced from

angles less than two right angles should meet. Taking two straight lines AB and

CD and the line AC falling upon them and making the interior angles less than two

right angles, they think they can demonstrate that AB and CD do not meet. Let AC

be bisected at E, and let a length AF equal

to AE be laid off on AB, and on CD a length

CG equal to EC. It is clear that AF and CG

will not meet at any point on FG; for if they

met, two sides of a triangle would be equal

to a third, AC, which is impossible

[Elements 1.20]. Again, let line FG be

drawn and bisected at H, and let equal

lengths be laid off on FB and GD]. These

likewise will not meet, for the same reason. By doing this indefinitely, drawing lines

between the non-coincident points, bisecting the connecting lines, and laying off

on the straight lines lengths equal to their halves, they say they prove that lines AB

and CD will not meet anywhere.

Such are their arguments. To them we must reply that what they say is true but

does not prove as much as they think. It is true that it is not possible in this simple

way to fix the point at which intersection occurs. It is not true, however, that the

lines never meet at all. Let it be granted that AB and CD do not meet when angles

BAC and DCA are defined by points F and G. But there is no reason why they

should not come together at K and L, even

if FK and GL are equal to FH and HG. For if

AK and CL meet at K and L, the angles KFH

and LGH are no longer the same; that is,

some of FG has come to belong to AK and

CL; and thus in turn the lines FK and GL are

greater than the base by as much as they

take away from within the line FG.

This also should be said: in affirming without qualification that lines produced

from angles less than two right angles do not meet, they are overthrowing what

they do not intend. Let the diagram be the same as before. Now is it possible or

not to draw a straight line from A to G? If they say it is not possible, they are denying

not only the fifth postulate, but also the first, which claims the ability to draw a

straight line from any point to any point. If possible, let the line be drawn. Then

since angles FAC and GCA are less than two right angles, it is clear that even

more so are GAC and GCA less than two right angles. Therefore AG and CG meet

at G, and they are produced from angles less than two right angles. Consequently,

it is not possible to say without qualification that lines produced from angles less

than two right angles do not meet. On the contrary, it is clear that some lines

produced from angles less than two right angles do meet, though the argument

proving this of all such lines is still to be found. Since “less than two right angles”

A F

E

C G

H

KB

DL

A

C

F

GD

B

Page 113: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

113

is indeterminate, one could say that with such-and-such an amount of lessening

the straight lines remain non-cutting, whereas, with an amount less than this, they

meet.

To anyone who wants to see this argument constructed, let us say that he must

accept in advance an axiom Aristotle used in establishing the finiteness of the

cosmos: if from a single point two straight lines making an angle are produced

indefinitely, the interval between them when produced indefinitely will exceed any

finite magnitude. At least he proved that, if the lines extending from the center to

the circumference are infinite, the interval between them is infinite; for if it is finite,

it is possible to increase the interval between

them, so that the straight lines are not

infinite. Straight lines extended indefinitely,

then, will diverge from each other by a

distance greater than any given finite

magnitude. If this is laid down, I say that, if a

straight line cuts one of two parallel lines, it

cuts the other also. Let AB and CD be

parallel lines and EFG a line cutting AB. I say that it also cuts CD. For since there

are two straight lines through point F, when FB and FG are extended indefinitely,

they will have an interval between them greater than any magnitude and hence

greater than the distance between the parallel lines. So, when they are separated

from each other a greater distance than that between the parallel lines, FG will

have cut CD. Therefore, if a straight line cuts one of two parallels, it cuts the other

also.

Having proved this, we can demonstrate the proposition before us as a

consequence. Let AB and CD be two straight lines, and EF falling upon them and

making angles BEF and DFE less than two right angles. I say that the straight lines

will meet on that side on which are the angles

less than two right angles. For since angles

BEF and DFE are less than two right angles,

let angle HEB be equal to the excess of two

right angles over them, and let HE be

produced to K. Then since EF falls upon KH

and CD and makes the interior angles equal

to two right angles, namely HEF and DFE,

HK and CD are parallel straight lines. And AB

cuts KH, so it will cut CD, by the proposition just demonstrated. AB and CD

therefore will meet on that side on which the angles are less than two right angles,

so that the proposition has been demonstrated.

[Here ends the selection from Proclus]

A

C

E

F

G

D

B

A

C D

HKE

F

B

Page 114: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

114

QUESTIONS

(1) How many arguments against the self-evidence of Euclid’s fifth postulate can

be gathered out of the foregoing words of Proclus? Are they decisive?

Consider the argument that other lines, contrary to our expectation, can approach

nearer and nearer to each other forever without ever intersecting. Does this prove

that if straight lines in particular cannot behave that way, then we must prove it?

Consider the argument from the demonstrability of 1.17, the converse of the fifth

postulate. Is it a reliable principle of logic that when there is a middle term joining

B to A in the statement Every A is B, then there must also be a middle term joining

A to B in the statement Every B is A?

Are Proclus and others wrong to single out this postulate from the others? Or is

there something quite different about it?

(2) Proclus finds fault with Ptolemy’s attempt to prove the fifth postulate. Has he

correctly identified the defect?

(3) Proclus offers his own attempt at proof of the postulate. Does his proof

succeed? If not, why not?

OTHER FIGURES IN THE HISTORY OF THE FIFTH POSTULATE

Other mathematicians, logicians, and philosophers down through the ages

continued to offer various attempts at proving the fifth postulate. This continued for

about two thousand years. Mathematicians agree that all the attempts were

failures. The attempts failed in more than one way.

One way that an attempt to prove Euclid’s fifth postulate can fail is to assume the

postulate itself, in a tacit way, in the course of trying to prove it. This is called a

petitio principii, or begging the question, and it is the most common way in which

attempted proofs failed.

Another way to fail is simply to assume something false, such as the idea that if

you divide a line or angle enough times, you will eventually come to its indivisible

parts.

Here follows a brief survey of some of the more prominent figures in the history of

attempts to prove Euclid’s fifth postulate.

Page 115: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

115

JOHN WALLIS (1616-1703)

John Wallis was an English mathematician. He was partly

responsible for developing calculus, and is credited with

introducing the symbol ∞ for infinity (in his Treatise on Conic

Sections, 1655). In 1632 he was sent to Emmanuel College,

Cambridge, earning a master’s degree in 1640. He entered the

priesthood, and also married. In the mid 1650s, he entered into

a debate with Thomas Hobbes, criticizing him for mathematical

errors in his De corpore (the debate, which included other

mathematicians, went on into the 1670s).

Although he entertained strange notions about negative numbers, he is

credited with the discovery of the idea of a number line. Among other things, he

discovered that

𝜋

2 =

2

1 ∙

2

3 ∙

4

3 ∙

4

5 ∙

6

5 ∙

6

7 ∙

8

7 ∙ …

which is known as the Wallis product.

He found that Euclid’s fifth postulate was logically equivalent to (produced all the

same results as) this statement:

On a given finite straight line it is always possible to construct a

triangle similar to a given triangle.

This is now known as the Wallis postulate. He saw that one of the fundamental

ideas of Euclid’s geometry is that shape is not somehow tied to a specific size, and

that once we accept that idea, we can deduce Euclid’s fifth postulate. Wallis

himself, however, did not think of this as a proof of Euclid’s postulate. In his time,

mathematicians were hopeful that they could proof the fifth postulate simply from

the other four, without having to introduce any new postulate in its place. Wallis

saw his postulate simply as a logical equivalent of Euclid’s own, one that is easier

to state and whose meaning is easier to grasp. On the other hand, it is also about

things that are more complex than Euclid’s postulate is.

Page 116: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

116

GIOVANNI GIROLAMO SACCHERI (1667-1733)

This man was an Italian Jesuit priest, a scholastic philosopher, and a

mathematician. He entered the Jesuit order in 1685, and was ordained a priest in

1694. He is known primarily for his last publication, now considered by some

people to be (perhaps contrary to his intention) the second work in non-Euclidean

geometry, the first one being the Discussion of Difficulties in Euclid, by the 11th

century polymath Omar Khayyam, which contains many precedents for the ideas

found in Saccheri’s work. Saccheri’s 1733 book, published shortly before his death,

bore the title

Euclides ab omni naevo vindicatus

(Euclid Freed from Every Flaw)

This book remained obscure until Eugenio Beltrami rediscovered it in the mid-19th

century. In it, Saccheri attempted to prove the fifth postulate by reductio ad

absurdum, assuming it false, and endeavoring to derive a contradiction. The result

was that he did a whole lot of positive reasoning from the denial of Euclid’s

postulate, which is why his work is regarded as effectively a work of non-Euclidean

geometry, despite his own ostensible intention to validate Euclid.

Saccheri approached the question through

certain quadrilaterals. First he constructed a

quadrilateral ABCD in which AC and BD were

equal to one another and made the same

angles with AB. He proved that the angles at C

and D were equal to one another (Proposition

1). Next he took the special case in which the

angles at A and B were right, and then proved

that

If C is right, then CD = AB

If C is obtuse, then CD < AB

If C is acute, then CD > AB

He called the first “the hypothesis of the right angle,” the second “the hypothesis

of the obtuse angle,” the third “the hypothesis of the acute angle.” He then proved

(Propositions 5, 6, 7) that if any one of these were true in any single case, then it

had to be true in all cases. He proved (Proposition 11) that the fifth postulate

followed necessarily from the hypothesis of the right angle, and so he could

demonstrate the fifth postulate if he could only eliminate the other two possibilities.

C D

A B

Page 117: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

117

Saccheri found (Proposition 14) that the hypothesis of the obtuse angle leads

to the conclusion that a triangle can have an angle sum greater than two rights,

contrary to Euclid’s Proposition 17 of Book I of the Elements, which proposition is

independent of the fifth postulate. Consequently, wrote Saccheri, “The hypothesis

of the obtuse angle is absolutely false, since it destroys itself.” (Later geometers

would reconsider it, since they were willing to drop or adjust the other postulates

of Euclid, such as the second postulate, on which 1.17 depended; this led to the

development of the particular kinds of non-Euclidean geometry known as elliptic

geometry and spherical geometry.)

His attempt to do away with the other hypothesis, the hypothesis of the acute

angle, was more prolix and is regarded by mathematicians as a failure. In outline,

it proceeds as follows. First, in Proposition 34 of

his work, he introduces a certain curved line CKD

that must exist if the hypothesis of the acute angle

is true, a line that is the locus of endpoints of all

the equal perpendiculars to a straight line AB,

which curve we might call an “equidistance curve.”

In Proposition 37, he attempts to prove that this

curve must be equal in length to the straight line

AB from which it stands at an equal distance at every point along itself. This is

where he fails. His proof proceeds by a fallacious comparison of infinitesimal

lengths in the equidistance curve and the straight line—he effectively argues from

the point-for-point correspondence in the two to their equal length. This does not

follow; indeed, by this method, any line can be proved equal to any line. Proceeding

as though Proposition 37 concluded demonstratively, Saccheri goes on in

Proposition 38 of his work to conclude that the hypothesis of the acute angle

contradicts itself, since according to it, on the one hand, the equidistance curve

CKD must equal the straight base AB that defines it (by the fallacious Prop.37),

and on the other, it must be longer than it, since CKD is longer than the straight

line CD joining its ends, which straight line is greater than the base AB (which he

proves, and which does indeed follow from the hypothesis of the acute angle).

Saccheri concluded Proposition 38, saying “And so, it is established that the

hypothesis [of the acute angle] is absolutely false since it destroys itself.” Following

this, in Proposition 39, he concludes that since the fifth postulate is the only

alternative to the two unacceptable hypotheses, therefore the fifth postulate is

demonstrated.

A B

C

K

D

Page 118: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

118

JOHANN HEINRICH LAMBERT (1728-1777)

Johann Heinrich Lambert was a Swiss polymath who made

significant contributions to mathematics, optics, philosophy,

astronomy, and map projections. He presented the first proof that

𝜋 is irrational. He also had a role to play in the history of Euclid’s

fifth postulate. He saw that, on the hypothesis of the acute angle

(as Saccheri had called the assumption that the angle in one

upper corner of his special quadrilateral was acute), the size of a

triangle was proportional to its defect from two right angles. In

1766, he wrote, but never published, his Theorie der Parallellinien, or Theory of

Parallels, in which he tried to prove the fifth postulate. Unlike Saccheri, he never

felt he had really derived a contradiction from the hypothesis of the acute angle.

After he died, the book was published by G. Bernoulli and C. F. Hindenburg.

JOHN PLAYFAIR (1748-1819)

John Playfair, educated at the University of St. Andrews,

was a Scottish scientist and mathematician, and a

professor of natural philosophy at the University of

Edinburgh. In 1795 he published an alternative, more

concise equivalent of Euclid’s fifth postulate, now known as

Playfair’s axiom (which others before him also used, such

as William Ludlam). The axiom states that

In a plane, given a straight line and a point not on

it, at most one line parallel to the given line can be

drawn through the point.

This is shorter than Euclid’s postulate, and its meaning is in some ways easier to

grasp. But it does not include in itself any reason why there should be “at most

one” parallel. It is hard to see how it could be evident that there can be “at most

one” if it is not also evident that there is one, and that its special character must

make it unique. But that there is one is something that can be demonstrated

independently of the fifth postulate, as in Elements 1.27, or 1.31. If it is evident that

“at most” one line is parallel, that is, does not cut the given line, then it must also

be evident that all others cut. By why is it evident that they cut the given line?

Surely because they are inclined to it. And that is just what Euclid’s fifth postulate

says.

Page 119: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

119

In 1785, William Ludlam formulated this alternative to Euclid’s postulate:

Two straight lines, meeting at a point, are not both parallel to a third

line.

And Playfair used a version of this:

Two straight lines which intersect one another cannot be both

parallel to the same straight line.

But again, why not? Because then there would be more than one parallel through

a given point to a given straight line. But why is that impossible? Because all others

cut. And why must they cut? Because they must be inclined toward the straight

line. So although these alternatives are shorter to state, it might well be that they

do not get to the heart of the matter, to the cause of things, as Euclid’s own

postulate did.

ADRIEN-MARIE LEGENDRE (1752-1833)

Legendre was a French mathematician born to a wealthy family,

educated at the Collège Mazarin in Paris, and who lost his

personal fortune in 1793 during the French Revolution. He

married, and his wife helped him get back on his feet. He made

his reputation by winning the Berlin Academy Prize in 1782 by

solving the problem of calculating the trajectory of a cannon ball

suffering wind resistance. Because he won this prize, and

because of his work in celestial mechanics, he was elected to

the French Academy of Sciences in 1783. In 1787, he was

elected to the Royal Society of London. He served as permanent

mathematics examiner for the École Polytechnique from 1799 to

1815. His name is one of 72 names of French scientists,

mathematicians, and engineers inscribed on the Eiffel tower. In

1824, he withheld his support of the government’s candidate for

the National Institute, and consequently lost his pension and

lived the remainder of his life in near poverty.

In 1830, he proved Fermat’s last theorem for 𝑛 = 5, that is, he proved that there

is no solution in non-zero integers for the equation 𝑎5 + 𝑏5 = 𝑐5.

A charming detail about him is that he was, for about two centuries,

misrepresented in numerous books and articles by a side-view portrait (the lower

of the two accompanying images) thought to be of him. In 2005 the error was

discovered. The side-view portrait was actually of an obscure French politician,

Louis Legendre (1752-1797), not of the mathematician Adrien-Marie Legendre.

The sketch had been simply labeled “Legendre” and had appeared in a book along

Page 120: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

120

with mathematicians of the same period, such as Lagrange, thus causing the

mistake. That left us without any image at all of Legendre, until someone

subsequently found an 1820 book, Album de 73 Portraits-charge acquarellés des

membres de l’Institut, a book of caricatures of 73 members of the Institut de France

in Paris, by the French artist Julien-Leopold Boilly, in which our mathematician,

Andrien-Marie, was depicted (upper of the two images above).

Legendre was fascinated with Euclid’s fifth postulate, and for over thirty years

endeavored to prove it. He refused to accept non-Euclidean geometry, first

proposed by Lobachevsky in 1829. His attempts to prove it appeared in his

successful work Éléments de Géométrie (1794-1823), which saw numerous

editions and translations.

One such attempt that Legendre proposed was as follows. If the angle sum of

any triangle is equal to two right angles, then Euclid’s fifth postulate must be true.

(This itself requires some proof, but we will see this for ourselves later.) Knowing

this, Legendre needed only to prove, without using the fifth postulate, that the angle

sum of a triangle is equal to two right angles, and he would have succeeded in

proving the postulate. Here is a presentation of how he tried to prove that the angle

sum of a triangle is equal to two rights.

Let ABC be any triangle. I say

that its angle sum is equal to

two right angles. If its sides are

unequal, then let AB be the

greatest and BC the least (if it is

equilateral, we may begin our

construction with any side).

Bisect BC at D and join AD. Now take triangle ADB and construct the flipped version

of it, AKG, so that these triangles are congruent (and clearly AG will pass through

D).

Now extend AB to H so that KH = AK. Since GKH is supplementary to AKG,

and AKG = ADB, thus GKH = ADC. Also, GK = BD = DC. Consequently,

triangles GKH and CDA are congruent (side-angle-side).

So AGK is a flipped version of ABD

and GKH is a flipped version of CDA

Thus the angle sum of AGH must be the same as that of ABC.

Also, AG > GH [since AG = AB, GH = AC, and AB > AC]

so AHG > GAH

i.e. KHG > GAK

so CAD > DAB

Therefore DAB is less than half of CAB, the least angle in triangle ABC.

Hence GAH is less than half of CAB, the least angle in ABC.

A

C

B

G

HK

D

Page 121: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

121

Also, GHA, equal to CAD, is less than the least angle in ABC.

So we have found a process: give us any triangle ABC, and we can construct a

new triangle AGH with the exact same angle sum as ABC, in which one angle (GAH)

is less than half the least in ABC, and another (GHA) is less than the least in ABC.

If we now repeat this process on the new triangle AGH, we will get a new triangle

with the same angle sum again, hence the same as ABC, and it will have in it a

new angle less than half of the least in AGH, hence less than a quarter of the least

in ABC, and it will have another angle in it less than the least in AGH, hence less

than half the least in ABC.

So, continuing thus, we can make a triangle TFL whose angle sum is the same as

that in all the previous triangles, including ABC, and whose smallest angle, T, and

whose next smallest angle, L, are such that

T < 1

2𝑛 (CAB)

and L < 1

2𝑛−1 (CAB)

where 𝑛 is the number of times we have performed the process.

If we now imagine a changing triangle TFL that adjusts its shape with each new

step of the process, we know that it will always be true, at any stage, that

T + L + F = angle sum of triangle ABC

Consequently, we also know, that

lim𝑛→∞

( T + L + F ) = angle sum of triangle ABC

Therefore, since the limit of the sum is equal to the sum of the limits,

lim𝑛→∞

T + lim𝑛→∞

L + lim𝑛→∞

F = angle sum of triangle ABC

But each of the first two limits is equal to zero, because at each step in the process,

the two new smallest angles, the new T and L, will be less than half of those that

preceded them. And the third limit, that of F, is clearly just two right angles, since

as the other two angles turn to zero, the point F must be getting as close as we

please to being on TL, which is to say that the angle TFL must be getting as close

as we please to a straight line.

Therefore the angle sum of triangle ABC is exactly two right angles.

T

F

L

Page 122: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

122

This argument is quite brilliant, but it is also a failure. The misstep occurs right at

the very end. Legendre assumes that

lim𝑛→∞

F = 180°

because TFL is flattening out into a

straight line. He is assuming, here,

that the only way the angles T and L

can get as close as we please to

zero is if F gets as close as we please to straight. But what if the triangle breaks

down not because TFL is flattening out into a straight line, but instead because

TF and FL, while tending toward some angle (and not toward a straight line), the

third side, TL, is becoming parallel to the two of them, and hence, in the limit, no

longer meets them, but is just one straight line parallel to two others at an angle to

each other? That would destroy Legendre’s argument. Does this alternative sound

absurd? Perhaps, but if we deny it we must say that two straight lines cutting at an

angle cannot both be parallel to a third straight line, which is an equivalent of

Euclid’s fifth postulate.

CARL FRIEDRICH GAUSS (1777-1855)

This giant among mathematicians was the first to arrive at the

conclusion that no contradiction can be derived from denying

Euclid’s fifth postulate while asserting the previous four. He

himself coined the term “non-Euclidean geometry,” although he

used it to refer specifically to what is known today as hyperbolic

geometry, one particular kind of non-Euclidean geometry.

CHARLES LUTWIDGE DODGSON (1832-1898)

This gentleman is better known by his pen name, Lewis Carroll, the author of

Alice’s Adventures in Wonderland and Through the Looking-Glass. He was an

English writer, mathematician, Anglican deacon, photographer, and logician. He

was, it seems, inclined to Anglo-Catholicism, and was an admirer of John Henry

Newman. He matriculated at Oxford in 1850 as a member of Christ Church, where

T

F

L

T’ L’

Page 123: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

123

he remained in one capacity or another until he died. He wrote about a dozen

mathematical books under his real name, among them Euclid and His Modern

Rivals (1879), in which he considers the pedagogical merit of other geometrical

textbooks of his day compared with Euclid. He favored Euclid in almost every way,

and included in this book some defense of the fifth postulate against the logical

objections levelled against its self-evidence.

JANOS BOLYAI (1802-1860)

Bolyai was a Hungarian mathematician, born in Romania. He was one of the

founders of non-Euclidean geometry. So much did Euclid’s fifth postulate

preoccupy his mind that his father wrote to him:

For God’s sake, I beseech you, give it up. Fear it no less than

sensual passions, because it too may take all your time and deprive

you of your health, peace of mind, and happiness in life.

But he persisted, and concluded that the fifth postulate was independent of the

other four, and can be denied without contradiction, thus opening the way to the

development of other geometries. He once wrote, in a letter to his father, “I created

a new, different world out of nothing.”

Between 1820 and 1823 he produced a complete system of non-Euclidean

geometry that was published in 1832 as an Appendix to a mathematics textbook

written by his father. Gauss read this Appendix and wrote to a friend

I regard this young geometer as a genius of the first order.

Page 124: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

124

In 1848 he learned that Lobachevsky had published a similar piece of work in 1829,

although it contained only one possible non-Euclidean geometry (hyperbolic).

Lobachevsky and Bolyai did not know one another, and neither had known of the

other’s work when each was completing his own.

Bolyai published nothing beyond the little 24-page Appendix, but left behind

20,000 pages of mathematical manuscripts. No original portrait of him survives.

Page 125: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

125

15 NIKOLAI LOBACHEVSKY:

Geometrical Researches on the Theory of Parallels NIKOLAI IVANOVICH LOBACHEVSKY (1792-1856)

This Russian mathematician will be our first tour guide in

the non-Euclidean world. William Kingdon Clifford (1845-

1879, an English mathematician and philosopher) called

him “the Copernicus of geometry.”

Lobachevsky obtained a master’s degree in physics

and mathematics in 1811 at Kazan University, where he

later became a full professor, teaching mathematics,

astronomy, and physics. He married Varvara Alexeyevna

Moiseyeva and had 18 children, of which only 7 survived

to adulthood. In 1846 he was dismissed from the

university because of his failing health. By the early

1850s, he was almost blind and unable to walk. He died

in poverty in 1856.

He did not think it right to try to prove the fifth postulate from the other four, but

sought to develop an alternative geometry in which the fifth postulate was not true.

He presented this idea in 1826 to the session of the department of mathematics

and physics, and his work in this area was printed in 1829-1830. What follows is

his 1840 work on the theory of parallels.

Page 126: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

126

GEOMETRICAL RESEARCHES

ON THE

THEORY OF PARALLELS

BY

NIKOLAI LOBACHEVSKY

Imperial Russian Real Councillor of State and

Regular Professor of Mathematics in the University of Kasan

Berlin, 1840

Translated from the Original

BY

GEORGE BRUCE HALSTED

A.M., Ph.D., Ex-Fellow of Princeton and Johns-Hopkins University

1914 Edition

Page 127: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

127

THEORY OF PARALLELS IN geometry I find certain imperfections which I hold to be the reason why this

science, apart from transition into analytics, can as yet make no advance from that state in which it has come to us from Euclid.

As belonging to these imperfections, I consider the obscurity in the fundamental concepts of the geometrical magnitudes and in the manner and method of representing the measuring of these magnitudes, and finally the momentous gap in the theory of parallels, to fill which all efforts of mathematicians have been so far in vain.

For this theory Legendre’s endeavors have done nothing, since he was forced to leave the only rigid way to turn into a side path and take refuge in auxiliary theorems which he illogically strove to exhibit as necessary axioms. My first essay on the foundations of geometry I published in the Kasan Messenger for the year 1829. In the hope of having satisfied all the requirements, I undertook hereupon a treatment of the whole of this science, and published my work in separate parts in the “Gelehrien Schriften der Universitæt Kasan” for the years 1836, 1837, 1838, under the title “New Elements of Geometry, with a complete Theory of Parallels.” The extent of this work perhaps hindered my countrymen from following such a subject, which since Legendre had lost its interest. Yet I am of the opinion that the Theory of Parallels should not lose its claim to the attention of geometers, and therefore I aim to give here the substance of my investigations, remarking beforehand that contrary to the opinion of Legendre, all other imperfections—for example, the definition of a straight line—show themselves foreign here and without any real influence on the theory of parallels.

In order not to fatigue my reader with the multitude of those theorems whose proofs present no difficulties, I prefix here only those of which a knowledge is necessary for what follows.

Page 128: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

128

[1] A straight line fits upon itself in all its positions. By this I mean that during the

revolution of the surface containing it the straight line does not change its place if it goes through two unmoving points in the surface (i.e., if we turn the surface containing it about two points of the line, the line does not move.)

[2] Two straight lines cannot intersect in two points. [3] A straight line sufficiently produced both ways must go out beyond all bounds,

and in such way cuts a bounded plane into two parts. [4] Two straight lines perpendicular to a third never intersect, how far soever they

be produced. [5] A straight line always cuts another in going from one side of it over to the other

side (i.e., one straight line must cut another if it has points on both sides of it). [6] Vertical angles, where the sides of one are productions of the sides of the

other, are equal. This holds of plane rectilineal angles among themselves, as also of plane surface angles (i.e., dihedral angles).

[7] Two straight lines cannot intersect if a third cuts them at the same angle. [8] In a rectilineal triangle, equal sides lie opposite equal angles, and conversely. [9] In a rectilineal triangle, a greater side lies opposite a greater angle. In a right-

angled triangle the hypotenuse is greater than either of the other sides, and the two angles adjacent to it are acute.

[10] Rectilineal triangles are congruent if they have a side and two angles equal,

or two sides and the included angle equal, or two sides and the angle opposite the greater equal, or three sides equal.

[11] A straight line which stands at right angles upon two other straight lines not

in one plane with it is perpendicular to all straight lines drawn through the common intersection point in the plane of those two.

[12] The intersection of a sphere with a plane is a circle. [13] A straight line at right angles to the intersection of two perpendicular planes,

and in one, is perpendicular to the other. [14] In a spherical triangle equal sides lie opposite equal angles, and conversely. [15] Spherical triangles are congruent (or symmetrical) if they have two sides and

the included angle equal, or a side and the adjacent angles equal. From here follow the other theorems with their explanations and proofs.

Page 129: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

129

[16] All straight lines which in a plane go out from a point can, with reference to a given straight line in the same plane, be divided into two classes—into cutting and not-cutting.

The boundary lines of the one and the other class of those lines will be called parallel to the given line.

From the point A (Fig.1) let fall upon the

line BC the perpendicular AD, to which again draw the perpendicular AE.

In the right angle EAD either will all straight lines which go out from the point A meet the

line DC, as for example AF, or some of them, like the perpendicular AE, will not meet the line

DC. In the uncertainty whether the perpendicular AE is the only line which does not meet DC, we will assume it may be possible that there are still other lines, for example AG, which do not cut DC, how far soever they may be prolonged. In passing over from the cutting lines, as AF, to the not-

cutting lines, as AG, we must come upon a line AH, parallel to DC, a boundary line, upon one

side of which all lines AG are such as do not meet the line DC, while upon the other side

every straight line AF cuts the line DC. The angle HAD between the parallel HA and the perpendicular AD is called the

parallel angle (angle of parallelism), which we will here designate by 𝛱(𝑝) for

AD = 𝑝. If 𝛱(𝑝) is a right angle, so will the prolongation AE’ of the perpendicular AE

likewise be parallel to the prolongation DB of the line DC, in addition to which we remark that in regard to the four right angles, which are made at the point A by the

perpendiculars AE and AD, and their prolongations AE’ and AD’, every straight line which goes out from the point A, either itself or at least its prolongation, lies in one

of the two right angles which are turned toward BC, so that except the parallel EE’ all others, if they are sufficiently produced both ways, must intersect the line BC.

If 𝛱(𝑝) < 1

2 𝜋, then upon the other side of AD, making the same angle DAK =

𝛱(𝑝) will lie also a line AK, parallel to the prolongation DB of the line DC, so that under this assumption we must also make a distinction of sides in parallelism.

All remaining lines or their prolongations within the two right angles turned

toward BC pertain to those that intersect, if they lie within the angle HAK = 2𝛱(𝑝) between the parallels; they pertain on the other hand to the non-intersecting AG, if they lie upon the other sides of the parallels AH and AK, in the opening of the two

A

p

D

F

B

C

H’ KE’

EK’ G H

D’

FIG. 1

Page 130: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

130

angles EAH = 12 𝜋 − 𝛱(𝑝), E′AK = 1

2 𝜋 − 𝛱(𝑝), between the parallels and EE’ the

perpendicular to AD. Upon the other side of the perpendicular EE’ will in like manner the prolongations AH’ and AK’ of the parallels AH and AK likewise be parallel to BC;

the remaining lines pertain, if in the angle K’AH’, to the intersecting, but if in the angles K’AE, H’AE’, to the non-intersecting.

In accordance with this, for the assumption 𝛱(𝑝) = 1

2 𝜋, the lines can be only

intersecting or parallel; but if we assume that 𝛱(𝑝) < 12 𝜋, then we must allow two

parallels, one on the one and one on the other side; in addition we must distinguish the remaining lines into non-intersecting and intersecting.

For both assumptions it serves as the mark of parallelism that the lines

becomes intersecting for the smallest deviation toward the side where lies the parallel, so that if AH is parallel to DC, every line AF cuts DC, how small soever the

angle HAF may be. [17] A straight line maintains the characteristic of parallelism at all its points.

Given AB (Fig.2) parallel to

CD, to which latter AC is perpendicular. We will consider two points taken at random on the line AB and its production beyond the perpendicular.

Let the point E lie on that side of the perpendicular on

which AB is looked upon as parallel to CD.

Let fall from the point E a perpendicular EK on CD and so draw EF that it falls within the angle BEK.

Connect the points A and F by a straight line, whose production then (by Theorem 16) must cut CD somewhere in G. Thus we get a triangle ACG, into which the line EF goes; now since this latter, from the construction, cannot cut AC, and

cannot cut AG or EK a second time (Theorem 2), therefore it must meet CD somewhere at H (Theorem 3).

Now let E’ be a point on the production of AB and E’K’ perpendicular to the production of the line CD; draw the line E’F’ making so small an angle AE’F’ that it

cuts AC somewhere in F’; making the same angle with AB, draw also from A the line AF, whose production will cut CD in G (Theorem 16).

Thus we get a triangle AGC, into which goes the production of the line E’F’; since now this line cannot cut AC a second time, and also cannot cut AG, since the

angle BAG = BE’G’ (Theorem 7), therefore must it meet CD somewhere in G’. Therefore from whatever points E and E’ the lines EF and E’F’ go out, and

however little they may diverge from the line AB, yet will they always cut CD, to which AB is parallel.

E’ E B

DHKK’ C

F’

A

F

G

FIG.2

Page 131: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

131

[18] Two lines are always mutually parallel.

Let AC be a perpendicular on CD, to which AB is parallel if we

draw from C the line CE making any acute angle ECD with CD, and let fall from A the

perpendicular AF upon CE, we obtain a right-angled triangle

ACF, in which AC, being the hypotenuse, is greater than the

side AF (Theorem 9). Make AG = AF, and slide the figure EFAB until AF coincides with AG, when AB

and FE will take the position AK and GH, such that the angle BAK = FAC; consequently AK must cut the line DC somewhere in K (Theorem 16), thus forming

a triangle AKC, on one side of which the perpendicular GH intersects the line AK in L (Theorem 3), and thus determines the distance AL of the intersection point of the

lines AB and CE on the line AB from the point A. Hence it follows that CE will always intersect AB, how small soever may be the

angle ECD; consequently CD is parallel to AB (Theorem 16). [19] In a rectilineal triangle the sum of the three angles cannot be greater than two rights.

Suppose in the triangle ABC (Fig.4) the sum of the three angles is equal to

𝜋 + 𝛼;13 then choose in case of the inequality of the sides the smallest, BC,

halve it in D, draw from A through D the line AD and make the prolongation of it,

DE, equal to AD, then join the point E to the point C by the straight line EC. In the congruent triangles ADB and CDE, the

angle ABD = DCE, and BAD = DEC (Theorems 6 and 10); whence follows that also in the triangle ACE the sum of the three angles must be equal to 𝜋 + 𝛼; but also

the smallest angle BAC (Theorem 9) of the triangle ABC, in passing over into the new triangle ACE, has been cut up into the two parts EAC and AEC. Continuing this process, continually halving the side opposite the smallest angle, we must finally

attain to a triangle in which the sum of the three angles is 𝜋 + 𝛼, but wherein are two angles, each of which in absolute magnitude is less than 1

2 𝛼; since now,

however, the third angle cannot be greater than 𝜋, so must 𝛼 be either null or negative.

13 Note that this letter 𝛼 is alpha, the first letter of the Greek alphabet. Do not confuse it with 𝑎, the first letter of the Roman alphabet. To keep these more distinct, we will use a

(Times New Roman, unitalicized) rather than 𝑎 (Cambria Math, italicized) for the first letter of the Roman alphabet whenever Lobachevsky uses them as mathematical symbols.

A B

E

LG

C

F

D

H

K

FIG.3

B

A

2

1

4 3

2

E

C

1

D

FIG.4

Page 132: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

132

[20] If in any rectilineal triangle the sum of the three angles is equal to two right angles, so is this also the case for every other triangle.

If in the rectilineal triangle GHK (Fig.5) the sum

of the three angles = 𝜋, then must at least two of its angles, G and K, be acute. Let fall from the

vertex of the third angle H upon the opposite side GK the perpendicular p. This will cut the triangle into two right-angled triangles, in each of which the sum of the three angles must also be 𝜋, since it

cannot in either be greater than 𝜋, and in their combination not less than 𝜋.

So we obtain a right-angled triangle with the perpendicular sides p and q, and from this a quadrilateral whose opposite sides are equal and whose adjacent sides p and q are at right angles (Fig.6).

By repetition of this quadrilateral we can make another with sides np and q, and finally a quadrilateral

ABCD with sides at right angles to each other, such that AB = np, AD = mq, DC = np, BC = mq, where m and n are any whole numbers. Such a quadrilateral is divided by the diagonal DB into two congruent right-angled

triangles, BAD and BCD, in each of which the sum of the three angles = 𝜋.

The numbers n and m can be taken sufficiently great for the right-angled triangle ABC (Fig.7) whose perpendicular sides AB = np, BC = mq, to enclose within itself another given (right-angled) triangle BDE as soon as the right angles fit each other.

Drawing the line DC, we obtain right-angled triangles of which every successive two have a side in common.

The triangle ABC is formed by the union of the two triangles ACD and DCB, in neither of which can the sum of the angles be greater than 𝜋; consequently it must

be equal to 𝜋, in order that the sum in the compound triangle may be equal to 𝜋.

In the same way the triangle BDC consists of the two triangles DEC and DBE; consequently must in DBE

the sum of the three angles be equal to 𝜋, and in general this must be true for every triangle, since each can be cut into two right-angled triangles.

From this it follows that only two hypotheses are allowable: Either is the sum of the three angles in all rectilineal triangles equal to 𝜋, or this sum is in all less than 𝜋.

H

G R K

p

q

FIG.5

H

R K

p

q

Wq

p

A BD

E

C

FIG.7

FIG. 6

A B

CD

p

p

p

p

q q q q

q

p

Page 133: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

133

[21] From a given point we can always draw a straight line that shall make with a given straight line an angle as small as we choose.

Let fall from the given point A (Fig.8)

upon the given line BC the perpendicular AB; take upon BC at random the point D; draw the line AD; make DE = AD, and draw

AE. In the right-angled triangle ABD let the

angle ADB = 𝛼; then must in the isosceles triangle ADE the angle AED be either 1

2 𝛼 or

less (Theorems 8 and 20). Continuing thus we finally attain to such an angle, AEB, as is less than any given angle. [22] If two perpendiculars to the same straight line are parallel to each other, then the sum of the three angles in a rectilineal triangle is equal to two right angles.

Let the lines AB and CD (Fig.9) be parallel to each other and perpendicular

to AC. Draw from A the lines AE and AF to

the points E and F, which are taken on

the line CD at any distances FC > EC from the point C.

Suppose in the right-angled triangle ACE the sum of the three angles is equal to 𝜋 − 𝛼, in the triangle AEF equal to 𝜋 − 𝛽, then must it in triangle ACF equal 𝜋 −𝛼 − 𝛽, where 𝛼 and 𝛽 cannot be negative.

Further, let the angle BAF = a, AFC = b, so is 𝛼 + 𝛽 = a – b; now by revolving

the line AF away from the perpendicular AC we can make the angle a between AF

and the parallel AB as small as we choose (Theorem 21); so also can we lessen the angle b; consequently the two angles 𝛼 and 𝛽 can have no other magnitude

than 𝛼 = 0 and 𝛽 = 0.

It follows that in all rectilineal triangles the sum of the three angles is either 𝜋 and at the same time also the parallel angle 𝛱(𝑝) = 1

2 𝜋 for every line 𝑝, or for all

triangles this sum is < 𝜋 and at the same time also 𝛱(𝑝) < 12 𝜋.

The first assumption serves as the foundation for ordinary geometry and plane trigonometry.

The second assumption can likewise be admitted without leading to any contradiction in the results, and founds a new geometric science, to which I have given the name Imaginary Geometry, and which I intend here to expound as far as the development of the equations between the sides and angles of the rectilineal and spherical triangle.

A

B D EC

FIG. 8

α

A

C E

a

bF D

B

FIG.9

Page 134: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

134

[23] For every given angle 𝜶 there is a line 𝒑 such that 𝜫(𝒑) = 𝜶.

Let AB and AC (Fig.10) be two straight lines which at the intersection point A

make the acute angle 𝛼; take at random on AB a point B’; from this point drop B’A’ at right angles to AC; make A’A’’ = AA’; erect at A’’ the perpendicular A’’B’’; and so continue until a perpendicular CD is attained, which no longer intersects AB. This

must of necessity happen, for if in the triangle AA’B’ the sum of all three angles is equal to 𝜋 − a, then in the triangle AB’A’’ it equals 𝜋 −2a (Theorem 20), and so

forth, until it finally becomes negative and thereby shows the impossibility of constructing the triangle.

The perpendicular CD may be the very one nearer than which to the point

A all others cut AB; at least, in the passing over from those that cut to those not cutting, such a perpendicular

FG must exist. Draw now from the point F the line

FH, which makes with FG the acute angle HFG, on that side where lies the

point A. From any point H of the line FH let fall upon AC the perpendicular HK, whose prolongation consequently must cut AB somewhere in B, and so makes

a triangle AKB, into which the prolongation of the line FH enters, and therefore must meet the hypotenuse AB somewhere in M. Since the angle GFH is arbitrary and can be taken as small as we wish, therefore FG is parallel to AB and AF = 𝑝 (Theorems 16 and 18).

One easily sees that with the lessening of p the angle 𝛼 increases, while, for 𝑝 = 0, it approaches the value 1

2 𝜋; with the growth of p the angle 𝛼 decreases,

while it continually approaches zero for 𝑝 = ∞. Since we are wholly at liberty to choose what angle we will understand by the

symbol 𝛱(𝑝) when the line 𝑝 is expressed by a negative number, so we will assume

𝛱(𝑝) + 𝛱(−𝑝) = 𝜋, an equation which shall hold for all values of 𝑝, positive as well as negative, and

for 𝑝 = 0.

A

B’

B’’

A’ A’’

M

B

H

K F

G D

C

α

FIG.10

Page 135: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

135

[24] The farther parallel lines are prolonged on the side of their parallelism, the more they approach one another.

If to the line AB (Fig.11) two

perpendiculars AC = BD are erected and their endpoints C and D joined by a straight line, then will the quadrilateral CABD have two right

angles at A and B, but two acute angles at C and D (Theorem 22) which are equal to one another, as we can easily see by thinking the quadrilateral superimposed on itself so that

the line BD falls upon AC and AC upon BD. Halve AB and erect at the midpoint E the line EF perpendicular to AB. This line

must also be perpendicular to CD, since the quadrilaterals CAEF and FDBE fit one another if we so place one on the other that the line EF remains in the same

position. Hence the line CD cannot be parallel to AB, but the parallel to AB for the point C, namely CG, must incline toward AB (Theorem 16) and cut from the perpendicular BD a part BG < CA.

Since C is a random point in the line CG, it follows that CG itself nears AB the more, the farther it is prolonged. [25] Two straight lines which are parallel to a third are also parallel to each other.

We will first assume that the three lines AB, CD, EF (Fig.12) lie in one plane. [CASE 1: All in one plane, AB and CD given as parallel to EF]

If two of them in order, AB and CD, are parallel to the outmost one, EF, so are AB and CD parallel to each other. In order to prove this, let fall from any point A of the outer line AB upon the other outer line FE, the perpendicular AE, which will cut

the middle line CD in some point C (Theorem 3), at an angle DCE < 1

2 𝜋 on the side toward EF, the

parallel to CD (Theorem 22). A perpendicular AG let fall upon CD from the

same point, A, must fall within the opening of the acute angle ACG (Theorem 9); every other line

AH from A drawn within the angle BAC must cut EF, the parallel to AB, somewhere in H, how small

soever the angle BAH may be; consequently will CD in the triangle AEH cut the line AH somewhere in K, since it is impossible that it should meet EF. If AH from the

point A went out within the angle CAG, then must it cut the prolongation of CD between the points C and G in the triangle CAG. Hence follows that AB and CE are parallel (Theorems 16 and 18).

B F

E

D

K

H

A

G

C

M

L

FIG.12

FIG.11

C F D

G

BEA

Page 136: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

136

[CASE 2: All in one plane, AB and EF given as parallel to CD] Were both the outer lines AB and EF assumed parallel to the middle line CD, so

would every line AK from the point A, drawn within the angle BAE, cut the line CD somewhere in the point K, how small soever the angle BAK might be.

Upon the prolongation of AK take at random a point L and join it with C by the line CL, which must cut EF somewhere in M, thus making a triangle MCE.

The prolongation of the line AL within the triangle MCE can cut neither AC nor CM a second time; consequently it must meet EF somewhere in H; therefore AB

and EF are mutually parallel.

[CASE 3: Parallels AB and CD lie in two planes, whose intersection is EF]

Now let the parallels AB and CD (Fig.13) lie in two planes whose

intersection line is EF. From a random point E of this latter let fall a

perpendicular EA upon one of the two parallels, e.g., upon AB. Then from A, the foot of the perpendicular

EA, let fall a new perpendicular AC upon the other parallel CD, and join

the endpoints E and C of the two perpendiculars by the line EC. The angle BAC must be acute (Theorem 22);

consequently a perpendicular CG from C let fall upon AB meets it in the point G upon that side of CA on which the lines AB and CD are considered parallel.

Every line EH [in the plane FEAB], however little it diverges from EF, pertains, with the line EC, to a plane which must cut the plane of the two parallels AB and CD

along some line CH. This latter line cuts AB somewhere, and in fact in the very point H which is common to all three planes, through which necessarily also the

line EH goes; consequently EF is parallel to AB. In the same way we may show the parallelism of EF and CD. Therefore the hypothesis that a line EF is parallel to one of two other parallels,

AB and CD, is the same as considering EF as the intersection of two planes in which

two parallels, AB, CD, lie. Consequently two lines are parallel to one another if they are parallel to a third

line, though the three be not coplanar. The last theorem can be thus expressed: Three planes intersect in lines

which are all parallel to each other if the parallelism of two is presupposed.

A HG

E

C

D

F

B

FIG.13

Page 137: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

137

[26] Triangles standing opposite to one another on a sphere are equivalent in surface [area].

By opposite triangles we here understand such as are made on both sides of the center by the intersections of the sphere with planes; in such triangles, therefore, the sides and angles are in contrary order.

In the opposite triangles ABC and A’B’C’ (Fig.14, where one of them must be

looked upon as represented turned about), we have the sides AB = A’B’, BC = B’C’, CA = C’A’, and the corresponding angles at the points A, B, C are likewise equal to those in the other triangle at the points A’, B’, C’.

Through the three points A, B, C, suppose a plane passed, and upon it from the center of the sphere a perpendicular dropped whose prolongations both ways cut

both opposite triangles in the points D and D’ of the sphere. the distances of the first D from the points ABC, in arcs of great circles on the sphere, must be equal

(Theorem 12) as well to each other as also to the distances D’A’, D’B’, D’C’, on the other triangle (Theorem 6); consequently the isosceles triangles about the points

D and D’ in the two spherical triangles ABC and A’B’C’ are congruent. In order to judge of the equivalence of any two surfaces in general, I take the

following theorem as fundamental: Two surfaces are equivalent when they arise from the mating or separating of equal parts. [27] A three-sided solid angle equals the half sum of the surface angles minus a right angle.

In the spherical triangle ABC (Fig.15), where

each side is < 𝜋, designate the angles by A, B, C; prolong the side AB so that a whole circle ABA’B’A is produced; this divides the sphere into two equal parts.

In that half in which is the triangle ABC, prolong now the other two sides through their common intersection point C until they meet the circle in A’ and B’.

In this way the hemisphere is divided into four

triangles, ABC, ACB’, B’CA’, A’CB, whose size may be designated by P, X, Y, Z. it is evident that here P + X = B, P + Z = A.

The size of the spherical triangle Y equals that of the opposite triangle ABC’, having a side AB in common with the triangle P, and whose third angle C’ lies at

the endpoint of the diameter of the sphere which goes from C through the center D of the sphere (Theorem 26). Hence it follows that P + Y = C. And since P + X + Y + Z = 𝜋, therefore we have also

P = 1

2 (A + B + C − 𝜋)

C

A

B

B’D

A’

C’

FIG.15

FIG. 14

B

A

D

C

B’

A’C’

D’

Page 138: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

138

[IDEM ALITER]

We may attain to the same conclusion in another way, based solely upon the theorem about the equivalence of surfaces given above (Theorem 26).

In the spherical triangle ABC (Fig.16),

halve the sides AB and BC, and through the midpoints D and E draw a great circle;

upon this let fall from A, B, C the perpendiculars AF, BH, and CG. if the

perpendicular from B falls at H between D and E, then will (of the triangles so made)

BDH = AFD, and BHE = EGC (Theorems 6 and 15), whence follows that the surface

of the triangle ABC equals that of the quadrilateral AFBD (Theorem 26).

If the point H coincides with the middle point E of the side BC (Fig.17), only two equal right-angled triangles, ADF and BDE, are made, by whose interchange the

equivalence of the surfaces of the triangle ABC and the quadrilateral AFEC is established.

If, finally, the point H falls outside the triangle ABC (Fig.18), the perpendicular CG goes, in consequence, through the triangle, and so we go over from the triangle

ABC to the quadrilateral AFGC by adding the triangle FAD = DBH, and then taking away the triangle CGE = EBH.

Supposing in the spherical quadrilateral AFGC a great circle passed through the points A and G, as also through F and C, then will their arcs between AG and

FC equal one another (Theorem 15); consequently also the triangles FAC and ACG will be congruent (Theorem 15), and the angle FAC will equal the angle ACG.

Hence follows, that in all the preceding cases, the sum of all three angles of the spherical triangle equals the sum of the two equal angles in the quadrilateral which are not the right angles.

Therefore we can, for every spherical triangle, in which the sum of the three angles is S, find a quadrilateral with equivalent surface, in which are two right angles and two equal perpendicular sides, and where the two other angles are each 1

2 S.

FIG. 16

B

EG

CA

FD

H

FIG. 17

B

E

CA

F

D

Page 139: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

139

Let now ABCD (Fig.19) be the spherical quadrilateral, where the sides AB = DC are perpendicular to BC, and the angles A and D are each 1

2 S.

Prolong the sides AD and BC until they cut one another in E, and further beyond E, make DE = EF, and let fall upon the

prolongation of BC the perpendicular FG. Bisect the whole arc BG and join the midpoint

H, by great-circle-arcs, with A and F. The triangles EFG and DCE are

congruent (Theorem 15), so FG = DC = AB. The triangles ABH and HGF are likewise

congruent, since they are right angled and have equal perpendicular sides;

consequently AH and AF pertain to one circle, the arc AHF = 𝜋, ADEF likewise = 𝜋, and the angle HAD = HFE = 1

2 S − BAH =

12 S − HFG = 1

2 S − HFE − EFG = 1

2 S −

HAD − 𝜋 + 12 S; consequently HFE = 1

2(S −

𝜋); or what is the same, this equals the size

of the lune AHFDA, which again is equal to the quadrilateral ABCD, as we easily see if we pass over from the one to the other by first adding the triangle EFG and then BAH

and thereupon taking away the triangles equal to them, DCE and HFG. Therefore 1

2(S − 𝜋) is the size of the quadrilateral ABCD and at the same time

also that of the spherical triangle in which the sum of the three angles is equal to S. [28] If three planes cut each other in parallel lines, then the sum of the three surface angles equals two right angles.

Let AA’, BB’, CC’ (Fig.20) be three parallels made by the intersection of planes (Theorem 25). Take upon them at random three points A, B, C, and suppose through these a plane passed, which consequently will cut the planes of the parallels along the straight lines AB, AC, and BC.

Further, pass through the line AC and any point D on BB’ another plane, whose intersection with

the two planes of the parallels (AA’ and BB’, CC’ and BB’) produces the two lines AD and DC, and whose inclination to the third plane of the parallels AA’ and CC’ we

will designate by 𝑤.

FIG. 18

B

E

CA

FD H

G

FIG. 19

A

B H

D

C

F

E

G

FIG. 20

AA’

B B’

C’C

D

r

p

q

r’p’

q’

l

m

n

Page 140: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

140

The angles between the three planes in which the parallels lie will be designated X, Y, Z, respectively at the lines AA’, BB’, CC’. Finally, call the linear angles BDC = a, ADC = b, ADB = c.

About A as center suppose a sphere described, upon which the intersections

of the straight lines AC, AD, AA’ with it determine a spherical triangle, with the sides p, q, and r. Call its size 𝛼. Opposite the side q lies the angle 𝑤, opposite r lies

X, and consequently opposite p lies the angle 𝜋 + 2𝛼 − 𝑤 − X (Theorem 27). In like manner CA, CD, CC’ cut a sphere about the center C, and determine a

triangle of size 𝛽, with the sides p’, q’, r’, and the angles, w opposite q’, Z opposite r’, and consequently 𝜋 + 2𝛽 − 𝑤 − Z opposite p’.

Finally is determined by the intersection of a sphere about D with the lines DA, DB, DC, a spherical triangle, whose sides are l, m, n, and the angles opposite them 𝑤 + Z − 2𝛽, 𝑤 + X − 2𝛼, and Y. Consequently its size 𝛿 = 1

2 (𝑋 + 𝑌 + 𝑍 − 𝜋) −

𝛼 − 𝛽 + 𝑤.

Decreasing 𝑤 lessens also the sizes of the triangles, 𝛼 and 𝛽, so that 𝛼 + 𝛽 −𝑤 can be made smaller than any given number.

In the triangle 𝛿 can likewise the sides l and m be lessened even to vanishing (Theorem 21); consequently the triangle 𝛿 can be placed with one of its sides l or

m upon a great circle of the sphere as often as you choose without thereby filling up the half of the sphere; hence 𝛿 vanishes together with 𝑤; whence follows that necessarily we must have X + Y + Z = π. [29] In a rectilineal triangle, the perpendiculars erected at the midpoints of the sides either do not meet, or they all three cut each other in one point.

Having presupposed in the triangle ABC (Fig.21) that the two perpendiculars ED and

DF, which are erected upon the sides AB and BC at their midpoints E and F, intersect in the

point D, then draw within the angles of the triangle the lines DA, DB, DC.

In the congruent triangles ADE and BDE (Theorem 10), we have AD = BD. Thus follows

also that BD = CD; the triangle ADC is hence isosceles; consequently the perpendicular dropped from the vertex D upon the base AC

B

F

D

E

CGA

FIG.21

FIG. 20

AA’

B B’

C’C

D

r

p

q

r’p’

q’

l

m

n

Page 141: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

141

falls upon G, the midpoint of the base. The proof remains unchanged also in the case when the intersection point D

of the two perpendiculars ED and FD falls in the line AC itself, or falls without the triangle.

Therefore in case we presuppose that two of those perpendiculars do not intersect, then also the third cannot meet with them. [30] The perpendiculars which are erected upon the sides of a rectilineal triangle at their midpoints, must all three be parallel to each other, so soon as the parallelism of two of them is presupposed.

In the triangle ABC (Fig.22) let the lines DE, FG, HK, be erected perpendicular upon the sides at their midpoints D, F, H.

[CASE 1]

We will in the first place assume that the two perpendiculars DE and FG are

parallel, cutting the line AB in L and M, and that the perpendicular HK lies

between them. Within the angle BLE draw from the point L, at random, a

straight line LG, which must cut FG somewhere in G, how small soever the

angle of deviation GLE may be (Theorem 16).

Since in the triangle LGM the perpendicular HK cannot meet with MG (Theorem

29), therefore it must cut LG somewhere in P, whence follows, that HK is parallel to DE (Theorem 16), and to MG (Theorems 18 and 25).

Put the side BC = 2a, AC = 2b, AB = 2c, and designate the angles opposite

these sides by A, B, C. Then we have, in the case just considered,

A = 𝛱(b) − 𝛱(c) B = 𝛱(a) − 𝛱(c) C = 𝛱(a) + 𝛱(b)

as one may easily show with the help of the lines AA’, BB’, CC’, which are drawn from the points A, B, C, parallel to the perpendicular HK, and consequently to both

the other perpendiculars DE and FG (Theorems 23 and 25).

A

D

C

F

B

A’

L H

G

M

P

E C’ B’K

FIG.22

Page 142: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

142

[CASE 2]

Let now the two perpendiculars HK and FG be parallel, then can the third DE

not cut them (Theorem 29), hence is it either parallel to them, or it cuts AA’. The last assumption is not other than that the angle

C > 𝛱(a) + 𝛱(b) If we lessen this angle so that it becomes

equal to 𝛱(a) + 𝛱(b), while we in that way give

the line CA the new position CQ (Fig.23), and designate the size of the third side BQ by 2c’, then must the angle CBQ at the point B, which is increased, in accordance with what is proved above, be

CBQ = 𝛱(a) − 𝛱(c′) > 𝛱(a) − 𝛱(c),

whence follows c′ > c (Theorem 23). In the triangle ACQ, however, the angles at A and Q are equal, hence in the

triangle ABQ must the angle at Q be greater than that at the point A; consequently is AB > BQ (Theorem 9); that is, c > c′. [31] We call a boundary line (or oricycle) that curve lying in a plane for which all perpendiculars erected at the midpoints of chords are parallel to each other.

In conformity with this definition, we can represent the generation of a boundary line if we draw from a given line AB (Fig.24) and from a given point A on it, chords such as AC = 2a,

making different angles CAB =𝛱(a). The end C of such a chord will lie on the boundary line, whose points we can thus gradually determine.

The perpendicular DE erected upon the chord AC at its midpoint D will be parallel to the line AB, which we will call the Axis of the boundary line. In like

manner will also each perpendicular FG, erected at the midpoint of any chord AH, be parallel to AB; consequently must this peculiarity pertain also in general to every

perpendicular KL which is erected at the midpoint K of any chord CH, between

L

H

C

A

F

B

E

G

FIG.24

FIG. 23

A

C

B

Q

Page 143: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

143

whatever points C and H of the boundary line this may be drawn (Theorem 30). Such perpendiculars must therefore likewise, without distinction from AB, be called Axes of the boundary line. [32] A circle with continually increasing radius merges into a boundary line.

Given AB (Fig.25), a chord of a boundary line, draw from the endpoints A and B of the chord two axes AC and BF, which consequently

will make with the chord two equal angles BAC = ABF = 𝛼 (Theorem 31).

Upon one of these axes AC, take anywhere the point E as center of the circle, and draw the

arc AF from the initial point A (of the axis AC) to its intersection point F with the other axis BF.

The radius of the circle, FE, corresponding to the point F, will make on the one side with the chord AF an angle AFE = 𝛽, and on the other side with the axis BF,

the angle EFD = 𝛾. It follows that the angle between the two chords BAF = 𝛼 −𝛽 < 𝛽 + 𝛾 − 𝛼 (Theorem 22); whence follows 𝛼 − 𝛽 < 1

2 𝛾.

Since, however, the angle 𝛾 approaches the limit zero, in consequence of moving the center E in the direction AC, when F remains unchanged (Theorem 21),

as well as in consequence of an approach of F to B on the axis BF, when the center E remains in its position (Theorem 22), so it follows, that with such a lessening of

the angle 𝛾, also the angle 𝛼 − 𝛽, or the mutual inclination of the two chords AB and AF, and hence also the distance of the point B on the boundary line from the

point F on the circle, tends to vanish. Consequently one may also call the boundary line a circle with infinitely great

radius. [33] Let AA’ = BB’ = 𝒙 (Fig.26) be two lines parallel toward the side from A to A’, which parallels serve as axes for the two boundary arcs (arcs on two

boundary lines) 𝐀𝐁 = 𝒔, 𝐀′𝐁′ = 𝒔′. Then is

𝒔′ = 𝒔𝒆−𝒙 where 𝒆 is independent of the arcs 𝒔, 𝒔′, and of the straight line 𝒙, the

distance of the arc 𝒔′ from 𝒔.

In order to prove this, assume that the ratio of the arc 𝑠 to 𝑠′ is equal to the ratio of the two whole

numbers 𝑛 and 𝑚. Between the two axes AA’, BB’, draw yet a third

axis CC’, which so cuts off from the arc AB a part

AC = 𝑡 and from the arc A’B’ on the same side, a part A′C′ = 𝑡′. Assume the ratio of 𝑡 to 𝑠 equal to

that of the whole numbers 𝑝 and 𝑞, so that

B

A E C

D

α

βγ

F

FIG.25

A A’

B B’

C C’

t t’

x

FIG.26

Page 144: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

144

𝑠 = 𝑛

𝑚 𝑠′

𝑡 = 𝑝

𝑞 𝑠

Now divide 𝑠 (by axes) into 𝑛𝑞 equal parts; then will there be 𝑚𝑞 such parts

on 𝑠′ and 𝑛𝑝 on 𝑡. However, there correspond to these equal parts 𝑠 and 𝑡 likewise equal parts

on 𝑠′ and 𝑡′; consequently we have

𝑡′

𝑡 =

𝑠′

𝑠

Hence also wherever the two arcs 𝑡 and 𝑡′ may be taken between the two

axes AA’ and BB’, the ratio of 𝑡 to 𝑡′ remains always the same, as long as the distance 𝑥 between them remains the same. If therefore for 𝑥 = 1, we put 𝑠 = 𝑒𝑠′, then we must have for every 𝑥

𝑠′ = 𝑠𝑒−𝑥

Since 𝑒 is an unknown number only subject to the condition 𝑒 > 1, and further

the linear unit for 𝑥 may be taken at will, therefore we may, for the simplification of reckoning, so choose it that by 𝑒 is to be understood the base of Napierian logarithms.

We may here remark that 𝑠′ = 0 for 𝑥 = ∞, hence not only does the distance between the two parallels decrease (Theorem 24), but with the prolongation of the parallels toward the side of the parallelism this at last wholly vanishes. Parallel lines have therefore the character of asymptotes.

A A’

B B’

C C’

t t’

x

FIG.26

Page 145: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

145

[34] Boundary surface (or orisphere) we call that surface which arises from the revolution of the boundary line about one of its axes, which, together with all other axes of the boundary line, will be also an axis of the boundary surface. A chord is inclined at equal angles to such axes drawn through its endpoints, wheresoever these two endpoints may be taken on the boundary surface.

Let A, B, C (Fig.27) be three points on a

boundary surface; AA’, the axis of revolution, BB’ and CC’ two other axes. Hence AB and AC are chords to which the axes are inclined at equal angles A’AB = B’BA, A’AC = C’CA (Theorem 31).

Two axes BB’, CC’, drawn through the endpoints of the third chord BC, are likewise parallel and lie in one plane (Theorem 25).

A perpendicular DD’ erected at the midpoint D

of the chord AB and in the plane of the two parallels AA’, BB’, must be parallel to the three axes AA’, BB’, CC’ (Theorems 23 and

25); just such a perpendicular EE’ upon the chord AC in the plane of the parallels AA’, CC’ will be parallel to the three axes AA’, BB’, CC’, and the perpendicular DD’. Let now the angle between (1) the plane in which the parallels AA’ and BB’ lie, and

(2) the plane of the triangle ABC, be designated by 𝛱(α), where α may be positive, negative, or null. If α is positive, then erect FD = α within the triangle ABC, and in

its plane, perpendicular upon the chord AB at its midpoint D. Were α a negative number, then must FD = α be drawn outside the triangle on

the other side of the chord AB; when α = 0, then point F coincides with D. In all cases arise two congruent right-angled triangles AFD and DFB;

consequently we have FA = FB. Erect now at F the line FF’ perpendicular to the plane of the triangle ABC.

Since the angle D′DF = 𝛱(α), and DF = α, so FF’ is parallel to DD’ and the line EE’, with which also it lies in one plane perpendicular to the plane of the triangle

ABC. Suppose now in the plane of the parallels EE’, FF’ upon EF the perpendicular

EK erected, then will this also be at right angles to the plane of the triangle ABC

(Theorem 13), and to the line AE lying in this plane (Theorem 11); and consequently must AE, which is perpendicular to EK and EE’, be also at the same

time perpendicular to FE (Theorem 11). The triangles AEF and FEC are congruent, since they are right-angled and have the sides about the right angles equal. Hence is

𝐴𝐹 = 𝐹𝐶 = 𝐹𝐵

A perpendicular from the vertex F of the isosceles triangle BFC, let fall upon the

base BC, goes through its midpoint G; a plane passed through this perpendicular FG and the line FF’ must be perpendicular to the plane of the triangle ABC, and cuts the plane of the parallels BB’, CC’, along the line GG’, which is likewise parallel to

FIG. 27

A

E

C

G

B

F

D

A’ K D’

F’

C’ G’B’

E’

Page 146: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

146

BB’ and CC’ (Theorem 25); since now CG is at right angles to FG, and hence at the same time also to GG’, so consequently is the angle C’CG = B’BG (Theorem 23).

Hence follows that for the boundary surface each of the axes may be considered as axis of revolution.

Principal-plane we will call each plane passed through an axis of the boundary surface.

Accordingly every principal-plane cuts the boundary surface at a boundary line, while for another position of the cutting plane this intersection is a circle.

Three principal planes which mutually cut each other make with each other angles whose sum is π (Theorem 28).

These angles we will consider as angles in the boundary triangle whose sides are arcs of the boundary lines which are made on the boundary surface by the intersections with the three principal planes. Consequently the same interdependence of the angles and sides that is proved in the ordinary geometry for the rectilineal angle pertains to the boundary triangles [in the imaginary geometry]. [35] [Spherical trigonometry is not dependent upon whether in a rectilineal triangle the sum of the three angles is equal to two right angles or not.]14

In what follows, we will designate the size of a line by a letter with an accent added, e.g. 𝑥′, in order to indicate that this has a relation to that of another line, which is represented by the same letter without accent, 𝑥, which relation is given by the equation:

𝛱(𝑥) + 𝛱(𝑥′) = 1

2 𝜋

Let now ABC (Fig.28) be a rectilineal right-angled

triangle, where the hypotenuse AB = c, the other

sides AC = b, BC = a, and the angles opposite them are:

BAC = 𝛱(𝛼)

ABC = 𝛱(𝛽)

14 This enunciation occurs at the end of this theorem, and is supplied here at the outset of the theorem for the sake of clarity, although Lobachevsky himself did not state it at the outset.

FIG. 28

AB

C

B’

C’A’

B’’

C’’a

b

c

p

q

FIG. 27

A

E

C

G

B

F

D

A’ K D’

F’

C’ G’B’

E’

Page 147: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

147

At the point A erect the line AA’ at right angles to the plane of the triangle ABC, and from the points B and C draw BB’ and CC’ parallel to AA’.

The planes in which these three parallels lie make with each other the angles: 𝛱(𝛼) at AA’, a right angle at CC’ (Theorems 11 and 13), 𝛱(𝛼′) at BB’ (Theorem 28).

The intersections of the lines BA, BC, BB’ with a sphere described about the

point B as center, determine a spherical triangle 𝑚𝑛𝑘, in which the sides are:

𝑚𝑛 = 𝛱(c)

𝑘𝑛 = 𝛱(𝛽)

𝑚𝑘 = 𝛱(a) and the opposite angles are 𝛱(b), 𝛱(𝛼′), 1

2 𝜋.

Therefore we must, with the existence of a rectilineal triangle whose sides are

a, b, c and the opposite angles 𝛱(𝛼), 𝛱(𝛽), 1

2 𝜋, also

admit the existence of a spherical triangle (Fig.29)

with the sides 𝛱(c), 𝛱(𝛽), 𝛱(a) and the opposite angles 𝛱(b), 𝛱(𝛼′), 1

2 𝜋.

Of these two triangles, however, the existence of the spherical triangle also conversely necessitates anew that of a rectilineal one, which in consequence also can have the sides a, 𝛼′, 𝛽, and the opposite angles 𝛱(b′), 𝛱(c), 1

2 𝜋.

Hence we may pass over from a, b, c, 𝛼, 𝛽 to b, a, c, 𝛽, 𝛼 and also to a, 𝛼′, 𝛽,b′, c.

Suppose through the point A (Fig.28) with AA’ as axis, a boundary surface passed, which cuts the two other axes BB’, CC’, in B’’ and C’’, and whose intersections with the planes the parallels form a boundary triangle, whose sides are B′′C′′ = 𝑝, C′′A = 𝑞, B′′A = 𝑟, and the angles opposite them 𝛱(𝛼), 𝛱(𝛼′), 1

2 𝜋,

and where consequently (Theorem 34):

𝑝 = 𝑟 sin𝛱(𝛼)

𝑞 = 𝑟 cos𝛱(𝛼)

FIG. 29

a

b

c

½ π½ π

Π(β)

Π(α)

Π(α’)Π(a)

Π(b)

Π(c)

Π(β)

Page 148: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

148

Now break the connection of the three

principal planes along the line BB’, and turn them out from each other so that they, with all the lines lying in them, come to lie in one plane, where consequently the arcs 𝑝, 𝑞, 𝑟 will unite to a single arc of a boundary line, which goes through the point A and has AA’ for axis, in such a manner that (Fig.30) on

the one side will lie: the arcs 𝑞 and 𝑝, the side b of the triangle (which is

perpendicular to AA’ at A), the axis CC’ going from the end of b parallel to AA’ and

through C’’ (the union point of 𝑝 and 𝑞), the side a perpendicular to CC’ at the point C,

and, from the endpoint of a, the axis BB’ (parallel to AA’) which goes through the

endpoint B’’ of the arc 𝑝. On the other side of AA’ will lie: the side c perpendicular to AA’ at the point A,

and the axis BB’ parallel to AA’, and going through the endpoint B’’ of the arc 𝑟 remote from the endpoint of b.

The size of the line CC’’ depends upon b, which dependence we will express by CC′′ = 𝑓(b).

In like manner we will have BB′′ = 𝑓(c). If we describe, taking CC’ as axis, a new boundary line from the point C to its

intersection D with the axis BB’ and designate the arc CD by 𝑡, then is BD = 𝑓(a).

𝐵𝐵′′ = 𝐵𝐷 + 𝐷𝐵′′ = 𝐵𝐷 + 𝐶𝐶′′ and consequently

𝑓(c) = 𝑓(a) + 𝑓(b)

Moreover, we perceive (by Theorem 33) that

𝑡 = 𝑝𝑒𝑓(b) = 𝑟 sin𝛱(𝛼) 𝑒𝑓(b)

If the perpendicular to the plane of the triangle ABC (Fig.28) were erected at B instead of at the point A, then would the lines c and 𝑟 remain the same, the arcs

𝑞 and 𝑡 would change to 𝑡 and 𝑞, the straight lines a and b into b and a, and the angle 𝛱(𝛼) into 𝛱(𝛽); consequently we would have

𝑞 = 𝑟 sin𝛱(𝛽) 𝑒𝑓(a)

whence follows by substituting the value of 𝑞,

cos𝛱(𝛼) = sin𝛱(𝛽) 𝑒𝑓(a) and if we change 𝛼 and 𝛽 into b′ and c,

FIG. 30

B

B’

B’’

C

C’

AA’

B

B’

B’’

C’’

c

a

b

r

q

pt

D

Page 149: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

149

sin𝛱(b) = sin𝛱(c) 𝑒𝑓(a)

further, by multiplication with 𝑒𝑓(b),

sin𝛱(b) 𝑒𝑓(b) = sin𝛱(c) 𝑒𝑓(c)

Hence follows also

sin𝛱(a) 𝑒𝑓(a) = sin𝛱(b) 𝑒𝑓(b)

Since now the straight lines a and b are independent of one another, and moreover, for b = 0, 𝑓(b) = 0, 𝛱(b) = 1

2 𝜋, thus we have for every straight line a

𝑒−𝑓(a) = sin𝛱(a)

Therefore,

sin𝛱(c) = sin𝛱(a) sin𝛱(b)

sin𝛱(𝛽) = cos𝛱(𝛼) sin𝛱(a)

Hence we obtain besides by mutation of the letters

sin𝛱(𝛼) = cos𝛱(𝛽) sin𝛱(b)

cos𝛱(b) = cos𝛱(c) cos𝛱(𝛼)

cos𝛱(a) = cos𝛱(c) cos𝛱(𝛽)

If we designate in the right-angled spherical triangle (Fig.29) the sides 𝛱(c), 𝛱(𝛽), 𝛱(a), with the opposite angles 𝛱(b), 𝛱(𝛼′), by the letters a, b, c, A, B, then the obtained equations take on the form of those which we know as proved in spherical trigonometry for the right-angled triangle, namely,

sin a = sin c sinA sin b = sin c sinB cosA = cos a sinB

cosB = cos b sinA cos c = cos a cos b

from which equations we can pass over to those for all spherical triangles in general.

Hence spherical trigonometry is not dependent upon whether in a rectilineal triangle the sum of the three angles is equal to two right angles or not.

Page 150: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

150

[36] We will now consider anew the right-angled rectilineal triangle ABC (Fig.31), in

which the sides are a, b, c, and the opposite angles 𝛱(𝛼), 𝛱(𝛽), 1

2 𝜋.

Prolong the hypotenuse c through the point B, and make BD = 𝛽; at the point D

erect upon BD the perpendicular DD’, which consequently will be parallel to BB’, the prolongation of the side a beyond the point

B. Parallel to DD’ from the point A draw AA’, which is at the same time also parallel to CB’ (Theorem 25). Therefore is the angle

A′AD = 𝛱(c + 𝛽)

and A′AC = 𝛱(b) consequently 𝛱(b) = 𝛱(𝛼) + 𝛱(c + 𝛽)

If from B we lay off 𝛽 on the hypotenuse

c, then at the endpoint D (Fig.32), within the triangle, erect upon AB the perpendicular DD’, and from the point A, parallel to DD’, draw AA’, so will BC with its prolongation CC’ be the third parallel; then is angle CAA′ = 𝛱(b) and DAA′ = 𝛱(c − 𝛽) consequently 𝛱(c − 𝛽) = 𝛱(𝛼) + 𝛱(b) The last equation is then also still valid when c = 𝛽, or c < 𝛽.

FIG. 31A

C

B

D

A’ B’ D’

β

Π(β)

Π(α)

a

b

c

FIG. 32

A

B

C

C’

A’ D’

Da

Π(β)

½ πΠ(α)

Π(b)

β

Page 151: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

151

If c = 𝛽 (Fig.33), then the perpendicular AA’ erected upon AB at the point A is parallel to the side BC = a, with its prolongation, CC’. Consequently we have 𝛱(𝛼) + 𝛱(b) = 1

2 𝜋

whilst also 𝛱(c − 𝛽) = 1

2 𝜋 (Theorem 23)

If c < 𝛽, then the end of 𝛽 falls beyond the point

A at D (Fig.34) upon the prolongation of the hypotenuse AB. Here the perpendicular DD’ erected

upon AD, and the line AA’ parallel to it from A, will likewise be parallel to the side BC = a, with its

prolongation CC’. Here we have the angle

DAA′ = 𝛱(𝛽 − c)

consequently 𝛱(𝛼) + 𝛱(b) = 𝜋 − 𝛱(𝛽 − c) = 𝛱(c − 𝛽) (Theorem 23). The combination of the two equations found gives

2𝛱(b) = 𝛱(c − 𝛽) + 𝛱(c + 𝛽) 2𝛱(𝛼) = 𝛱(c − 𝛽) + 𝛱(c + 𝛽)

whence follows cos𝛱(b)

cos𝛱(𝛼) =

cos[ 12𝛱(c − 𝛽) + 12 𝛱(c + 𝛽)]

cos[ 12𝛱(c − 𝛽) −

1

2𝛱(c + 𝛽)]

Substituting here (Theorem 35) the value

cos𝛱(b)

cos𝛱(𝛼) = cos𝛱(c)

FIG. 33

A B

C

A’ C’

a

c

b

Π(α) Π(β)

Π(b)

FIG. 34

A

D

C

B

D’C’A’

a

b

c

Π(β)

Π(α)

Π(b)

Page 152: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

152

we have [tan 12 𝛱(c)]

2 = tan 1

2 𝛱(c − 𝛽) tan 1

2 𝛱(c + 𝛽)

Since here 𝛽 is an arbitrary number, as the angle 𝛱(𝛽) at the one side of c may be chosen at will between the limits 0 and 1

2 𝜋, consequently 𝛽 between the

limits 0 and ∞, so we may deduce, by taking consecutively 𝛽 = c, 2c, 3c, etc., that for every positive number 𝑛,

[tan 12 𝛱(c)]

𝑛 = tan 1

2 𝛱(𝑛c)

If we consider 𝑛 as the ratio of two lines x and c, and assume that cot 1

2 𝛱(c) = 𝑒c

then we find for every line x in general, whether it be positive or negative, tan 1

2 𝛱(x) = 𝑒−x

where 𝑒 may be any arbitrary number which is greater than unity (since 𝛱(x) = 0

for x = ∞). Since the unit by which the lines are measured is arbitrary, so we may also understand by 𝑒 the base of the Napierian Logarithms. [37] Of the equations found above in Theorem 35 it is sufficient to know the two following,

sin𝛱(c) = sin𝛱(a) sin𝛱(b)

sin𝛱(𝛼) = cos𝛱(𝛽) sin𝛱(b)

applying the latter to both the sides a and b about the right angle, in order from the combination to deduce the remaining two of Theorem 35, without ambiguity of the algebraic sign, since here all angles are acute. In a similar manner we attain the two equations (1) tan𝛱(c) = sin𝛱(𝛼) tan𝛱(a) (2) cos𝛱(a) = cos𝛱(c) cos𝛱(𝛽)

Page 153: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

153

We will now consider a rectilineal triangle whose sides are a, b, c, (Fig.35) and the

opposite angles A, B, C.

If A and B are acute angles, then the perpendicular p from the vertex of the angle

C falls within the triangle and cuts the side c into two parts, x on the side of the angle A

and c – x on the side of the angle B. Thus arise two right-angled triangles, for which we obtain, by application of equation (1),

tan𝛱(a) = sin B tan𝛱(p) tan𝛱(b) = sinA tan𝛱(p)

which equations remain unchanged also when one of the angles, e.g. B, is a right angle (Fig.36) or an obtuse angle (Fig.37). Therefore we have universally for every triangle (3) sinA tan𝛱(a) = sinB tan𝛱(b) For a triangle with acute angles, A, B (Fig.35), we have also, by Equation (𝟐),

cos𝛱(x) = cos A cos𝛱(b)

cos𝛱(c − x) = cos B cos𝛱(a)

which equations also relate to triangles in which one of the angles A or B is a right angle or an obtuse angle. For example, for B = 1

2 𝜋 (Fig.36), we must take x = c; the first equation then

goes over into that which we have found above as Equation (𝟐). The other, however, is self-sufficing. For B > 1

2 𝜋 (Fig.37) the first equation remains unchanged. Instead of the second,

however, we must write correspondingly

FIG. 35

AD

B

C

ab

x c - x

p

FIG. 36

C

BA

ab

c

FIG. 37

A

C

D

p

Bc

ba

x - c

Page 154: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

154

cos𝛱(x − c) = cos(𝜋 − B) cos𝛱(a) but (by Theorem 23) we have

cos𝛱(x − c) = −cos(c − x) and also

cos(𝜋 − B) = − cosB If A is right or an obtuse an angle, then must c − x and x be put for x and c − x, in order to carry back this case upon the preceding.

In order to eliminate x from both equations, we notice that (Theorem 36)

cos(c − x) = 1 − [tan 1

2 𝛱(c−x)]

2

1 + [tan 12 𝛱(c−x)]

2

cos(c − x) = 1 − 𝑒2x−2c

1+ 𝑒2x−2c

cos(c − x) = 1 − [tan 1

2 𝛱(c)]

2[cot1

2 𝛱(x)]

2

1+ [tan 12 𝛱(c)]

2[cot 1

2 𝛱(x)]

2

cos(c − x) = cos𝛱(c) − cos𝛱(x)

1 − cos𝛱(c) cos𝛱(x)

If we substitute here the expression for cos𝛱(x), cos(c − x), we obtain

cos𝛱(c) = cos𝛱(a) cosB + cos𝛱(b) cosA

1 + cos𝛱(a) cos𝛱(b) cosA cosB

whence follows

cos𝛱(a) cosB = cos𝛱(c) − cosA cos𝛱(b)

1 − cosA cos𝛱(b) cos𝛱(c)

and finally

[sin𝛱(c)]2 = [1 − cosB cos𝛱(c) cos𝛱(a)][1 − cosA cos𝛱(b) cos𝛱(c)] In the same way we must also have

(4) [sin𝛱(a)]2 = [1 − cosC cos𝛱(a) cos𝛱(b)][1 − cosB cos𝛱(c) cos𝛱(a)]

[sin𝛱(b)]2 = [1 − cosA cos𝛱(b) cos𝛱(c)][1 − cos C cos𝛱(a) cos𝛱(b)]

Page 155: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

155

From these equations we find

[sin𝛱(b)]2[sin𝛱(c)]2

[sin𝛱(a)]2 = [1 − cosA cos𝛱(b) cos𝛱(c)]2

Hence follows without ambiguity of sign,

(5) cosA cos𝛱(b) cos𝛱(c) + sin𝛱(b) sin𝛱(c)

sin𝛱(a) = 1

If we substitute here the value of sin𝛱(c) corresponding to Equation (3),

sin𝛱(c) = sinA

sinCtan𝛱(a) cos𝛱(c)

then we obtain

cos𝛱(c) = cos𝛱(a) sinC

sinA sin𝛱(b) + cos A sinC sin𝛱(a) cos𝛱(b)

but by substituting this expression for cos𝛱(c) in Equation (4),

(6) cotA sin C sin𝛱(b) + cosC = cos𝛱(b)

cos𝛱(a)

By elimination of sin𝛱(b), with the help of Equation (3), comes

cos𝛱(a)

cos𝛱(b) cos C = 1 −

cos A

sinB sin C cos𝛱(a)

In the meantime, Equation (6) gives, by changing the letters,

cos𝛱(a)

cos𝛱(b) = cot B sinC sin𝛱(a) + cosC

From the last two equations follows

(7) cosA + cosB cos C = sinBsinC

sin𝛱(a)

Page 156: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

156

All four equations for the interdependence of the sides a, b, c, and the opposite

angles A, B, C, in the rectilineal triangle will therefore be [Equations(3), (5), (6), (7)]: (8) sinA tan𝛱(a) = sinB tan𝛱(b)

cosA cos𝛱(b) cos𝛱(c) + sin𝛱(b) sin𝛱(c)

sin𝛱(a) = 1

cot A sinC sin𝛱(b) + cos C = cos𝛱(b)

cos𝛱(a)

cosA + cosB cos C = sinB sinC

sin𝛱(a)

If the sides a, b, c, of the triangle are very small, we may content ourselves with the approximate determinations (Theorem 36)

cot𝛱(a) = a

sin𝛱(a) = 1 − 12 a2

cos𝛱(a) = a

and in like manner also for the other sides b and c. The Equations (8), for such triangles, pass over into the following: b sinA = a sinB

a2 = b2 + c2 − 2bc cosA a sin(A + C) = b sinA

cosA + cos(B + C) = 0 Of these equations, the first two are assumed in the ordinary geometry; the last two lead, with the help of the first, to the conclusion

A + B + C = 𝜋 Therefore the imaginary geometry passes over into the ordinary, when we suppose that the sides of a rectilineal triangle are very small.

Page 157: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

157

I have, in the scientific bulletins of the University of Kasan, published certain researches in regard to the measurement of curved lines, of plane figures, of the surfaces and the volumes of solids, as well as in relation to the application of imaginary geometry to analysis.

The Equations (8) attain for themselves already a sufficient foundation for considering the assumption of imaginary geometry as possible. Hence there is no means, other than astronomical observations, to use for judging of the exactitude which pertains to the calculations of the ordinary geometry.

This exactitude is very far-reaching, as I have shown in one of my investigations, so that, for example, in triangles whose sides are attainable for our measurement, the sum of the three angles is not indeed different from two right angles by the hundredth part of a second.

In addition, it is worthy of notice that the four Equations (8) of plane geometry pass over into the

equations for spherical triangles, if we put a√−1,

b√−1, c√−1, instead of the sides a, b, c; with this change, however, we must also put

sin𝛱(a) = 1

cosa

cos𝛱(a) = √−1 tan a

tan𝛱(a) = 1

sin a √−1

and similarly also for the sides b and c. In this manner we pass over from Equations (8) to the following: sinA sinB = sinB sin a

cos a = cos b cos c + sin b sin c cosA

cot A sinC + cos C cos b = sin b cot a

cosA = cos a sin B sin C − cos B cos C

Page 158: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

158

16 Further Exploration of Hyperbolic Geometry

Lobachevsky has introduced us to non-Euclidean geometry, but for a number of reasons it may be desirable to go a little further into non-Euclidean geometry than his short treatise has taken us.

First, his way of approaching non-Euclidean geometry is mixed together with the Euclidean alternative. For the sake of a clearer understanding of the Lobachevskian geometry, it would be good to state its principles independently, without stating foreign principles and alternatives.

Second, his choice of theorems is not always the best for the specific purpose of conveying a strong sense of Lobachevskian (as opposed to Euclidean) space. In his Proposition 35, for example, he focuses on the trigonometry of spherical triangles, which is not different in his new space from what it is in Euclidean space.

Third, there are some very surprising and easily demonstrated properties of Lobachevskian space that Lobachevsky omits in his short treatise, and that do convey a good sense of how this space differs from that of Euclid.

For these reasons, we will next look into a number of theorems of Lobachevskian geometry starting from the beginning, but not quite from scratch. “From the beginning,” because we will lay down the very first principles and reason from them. “Not quite from scratch,” because when these first principles or their conclusions have been stated or proved already by Euclid or Lobachevsky, we will often simply refer to them, and not prove them all over again.

A word of vocabulary, now, before we begin. Lobachevskian space has another name worth mentioning: it is called hyperbolic space. In Euclid’s geometry, there is one and only one parallel to a given straight line through a given point. In Lobachevsky’s geometry, we have seen, there are two parallels to a given straight line through a given point. And there is another major branch of non-Euclidean geometry, first developed by the German mathematician Bernhard Riemann (1826-1866), in which there are no parallels to a given straight line

Page 159: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

159

through a given point. One parallel, two parallels, and no parallels. This is like a mean, a defect, and an excess. Consequently, Euclid’s geometry is called parabolic geometry, Lobachevsky’s is called hyperbolic geometry, and Riemann’s is called elliptic geometry. These names in part reflect the mean, defect, and excess of parallels, just as the ordinate-square is exactly equal to, or falls short of, or exceeds the rectangle contained by the abscissa and the upright side in the conic sections with the corresponding names. But there is more. A parallel in Lobachevsky’s geometry behaves toward the straight line to which it is parallel like a hyperbola to its asymptote. And there is a special model of Lobachevskian space that is built on a revolved hyperbola, or hyperboloid. And in Riemann’s geometry, straight lines come back on themselves, and cannot just go on into new territory forever as they can in both Euclid’s world and Lobachevsky’s world, making Riemannian straight lines more like ellipses than hyperbolas or parabolas. And there is a special model of Riemannian space that is built on a revolved ellipse, or ellipsoid.15

With these things in mind, we may now move on to our continued exploration of hyperbolic space.

15 Actually, Riemann developed two different geometries, in each of which there are no parallel lines. In “spherical geometry” any straight line loops back on itself, and any two straight lines cut each other exactly twice, so they behave like great circles on a sphere. In “elliptic geometry” any straight line loops back on itself, and any two straight lines cut each other exactly once, so they behave like the parts of great circles on a hemisphere, if we consider the endpoints of any such semicircular arc to be the same point. The term “Riemannian geometry” today refers to a much more general geometry that determines ways of measuring the curvature of a space, whether that space be elliptic, spherical, hyperbolic, parabolic, or some other type of space.

Page 160: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

160

PRINCIPLES From Euclid, we retain the following principles and theorems: Definitions 1-22 Postulates 1-4 Common Notions 1-5 Propositions 1-26 But we replace his Postulate 5 with this Lobachevskian Postulate:

That, if a straight line falling on two straight lines make the interior

angles on the same side less than two right angles, the two straight

lines, if extended indefinitely toward that side, at least in some cases

never meet.

And we replace his Definition 23 with this Definition 23:

If one straight line never meets another in the same plane with it

however far the two are extended toward one side of a transversal,

and if the one line makes with the transversal the least angle of all

the straight lines not cutting the other line yet passing through the

same point on the transversal, then the one straight line is said to be

parallel to the other at that point and toward that side of the

transversal.

For example, if SPL and TRN are two straight

lines in one plane cut by a third straight line

PR, and SPL and TRN never meet toward the

right side of PR however far they are

extended, and if on that side of PR the line PL

makes with PR the least angle of all the

straight lines through P that never cut RN on

that side of PR, then PL is parallel to RN (at P

and toward that side of PR).

With this definition available, we may state our Lobachevskian Postulate like this:

At least in some cases, two straight lines one of which is parallel to

the other at some point, make interior angles (on one side of a

transversal through that point) that are together less than two right

angles.

And we let the symbol 𝜋 stand for 180°, or two right angles.

P

R

S

Q

T N

AL

Page 161: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

161

From Lobachevsky, we retain all of his propositions. Now let’s reason from these things, first to some results familiar from or strongly implied in Lobachevsky himself, and then to other things he did not even hint at in what we read.

PROPOSITIONS PROPOSITION 1 If even one triangle has an angle sum of two rights, then the interior angles formed on one side of a transversal through two parallel straight lines must always add up to two rights. Let it be that the angle sum

of some triangle is π. Let PL be parallel to NB toward the right.

I say that DPL + PDB = π. For let the straight line PR be drawn at right angles to NB. Choose any point A on RB on the side of PR on which PL is parallel to NB, and

join PA.

Since PL is the first line from P (rotating PA counterclockwise) that never cuts RB

[since PL is parallel to RB], therefore we can shrink 4 as close to nothing as we

please and still have a triangle, RPA.

As we do this, RA gets larger than any assignable length, that is, RA goes to

infinity. And as this happens, 3 also becomes as close to nothing as we please [Lobachevsky, Prop. 21].

Now since 4 goes to nothing as RA goes to infinity, therefore RPL = lim

RA→∞[ 1]

But since at least one triangle in existence has an angle sum of π [by hypothesis],

therefore all do [Lobachevsky, Prop.20], and therefore the angle sum of PRA is

π. Thus 1 is just π minus the other two angles of PRA, so that RPL = lim

RA→∞[π − 2 − 3]

PL

BARN D

1

2 3

4

5

6

Page 162: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

162

But the limit of a sum is the sum of the limits of the things summed, and so RPL = lim

RA→∞[π] − lim

RA→∞[2] − lim

RA→∞[3]

so RPL = π − 1

2π − 0

so RPL = 1

So RPL must be right. Now in DPR, the angle sum must also be π, since by hypothesis the angle sum

of some triangle is π, and hence the angle sum of every triangle is π as a consequence.

So 5 + 6 + PRD = π

But PRD = ½ π [since PR is perpendicular to NB]

and RPL = ½ π [just shown]

so 5 + 6 + RPL = π

or 5 + [6 + RPL] = π

or DPL + PDB = π

So if any one triangle has an angle sum equal to π, then it must always happen that the interior angles that two parallel straight lines form on one side of a transversal will add up to two rights. Q.E.D. PROPOSITION 2 Every triangle has an angle sum of less than two rights. For if any single triangle has an angle sum of two rights, then the interior angles formed on one side of a transversal through two parallels will always add up to two rights [Prop.1].

But this is not so, since at least in some cases they add up to less than two rights [Lobachevskian Postulate]. Therefore no triangle has an angle sum equal to two rights. And no triangle has an angle sum greater than two rights [Lobachevsky, Prop.19]. Therefore every triangle has an angle sum less than two rights. Q.E.D.

PL

BARN D

1

2 3

4

5

6

Page 163: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

163

DEFINITION 24 Let the deficiency of a triangle’s angle sum from two right angles be called its defect.

So if ABC has an angle sum of 𝜋 − 𝑑, then 𝑑 is its defect. For example, if the angle sum is 150°, then the defect is 30°. COROLLARY 1 to Proposition 2 The defect of a triangle is the sum of the defects of the triangles that compose it. If PRB is any triangle, and we choose any point

A along any side RB, and join PA, then the larger triangle PRB has a larger defect than PRA has—specifically, its defect is the sum of the defects of triangles PRA and PAB. For let the

defect of PRA be 𝑑, and let that of PAB be 𝑒. Then the angle sum of PRB must be

[angle sum of PRA] + [angle sum of PAB] – [angles at A]

or [𝜋 − 𝑑] + [𝜋 − 𝑒] – [𝜋] So angle sum of PRB = 𝜋 − 𝑑 − 𝑒 so that its defect is 𝑑 + 𝑒. Q.E.D. COROLLARY 2 to Proposition 2 As we take smaller and smaller triangles, with areas as little as we please, their defects, too, become as little as we please. To see this, consider any tiny right triangle DRH, and extend one leg of the right

angle, say RH, out to B, and let DL be parallel to RB. Next take a point A out along RB, and move it toward B, giving us a changing triangle DRA. Also imagine point D

moving along DR toward R. Thus

DR → 0 and RA → ∞

P

R A B

D

R H A B

L

Page 164: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

164

Let us consider the effects of these motions upon the angle sums of DRA and DRH. First of all, since any acute angle RDL, no matter how close it is to a right angle, will have a distance of parallelism DR, but this distance will be less than the distance for any angle that is more acute and differs more from a right angle [Lob.Prop.23],

therefore as DR shrinks to nothing, the first angle too large to occur as an angle of parallelism will be a right angle. Hence lim

DR→0RDL = 1

2 π

But RDL is also the limit of RDA as RA goes to infinity. And the limit of the angle

sum of DRA, as DR goes to nothing and RA goes to infinity, is lim

DR→0RA→∞

angle sum of DRA = limDR→0RA→∞

RDA + limDR→0RA→∞

DRA + limDR→0RA→∞

RAD

or lim

DR→0RA→∞

angle sum of DRA = limDR→0RA→∞

RDL + limDR→0RA→∞

DRA + limDR→0RA→∞

RAD

so lim

DR→0RA→∞

angle sum of DRA = 12 𝜋 + 1

2 𝜋 + lim

DR→0RA→∞

RAD

And the limit of RAD as RA goes to infinity, even if DR did not change (but all the more if it is shrinking) is zero [Lob.Prop.21]. So lim

DR→0RA→∞

angle sum of DRA = 12 𝜋 + 1

2 𝜋 + 0 = 𝜋

But the DRH is always less than DRA, being part of it, and therefore its defect will be less than that of DRA [Prop.2, Cor.1], and so its angle sum will be always

closer to 𝜋 than is the angle sum of DRA. Hence the angle sum of DRH also approaches 𝜋 the smaller it gets. And since this is true of right triangles, this is true of triangles in general, since all triangles are just composed of two right triangles. This means that the smaller the figures we make in this geometry, the more the geometry approximates Euclid’s, as near as we please. Q.E.D.

D

R H A B

L

Page 165: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

165

PROPOSITION 3 The sum of the interior angles formed by two parallel straight lines and their transversal, on the side of parallelism, is always less than two rights. For let PL be parallel to NB toward the right, and let PD be any transversal. I say that

DPL + PDB < π For drop PR at right angles to NB. Since PL is parallel to NB toward the right of PD, therefore PL is the

first line forming an angle with PR that is too large to meet RB and form a triangle. Thus RPL = lim

RA→∞[ 1]

Now since PRA is a triangle, therefore its angle sum falls short of two rights by some amount, 𝑑 [Prop.2], or 1 + 2 + 3 = π − 𝑑 So 1 = π − 𝑑 − 2 − 3 = π − 𝑑 − 1

2 π − 3 = 1

2 π − 𝑑 − 3

So as RA increases toward infinity it is always true that 1 = 1

2 π − 𝑑 − 3

although 𝑑 is variable like 3 is.

So RPL = limRA→∞

[12 π − 𝑑 − 3]

Thus RPL = limRA→∞

[12 π] − lim

RA→∞[𝑑] − lim

RA→∞[3]

Now 𝑑 increases as RA → ∞, that is, as PRA grows [Cor 1.Prop.2], so it cannot be approaching zero. Also, the angle sum of PRA is always greater than ½ π, so its angle sum never gets as low as ½ π, and so its defect can never get as great as

that. Hence the limiting value of 𝑑 (let us call it 𝛿), lies somewhere between 0 and ½ π. So, taking the limits of the expressions on the right side, RPL = 1

2 π − 𝛿 − 0

P L

BR ADN

5

1

3

6

2

4

Page 166: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

166

so RPL = 12 π − 𝛿

so RPL + PRB = 1

2π +

1

2π − 𝛿 = π − 𝛿

Now in RPD, the sum of the angles is less than two rights [Prop.2],

so 5 + 6 + ½ π < π

so 5 + 6 < ½ π

so 5 + 6 + RPL < ½ π + RPL

or 5 + (6 + RPL) < ½ π + ½ π – 𝛿 or PDB + DPL < π – 𝛿

so DPL + PDB < π Q.E.D.

P L

BR ADN

5

1

3

6

2

4

Page 167: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

167

PROPOSITION 4 Rectilineal figures which are the same shape are also the same size (OR: There is no such thing as similar figures). Triangles that are the same shape, and consequently have their corresponding angles equal, must be the same size.

For let ABC and ADE be the same shape. I say they are also the same size. For if not, let them be

different sizes, with ADE being the smaller. Let ADE be placed inside ABC, sharing with it the

equal angle at A. Thus the sides of ADE will lie along, but be smaller than, the sides of ABC, so

that E (for example) will lie between A and C. Join EB. Since the defect of a triangle is the sum of the defects of the triangles that compose it [Cor 1.Prop.2], consequently ABC has a larger defect than ADE has. Hence their angle sums cannot be equal. Hence their corresponding angles cannot all be equal. Therefore the two triangles are not the same shape. And yet they do have the same shape, since this is given. Therefore triangles that are the same shape must not be different sizes, but must be the same size. Consequently rectilineal figures that are the same shape must also be the same size, since rectilineal figures that are the same shape are always made of triangles that are the same shape, and consequently they are made of triangles that are the same size. Q.E.D.

B

D

A E C

Page 168: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

168

PROPOSITION 5 The angle in a semicircle is acute.

Let ABC be any angle in a semicircle.

I say that ABC is acute. Join B to the center of the semicircle, O.

Thus OA = OB = OC

So 2 = 1 [Euc.1.5]

and 3 = 4 [Euc.1.5]

So 2 + 3 = 1 + 4

But 1 + 2 + 3 + 4 = π – 𝑑 [Prop.2]

So 2 + 3 = ½ (π – 𝑑)

so 2 + 3 = ½ π – ½ 𝑑

that is, ABC is less than a right angle, and hence it is acute. Q.E.D. PROPOSITION 6 The angle sum of any quadrilateral is less than four right angles.

For let ABCD be a quadrilateral. I say that its angle sum is less than four right angles.

Join BD. Let the expression AΣ ( ) stand for the angle sum of the figure named in the parentheses, and let lowercase letters stand for defects.

Thus AΣ (ABD) = π – 𝑎 [Prop.2]

and AΣ (CDB) = π – 𝑏 [Prop.2]

So AΣ (ABD) + AΣ (CDB) = 2π – (𝑎 + 𝑏)

or AΣ (ABCD) = 2π – (𝑎 + 𝑏) that is, the angle sum of ABCD is less than four right angles by the amount (𝑎 + 𝑏). Q.E.D.

B

COA

1

2 3

4

A B

CD

Page 169: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

169

Corollary 1: There is no such thing as a rectangle or a square. Corollary 2: The defect of a quadrilateral’s angle sum from four right angles will be equal to the sum of the defects of the two triangles into which a diagonal divides it. PROPOSITION 7 Two cutting straight lines eventually diverge by more than any finite distance.

Let straight lines LB and BN cut at B, forming LBN. And let any point A along BL be

chosen, and let AC be dropped at right angles to BN.

I say that as perpendiculars such as CA are taken along BN further from B, the portion of them that does not cut BL eventually becomes greater than any assigned amount.

For at some finite distance BD, the perpendicular DR to BN becomes parallel to

BL [Lob.Prop.23]. Therefore at D, the perpendicular distance from BN to BL is

infinite, and therefore the lines LB and BN diverge by more than any finite distance. Q.E.D.

B

C

D

AL

R

N

Page 170: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

170

PROPOSITION 8 Parallel lines are asymptotic. Lobachevsky proved this, but his proof involved boundary lines. Let’s see the same conclusion without the use of such a curve. [A] First, I say that parallel lines, in the direction of parallelism, draw nearer and nearer to each other—that is, that any perpendicular from one to the other that is further along in the direction of parallelism than another such perpendicular is the lesser of the two.

For let CD and AB be parallel in the direction of CD (to the right). Drop perpendiculars to AB from two random points C and D, as CA and DB. Bisect AB at E. Draw EF at right angles to AB. Then BD < CA. For if not, either BD = CA

or else BD > CA

• First, if possible, let BD = CA.

Then FEBD folds over onto FEAC, and 1 = 2, and so they are right angles,

and so FD and EB are parallels making the sum of the interior angles on the parallel-side of a transversal equal to two right angles, which is impossible [Prop.3]. Therefore BD is not equal to CA. • Next, if possible, let BD > CA.

Then FEBD folds over to FEAC such that B is on A, but D lies above C, and so 2 > 1. But since 1 + 2 makes two rights (Euc.1.13), therefore 2 is greater than

a right angle. Therefore FD and EB are parallels making the sum of the interior angles on the parallel side of a transversal greater than two right angles, which is impossible

[Prop.3]. Therefore BD is not greater than CA.

• Therefore BD is less than CA. So CD and AB draw nearer to each other the further we go in the direction of their parallelism.

C F D

A E B EA

C F

D

(B)

4 1 32 4

3

2 1

Page 171: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

171

[B]

Next, I say that CD and AB in that direction draw nearer to each other than any given distance, no matter how small. For let K be any distance as small as you like. I say that AB draws nearer to CD

than K. Choose any point Q on AB (extended either way, if you like), and draw QE at right

angles to CD. If QE is less than K, then we are done. If QE is instead greater

than or equal to K, then from it cut off EF < K. Let FG be the parallel to CD through F in the

direction of FG (to the left). Extend GF to some point

X to the right of QE.

Since FEC is right, and FG is parallel to EC, therefore EFG is acute [Prop.3].

Therefore QFX is acute [Euc.1.15]. So if we make FEL = QFX, then FEL is

acute and falls inside QED (which is right).

Since EL makes a lesser angle with QE than ED does, and ED is parallel to QA, therefore EL must cut QA at some point, say T. Since GFX enters triangle QET and

cannot exit it through QE (through which it enters), nor through ET (since GFX and

ET make EFG = FET, and thus they can never meet [Euc.1.16]), therefore GFX

exits triangle QET through QT, say at M.

Drop MP at right angles to CD. Since MFG is parallel to CD to the left, and MB is parallel to CD to the right, therefore

PMG = PMB [Lobachevsky, Prop.16, or mere symmetry]

Folding over figure MPEF we will get figure MPRS on the opposite side of MP, giving us RS = EF < K And RS is at right angles to CD. Therefore parallels CD and AB draw as near to each other as RS, which is less than the given distance K. Q.E.D.

Q

M

A

TS

R

B

DPE

X

L

F

C

G

Page 172: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

172

PROPOSITION 9 Asymptotic straight lines are parallel in the direction in which they are asymptotic. Let AB, CD be two straight lines asymptotic toward B, D. I say that AB is parallel to CD in that same direction. From any point P on AB drop PR at right angles to CD. Now PB never cuts CD to the right, since the two lines are asymptotic. So if PB is not the parallel through P to RD toward the right, the parallel must be lower down, as PL.

Since BP and PL form an angle, therefore they diverge by more than any given distance [Prop.7].

Much more so do PB and RD diverge by more than any given distance, since RD is on the other side of PL from PB, and PB diverges from PL as much as we

please. Therefore PB is not asymptotic to RD. Which is absurd, since we are given that these lines are asymptotic.

Therefore it is impossible for any line such as PL, dividing the angle BPR, to be the parallel to RD to the right. Therefore PB is the first straight line through P never

to cut RD to the right. Therefore PB is parallel to RD to the right, that is, AB is parallel to CD to the right. Q.E.D.

AP

B

L

D

RC

Page 173: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

173

PROPOSITION 10 The locus of points equidistant from a straight line (on one side of it) is a curve. Let AB be a line that is at every point the same perpendicular distance from straight

line CD. I say that AB is a curve.

For if not, let it be a straight line, and drop AC and BD at right angles to CD. Bisect CD at E and draw EF at right angles to CD.

Bisect ED at G and draw GK at right angles to CD.

Thus EFBD folds over to EFAC and coincides with it, showing that

1 = 2, so each of these is right. And GKBD folds over to GKFE and coincides with it, showing that

3 = 4, so each of these is right. Therefore quadrilateral EFKG has four right angles in it, which is impossible [Prop.6]. Therefore AB is not a straight line, and therefore it is a curve. Q.E.D. DEFINITION 25 Let such a curve as AB be called an equidistance curve, and let CD, the straight line from which it is equidistant, be called its axis.

A F BK

C E G D

1 42 3

Page 174: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

174

PROPOSITION 11 If two straight lines are both at right angles to a third straight line, then any other perpendicular to either line will form an acute angle in the resulting quadrilateral. Let FB and EC both be at right angles to EF, and let BC be any other perpendicular

let fall from FB to EC.

I say that FBC is acute.

FBC cannot be right or obtuse, lest EFBC have an angle sum equal to or greater than four right angles, which is impossible [Prop.6].

Therefore FBC is acute. Q.E.D. DEFINITION 26

Let such a quadrilateral as FBCE, with three right angles and one acute, be called a Lambert quadrilateral. DEFINITION 27

If we fold FBCE over FE to FADE, then AFB and DEC are straight lines, giving us a new quadrilateral ABCD with two right angles sharing a common leg CD, and

two equal acute angles at A and B, and a line of symmetry EF perpendicularly bisecting both AB and CD. Let such a figure be called a Saccheri quadrilateral, and let the side AB, joining the acute angles, be called its summit.

A F B

D E C

Page 175: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

175

PROPOSITION 12 In a Lambert quadrilateral, either leg of the acute angle is greater than the opposite side.

Let ABCD be a Lambert, with A acute.

Then AD > BC and AB > DC For bisect DC at G, draw GE at right angles to DC.

If we fold EGDA over EG, then D will land on C, and A cannot land on B,

since then EAD = EBC, an acute equal to a right, which is absurd. Nor can A land between B and C,

since then EAD = E(A)C, so that an acute is equal to an obtuse. Therefore A will land above B, on CB extended. Therefore AD > BC.

Likewise AB > DC. Q.E.D. PROPOSITION 13 A quadrilateral with two right angles joined by a side and two equal acute angles joined by a side is a Saccheri quadrilateral.

Let ABCD be a quadrilateral in which the angles at A and B are equal acutes, and the angles at C and D are rights.

Bisect DC at E, draw EF at right angles to DC.

I say that AD = BC AF = FB and AFE = BFE, making ABCD a Saccheri.

For if we fold FBCE over FE, then C will land on D.

A B

D C

E

(A)

G

F B

CED(C)

(B)

(B)

A

Page 176: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

176

And since ECB = EDA, therefore B

must land along DA somewhere, and in fact it must land on A. For if it fell on the

extension of DA, or between D and A, then in either case we will form a triangle whose exterior angle is equal to an interior and opposite, which is impossible

[Euc.1.16]. Therefore B lands on A, and therefore

AD = BC AF = FB AFE = BFE Q.E.D. COROLLARY

If the angles at D and C are not right but are nonetheless equal, ABCD would still be symmetrical and EF would still be the perpendicular bisector of both AB and

CD. PROPOSITION 14 If a quadrilateral has two right angles joined by a side, and if the opposite legs of the right angles are equal, then the quadrilateral is a Saccheri. Let quadrilateral ABCD have right angles at A and B, and let AD = BC.

I say that ADC = BCD and they are acute, and that there is a line of symmetry for the figure,

that is, that AB and CD share a perpendicular bisector, which things make ABCD a Saccheri. For let AC, BD be joined. Now AD = BC [given]

and BAD = ABC [given]

and AB is common so BAD ≅ ABC [SAS]

so BD = CA but AD = BC [given]

and DC is common so ADC ≅ BCD [SSS]

so ADC = BCD

F B

CED(C)

(B)

(B)

A

C

BA

D

Page 177: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

177

and they must be acute, lest ABCD have an angle sum greater than four rights, which is impossible [Prop.6]. Thus ABCD has a line of symmetry [Prop.13]. Therefore ABCD is a Saccheri. Q.E.D. PROPOSITION 15 Of two perpendiculars drawn from one line to another with which it shares a coperpendicular line, the one further from the coperpendicular is greater. Let AK and BL share AB as a coperpendicular. Let DC, FE be any other perpendiculars drawn from AK to BL,

with EF further from AB than DC is.

Then EF > DC

Bisect CE at M, draw MN at right angles to CE. • If possible, let EF = DC.

Then we could fold NMEF over to NMCD, so that DNM = FNM, making DNM right, and so quadrilateral ANMB would have four right angles, which is impossible

[Prop.6]. So EF is not equal to DC.

• If possible, let EF < DC. Then NMEF would fold to NMC(F), where (F) lies between C and D, making

N(F)C = NFE, which is acute because it is in Lambert ABEF. Thus N(F)C is

acute, so D(F)N, its supplement, is obtuse.

Thus D(F)N is obtuse in D(F)N,

so (F)DN is acute, to keep its angle sum less than two rights,

so ADC is obtuse, since it is adjacent to (F)DN,

but ADC is acute, because it is in Lambert ABCD.

So ADC is both obtuse and acute, which is absurd. Therefore EF is not less than DC, nor was it equal. Therefore EF is greater than DC. Q.E.D.

A

B C

D N

EM

F K

L

(F)

Page 178: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

178

PROPOSITION 16 Saccheris with equal summit angles and equal summits are equal. For if possible let two Saccheris have equal summits and equal summit angles but unequal areas. Then placing the summit of one on the other, we will have two Saccheris under the same summit AB, but one of them overshooting the

other, as ABEG overshoots ABCD, and the excess figure DCEG will be a quadrilateral with four right angles, which is impossible [Prop.6]. Therefore Saccheris with equal summit angles and equal summits are equal (in the sense of congruent). Q.E.D. PROPOSITION 17 Saccheris with equal summit angles and equal bases are equal. For if possible let two Saccheris have equal bases and equal summit angles but unequal areas. Then placing the base of one on the other, we will have two Saccheris on the same base, but one of them will overshoot the other,

as ABEG overshoots ABCD, and ADC = AGE. Bisect AB at K and draw KLM at right angles to AB. This is the line of symmetry

for both Saccheris, and so it makes right angles with CD and EG as well.

Now LDA + LDG = two rights

so MGD + LDG = two rights [summit G = summit D]

so the angle sum of quadrilateral GMLD is equal to four right angles, which is impossible [Prop.6]. Therefore Saccheris with equal summit angles and bases are equal (in the sense of congruent). Q.E.D.

A B

C

EG

D

G M E

C

BKA

DL

Page 179: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

179

DEFINITION 28 If a straight line passes through the point at which two parallels to the same straight line intersect, but does not pass into the angle they form which is bisected by the perpendicular from that point to the other straight line, then the line is said to be ultraparallel to that other straight line. For example, if BD is parallel to AC in the direction of BD (to the right), and BA is

dropped perpendicular to AC, and ABT = ABD, and TB is extended through to E,

then any straight line BF inside EBD is ultraparallel to AC. Obviously BF is not parallel to AC, because BD is the parallel toward the right, and BD is more inclined to AC than BF is toward the right, and BT is the parallel toward

the left, and BT is more inclined to CA than FB is toward the left. Yet neither does BF cut AC, since BD and BT do not cut it (these being the parallels), and hence if BF did

cut AC toward either side, it would first have to cut BD or BT again, after cutting them at B, whereas that is impossible (since two straight lines cannot enclose a space). Therefore, since BF is not a

parallel to AC, and yet it does not cut AC, BF is called ultraparallel, as in a non-cutting line that is “beyond the parallel.” PROPOSITION 18 Ultraparallel straight lines have one and only one coperpendicular. Let UT be ultraparallel to RS. I say that there is one and only one straight line perpendicular to both UT and RS (their “coperpendicular”). Choose any point A at random on UT.

Drop AB at right angles to RS.

If AB is at right angles to UT also, then AB is a coperpendicular.

P

A

L

MD

C

T

SEB

U

R

T

A C

B

EF

F

F

D

Page 180: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

180

If not, then AB makes with UT an acute angle on one side; so let BAT be acute.

Now since BAT is acute, therefore AT inclines toward RS at point A, and therefore

its distances from RS decrease for some time as we move to the right of A. But since UT is ultraparallel to RS, some line PAL making a still more acute angle with

AB will be the parallel to RS.

Now PAL never cuts RS, but it cuts UT at A, and therefore UT diverges by as great a distance as we please from AL as we follow UAT to the right of A [Prop.7]. All the more does AT diverge by as great a distance as we please from RS, which is always further from UT than AL is. Therefore, although UAT at first approaches

RS to the right of A, and is for a while at a distance less than AB from RS, it again diverges from it by as much as we please, and is eventually at a distance greater

than AB from RS. By continuity, it must therefore at some point after A be once again at the distance AB from RS, somewhere to the right of A, say at D. So if we drop DC at right angles to RS, then DC = AB. Thus ABCD is a Saccheri [Prop.14]. Therefore if we bisect BC at E and draw EM at right angles to BC, EM will be its line of symmetry, and will be perpendicular to both RS and UT. Thus RS and UT have this coperpendicular EM. And they can have no other, lest the ultraparallels and the two distinct coperpendiculars form a quadrilateral with four right angles in it, which is impossible [Prop.6]. Therefore any pair of ultraparallels share one and only one coperpendicular. Q.E.D.

P

A

L

MD

C

T

SEB

U

R

Page 181: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

181

PROPOSITION 19 If from the three vertices of a triangle are let fall three straight lines perpendicular to the straight line bisecting two sides of the triangle, then the three perpendiculars are equal. Let KML be any triangle, and XY the straight line passing through the midpoints of sides MK and ML (namely points O and P). Drop KR, LS, MT perpendicular to XY. I say that KR = LS = MT

For ROK = TOM [vertical angles]

and ORK = MTO [right angles] and OK = OM [O is midpoint of KM]

so ROK ≅ TOM [ASA] so KR = MT

And SPL ≅ TPM [ASA] so MT = SL Q.E.D. DEFINITION 29 Let a line such as XY be called a midline of a triangle. COROLLARY Conversely, if two equidistance curves share a common axis (that is, if they are equidistant from the same straight line), and one passes through one vertex of a triangle while the other passes through the other two, then the common axis between them is a midline of the triangle.

M

YX

K

R O

L

SP T

Page 182: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

182

PROPOSITION 20 A Saccheri quadrilateral with its base on the midline of a triangle and whose summit is the base of that triangle is equal to that triangle. Using the same construction as in the previous demonstration, label the figures in it as follows:

ORK = 1 OSLK = 2 SPL = 3 OPM = 4 MPT = 5

Then 1 = 5 + 4 [Prop.19] but 5 = 3 [Prop.19]

so 1 = 3 + 4 [substitution] so 1 + 2 = 2 + 3 + 4 [adding 2 to both sides]

i.e. RSLK = MLK But RSLK is a Saccheri, since the angles at R and S are right and since RK = SL

[Prop.19, Prop.14]. And its summit KL is the base of MLK. And its base RS lies along the midline of MLK. Hence the Saccheri with its base on the midline of

MLK and whose summit is KL is equal to MLK. Q.E.D. PROPOSITION 21 A triangle and the equal Saccheri whose summit is the base of the triangle have equal defects. Let ADH be any triangle, XY the midline

through C, E the midpoints of sides AD, DH, and AB, HG, DF the perpendiculars from

vertices A, H, D to XY. Thus BAHG is a Saccheri equal in area to ADH (Prop.20). Let 𝑑 be the defect of the angle sum of ADH from π,

and 𝛿 be the defect of the angle sum of Saccheri BAHG from 2π ,

M

YX

K

R O

L

SP T

1

2

3

45

YX

D

F

A H

C GB E

1

3

4

5

2

Page 183: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

183

thus 𝑑 = π − AΣ (ADH) and 𝛿 = 2π − AΣ (BAHG) I say that 𝑑 = 𝛿. For brevity, call

ABC = 1 CDF = 2 ACEH = 3 DEF = 4 EGH = 5

Now 1 ≅ 2 [ASA] and 4 ≅ 5 [ASA] so AΣ (2) + AΣ (4) = AΣ (1) + AΣ (5)

so AΣ (2) + AΣ (4) + AΣ (3) − 3π = AΣ (1) + AΣ (5) + AΣ (3) − 3π

or AΣ (ADH) = AΣ (BAHG) − π

so π − 𝑑 = 2π − 𝛿 − π so π − 𝑑 = π − 𝛿 so 𝑑 = 𝛿 Q.E.D.

YX

D

F

A H

C GB E

1

3

4

5

2

Page 184: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

184

PROPOSITION 22 Triangles on the same base and in the same equidistance curves are equal.

Let ABC and DBC share the base BC, and let them also share midline EF, passing through K, R, L, M (the midpoints of AB, DB, AC, DC), and let BG, CH be

drawn at right angles to EF. Thus GBCH is a Saccheri with base GH lying along the midline of both triangles,

and BC, the base of both triangles, is the summit of this Saccheri. And A, B, C, D are all the same perpendicular distance from midline EF. So the equidistance curve that is the distance BG from EF passes through C, and

the equidistance curve that is that same distance from EF on the other side of it passes through A and D. Such triangles as ABC, DBC are said to be in the same equidistance curves. And they are equal. For each is equal to Saccheri GBCH [Prop.20]. Q.E.D.

AD

E

B C

G FM

K R HL

Page 185: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

185

PROPOSITION 23 Triangles with equal bases and equal defects are equal.

Let ABC and DEF have base AC equal to base DF, and let their defects be equal. I say that ABC = DEF For let the Saccheris equal to each triangle be constructed, namely those formed by dropping perpendiculars from the endpoints of their bases to their respective

midlines [see Prop.20], thus forming Saccheri G for ABC and Saccheri H for DEF.

Now, since ABC is equal to Saccheri G (whose summit AC is the base of triangle ABC), therefore Defect of ABC from π = Defect of G from 2π [Prop.21]

again Defect of DEF from π = Defect of H from 2π [Prop.21] But Defect of ABC from π = Defect of DEF from π [given] so Defect of G from 2π = Defect of H from 2π

Therefore each summit angle of G is equal to each summit angle of H, since each has two right angles and two equal acute angles.

But AC = DF [given] so G = H [Prop.16] But each triangle is equal to its Saccheri,

so ABC = DEF. Q.E.D.

B

G

A C

H

D F

E

Page 186: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

186

PROPOSITION 24 Triangles with equal defects are equal. Let ABC and DEF have equal defects. I say that they are equal. If every side in one is equal to a corresponding side in the other, then they are not only equal but congruent, and their defects will be equal. But if not, then one side in one triangle will be less than one of the sides in the other.

So let AB < DE.

Draw the midline HK for ABC and the corresponding equidistance curves through A, B, C. If we move a point N from B and to the right along the equidistance curve through B, then AN grows continuously and becomes as large as we please. For if we drop

NR at right angles to HK, we see that it is always true that HAG is congruent to NRG, and so it is also always true that AG = GN, and thus it is always true that AN = 2AG. But AG grows continuously and as large as we please as N moves to the right (AG may be the radius of a circle of center A, and we may take radii as large as

we like). And therefore so does AN.

From which it is clear that at some point AN = DE. Let the AN which is equal to DE have been drawn.

Thus ABC and ANC share equidistance curves and also share base AC,

so ABC = ANC [Prop.23] And each has the same defect as Saccheri HACK, to which each is equal [Prop.21]. Thus Defect of ABC = Defect of ANC.

Now Defect of ABC = Defect of DEF. [given] So Defect of ANC = Defect of DEF. But these two triangles also have the base AN equal to the base DE (construction).

So ANC = DEF. [Prop.23]

Thus ABC = DEF. Q.E.D.

E

F

D

BN

R

CA

H KL

G

Page 187: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

187

PROPOSITION 25 Equal triangles have equal defects. For let equal triangles ABC and DEF have

defects 𝑑 and 𝛿 respectively.

I say that 𝑑 = 𝛿

For if possible, let 𝑑 and 𝛿 be unequal, say 𝑑 < 𝛿.

Then take point Q on DF, and move it from F to D along FD, so that the triangle DEQ goes in a continuous manner from being equal to DEF down to being nothing. As its area approaches zero, so too does its defect [Cor 2.Prop.2]. Therefore this triangle goes continuously from having a defect equal to that of DEF, hence greater than that of ABC, to having a defect less than that of ABC, and consequently it must

at some point between have a defect equal to that of ABC. Let this be at the point Q that is drawn in the figure. Thus defect of DEQ = defect of ABC

so DEQ = ABC in area [Prop.24] so DEQ = DEF [since ABC = DEF] so that the part is equal to the whole, which is absurd.

Therefore 𝑑 and 𝛿 cannot be unequal.

So 𝑑 = 𝛿 Q.E.D.

B

AC

E

D

FQ

Page 188: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

188

PROPOSITION 26 The areas of any two triangles are proportional to their defects.

Let A and B be any two triangles, d(A) and d(B) their defects.

I say that A : B = d(A) : d(B) For let there be taken a triangle 𝑡 (of fixed area but flexible shape) much smaller than either of the given triangles, and that

measures one of them (say A) exactly some number of times 𝑛. And let us refer

to the defect of triangle 𝑡 as 𝛿. If 𝑡 measures B also, say 𝑚 times, then A = 𝑛 ∙ 𝑡 and B = 𝑚 ∙ 𝑡 so A ∶ B = 𝑛 ∶ 𝑚 But also the defect in A is the sum of all the defects of the 𝑛 constituent triangles

in it each equal to 𝑡 [Cor 1.Prop.2], so that the defect in A must be 𝑛 ∙ 𝛿, and similarly the defect in B must be 𝑚 ∙ 𝛿. Thus d(A) = 𝑛 ∙ 𝛿 and d(B) = 𝑚 ∙ 𝛿 so d(A) ∶ d(B) = 𝑛 ∶ 𝑚 Thus A : B = d(A) : d(B) If instead triangle 𝑡 does not measure B also, but leaves a leftover less than itself,

we may repeat the process by taking smaller and smaller triangles that measure A exactly and leave as little of B leftover as we please, so that

lim𝑡→0

(𝑛 ∙ 𝑡

𝑚 ∙ 𝑡) =

A

B

where 𝑛 and 𝑚 are the changing numbers of triangles with area 𝑡 that fit into A and B (that is, into A exactly, and into B as nearly as we please), and likewise

lim𝑡→0

(𝑛 ∙ 𝛿

𝑚 ∙ 𝛿) =

d(A)

d(B)

but the ratio whose limit we are taking in each case is always 𝑛

𝑚 , no matter what 𝑡

and its corresponding 𝛿 are, so A

B and

d(A)

d(B) are each the limit of the same ratio,

namely 𝑛

𝑚, and therefore they are equal.

So A : B = d(A) : d(B) Q.E.D.

A B

Page 189: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

189

PROPOSITION 27 To construct a quadrilateral that is greater than any possible triangle.

On a straight line QR set up CT at right angles to it. A straight line making

with CT an acute angle that is less than 45° will, if drawn far up enough, be

parallel to CQ [Lob.Prop.23]. Let BP be such a parallel. Since CBP is less than 45°, then if CBL is made equal to 45°, BL will be

ultraparallel to CQ. But since CBL is acute and BL ultraparallel to CQ, therefore BL at first inclines

toward CQ, then away again to as great a distance as we please (since BL cuts parallel BP and must diverge from it as far as we like [Prop.7]), and therefore its

distance from CQ will first decrease to less than BC and then increase back up to it again, say at point A, where it will therefore happen that when we drop AD at right angles to QR, AD = BC.

Thus if we call L the midpoint of AB and drop LM at right angles to DC, ABCD will be a Saccheri with line of symmetry LM, and each summit angle will be 45°. And if we fold ABCD over CD to KGCD, then ABGK will be a quadrilateral composed of two identical Saccheris, and each of its four angles will be 45°.

So ABGK is double ABCD. Can some triangle also be double ABCD ? The triangle equal to ABCD (with base CD and its midline along AB) will have

a defect equal to that of ABCD [Props.20,21]. But the angle sum of ABCD is 90° + 90° + 45° + 45°, and so its defect from four right angles is 90°. So this particular

triangle equal to ABCD has a defect from two right angles of 90°. And since equal triangles have equal defects [Prop.25], therefore any other

triangle equal to ABCD would also have a defect of 90°. And since the areas of triangles are as their defects [Prop.26], therefore any triangle that is double a triangle that is equal to ABCD would have to have a defect that is double that of ABCD,

hence a defect of 180°. So any triangle that is double ABCD in area would have an angle sum falling short of 180° by 180°, and so it would have no angles at all, which is absurd.

Therefore there is no triangle that is double ABCD. But ABGK is double ABCD.

Therefore no triangle is as big as ABGK. So ABGK is bigger than any possible triangle. Q.E.F.

Q D

K

A

L

P

B

T

R

G

N

MC

Page 190: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

190

PROPOSITION 28 For any right triangle, whose sides are a, b, c, and whose angles opposite these are 𝜆, 𝜇, 1

2 𝜋 respectively, if we call l, m the distances of parallelism for 𝜆, 𝜇, and

if we call 𝛼, 𝛽, the angles of parallelism for a, b, then the following equations hold: [1] 𝛽 = 𝜆 + Π(c + m) [2] 𝜆 + 𝛽 = Π(c − m) [3] Π(b + l) + Π(m − a) = 1

2 𝜋

[4] 𝛼 = 𝜇 + Π(c + l) [5] 𝜇 + 𝛼 = Π(c − l) [6] Π(m + a) + Π(l − b) = 1

2 𝜋

For, extend a in direction CB (that is, in

the direction from C toward B) out to a point Z infinitely far away (if you will allow this convenience of speech), and let line AZ be parallel to CB in that

direction, and extend AB to D till BD is the distance of parallelism for 𝜇, that is, until BD = m. Thus the perpendicular to

ABD at D is also parallel to CB, hence also to AZ [Lob.Prop.25]. So CAZ = angle of parallelism for b = 𝛽 and CAZ = CAB + DAZ = 𝜆 + angle of parallelism for AD so 𝛽 = 𝜆 + Π(AD) = 𝜆 + Π(c + m) which is equation [1].

A

C

b

a

c

m

D

ZB

β

μ

μ

λ

Page 191: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

191

Similarly, Equation [2] becomes apparent if we extend a in direction BC, draw a parallel to a

through A in the same direction, and cut off from BD a part equal to the distance of parallelism for 𝜇, that is, a part equal to m. Equation [3] becomes apparent if we extend c in direction BA, draw a parallel to c through C in the same direction, extend CA to E so that AE = l = distance of

parallelism for 𝜆, and extend BC to D so that BD = m = distance of parallelism for 𝜇.

Equation [4] becomes apparent if we extend b in direction CA, draw a parallel to b through B in the same direction, and extend BA to E so that AE = l = distance of

parallelism for 𝜆.

Equation [5] becomes apparent if we extend b in direction AC, draw a parallel to b through B in the same direction, and cut off from AB a part equal to l = distance of

parallelism for 𝜆. Equation [6] becomes apparent if we extend c in direction AB, draw a parallel to c

through C in the same direction, extend AC to E so that AE = l = distance of parallelism for 𝜆, and extend CB to D so that BD = m = distance of parallelism for 𝜇. Q.E.D.

A

C

b

a

c

m

D

ZB

β

μ

μ

λ

Page 192: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

192

PROPOSITION 29 For any quadrilateral having three right angles, whose sides, beginning from the one

acute corner, 𝜷, are called, in order, c, m’, a, l, (where m’ indicates the distance of parallelism for an angle whose complement has distance of parallelism m, and so

on with other letters), if we call 𝜸, 𝝀 the angles of parallelism for c, l, and if we call b the distance of parallelism for 𝜷, then the following equations hold: [1] 𝜷 = 𝝀 + Π(𝐜 + 𝐦) [2] 𝝀 + 𝜷 = Π(𝐜 − 𝐦) [3] Π(𝐛 + 𝐥) + Π(𝐦 − 𝐚) = 1

2 𝜋

[4] 𝜷 = 𝜸 + Π(𝐥 + 𝐚′) [5] 𝜸 + 𝜷 = Π(𝐥 − 𝐚′) [6] Π(𝐜 + 𝐛) + Π(𝐚′ − 𝐦′) = 1

2 𝜋

in which equations we use bold letters to indicate that these values might not be the same as those in a right triangle, as in the foregoing proposition. For, extend DA to Z, a point infinitely distant. Draw CZ, BZ parallel to DAZ.

Extend CB to G until BG is the distance of parallelism for GBZ. Thus GZ, BZ, CZ, DZ are all parallel. Now DCZ = Π(CD) = Π(𝐥) = 𝝀

and GCZ = Π(GC) = Π(GB + 𝐜)

while GB = distance of parallelism for GBZ = distance of parallelism for complement of ABZ = distance of parallelism for complement of the angle whose distance of parallelism is 𝐦′

= 𝐦 by our notation. Hence GB = 𝐦, and thus GCZ = Π(GC) = Π(𝐦 + 𝐜) Now GCD = GCZ + DCZ thus 𝜷 = 𝝀 + Π(𝐜 + 𝐦) which is equation [1].

C

a

c

m’Z

β

Bm

A D

G

l

Page 193: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

193

Similarly,

Equation [2] becomes apparent if we extend AD to Z, draw CZ and BZ parallel to ADZ, and cut off from CB length BG = m = distance of parallelism for 𝝁, complement

to 𝝁′. Equation [3] becomes apparent if we extend BC to Z, draw AZ and DZ parallel to

BCZ, extend AD to G, till AG = distance of parallelism for DAZ (so that the perpendicular to AG at G is parallel to AZ, and AG = the distance of parallelism for

the complement of 𝝁′, and thus AG = m), and extend DC to N till CN = distance of parallelism for 𝜷, that is, till CN = b. Equation [4] becomes apparent if we extend BA to Z, draw CZ and DZ parallel to BAZ,

and extend CD to N till DN = distance of parallelism for NDZ, complement of ADZ, i.e., so that DN = a’. Equation [5] becomes apparent if we extend AB to Z, draw DZ and CZ parallel to

ABZ, and cut off from CD length DN = distance of parallelism for CDZ, complement

of ADZ, which is the angle of parallelism for AD = a, and thus DN = a’. Equation [6] becomes apparent if we extend DC to Z, draw AZ and BZ parallel to DCZ, extend BC to N till CN = b, and extend AB to G till AG = a’. Q.E.D.

C

a

c

m’Z

β

Bm

A D

G

l

Page 194: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

194

PROPOSITION 30 Using the same notation as in the last two propositions, let there be any right triangle of sides a, b, c, opposite angles 𝜆, 𝜇, 1

2 𝜋 respectively, and a quadrilateral

with three right angles, in which one side

of the acute angle is equal to c in the right triangle, and the next side is equal to m’ (the distance of parallelism for the

complement of 𝜇 in the right triangle), and call the remaining sides, in order, a

and l, and call the acute angle 𝜷. Then it must follow that 𝜷 = 𝛽

𝝀 = 𝜆 𝐚 = a 𝐥 = l That is, the acute angle in the quadrilateral is equal to the angle of parallelism for

side b in the triangle, and the angle of parallelism for side l of the quadrilateral is equal to angle 𝜆 in the triangle, and side a in the quadrilateral is equal to side a in

the triangle, and side l in the quadrilateral is equal to the distance of parallelism for angle 𝜆 in the triangle. For, in the triangle we have, by Equations [1] and [2] of Proposition 28, 𝛽 = 𝜆 + Π(c + m) 𝜆 + 𝛽 = Π(c − m) And, in the quadrilateral we have, by Equations [1] and [2] of Proposition 29, 𝜷 = 𝝀 + Π(c + m) 𝝀 + 𝜷 = Π(c − m) where we do not write c and m in bold, since we are given that c in the quadrilateral

is equal to the side c in the triangle, and also that m’ in the quadrilateral is equal to the distance of parallelism for the complement of 𝜇 in the right triangle, so that the

distance of parallelism of 𝜇 in the right triangle, and also the distance of parallelism complementary to m’ in the quadrilateral, is the same m. But since we have here two pairs of linear equations identical in form, and differing

only in two terms (namely in 𝛽 and 𝜷, and again in 𝜆 and 𝝀), it follows that 𝜷 = 𝛽

𝝀 = 𝜆

b

a

c

m’

β

μ

λ

c

a

l

Page 195: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

195

And since these angles are equal, therefore the distances of parallelism for them are also equal, that is 𝐥 = l and 𝐛 = b Now from Equation [3] in Proposition 29, we have Π(𝐛 + 𝐥) + Π(𝐦 − 𝐚) = 1

2 𝜋

which we now know is the same as Π(b + l) + Π(m − 𝐚) = 1

2 𝜋

and from Equation [3] in Proposition 28, we have Π(b + l) + Π(m − a) = 1

2 𝜋

from which two equations it follows that 𝐚 = a And from this it follows that whenever there exists a quadrilateral having three right angles, and sides c, m’, a, l, taken in order beginning from the one acute angle 𝛽, then there also exists a right triangle with sides a, b, c, opposite angles 𝜆, 𝜇, 1

2 𝜋

respectively, in which 𝜇 is the complement to the angle for which m’ is the distance of parallelism. Q.E.D.

b

a

c

m’

β

μ

λ

c

a

l

Page 196: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

196

PROPOSITION 31 To construct a straight line parallel to a given straight line through a given point not on it.

Let AB be the given straight line, P the given point not on it. Thus it is required to draw through P a straight line parallel to AB. Drop PR at right angles to AB.

Draw PU at right angles to PR. From RB cut off the part RF of any length less than PR.

Draw FQ at right angles to PU. Thus RFQ must be acute [Prop.11]. And therefore, in the quadrilateral PRFQ, side RF must be greater than PQ [Prop.12].

Cut off PK equal to RF (thus greater than PQ), and draw a circle with center P and radius PK, which will therefore cut QF at some point (since PK > PQ, but PK < PR),

say L. Join PL. Call PQ = a QF = l FR = c = PL RP = m’ QFR = 𝛽 Now, since we have constructed this quadrilateral PQFR, corresponding to it there exists also a right triangle with sides a, b, c, opposite angles 𝜆, 𝜇, 1

2 𝜋 respectively,

in which 𝜇 is the complement to the angle for which m’ is the distance of parallelism

[Prop.30]. But since PQL is right, and a is a leg and c is the hypotenuse, therefore

PQL is that triangle, and therefore QPL = 𝜇. Now RPL is the complement of this,

and therefore RPL = 𝜇′. But PR = m’, the distance of parallelism for 𝜇′, and PR is at right angles to RB. Consequently, PL is parallel to RB. Q.E.F.

c

a

m’

β

P

AF

l

U

BR

Q

c

L

K

Page 197: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

197

PROPOSITION 32 It is possible to assign an absolute unit of length. In Descartes’ geometry, he chooses a random finite straight line as a unit length. And he invites you to do the same. Question: is your unit the same as his? Are they equal, or unequal? Answer: the question has no meaning. Descartes’ unit has metrical relationships to other lines in the same world, that is, to other lines in his mind. His unit length is to another line that he is imagining as 1 is to 2, and it is to

the diagonal of the square on itself as 1 is to √2, and the same things can be said of a unit in your mind in relation to other lines you have in mind. But his unit has no metrical relationships to yours, or to any lines in your mind. Each is a measure only in relation to other things in the same world. Each unit is something relative to the unit-user, one might say, and there is no way to define a common unit for everyone—what is the same for everyone is just the metrical relationships. All of that is about length. Angles are another story entirely. My right angle is the same as yours and the same as Descartes’—it is exactly “one quarter of a full circle,” or one fourth of “all the way around.” Here we have a natural unit of angular measure, one that can be specified independently of its users, and defined in a way that specifies a determinate amount. In hyperbolic geometry, we can define a unit of length in a way similar to the way that we can define a unit of angular measure. For example, “the side of an equilateral triangle having a defect of one right angle.” This specifies a definite length, and defines it in a way that is independent of any unit-chooser. In Euclid’s geometry, the same shape can have different sizes, so “the side of an equilateral triangle” does not specify any definite length. In Lobachevsky’s geometry, a given shape can have only one size, and so its shape gives us a way of defining the length of one of its sides.

Page 198: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

198

PROPOSITION 33 To investigate how definite is the character of hyperbolic space as determined by the postulates previously laid down. How much restriction do the postulates we have been using place upon the character of hyperbolic space? Every Euclidean space is the same as every other Euclidean space, but is every hyperbolic space the same as every other? Is it possible, for example, that in my universe the distance of parallelism for 60° is 12

meters, whereas in your universe the distance of parallelism for 60° is 12 yards?

Nothing in the postulates seems to prevent this, and there seems no way of getting a definite length (such as “12 meters”) just out of “60°” and our postulates.

Suppose, now, that in my universe the angle of parallelism for 6 meters is 𝑥°, and in your universe the angle of parallelism for 6 yards is 𝑦°. Must 𝑥 = 𝑦? Or do the postulates of Lobachevskian geometry permit even more freedom than that, so that different geometries can have different rates of change in angles of parallelism as distances change? Let’s find out.

In my universe, I let AC be any straight line, AG an oricycle of which AC is an axis, P any point on the oricycle, PL a straight line orthogonal to the arc AG (hence

PL is another axis, and thus PL is parallel to AC), B a random point on AC, and BR another oricycle, thus determining the mixed quadrilateral PABR (contained by two equal straight lines and two oricyclic arcs).

In your universe, let ac be any straight line, ag an oricycle of

which ac is an axis, d any point on the oricycle, dk a straight line

orthogonal to the arc ag (hence dk is another axis, and thus dk is

parallel to ac). Now move an oricycle from position ag toward the right until the arc cut off from it by dk, namely arc be, is such that

12

m

eter

s

12

ya

rds

60

60

xy

P

G

A BM C

L

N

Rd

p

a bm n

r

ek

l

c

g

Page 199: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

199

AP ∶ BR = ad ∶ be

which we know is possible thanks to Lobachevsky’s Proposition 33. Now, arc AP has to AB some ratio in my universe, so in yours cut off ap from ag so that

AP ∶ AB = ap ∶ ab

and draw prl, the axis through point p, thus giving us the quadrilateral pabr. Now in PABR, all four angles are 90°, and PR = AB.

And in pabr, all four angles are 90°, and pr = ab.

And PA ∶ AB ∶ BR = pa ∶ ab ∶ br

and therefore PABR is similar to pabr. [Although similar figures (same shape, different size) are impossible in one Lobachevskian space, perhaps nothing prevents them from existing in different worlds, since, as long as they do not inhabit the same space, they cannot coexist in such a way that we can use them to prove that the angle-sum of a triangle is two rights. At any rate, even if the figures are

congruent, not just similar, that will be fine for the present purpose.] I now drop PM and RN at right angles to AC, and you likewise drop pm and rn at right angles to

ac. From the similarity of PABR and pabr follows the similarity of PMNR and pmnr, and therefore

MPR = mpr and NRL = nrl and PM ∶ RN = pm ∶ rn

But MPR is the angle of parallelism for PM

and mpr is the angle of parallelism for pm

and NRL is the angle of parallelism for RN

and nrl is the angle of parallelism for rn and from this it is evident that the rate of change in the angle of parallelism in one world is either the same as, or similar to, that in another. That is,

1. If in one space PM and RN have angles of parallelism α and β

respectively, and in another space pm has α as its angle of

parallelism, and in the second space we take rn such that PM : RN

= pm : rn, then the angle of parallelism for rn will be β.

2. Conversely, if the angles of parallelism for PM and RN in one

space are α and β respectively, and in another space the angles of

parallelism for pm and rn are again α and β respectively, then PM :

RN = pm : rn.

Returning to our original figure, then, it is indeed the case that 𝑥 = 𝑦, since we are given the proportionality of the straight lines in the two spaces, and we are also given that the antecedents have the same angle of parallelism.

Page 200: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

200

Consequently there is one and only one possible value for the angle of parallelism for a given distance once a single pairing of an angle of parallelism with its distance is given in a particular space. That is, once 12 meters is paired with 60° in my

universe, then the angle of parallelism for 6 meters is determined, and I have no choice about what its value will be. And this was entirely the consequence of Lobachevsky’s postulates. Therefore the form of hyperbolic space is determined by the postulates alone, even if the size or scale of the universe is not. Again, any such space must be uniform, that is, the rate at which an angle of parallelism changes with change in the distance must be the same at all points in the space, since we may use the argument above for two points in the same space and achieve the same result (although in that case PABR and pabr must be congruent, not merely similar).

P

G

A BM C

L

N

Rd

p

a bm n

r

ek

l

c

g

Page 201: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

201

PROPOSITION 34 There exists an isosceles triangle with

a given peak angle a and having each base angle equal to a given angle b, on

condition that a + 2b < 𝜋. For, let DAQ = 1

2 a be set out, and let

AC = Π(1

2 a), so that CL, drawn at right

angles to AQ, is parallel to AD.

Choose R at random on AD, and drop RB at right angles to AC. Join CR. Call

the angle ARB by the name 𝛽.

Take AR greater and greater, and repeat the construction, allowing 𝛽 to change as necessary.

Now, as AR grows toward infinity, ARC goes to zero [Lob.Prop.21]. But ARB is

always less than ARC, and therefore lim

AR→∞𝛽 = 0

But as AR → ∞, B → C, and so lim

B→C 𝛽 = 0

Therefore 𝛽 will become smaller than any given angle as B goes to C. Next take AR less and less. As AR → 0, the area of triangle ARB goes to zero, and therefore also the defect of triangle ARB goes to zero. Call this defect d. Therefore lim

AR→0 d = 0

But d = π − (1

2 𝜋 + 1

2 a + 𝛽) = 1

2 𝜋 − 1

2 a − 𝛽

So limAR→0

(1

2 𝜋 − 1

2 a − 𝛽) = 0

or 0 = lim

AR→0 12 𝜋 − lim

AR→0 12 a − lim

AR→0 𝛽

0 = 1

2 𝜋 − 1

2 a − lim

AR→0 𝛽

lim

AR→0 𝛽 = 1

2 𝜋 − 1

2 a

D

R

A B C Q

L

β

Page 202: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

202

or limB→A

𝛽 = 12 𝜋 − 1

2 a

And therefore 𝛽 will become as close to 1

2 𝜋 − 1

2 a as we please as B goes to A.

Now, we are given that b is some finite amount, and that

a + 2b < 𝜋 so 2b < 𝜋 − a so b < 1

2 𝜋 − 1

2 a

And it is less by a fixed amount, since b is a given constant. So we can find a value of 𝛽 that is closer to 1

2 𝜋 − 1

2 a

than b is, and for this value 𝛽 > b.

Also, since b is a fixed and finite amount, we can find another value of 𝛽, as B goes to C, that is closer to zero than b is, and so for this value we will have 𝛽 < b. So, as R goes from A to infinity, 𝛽 passes first through values greater than b, and then through values less than b. Therefore, since this process is continuous, at some

point the changing value of 𝛽 must equal the fixed value of b.

Let this happen at E. Drop EF at right angles to AQ.

Then AEF = b. Now extend EF to G so that EF = FG.

Thus AFG is congruent to AFE. Therefore EAG = a, and AGF = AEF = b. Therefore AEG is an

isosceles triangle with peak angle a and each base angle equal to b. Q.E.D.

D

E

AF

Q

β

G

D

R

A B C Q

L

β

Page 203: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

203

PROPOSITION 35 That it is possible to tessellate the hyperbolic plane with a required number of equilateral triangles about a given point, provided the number be greater than 6. To tessellate a plane (or any surface) means to cover it with figures so that there are no gaps or overlaps, as with tiles on a floor. We can tile the hyperbolic plane with just equilateral triangles, as long as we use 7 or more around each vertex. The

number must be greater than 6, because if we used only 6 around a point, or fewer, and somehow still managed to leave no gaps, then 6 × (angle of equilateral triangle) ≥ 360° so angle of equilateral triangle ≥ 60° so angle sum of equilateral triangle ≥ 180° which is impossible in hyperbolic space.

To tile the plane with 7 equilateral triangles around every vertex, we first set out an isosceles triangle with peak angle b, and each base angle equal to b also, where

b = 1

7 (360°)

which is doable, since 3b = 3

7 (360°) < 180° [Prop.34].

Now place 7 such equilateral triangles around any point P in the plane, and they will not overlap or leave any gaps. And we can repeat this around any vertex of any such triangle, thus tiling the plane. We could also similarly tile the plane with equilateral triangles in which b =

1

8 (360°), or b =

1

9 (360°), etc.

Q.E.D.

P

B

A C

b

b b

Page 204: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

204

COROLLARY: In Euclidean space, the only regular figures that tessellate the plane

are the 3-sided, 4-sided, and 6-sided regular rectilineal figures, that is, the equilateral triangle, the square, and the regular hexagon. But in hyperbolic space, we can tessellate the plane with regular polygons of any given number of sides. To do so

with a 5-sided regular polygon, for example, we need only consider the pentagon as composed of 5 congruent isosceles triangles (sharing a common vertex at the center of the pentagon). One limitation on these triangles is that the angle sum of each must be less than 180°. To get the pentagons to tessellate the plane, we just choose an angle sum for the isosceles triangles that is less than 180°, but also such that

each peak angle is one fifth of 360°, that is 72°. Since double each base angle of these isosceles triangles will form one angle of our pentagon, we require the double

of the base angle to measure 360° evenly. For example, we know that there is an

isosceles triangle with peak angle a = 72°, and each base angle b = 1

8 (360°) =

45° [Prop.34], since those add up to less than 180°. Using these isosceles triangles to form our pentagons, each pentagonal angle will be 90°. And four of those will fill

up 360° without gap or overlap. So four such pentagons will fit around any one point, and we can use them to tessellate the plane.

Page 205: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

205

17 Poincaré’s Disk Model of the Hyperbolic Plane: A Euclidean Mirror of a Non-Euclidean World

INTRODUCTION We have now gotten some experience of Lobachevsky’s world, of the particular non-Euclidean universe known as hyperbolic space. In that world we have discovered certain truths of Euclid that remain untouched, and also other things impossible in Euclidean space. It is time to begin comparing the two worlds. Is one of them truer than the other? Is Lobachevsky’s world somehow false, and Euclid’s true? And what might that mean?

We will begin investigating that question in a particular form in this section. We will ask: is hyperbolic geometry absurd, containing hidden contradictions that we have not yet discovered? Or is it merely strange, unfamiliar territory, difficult for us to imagine, but entirely self-consistent? And is self-consistency the same thing as mathematical truth?

Another way of asking our present question is this: do the first four postulates of Euclid imply his fifth as a logical consequence? Can it be proved from them? Or can the first four postulates be assumed together with the negation of the fifth postulate without ever running into a contradiction? The fact that geometers for many centuries tried to prove the fifth postulate from the other four, and always failed, suggests but does not prove that the first four cannot prove the fifth. The fact that Lobachevsky and other geometers assumed the first four postulates together with the negation of the fifth, and never ran into contradictions, suggests but does not prove that the denial of the fifth postulate is logically consistent with the other four. In a similar way, the failure of geometers to find a way to trisect an arbitrary rectilineal angle with straight edge and compass alone suggests, but does not prove, that it is impossible to trisect a given angle with such instruments alone (that has been proved an impossibility by modern mathematics, by the way).

How shall we go about answering our question? We are asking whether there is a hidden contradiction in hyperbolic geometry, one we have not yet found. If

Page 206: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

206

there is one, we might find it and then we would know the answer to our question in a definite way. But what if there isn’t one? Then we will simply continue failing to find one, and we will never be sure whether the reason is that there isn’t one, or that there is but we simply haven’t found it yet, and so it seems we could never know the answer to our question in a definite way. How frustrating that would be! Never fear. Modern mathematicians have discovered ways of answering our question.

We will prove that the first four postulates of Euclid do not imply his fifth postulate, and therefore the first four postulates and the negation of the fifth (which negation is the special postulate of hyperbolic geometry) are logically consistent—or at least they are as logically consistent as Euclid’s postulates are. Our technique will be to build a model of hyperbolic plane geometry within Euclid’s geometry, a perfectly Euclidean construction whose logical behavior is exactly analogous to that of all the entities of hyperbolic plane geometry. The particular model we will build is called the Poincaré Disk Model, named after Jules Henri Poincaré (1854-1912), a French mathematician, theoretical physicist, mining engineer, and philosopher of science. Among many other things to his credit, he is numbered among the founders of topology and the forerunners of modern chaos theory. He made a large number of significant contributions to modern mathematics and physics, and his name will come up again when we come to Einstein’s theory of relativity.

Although in many ways a very modern mind, Poincaré was scandalized by Cantor’s theory of transfinite numbers, and referred to it as a “disease” of which mathematics would eventually be cured. He said “There is no actual infinite; the Cantorians have forgotten this, and that is why they have fallen into contradiction.” He was also opposed to the mathematical philosophy of Bertrand Russell and Gottlob Frege, who both believed that mathematics is a branch of logic. Although it may seem surprising after we study his disk model, Poincaré disliked logic, and thought it only gave structure to ideas after discovery, whereas real innovation must come more intuitively. For him, intuition was the essence of mathematics, and the science was not merely a matter of logical deduction from assumed principles, but a visualization of them—it was about logical structures and relations inhering in concrete forms, not about logical structures considered abstractly.

The Poincaré Disk Model, also called the Conformal Disk Model, was originally proposed by Eugenio Beltrami (1835-1900, an Italian mathematician who also proposed what later became known as the Klein Model and the Poincaré Half-Plane Model) to show that hyperbolic geometry was “equiconsistent” with Euclidean geometry, meaning that hyperbolic geometry did not contradict itself any more than Euclid’s geometry did. The model is named after Poincaré nonetheless, since his own discovery of it fourteen years later became better known than Beltrami’s original work.

In order to build the model and make our argument from it, we require a number of preliminary Euclidean theorems. Since these are beautiful in their own right, we will pursue these mostly for their own sake first, without explaining at each step how they will be useful for our main purpose, and then we will put together a demonstration from our model afterward.

Page 207: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

207

DEFINITION 1 When the radius of a circle is a mean proportional between the distances from its center to each of two points (one inside the circle and the other outside) that are collinear with the center, then either point is called the inverse or inversion of the other. For example, if a circle of center O and

radius R has a point P inside it, and if OP is

extended to I so that R2 = IO ∙ OP, then point I is called the inversion of P, and conversely

P is the inversion of I. Points P and I can also be called inverses of each other.

Remark 1: If R = 1, then OI = 1

OP, and so the lines OI and OP are also “inverses”

of one another in the fractional sense. Remark 2: Two points are called inverses of each other not absolutely, but only with respect to some circle (although an infinity of different circles can serve the purpose) called a circle of inversion, the center of which is called the center of inversion. Remark 3: Given a line or figure and a circle of inversion, if the inverse point for every point on the line or figure lies on another line or figure, this other line or figure

A

I

O

R B

P

Page 208: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

208

can also be called the inverse or inversion of the original one. Inversion thus gives us a way of mapping all points outside a circle to points inside it, placing them in one-to-one correspondence. Points, lines, or figures that are inverses of each other are thus said to map to each other. Sometimes lines or figures that are inversions of each other are called images of each other. Although an infinity of different circles can be used to invert one point into another point, the case might be different for whole lines or figures. Perhaps, for example, a given pair of figures might invert to one another only with respect to one special circle, or maybe not at all. We will have to investigate how inversion affects certain things, such as straight lines, angles, and circles. Remark 4: Every straight line drawn from the center of inversion through two lines (or figures) that are inversions of each other cuts them in points that are inverses of each other. THEOREM 1 If from a point inside a circle a perpendicular is drawn to the diameter through it, and straight lines join the ends of this chord to the inverse of the point, these lines will be tangent to the circle, and conversely.

Let P be a point inside a circle of center O, diameter AOPB, and draw chord TPN at right

angles to AB, and join I, the inverse of P, to T and N. Then TI, NI will be tangent to the circle. For, since P and I are inverses of one another,

PO ∙ OI = OT2 [Def.1]

PO ∙ OI = OP2 + PT2 [Euc.1.47]

PO ∙ OI − OP2 = PT2

OP (OI − OP) = PT2

OP ∙ PI = PT2 Therefore angle OTI is right, and thus TI is tangent to the circle at T [Euc.3.16].

Moreover, by reversing the argument, if IT is tangent, and IOA is a diameter, and

TP is drawn at right angles to IOA, then PO ∙ OI = OT2, and so points P and I will be inverses of one another. Q.E.D.

T

I

N

OA

BP

Page 209: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

209

IDEM ALITER. If IT, IN are tangents to the circle of center O, and chord TN cuts diameter IBOA at

P, then

AI ∶ IB = AP ∶ PB [Apollonius, Conics 1.36] so AO + OI ∶ OI − OB = AO + OP ∶ OB − OP or OI + OB ∶ OI − OB = OB + OP ∶ OB − OP so OI ∶ OB = OB ∶ OP

so OB2 = IO ∙ OP so points I and P are inverses [Def.1]. THEOREM 2 If from a point outside a circle a pair of symmetrical secants is drawn through the circle, and the “X” is drawn joining the ends of the intercepted chords, then the intersection point in this “X” and the original point outside the circle are inverses of each other, and conversely. Let F be a point outside a circle of center E, diameter

FCEA, FGB any secant, and FHD the secant symmetrical to it about the diameter. Now form the “X” by joining BH, GD. By symmetry, the point K at which these intersect must lie on FCEA, the line of symmetry. Then K will be the inverse of F. For Ptolemy shows that in such a figure AF ∶ FC = AK ∶ KC [Almagest 12.1] so EF + EA ∶ EF − EC = AE + EK ∶ EC − EK or EF + EC ∶ EF − EC = EC + EK ∶ EC − EK so EF ∶ EC = EC ∶ EK

so EC2 = FE ∙ EK

and so F, K are inverses of each other with respect to circle E [Def.1]. By reversing the steps in the argument, we see also that if F, K are inverses of one another,

then AF ∶ FC = AK ∶ KC, and consequently the “X” formed between any pair of symmetrical secants from F will pass through K. Q.E.D.

A

D

H

F

G

B

E

K

C

T

I

N

OA

BP

Page 210: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

210

Corollary to Theorem 2: Ptolemy also proves that in such a figure DF : FH = BK : KH, and so this is also a necessary and sufficient condition of points F and K being inverses. THEOREM 3 If from any point outside a circle any secant is drawn, and to the resulting chord is drawn a line perpendicular to the diameter from the inverse of the outside point, this line will cut the chord internally in the same ratio in which the secant is cut externally, and conversely. Let F be a point outside a circle of center E, FHD any

secant, FCEA the diameter from F, K the inverse of F, KP drawn at right angles to the diameter. Then DP : PH = DF : FH.

For let FGB, the secant symmetrical to FHD on the other side of diameter FCA, be drawn. Thus K lies at the intersection of DG, BH [Thm.2], and points G, H are

symmetrical, and so are points B, D, so that KG = KH and DK = BK, and GH will be at right angles to

diameter AC, and so GH will be parallel to PK. Now, since points F, K are inverses of each other, DF : FH = BK : KH [Cor.Thm.2] or DF : FH = DK : KG so DF : FH = DP : PH [since DP : PH = DK : KG] And if we start with this proportion, we can reverse the argument and show that, after drawing a perpendicular PK from P to the diameter through F, the points F, K will be inverses of each other. Q.E.D.

A

D

H

F

G

B

E

C

M NPK

Page 211: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

211

THEOREM 4 Points on a circle of inversion invert to themselves. Let a point P be taken on a circle of center O, diameter

AOP, radius R. By definition, the inverse of P with respect to circle O is a point Q on diameter AOP such that

R2 = QO ∙ OP

or OP2 = QO ∙ OP so QO = OP and Q is just P itself. Q.E.D. THEOREM 5 A straight line through the center of inversion inverts to itself. Let QOR be a straight line through center O of a circle with respect to which we

shall find the inversion of QR. Choose any point on QR, whether a point A inside the circle, or a point b outside it. From point A draw Aα at right angles to QOR, and

draw the tangent from α, cutting QOR at point a. Therefore point a is the inverse of point A [Thm.1]. From point b draw bβ tangent to the circle, and drop βB at right

angles to QOR. Thus B is the inverse of b [Thm.1]. In this way it is evident that all points on QR invert to other points also on QR. Hence QR inverts to itself. Q.E.D.

A PO

QO b a RBA

β

α

Page 212: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

212

DEFINITION 2 Two circles are said to cut orthogonally, or to be orthogonal to one another, when the two tangents at either point of their intersection are at right angles to one another.

For example, if circles with centers O and T

cut at H and R, and OHT is right (and so

ORT is also right), then OH is tangent to circle T at H, and TH is tangent to circle O at

H, and so, since these tangents are at right angles, the circles are orthogonal to each other.

Note that each circle can say to the other: “Your radii are my tangents.” THEOREM 6 Any circle through two points that are inverses of one another with respect to some circle cuts that circle of inversion orthogonally—and, conversely, if two circles are orthogonal to one another, then every straight line through the center of one cuts the other in points that are inverses with respect to itself. First let B and D be inverses with respect to a circle

with center O, and let another circle with center Q pass through B and D, cutting circle O at P. Then circles O

and Q are orthogonal to one another.

For, since B and D are inverses with respect to circle O, thus

OP2 = BO ∙ OD [Def.1] But P, B, D all lie on circle Q, and OBD is a straight line.

So OP is tangent to circle Q. [Euc.3.37] So OP is at right angles to PQ. [Euc.3.16]

Thus the two circles O and Q have their radii to P at right angles, and therefore they are orthogonal to one another. [Def.2] Next, let circles O and Q be centers of circles that cut orthogonally at P. Let B be

any other point on circle Q, and join OB, extending it to D on circle Q. Then B, D must be inverses of one another with respect to circle O.

O

H

T

R

Q

D

A

P

O

B

Page 213: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

213

For, since circles O and Q are orthogonal to each other, therefore OPQ is right,

and so OP is tangent to circle Q at P, and so

OP2 = BO ∙ OD [Euc.3.36] But OP is the radius of circle O, and B, D lie on the same straight line through O,

and therefore B, D are inverses of each other with respect to circle O [Def.1]. Q.E.D. Corollary to Theorem 6: Any circle orthogonal to a

circle of inversion inverts to itself. For let circle Q be

orthogonal to circle O. Then, taking O as the circle of inversion, circle Q must invert to itself. For if we draw

through O any straight line cutting circle Q, say at B and D, these points are inverses of each other [by

Theorem 6], and so any point on circle Q must invert to another point on it, and so this corollary holds. THEOREM 7 The inversion of a straight line not passing through the center of an inversion circle is a circle that passes through the center of inversion—and, conversely, the inversion of a circle passing through the center of inversion is a straight line not through it. First let AB be a straight line not through O, the center of a circle with respect to which we are to find AB’s inverse.

Drop ON at right angles to AB, and let ON cut circle O at R, so that OR is a radius of circle O. Now take point n on OR (extended, if necessary) such that NO : OR = OR : On

Thus OR2 = NO ∙ On and N, n are inverses with respect to circle O [Def.1].

N

Q

A

B

q

OnR

Q

D

A

P

O

B

Page 214: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

214

Next choose Q at random on AB, join QO, and take point q on OQ (extended, if necessary) such that QO : OR = OR : Oq Thus Q, q are inverses also. Moreover, thanks to this proportion,

QO ∙ Oq = OR2 = NO ∙ On so NO : OQ = qO : On so NOQ is similar to qOn

so nqO = QNO = a right angle. Therefore the locus of points q, which locus is the inversion of AB with respect to

circle O, is a circle of diameter nO, proving this first part of the theorem. Next, conversely, let nqO be a circle through point O, the center of another

circle with respect to which we are to find nqO’s inverse, let nO be the diameter

of nqO that passes through O, and let q be any random point on circle nqO. Let On intersect circle O at R, and on OR (extended, if need be) take N such that nO : OR = OR : ON

thus OR2 = NO ∙ On and so N, n are inverses with respect to circle O [Def.1]. Through N draw ANB at right angles to NO, and let Oq intersect AB at Q. Since

nqO is right [it is in a semicircle], and QNO is right [AQNB was drawn at right

angles to NO], and the angle at O is common to both NOQ and qOn, therefore these triangles are similar, and so NO : OQ = qO : On so NO ∙ On = QO ∙ Oq

so QO ∙ Oq = OR2

and so Q, q are inverses with respect to circle O [Def.1]. Therefore the points on AB are inverses of points on circle nqO with respect to circle O, and so the converse is proved. Q.E.D. Note: the point O itself corresponds to the two points on AB that are infinitely distant

from O (so to speak).

O

A

Q

N

B

n

q

R

Page 215: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

215

THEOREM 8 The inversion of any circle not through the center of inversion is another circle not through the center of inversion. Let the circumference of the circle on diameter AB not pass through O, the center

of the circle with respect to which we are to find the inversion of circle AB, and let AB be the diameter in line with point O, and let OAB (extended, if necessary)

intersect circle O at R. Thus a, b, the inverses of A, B with respect to circle O, will lie on OAB. Now choose any point C on circle

AB, and join CO, and on CO take c, the inverse of C. Now, since a, b, c are respectively the inverses of A, B, C, and OR is the radius of the circle of inversion, therefore

AO ∙ Oa = BO ∙ Ob = CO ∙ Oc = OR2 [Def.1] so BO : OC = cO : Ob and since this proportion is among sides about the one angle at O, therefore BOC is similar to cOb.

Thus OCB = Obc

So too OCA = Oac [since AO : OC = cO : Oa]

so OCB – OCA = Obc – Oac

so ACB = acb [Euc.1.32]

But ACB is right, being in a semicircle.

So acb is also right.

Hence the locus of points c, which locus is the inversion of circle AB, is itself a circle on diameter ab. And it is plain that the circumference of circle ab does not pass through O, since

the inversion of O would be a point infinitely distant from O, whereas no point on circle AB can be infinitely distant from O. Q.E.D. Corollary to Theorem 8: We saw that BOC is similar to cOb. Generally, a triangle whose vertices are two points and the center of inversion is similar to the triangle whose vertices are the two inverses of those points and the center of inversion.

B

C

cO

bA R a

Page 216: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

216

Question: Is abc similar to ABC? Well, if abc were similar to ABC, then

CAB = cab

But CAB = ACO + AOC [Euc.1.32]

so CAB = caO + aOc

so CAB = cab + aOc

so CAB > cab Hence the triangles are not similar. Question: Is abc the inversion of ABC? No, because the inversion of any straight side of either triangle is not another straight line, but a circle through the center of inversion [Thm.7]. Observation: We are now in a position to understand better what inversion does to a figure or series of figures. In the accompanying diagram, the word “Hello” has been written inside the circle of inversion, and its inversion has been constructed (using Theorems 7 & 8) outside the circle.

B

C

cO

bA R a

Page 217: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

217

DEFINITION 3 If four points lie on the semicircumference of a circle, then the ratio of (1) the chord from one extreme to the nearer of the middle points to (2) the chord from that middle point to the further extreme, times the ratio of (3) the chord from that further extreme to the middle point nearest to it to (4) the chord from that middle point back to the first extreme, is called the cross ratio of the chords.

For instance, if points a, b, d, c all lie on a semicircumference, and a and c are the extreme (outermost) points, then the cross ratio of the chords is

ab

bc∙cd

da

Note: Cross ratio is a more general idea than cross

ratio of chords in a circle. For example, the points a, b, d, c could all lie on a straight line, and the expression above would still be the cross ratio of the four points. Note also that the cross ratio is the same regardless of which extreme we begin from. THEOREM 9 The cross ratio of the chords between four points lying on a semicircle is preserved under inversion.

Let point O be the center of a circle with respect to which inverses will be taken. Let points A, B, D, C lie on one semicircumference. And let points a, b, d, c be

the inverses of points A, B, D, C respectively. Then the cross ratio of points a, b, d, c is equal to

that of points A, B, D, C.

That is ab

bc∙cd

da =

AB

BC∙CD

DA

For ABO is similar to baO [Cor.Thm.8]

a

b

d

c

AB

D

C

a

cO

bd

Page 218: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

218

so AB

BO =

ba

aO so AB =

BO ∙ ab

aO

And BCO is similar to cbO

so CB

BO =

bc

cO so BC =

BO ∙ bc

cO

So AB

BC =

ab

aO ∙

cO

bc =

ab

bc ∙

cO

aO

Likewise

CD

DA =

cd

da ∙

aO

cO

So AB

BC∙CD

DA =

ab

bc ∙

cO

aO ∙

cd

da ∙

aO

cO =

ab

bc∙cd

da

Q.E.D. Note: If circle ABDC happens to pass through

O, so that its inversion is not another circle but instead a straight line [Thm.7], the argument is unaffected.

FYI, the cross ratio of four points in a straight line is also preserved in the projection of those points, from any point, to any other straight line.

a

b

d

c

O

A

C

B

D

AB

D

C

a

cO

bd

Page 219: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

219

THEOREM 10 Angles are preserved under inversion. We will consider this theorem in three parts: Part 1: An angle between two straight lines is equal to the angle between their

inversions. Part 2: An angle between two circular arcs is equal to the angle between their

inversions. Part 3: An angle between any two lines is equal to the angle between their

inversions. Part 1

Let straight lines α and β intersect at angle θ, and let O be the center of the

circle of inversion. Drop OA and OB at right angles to α and β respectively. If either α or β passes through O, then such a straight line is its own inversion [Thm.5], and things get simpler. Barring that, then the inversion of α is a circle

through O and with its diameter along OA, and the inversion of β is a circle through

O and with its diameter along OB [Thm.7]. Let these be the circles with diameters aO, bO.

Through O draw OC parallel to α, OD parallel to β. Thus COD = θ. But since OC is parallel to α, therefore OC is perpendicular to AO, or aO. Thus OC is tangent to circle aO at O.

Likewise OD is tangent to circle bO at O.

Therefore COD, or θ, is the angle between circles aO and bO, the inversions of α and β. Therefore the angle between the inversions of α and β is equal to the angle

between α and β themselves. Q.E.D.

B

D

C

O

A

a

b

α

β

θ

Page 220: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

220

Part 2 Let two circles F, G (G is undepicted, to keep the diagram uncluttered) cut at T, N, and let O be the center of a circle we shall use for inversion. Then the angle between f, g, the inversions of F, G, will be the same as the angle between F, G themselves. For: Let it be that neither F nor G

passes through O (to take the more complex case), and thus f, g

are circles not passing through O [Thm.8]. Let TE be the tangent to F at T.

Drop OA at right angles to TE. Along OA, take a, the inverse of

A. Thus the circle on Oa as diameter is the inverse of TE [Thm.7]. Since T lies on TE, therefore t, the

inverse of T, lies on circle aO, the inverse of TE. And t lies also on

f, the inversion of F, since it lies on F. And no point on f other than t lies on circle aO, since no point on F other than T lies on TE. Therefore f is tangent to circle aO at t. Therefore if c is the

center of aO, then tc passes through the center of f.

Likewise if we drew the circle through O which is the inversion of the tangent to G at T, and the center of this circle were k, then tk would pass through the center of

g.

So the angle between the circles f and g is formed by these two lines tc and tk, but also the angle between the circles that are the inversions of the tangents to F

and G at T (that is, the circles with centers c and k) is formed by them, and so the angle between f and g is equal to the angle between the inversions of the tangents. But the angle between the inversions of the tangents is equal to the angle between

the tangents themselves [by Part 1 above], and therefore the angle between f and g is equal to the angle between the tangents to F and G at T, that is, the angle

between f and g is equal to the angle between F and G, proving the theorem in this case.

F

A

E

T

c

Ot

f

a

N

Page 221: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

221

Part 3

Now let’s prove the theorem universally. Let any line G cut any line F at point B, and let O be the center of some inversion circle, and let lines f and g be the

inversions of F and G.

Let the inverse of point B be point b. Thus B, b, O are collinear.

Also, since B lies on both F and G, thus must b lie on both f and g, so that f and g cut at some angle. Now it remains to be shown that the angle between f and g equals that between F and G. Along F, some distance from B,

choose any point D, and join DO, thus cutting f in d, the inverse

of D. Along G, some distance from B, choose any point E, and join EO, thus cutting

g in e, the inverse of E.

(1) Now DBO = bdO [since DBO similar to bdO, Cor.Thm.8] (2) so lim

D→BDBO = lim

D→BbdO

(3) But lim

D→BDBO = angle between BO and the tangent to F at B

(4) And bdO = Bbd − dOb [Euc.1.32] (5) so lim

D→BbdO = lim

D→BBbd − lim

D→BdOb

Now the first limit on the right in the above equation is the angle between Bb (or BO) and the tangent to f at b, while the second limit is just zero. (6) So lim

D→BbdO = angle between BO and the tangent to f at b

Putting together steps (2), (3), and (6), we have (7) angle between BO and the tangent to F at B = angle between BO and the tangent to f at b

Similarly, since EBO = beO (since EBO is similar to beO by Cor.Thm.8), we can show that

A

F

G

B

D

E

aO

g

f

d

b

e

Page 222: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

222

(8) angle between BO and the tangent to G at B = angle between BO and the tangent to g at b And so, adding these two angles to the ones in step (7), we get (9) angle between (the tangent to F at B, and the tangent to G at B) = angle between (the tangent to f at b, and the tangent to g at b) proving the theorem. Q.E.D. DEFINITION 4 So far, we have just been looking at some theorems of Euclidean geometry, most of them discovered in modern times (the 19th century). Using them, we will now begin to build a window looking out from Euclid’s world into Lobachevsky’s. Let a particular circle in a Euclidean plane be set out. Let the area inside this circle, not including the circumference, be called the disk space, or more simply, the disk. And let the circumference of this circle be called the disk boundary.

A

F

G

B

D

E

aO

g

f

d

b

e

Page 223: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

223

DEFINITION 5 Consider any circular arc that (1) belongs to a circle orthogonal to the disk boundary, and (2) lies entirely within the disk, so that even its endpoints lie inside the boundary and not on it—let such circular arcs be called orthogonal lines. And let the points where such an orthogonal circle cuts the disk boundary be called the ideal points of the orthogonal line.

In the accompanying figure, arc AB is an orthogonal

line, and P and Q are its ideal points. Let an orthogonal arc lying in the disk and terminated by ideal points be called an ideal line. And so an ideal line differs from an orthogonal line only in that it reaches the disk boundary. DEFINITION 6 From the endpoints of any orthogonal line may be drawn a pair of chords to its ideal points. Let the natural logarithm of the cross ratio of these chords be called the logarithmic length of the orthogonal line. For short, we may call this measure of length loglength or logdistance. In the accompanying figure, AB is the orthogonal line,

P and Q are its ideal points, and the cross ratio of the four chords is

PA

AQ ∙

QB

BP

so the logarithmic length of line AB is

ln [PA

AQ ∙

QB

BP]

This way of measuring length or distance is an example of a Cayley-Klein metric, named for Felix Klein (1849-1925, a German mathematician known for his work in group theory and its connection to geometry, complex analysis, and non-Euclidean geometry) and Arthur Cayley (1821-1895, a British mathematician who was the first to define a group in a modern way).

Disk

A

P Q

B

P Q

A B

Disk

Page 224: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

224

This way of defining length makes the loglength of a point equal to zero, since

ln [PA

AQ ∙

QA

AP] = ln[1] = 0

Also, since cross ratio is preserved under inversion [Thm.9], it follows that logarithmic length is also preserved under inversion.

In the special case where A and B are collinear with O,

the center of the disk, so that AB is a straight line (like an arc of a circle with infinite radius), the straight line ABO is

extended to the ideal points P, Q, which are the ends of a diameter, and it is still true by definition that

logarithmic length of AB = ln [PA

AQ ∙

QB

BP]

THEOREM 11 It is possible to draw exactly one orthogonal line between two given points in the disk. Let A, B be any two given points in the disk

of center O. Thus it is required to draw an orthogonal line between them, and to show that no other can be drawn between them.

Join O to either point, say A, and extend

OA each way till it meets the disk boundary in points C, D. Extend CD to point a so that a is the inverse of A with respect to circle

O, the disk (by drawing AT perpendicular to CD, and Ta tangent to circle O).

Through points A, a, B describe a circle [Euc.4.5]. Since this circle passes through A, a, inverse points with respect to the disk, therefore this circle is

orthogonal to the disk [Thm.6], and since arc AB is part of this orthogonal circle and lies wholly inside the disk, arc AB is an orthogonal line [Def.5].

Moreover, no other orthogonal line can be drawn between A and B, since it would have to be part of another circle that was orthogonal to the disk [Def.5], and

therefore such a circle, since it would pass through A, would also pass through a, the inverse of A [Thm.6], so that two distinct circles would pass through A, a, B, which is impossible [Euc.3.10]. Q.E.D. Corollary to Theorem 11: Two orthogonal lines cannot cut each other more than once.

Q

O

B

A

P

T

a

C

O

B

A D

Page 225: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

225

THEOREM 12 From a given point in the disk, to draw an orthogonal line at right angles to a given orthogonal line. Let AB be the given orthogonal line in the disk with center O (and let C be the center

of the circle of which AB is a part), and let R be the given point. Thus it is required to draw an orthogonal line through R that is at right angles to arc AB. First, let R not lie on AB. Join RO, and extend it either way to meet

the disk boundary in D, E. Join RC, and extend it either way to meet

the circumference of circle C in L, N. Extend DRE to K so that K is the inverse

of R with respect to the disk. Cut LN at T so that T is the inverse of R with respect to circle C. Draw a circle through points K, R, T

[Euc.4.5], cutting AB at V.

Since this circle passes through K, R, which are inverses with respect to the disk, therefore this circle is orthogonal to the disk [Thm.6], and therefore RV is an orthogonal line [Def.5]. Since circle KRT passes through R, T, which are inverses with respect to circle C, thus circle KRT is also at right angles to circle C [Thm.6], and thus to AB. Therefore RV is an orthogonal line through the given point R and drawn at right angles to the

given orthogonal line AB. Q.E.F.

D

O

R

K E

QP

N

C

B

A

VL

T

Page 226: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

226

Next, let R lie on AB itself.

Then join QP, the ideal points of AB, and extend that line till it intersects the tangent to circle C from point R, say at G.

Thus GR2 = QG ∙ GP [Euc.3.36] So P, Q are inverses with respect to the circle with center G, radius GR (or “circle G”) [Def.1].

And circles C and O each pass through P, Q, and therefore circles C and O are each

orthogonal to circle G [Thm.6].

Take any point W in the disk and on circle G besides point R, and WR is an orthogonal line,

drawn through the given point R, and at right angles to the given orthogonal line AB. Q.E.F. Corollary to Theorem 12: The center of any circle orthogonal to two circles cutting at two points must lie along the straight line joining those two points. For let two circles O and C cut at P and Q, and let circle G be orthogonal to both

circles. Then center G must lie on PQ extended. For, G must lie outside both circles, since the circle with center G is orthogonal to both. Now if we draw GR tangent to

circle C, then draw GP through to point X where GP again meets circle C, we have

GR2 = XG ∙ GP [Euc.3.36]

Again, we can draw the tangent from G to circle O, and this will be another radius of circle G, hence equal to GR, and so if we join GP and extend this to Z where GP

again meets circle O, we have

GR2 = ZG ∙ GP [Euc.3.36] Consequently GX = GZ, and yet X and Z both lie on GP extended, and therefore X

and Z are the same point, hence a point of intersection of circles O and C. But they intersect only at Q, other than P. Therefore GP extended passes through Q, and so

G, the center of circle G, lies on PQ.

O

QP

C

B

A

R

W

G

Page 227: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

227

THEOREM 13 To extend an orthogonal line to as great a logarithmic distance as you please. Let AB be a logarithmic line in our disk, 𝑛 a given positive integer, and let it be required to construct a logarithmic line

starting from A, lying in the same orientation as AB, but with a logarithmic

length that is 𝑛 times that of AB.

Let P, Q be the ideal points of AB. First form the quantity

(PA

AQ)𝑛−1

(QB

BP)𝑛

and cut arc PQ at C so that

(PA

AQ)𝑛−1

(QB

BP)𝑛 =

QC

CP

which we learned how to do from Pappus at the beginning of the junior math tutorial, that is, we learned how to inflect chords QC, CP in a given segment of a

circle (in this case segment PABQ on base PQ) and in a given ratio, which is in this case the ratio of the quantity on the left side of the equation to 1.

Now loglength of AB = ln [PA

AQ ∙

QB

BP] [Def.6]

so 𝑛 (loglength of AB) = 𝑛 (ln [PA

AQ ∙

QB

BP])

so 𝑛 (loglength of AB) = ln [(PA

AQ)𝑛(QB

BP)𝑛] [in general, 𝑛 ln 𝑥 = ln 𝑥𝑛]

so 𝑛 (loglength of AB) = ln [(PA

AQ) (

PA

AQ)𝑛−1

(QB

BP)𝑛] [algebra]

And so, substituting our construction in the place of (PA

AQ)𝑛−1

(QB

BP)𝑛, we get

𝑛 (loglength of AB) = ln [(PA

AQ) (

QC

CP)]

or 𝑛 (loglength of AB) = loglength of AC [Def.6] Q.E.F.

O

P Q

A

C

B

Page 228: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

228

THEOREM 14 The centers of circles orthogonal to a given circle and passing through a given point inside it are collinear. Let C be the center of the given circle, P the given point inside it. Extend CP to Q so that P, Q are inverses of each other with respect to circle C. Bisect PQ at R. Set up the straight line L perpendicular to CPQ at R.

The centers of all circles orthogonal to C and passing through P must lie on L. For:

Any circle orthogonal to C and passing through P will be such that

CP extended will meet this circle in a point that is the inverse of P with respect to C [Thm.6]. But Q is the

inverse of P with respect to C [construction]. So any circle

orthogonal to C and passing through P will also pass through Q. Thus PQ will be a chord of any circle orthogonal to C and passing through

Q. But line L, being perpendicular to the midpoint of chord PQ, passes through the center of any circle in which PQ is a chord [Euc.3.1]. Therefore every circle orthogonal to

C and passing through P has its center on L, that is, the centers of all such circles are collinear. Q.E.D.

C Q

L

RP

Page 229: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

229

Corollary to Theorem 14: Conversely, if a first circle is orthogonal to a second one, and from the center of the first a chord is drawn through the second, and the perpendicular is drawn at the midpoint of this chord, then all circles with centers on this perpendicular and passing through that endpoint of the chord that lies inside the first circle will be orthogonal to that circle.

Let the first circle have center C, and from C draw CPQ, giving chord PQ in the second

circle that is orthogonal to circle C. Bisect PQ at R, and draw RL at right angles to PQ. Then

if we choose any point on RL, and draw the circle with that center and passing through P, such a circle will be orthogonal to circle C. For by symmetry such a circle will also pass through Q, and so will pass through P, Q, which

are inverses with respect to circle C [Thm.6], and so it will be orthogonal to circle C [Thm.6].

C Q

L

RP

Page 230: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

230

THEOREM 15 About a given point in the disk, and with a given logarithmic length, to describe a circle so that every logarithmic line between the given point and the circle’s circumference has the given logarithmic length.

Let A be the given point in our disk of center O, AB a logarithmic line of the given logarithmic length. Thus it is required to construct a circle in the disk such that all

logarithmic lines cut off between point A and the circumference of the circle have the

same logarithmic length that AB has. Let P, Q be the ideal points of AB.

Complete the circle of which PABQ is an arc. Let OA extended cut this circle again at A’. Bisect AA’ at M and draw ML at right angles to AA’. At B draw the tangent to circle PABQ, cutting OF at point K. Draw the circle with center K, radius KB (“circle K”).

Then this will be the required circle. That is to say, if we choose T at random along its circumference, and draw the orthogonal line through A, T (with ideal

points C, D), then

loglength of AT = loglength of AB For:

Join TB, and extend it till it cuts ML, say at Z. (If TB is parallel to ML, then AT and AB are symmetrical about OKA, and therefore will obviously have the same loglength.) Join ZA. Using this as radius, Z as center, describe circle Z, cutting the disk

boundary at points G and H.

Since PABQA’ is orthogonal to disk O [given], and A’, A, O are collinear points [construction], therefore A and A’ are inverses with respect to circle O [Thm.6]. Since circle Z passes through A, A’, inverses with respect to circle O, therefore

circle Z is orthogonal to the disk [Thm.6]. Since BK is tangent to circle PABQ [construction], and KB is a radius of circle

K, therefore circle K is orthogonal to circle PABQ [Def.2].

RO

D

H

L

P

MZ

G

A’

N

T

BKA

FC

Q

Page 231: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

231

Now the center of PABQ must lie on line ML, since M is a midpoint of chord AA’, and ML is at right angles to AA’ [Euc.3.1].

And since circle PABQ is orthogonal to circle K [proved above], and ML is perpendicular to the midpoint of chord AA’ in circle PABQ, therefore any circle

passing through A and with its center on ML will also be orthogonal to circle K [Cor.Thm.14]. Therefore circle Z, or GAH, is orthogonal to circle K. Now, using circle GAH as a circle of inversion, point A inverts to itself, [Thm.4]

point T inverts to point B, since circles K and Z are orthogonal, and points Z, B, T are collinear [Thm.6]. Since circle Z is orthogonal to the disk [shown above], therefore the disk inverts to

itself [Cor.Thm.6]. And since circle CATD is orthogonal to the disk, and angles are preserved in inversion [Thm.10], therefore circle CATD inverts to another circle

[Thm.8] orthogonal to the disk, and passing through A, B, the inverses of A, T. But circle PABQ is a circle orthogonal to the disk and passing through A, B, and it is the

only one [Thm.11], and therefore circle CATD inverts to circle PABQ, and in particular arc AT to arc AB. But loglength is preserved under inversion [Def.6, Thm.9]. So loglength of AT = loglength of AB Q.E.D.

RO

D

H

L

P

MZ

G

A’

N

T

BKA

FC

Q

Page 232: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

232

Question: What about AR? It is straight. Is its loglength equal to that of AB? It is. Let J be the center of circle PABQ, which must lie on ML. As J goes out to L, and the radius JA grows infinitely

large, point P gets as near as we please to F, point B gets as near as we please to R, point Q gets as near as we please

to V, and so lim

AJ→∞PA = FA

lim

AJ→∞AQ = AV

lim

AJ→∞QB = VR

lim

AJ→∞BP = RF

Therefore

limAJ→∞

ln [PA

AQ ∙

QB

BP] = ln [

FA

AV ∙

VR

RF]

But the expression whose limit we are taking on the left is in fact a constant, since it is the loglength of AB as AJ goes to infinity, and the loglength of AB does not

change as AJ increases (as we showed in the proof above). So the left side is nothing else than the loglength of the particular AB we were given. But the right

side is by definition the loglength of AR. Hence loglength of AB = loglength of AR

RO

M

B

A

F

Q

P

A’

J

V

L

Page 233: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

233

THEOREM 16 Every figure contained by three sides, each of which is either an orthogonal line or an ideal line, has an angle sum of less than two right angles.

Let ABC be such a triangle, its three sides

being arcs of circles D, E, F, each orthogonal to the

disk with center O, and points A, B, C lying either in the disk or on the disk boundary. Then the angle sum of ABC will be less than two right angles, which is proved as follows. Refer to the point where circles D and E meet outside the disk as V.

Using V as center, draw a circle (“circle V”) that

cuts both circles D and E, say at G, K (on D) and at

H, L (on E).

Now use circle V as a circle of inversion.

Since circle D passes through V, therefore circle D inverts to the straight line through G and K [Thm.7]. Likewise circle E inverts to the straight line through H and

L.

Join VA, cutting HL at a, which must therefore be the inverse of A. Join VB, cutting GK at b, which must therefore be the inverse of B. Also, since A lies both on circle D and on circle E, therefore a, its inverse, lies on both their inverses, and therefore it is the intersection of HL, GK. Join VC, cutting HL at c, which must therefore be the inverse of C. Now ab, ac are the inversions of arcs AB, AC, since circles D and E invert to

straight lines GK and HL, on which lie ab and ac, and A, B, C invert to a, b, c.

But BC is on circle F, which does not pass through point V, the center of inversion, and therefore circle F inverts to another circle not through V [Thm.8], namely the

A

V

C

B

N

G

K

F

E

D

L

H

a

c Tb

f

O

Page 234: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

234

circle through b, c (and the two points T, N where circle F intersects the circle of inversion, circle V, if it does intersect that circle). Hence the mixed figure abc is the inversion of triangle ABC. Now arc BC is convex toward point A, and so if we joined chord BC and drew the

tangents at B and C to arc BC, the resulting angles would lie on the side of BC that point A is on. And angles are preserved under inversion [Thm.10]. Therefore,

if we join bc, and draw tangents to arc cb at c and b, these will lie on the side of bc that a is on. Therefore the angles of the mixed figure at b and c are less

than those of the rectilineal triangle abc, and hence they are less than two rights. But the angle sum of mixed figure abc is equal to that of ABC [Thm.10], and therefore the angle sum of figure ABC is less than two rights. Q.E.D.

A

V

C

B

N

G

K

F

E

D

L

H

a

c Tb

f

O

Page 235: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

235

THEOREM 17 If, from any point on one orthogonal line that is orthogonal to another, the ideal lines is drawn to the ideal points of the other, these lines make equal angles with the perpendicular.

In the disk with center O, let DM be an orthogonal line orthogonal to another

orthogonal line PMQ with ideal points P, Q, and let the ideal lines through D be drawn to Q and P as DP, DQ. Then will the curvilinear angles PDM and

QDM be equal.

For, let A be the center of the circle of which DP is an arc,

let B be the center of the circle of which DQ is an arc,

let E be the center of the circle of which PMQ is an arc, let QP extended intersect AB at

point C.

Then C is the center of the circle of which DM is an arc. For:

• The center of the circle containing arc DM must lie on AB. For

circles A and B are both orthogonal to disk O [given], and both pass

through D, and so all circles orthogonal to the disk and passing

through D must have their centers on AB [Thm.14]. And the circle

containing DM is a circle orthogonal to the disk and passing through

D [given]. Hence the center of the circle containing arc DM lies on

AB.

• The center of the circle containing arc DM must lie on PQ. For the

circle containing DM is orthogonal to circles O and E [given], and

these cut each other at P, Q, and the center of any circle orthogonal

to two cutting circles must lie on the straight line joining the points

where they cut [Thm.12]. Hence the center of the circle containing

arc DM lies on PQ.

O

Q

E

B

D

M

CP

A

Page 236: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

236

• Since the center of the circle containing arc DM must lie on AB,

but also on PQ, therefore it lies at their intersection, which is point C.

So C is the center of the circle containing arc DM.

Now, since circles A and E both pass through P and

are both orthogonal to the disk at that point, therefore they

are tangent to each other there, and therefore A, P, E are

collinear points. Similarly, B, E, Q are collinear points.

Consequently, ACB, BEQ, QPC define a Menelaos figure.

In such a figure, Ptolemy shows that

AC

CB =

AP

EP ∙

QE

QB [Almagest 1.13]

so AC

CB =

AP

QB ∙

QE

EP

But QE = EP [radii of circle E]

so AC

CB =

AP

QB

and AP = AD [radii of circle A]

and QB = BD [radii of circle B]

so AC

CB =

AD

BD

and therefore CD bisects ADB. [Euc.6.3] But AD, BD, being radii of circles A

and B, are at right angles to the tangents to these circles at point D [Euc.3.16].

Therefore CD bisects also the angle between these tangents.

Hence the straight line through D at right angles to CD bisects the adjacent angle between those tangents. But the straight line through D at right

angles to CD is the tangent to circle C at point D. Therefore the tangent to

circle C at D, that is, the tangent to arc DM at D, bisects the angle between

the tangents to circles A and B at D,

Q

E

B

CP

A

O

Q

E

B

D

M

CP

A

Page 237: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

237

that is, the tangents to arcs DP and DQ at D.

Thus the angle between the tangents at D to arcs PD and DM is equal to the angle between the tangents at D to arcs QD and DM. That is, curvilinear angles PDM and QDM are equal. Q.E.D.

Note: Sometimes CD bisects ADB externally. That is, CD does not pass into the

angle ADB, but angle CDA is equal to the angle that CD would make with DB if we extended CD. This does not change the theorem.

O

QPC

E

BA

D

M

Page 238: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

238

THEOREM 18 If two ideal lines share an ideal point, and if from a point on one of them an orthogonal line is drawn at right angles to the other, then:

(1) The two ideal lines do not meet at any point in the disk, (2) Every other orthogonal line through the point and making a lesser angle

with the perpendicular will cut the ideal line to which the perpendicular was drawn,

(3) The angles on the side of the perpendicular on which the ideal lines meet at an ideal point are together less than two rights.

Let TQ and PQ be two ideal lines sharing ideal point Q on the disk boundary. (1) Then they cannot meet at any point in the disk.

For each is orthogonal to the disk boundary [given and Def.5], so they cut it at right angles at Q, and thus they are tangent to each other at Q,

and Q is the only point common to both. But Q lies on the disk boundary, not within the disk, and

therefore TQ and PQ do not meet at any point in the disk. From any point on one of these ideal lines, say D on TQ, draw orthogonal line DM

at right angles to the other (i.e., to PQ) [Thm.12]. Let DL be any other orthogonal

line through D drawn inside MDQ. (2) Then DL must cut MQ if extended far enough.

For it must exit the triangle DMQ in order to exit the disk, but it cannot exit

triangle DMQ through DM or DQ at any point within the disk, since it already cuts each once at D [Cor.Thm.11]. Nor can it exit through Q, since then DQ and DL are

both orthogonal to the disk, and so they would both be orthogonal to it at Q, and therefore the circles to which they belong would be tangent to each other there,

and would not cut at D as they do. Therefore DL must exit triangle DMQ through some point on MQ between M and Q. So any orthogonal line making an angle MDL

that is less than angle MDQ must cut PQ.

(3) Also, MDQ plus DMQ is less than two rights, since the angle sum of triangle

DMQ is less than two rights [Thm.16]. Corollary to Theorem 18: If two orthogonal lines (as DR, MB), on one side of a

third orthogonal line cutting them (as DM), make angles (as MDR, DMB) that are together less than two right angles, sometimes they do not meet at any point in the disk, no matter how great a logarithmic length to which we extend them.

For DR, MB never meet except at Q, which is not in the disk, and they can be extended to as great a loglength as we please prior to reaching Q [Thm.13].

P

T

D

MQ

AR

B

L

Page 239: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

239

THEOREM 19 Given two unequal circles, to find the circle with respect to which they are inverses of one another. First, let one circle lie wholly inside the other, and not internally tangent to it. Let the common diameter of the two circles be AbaB, with AB the diameter of the larger

circle, and ba that of the smaller. Draw the circles on diameters Aa, bB. Since these diameters overlap, these circles must cut, say at P and Q.

Join PQ, cutting AB at V.

Now invert circle AB with respect to the circle of center V, radius VP (“circle V”). Since circle AB does not pass through V, it must invert to another circle not through V, and whose center lies along AVB [Thm.8]. And thanks to the semicircles on Aa, bB, we know that

PV2 = VA ∙ aV = Vb ∙ BV

And therefore A, a are inverses with respect to circle V, and B, b are also [Def.1], and therefore the points a, b lie on the circle that is the inverse of circle AB with

respect to circle V. But the only circle that has its center on AVB and passes through points a, b is the given circle on diameter ab. Therefore we have found the circle that inverts

circles AB and ab to one another. Next, let the given circles either cut one another or lie wholly outside each other (even if externally tangent).

Let the centers of AB and ba be called C and c respectively. On circle AB choose point R at random, and draw cr parallel to CR. Join Rr and extend it to where it cuts AB at V.

Draw Vt tangent to circle ab. Draw CT at right angles to Vt.

A Bb a

P

Q

V

AC

T

R

VaB b c

r

t

L

Page 240: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

240

Then CTV is similar to ctV so CT : ct = CV : cV

but CV : cV = CR : cr so CT : ct = CR : cr

but ct = cr [radii of circle c] so CT = CR

and thus point T lies on circle AB, and CTV was made right, therefore the straight

line VtT is tangent to both circles, not just to circle ab.

Now cut VT at L so that

VL2 = VT ∙ tV and draw circle V, that is, the circle of center V, radius VL. Since circle AB does not pass through V, therefore its inversion with respect to

circle V is another circle not through V but whose center lies on ABV [Thm.8].

Also, since VL2 = VT ∙ tV, therefore T, t are inverses of each other with respect to circle V [Def.1], and therefore t lies on the inverse of circle AB with respect to

circle V. But VT inverts to itself since it passes through V [Thm.5], and since it is tangent to circle AB it is also tangent to the circle that is the inverse of circle AB

[Thm.10]. Therefore the inverse of circle AB with respect to circle V is a circle with its center on ABV, and tangent to TV at t. But the only circle like that is the given circle ab. Therefore, with respect to the constructed circle V, the given circle ab is

the inverse of the given circle AB. Q.E.F. QUESTIONS:

(1) Are the points R and r inverses with respect to circle V? (2) What if the given circles are equal? (3) What if the given circles are externally tangent? (4) What if the given circles are internally tangent?

AC

T

R

VaB b c

r

t

L

Page 241: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

241

THEOREM 20 If two triangles, each contained by orthogonal lines, have two sides in one equal in logarithmic length to two sides in the other and also the included angles equal, then the base will also be equal in loglength to the base, and the remaining corresponding angles will be equal. Isn’t it beautiful and remarkable that this analogue to “side-angle-side” for rectilineal triangles works for curvilinear triangles contained by circular arcs orthogonal to the disk? As with the preservation of angles in inversion, we will prove this theorem in three steps. First we will prove it in an elementary case, then in a more sophisticated case, and then universally. CASE 1 Let’s begin with an elementary case. Consider any ideal line Pp, midpoint M.

On it, take any equal arcs MA, Ma, AB, ab, and through them draw symmetrical ideal lines, forming symmetrical triangles ABC and abc. Clearly the corresponding angles of these figures are equal by the symmetry. And the corresponding arcs and chords are equal in ordinary Euclidean length. Consequently the corresponding arcs are equal in loglength, too. For example,

loglength of AB = ln [PB

Bp ∙

pA

AP]

and loglength of ab = ln [Pa

ap ∙

pb

bP]

But PB = pb and Bp = bP

and pA = Pa and AP = ap so loglength of AB = loglength of ab

Therefore figures ABC and abc have all their corresponding angles equal, and all their corresponding sides of equal loglength. And this must be true for any symmetrically drawn triangles contained by orthogonal lines.

MCA

Ba

b

c

pP

Page 242: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

242

CASE 2

On ideal line PQ let an arc AB be taken, and let any triangle ABC, formed by orthogonal lines, be constructed on it. And let another arc ab be taken on PQ, of

the same loglength as AB, and on it let triangle abc be constructed of orthogonal lines, and such that loglength of ac = loglength of AC

and cab = CAB

Also, let abc be situated so that b and B lie between a

and A. In other words, we are now looking at the case when the two triangles having the “side-angle-side” relation have an ideal line in common passing through one pair of corresponding sides. Clearly these triangles are not congruent in the ordinary sense. But will it still be true that the remaining sides that correspond, namely BC and bc, have the same loglength, and will the remaining corresponding angles be equal?

Join Aa, Cc, and let them intersect at V. With center V and radius R such that

R2 = AV ∙ Va, describe a circle (“circle V”).

Since circle PabBAQ passes through A, a, which are inverses of one another with respect to circle V [construction, Def.1], therefore circle PabBAQ is orthogonal to

circle V [Thm.6], and thus circle PabBAQ inverts to itself [Cor.Thm.6].

And since points a, A are inverses of each other, and arcs aP and AQ have equal loglengths (each is infinite), and since loglength is preserved under inversion

[Def.6, Thm.9], therefore points P, Q are inverses of each other with respect to circle V (thus points Q, P, V are collinear). Now the disk boundary passes through points P, Q, inverses with respect to circle V, and therefore the disk boundary is orthogonal to circle V [Thm.6], and therefore

the disk boundary inverts to itself with respect to circle V [Cor.Thm.6].

C

A

B

V

Q

P

c

b

a

Page 243: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

243

Again, since A, a are inverses of one another, and since circle PabBAQ inverts to itself, and since loglength is preserved under inversion [Def.6, Thm.9], therefore the arc to which ab inverts lies on PQ, begins at A, and has the same loglength

as ab. But that defines arc AB. Hence points B, b are inverses of each other with respect to circle V (so B, b, V are collinear), and arcs AB, ab are inversions of one another. And since angles are preserved under inversion [Thm.10], therefore the circular

arc to which arc ac inverts [Thm.8] passes through A and makes the same angle with AB that ac makes with ab, and it is orthogonal to the disk (since ac is

orthogonal to the disk, hence the inversion of ac is orthogonal to the inversion of the disk [Thm.10], that is, to the disk itself, since the disk inverts to itself, as shown above).

Now circle AC is the only circle that passes through A, that is orthogonal to the

disk, and that makes BAC = bac. Therefore circle AC is the circle to which circle ac inverts with respect to circle V.

And since loglength of ac = loglength of AC [given], therefore points C, c are

inverses of each other [Def.6, Thm.9], and thus arcs AC and ac are inversions of one another. Since B, b are inverses, and so are C, c, and since bc is orthogonal to the disk

[given], therefore the circle that is the inverse of circle bc [Thm.8] passes through B, C and is orthogonal to the disk [since the disk inverts to itself, as shown above, and since angles are preserved under inversion]. But there is only one orthogonal arc through B, C [Thm.11]. Therefore the given arc BC is the inversion of arc bc with

respect to circle V. In sum, arcs ab, ac, bc are respectively the inversions of arcs AB, AC, BC with

respect to circle V. And since loglengths and angles are preserved under inversion, therefore loglength of bc = loglength of BC

and abc = ABC

and acb = ACB which proves the theorem for the present case. That is, we may now say that

When two triangles having the “side-angle-side” relation and face

opposite ways and have an ideal line in common passing through

one pair of corresponding sides and face opposite ways, then the

remaining sides also have the same loglength, and the remaining

corresponding angles are equal.

Page 244: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

244

UNIVERSAL PROOF Now let’s prove the theorem universally. Let ABC (depicted in the figure) and αβγ (left undepicted) be any two triangles in the disk that are formed by orthogonal lines such that AB, AC, BC lie respectively on ideal lines PQ, RS, TY, and loglength of AB = loglength of αβ loglength of AC = loglength of αγ

and BAC = βαγ Then will loglength of BC = loglength of βγ

and ABC = αβγ and BCA = βγα

For, take any ideal line pq that is part of a circle unequal to the circles of which

AB and αβ are parts, and not internally tangent to either of them.

Construct the circle, center V, that is the inversion circle for the circles of which PQ, pq are parts [Thm.19]. Join VA, VB, cutting pq at a, b. Thus A, a are inverses,

and so are B, b, and thus arc ab is the inversion of arc AB, and therefore loglength of ab = loglength of AB [Def.6, Thm.9]. Let rs, ty be the inversions of RS, TY with respect to circle V.

Since RS, TY pass through A, B and intersect at C, thus rs, ty pass through a, b and intersect at some point that is the inverse of C, say c. And since RS, TY are circles not through V, therefore rs, ty are also circles not

through V [Thm.8].

And since circles PQ, pq are inversions of each other, and points A, a are inverses of each other, and since loglength of AP = loglength of ap [both are infinite]

and loglength of AQ = loglength of aq [both are infinite] and since inversion must preserve loglength [Def.6, Thm.9],

therefore P, p are inverses, and so are Q, q (so P, p, V are collinear, even if the diagram does not make them look like it, and so are Q, q, V).

A

P

R

Y

V

y

p

Tt

Q Ss q

r

B

Ca

b

c

Page 245: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

245

Since the disk boundary passes through inverse points (such as P, p), therefore it

is orthogonal to circle V [Thm.6], and therefore the disk boundary inverts to itself [Cor.Thm.6]. But angles are preserved in inversion [Thm.10], and RS, TY are

orthogonal to the disk boundary, therefore rs, ty, their inversions, are orthogonal to the inversion of the disk boundary, that is, to the disk boundary itself, and so they are both orthogonal lines. Consequently figure abc is another figure contained by orthogonal lines, and it is

the inversion of figure ABC. Therefore its angles are equal to the corresponding angles of ABC [Thm.10], and its sides have loglengths equal to those of the

corresponding sides of ABC [Def.6, Thm.9].

Next, using the same ideal line pq, construct a circle with center V’ which is the circle of inversion for the circles of which αβ and pq are parts [Thm.19]. Going

through similar steps as before, we will construct a triangle a’b’c’ whose angles are all equal to the corresponding ones in triangle αβγ, and whose sides all have

loglengths equal to those of the corresponding sides in triangle αβγ, and of which one side, a’b’, lies on ideal line pq. But these two triangles, abc and a’b’c’, fall under the special case we proved earlier, for loglength of ab = loglength of a′b′ since, in loglength, ab = AB [construction], AB = αβ [given], αβ = a’b’ [construction], and again loglength of ac = loglength of a′c′ since, in loglength, ac = AC [construction], AC = αγ [given], αγ = a’c’ [construction], and again

bac = b’a’c’

since bac = BAC [construction], BAC = βαγ [given], βαγ = b’a’c’ [construction],

and ab, a’b’ lie on the same ideal line pq [construction].

Therefore triangles abc and a’b’c’ have all their corresponding angles equal, and all corresponding sides equal in loglength. Therefore also the triangles of which these triangles are inversions, namely ABC and αβγ, have all of their corresponding angles equal to each other, and all of their corresponding sides equal to each other in loglength. Q.E.D.

Page 246: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

246

Note: If αβγ happens to invert to a triangle a’b’c’ that is facing the wrong way in

order to apply our second case to abc and a’b’c’, then we need only recall the elementary case with which we began, which guarantees that the mirror image of

a’b’c’, and to which the second case does apply in relation to abc, has the same angles and has sides of the same loglength as triangle a’b’c’ itself. This little hiccup is analogous to the one that occurs in Euclid’s fourth proposition of Book 1 of his Elements: if the triangle ABC happens to be the mirror image of DEF, then we

cannot simply slide ABC over and superimpose it on DEF—we have to flip it over first.

A

P

R

Y

V

y

p

Tt

Q Ss q

r

B

Ca

b

c

Page 247: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

247

THE IMPOSSIBILITY OF PROVING EUCLID’S FIFTH POSTULATE All the hard work is now done. We have only to make a shrewd use of all these theorems of Euclidean geometry in order to prove the impossibility of proving Euclid’s Fifth Postulate from his other principles. The first step in this proof is to construct a kind of “dictionary” pairing the terms in these theorems to the basic terms in Euclidean geometry. Here is our dictionary: BASIC TERMS OF TERMS OF THE EUCLIDEAN GEOMETRY

EUCLIDEAN PLANE GEOMETRY OF ORTHOGONAL LINES

1. point 1. point

2. straight line 2. orthogonal line

3. the plane 3. the disk

4. length 4. loglength

5. circle with C as center of 5. circle with C as center of

straight radii with equal lengths orthogonal radii with equal loglengths

6. rectilineal angle 6. angle of tangents to two orthogonal

lines at their point of intersection

7. things that coincide 7. things that invert to one another

The next step is to use this “dictionary” to translate the principles of Euclid other than his Fifth Postulate. We merely write out Euclid’s principles, then replace all the basic terms in them with the corresponding terms in the geometry of orthogonal lines. Here is what we get:

Page 248: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

248

EUCLID’S PRINCIPLES DISK GEOMETRY TRANSLATIONS

(1) It is possible to draw exactly one straight line

between any two points in a plane.

(2) It is possible to extend a straight line to as great a

length as you please.

(3) It is possible to describe a circle with any given

point as center (of equal lengths) and with any given

length as radius.

(4) All right angles are equal.

(5) A straight line drawn continuously from one side of

another straight line to the other side of it must cut that

other straight line.

(6) Two trilateral figures contained by straight lines and

having two sides in one equal in length to two sides in

the other, and the included angle equal to the included

angle, have their remaining sides equal in length and

their remaining angles equal.

(7) Things equal to the same thing are also equal to

one another.

(8) If equals be added to equals, the wholes are equal.

(9) If equals be subtracted from equals, the remainders

are equal.

(10) Lengths or angles that coincide with one another

are equal to one another.

(11) The whole is greater than the part.

(1) It is possible to draw exactly one orthogonal line

between any two points in the disk. [Thm.11]

(2) It is possible to extend an orthogonal line to as great

a loglength as you please. [Thm.13]

(3) It is possible to describe a circle with any given

point as center (of equal loglengths) and with any given

loglength as radius. [Thm.15]

(4) All right angles are equal.

(5) An orthogonal line drawn continuously from one

side of another orthogonal line to the other side of it

must cut that other orthogonal line.

(6) Two trilateral figures contained by orthogonal lines

and having two sides in one equal in loglength to two

sides in the other, and the included angle equal to the

included angle, have their remaining sides equal in

loglength and their remaining angles equal. [Thm.20]

(7) Things equal to the same thing are also equal to

one another.

(8) If equals be added to equals, the wholes are equal.

(9) If equals be subtracted from equals, the remainders

are equal.

(10) Loglengths or angles that invert to one another

are equal to one another.

(11) The whole is greater than the part.

Notice that all the “translations” of Euclid’s first principles of plane geometry (other than his Fifth Postulate, the only one we left off the list in the left column) are in fact true statements of the geometry of orthogonal lines, whether theorems or self-evident statements. Now suppose, if possible, that there exists an argument made exclusively from Euclid’s principles in the left column that demonstrates his Fifth Postulate. Since that argument would consist entirely of statements drawn from the left column (and their logical consequences), we could then construct a logically identical argument made

Page 249: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

249

out of the corresponding statements in the right column. So if the argument made of statements in the left column were logically sound, then the one in the right column would also have to be logically sound. This means that if the Fifth Postulate follows from the argument made of statements in the left column, then the “translation” of the Fifth Postulate would have to follow from the corresponding argument made of statements in the right column. And since the statements in the right column are all true, this means that the “translation” of the Fifth Postulate, which follows logically from them (on the present hypothesis), would also have to be true. So what is the “translation” of the Fifth Postulate, in the terms of the geometry of orthogonal lines? Euclid’s Fifth Postulate states that

If two straight lines that are cut by a third make internal angles on

one side of it that are together less than two right angles, then the

two straight lines must meet in the plane.

It also specifies on what side the lines meet, but we do not need that for our argument. Replacing all the terms in the formula above that occur in our “dictionary,” we get this translation:

If two orthogonal lines that are cut by a third make internal angles on

one side of it that are together less than two right angles, then the

two orthogonal lines must meet in the disk.

Now we have shown that if Euclid’s other principles can constitute an argument that proves his Fifth Postulate, then this statement about orthogonal lines must be true. But it is not true, since we proved the statement contradicting it in the Corollary to Theorem 18. If Euclid’s Fifth Postulate could be proved from the other principles in Euclid, then the statement above would be true, whereas it is false. Therefore it is impossible to prove Euclid’s Fifth Postulate from the other principles of Euclid. Q.E.D.

Page 250: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

250

THE CONSISTENCY OF LOBACHEVSKY’S GEOMETRY We can similarly prove that Lobachevsky’s geometry does not contradict itself, by means of the following dictionary: LOBACHEVSKIAN TERMS EUCLIDEAN TRANSLATIONS

1. point 1. point

2. straight line 2. orthogonal line

3. the plane 3. the disk

4. length 4. loglength

5. circle with C as center of 5. circle with C as center of

straight radii with equal lengths orthogonal radii with equal loglengths

6. rectilineal angle 6. angle of tangents to two orthogonal

lines at their point of intersection

7. things that coincide 7. things that invert to one another

Translating the principles of Lobachevsky’s hyperbolic geometry into corresponding Euclidean statements, we get:

Page 251: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

251

LOBACHEVSKIAN PRINCIPLES EUCLIDEAN TRANSLATIONS

(1) It is possible to draw exactly one straight line

between any two points.

(2) It is possible to extend a straight line to as great a

length as you please.

(3) It is possible to describe a circle with any given

point as center (of equal lengths) and with any given

length as radius.

(4) All right angles are equal.

(5) If a straight line meets another at right angles, from

any point on the perpendicular there is exactly one first

straight line not meeting the other toward one side, and

these three straight lines together make angles less

than two right angles on that side.

(6) A straight line drawn continuously from one side of

another straight line to the other side of it must cut that

other straight line.

(7) Two trilateral figures contained by straight lines and

having two sides in one equal in length to two sides in

the other, and the included angle equal to the included

angle, have their remaining sides equal in length and

their remaining angles equal.

(8) Things equal to the same thing are also equal to

one another.

(9) If equals be added to equals, the wholes are equal.

(10) If equals be subtracted from equals, the

remainders are equal.

(11) Lengths or angles that coincide with one another

are equal to one another.

(12) The whole is greater than the part.

(1) It is possible to draw exactly one orthogonal line

between any two points in the disk. [Thm.11]

(2) It is possible to extend an orthogonal line to as great

a loglength as you please. [Thm.13]

(3) It is possible to describe a circle with any given

point as center (of equal loglengths) and with any given

loglength as radius. [Thm.15]

(4) All right angles are equal.

(5) If an orthogonal line meets another at right angles,

from any point on the perpendicular there is exactly one

first orthogonal line not meeting the other toward one

side, and these three orthogonal lines together make

angles less than two right angles on that side. [Thm.18]

(6) An orthogonal line drawn continuously from one

side of another orthogonal line to the other side of it

must cut that other orthogonal line.

(7) Two trilateral figures contained by orthogonal lines

and having two sides in one equal in loglength to two

sides in the other, and the included angle equal to the

included angle, have their remaining sides equal in

loglength and their remaining angles equal. [Thm.20]

(8) Things equal to the same thing are also equal to

one another.

(9) If equals be added to equals, the wholes are equal.

(10) If equals be subtracted from equals, the

remainders are equal.

(11) Loglengths or angles that coincide with one

another are equal to one another.

(12) The whole is greater than the part.

Page 252: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

252

In the column of Lobachevskian principles, the only one that differs from Euclid is number (5). All the others are just Euclid’s own principles. Assume now, if possible, that the other principles in the left column can be used to construct an argument that contradicts and hence disproves (5) in the left column. We may now use our dictionary to construct a perfectly analogous argument from the corresponding terms and principles in the Euclidean column, and from the principles other than (5) in the right column prove that Euclidean theorem (5) is false. But Euclidean theorem (5) is true, as we proved before [Thm.18], as are all the other Euclidean statements corresponding to Lobachevsky’s principles. So if it were possible to disprove Lobachevsky’s (5) from his other principles, then it would be possible to get Euclid to contradict himself as well. But Euclid (we presume) does not contradict himself. Therefore neither does Lobachevski. Therefore it is impossible to disprove Lobachevsky’s (5) from his other principles. Put another way, it is impossible to prove Euclid’s Fifth Postulate from his other principles. Q.E.D. FURTHER OBSERVATIONS Our disk serves as a finite Euclidean model of the whole infinite hyperbolic plane. For anything that Lobachevsky can draw in his infinite plane, we can draw a precise analogue inside our little disk. We might wonder what in our disk geometry corresponds to the two curves we encountered in hyperbolic geometry, namely Lobachevsky’s boundary lines (or “oricycles”) and equidistance curves. To find out, we need one theorem for each curve.

Page 253: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

253

THEOREM: A boundary line (or oricycle) maps to a circle internally tangent to the disk boundary. Draw any radius of the disk, OH. Pick any point P

between O and H, then any point Q between P and H. Draw circles on PH, PQ as diameters. Keeping P fixed, let Q go to H. As this happens, PQ is always the mapping of a circle in hyperbolic space [Thm.15]. But its diameter is growing toward the infinite diameter PH (which is infinite only in logarithmic length, of course). Lobachevsky proves that as one of his circles grows toward infinity, the limit is the boundary line. Since circle PH is the limit of circle PQ in the disk, it must be the Euclidean image of the boundary line. Moreover, the property of the boundary line carries over. Choose any point R on

circle PH. Join R to C, the Euclidean center of circle PH. Draw the circle tangent to radius CH at H and to radius CR at R. Since it is tangent to CH, that is, to OH, a

radius of the disk, therefore HR is an orthogonal line, corresponding to a hyperbolic straight line. Draw PT and RT tangent to

circle PH at P and R, meeting at T. Thus the angle between arc PR and diameter PH is

TPH, and the angle between arc PR and orthogonal line RH is the curvilinear angle

PRH. And TPH (or TPC) is equal to the angle between arc PR and arc RH (or

TRC), which is equal to a right angle. This corresponds to the property of the

boundary line, that at any point such as R, we may draw a straight line at right angles to it (or, in the disk, an orthogonal line orthogonal to it), and all such lines are parallel (or, in the disk, they all meet at the

ideal point H).

O

P

H

Q

O

P TR

C

H

Page 254: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

254

THEOREM: An equidistance curve maps to any circular arc in the disk that is neither orthogonal nor tangent to the disk boundary. To see this theorem, let PAQ be any ideal line, M the midpoint of PQ, PBQ any

circular arc not coinciding with PAQ but sharing ideal points P, Q (thus PBQ is not orthogonal to the disk boundary, and therefore will not correspond to a straight line

in hyperbolic geometry), and let A, B lie on OM. We shall now see that orthogonal lines at right

angles to PAQ will be cut off by PBQ so that all their loglengths are the same. Choose R at random on PAQ.

Draw tangents to PAQ at R and A, meeting at E.

Extend AR to V on the extension of PQ.

Extend ER to C on the extension of PQ.

Thus VCR is similar to AER, and since AE = ER [being tangents from one point to a circle], therefore VC = CR.

Draw the circle of center C, radius CR (“circle C”), cutting PBQ at L.

Since R was taken at random, if we can show that RL is at right angles to PAQ, and that the loglengths of RL and AB are equal, the theorem will be proved. Since CR is in line with RE, the tangent to circle PRAQ, therefore CR is at right

angles to the radius at R in circle PRAQ, and therefore circle C is orthogonal to circle PRAQ [Def.2], and thus RL is at right angles to PA. Now it remains to be shown that loglength of RL = loglength of AB.

Observe that C lies on PQ, the line joining the points of intersection of circle PRAQ and the disk boundary, which are circles orthogonal to each other, and CR is

tangent to circle PRAQ—consequently, circle C is not only orthogonal to PRAQ, but also to the disk boundary [Thm.12 and Cor.Thm.12]. And since CR is tangent to PRAQ,

V

O

QM

G B

A

C

L

RE

P

Page 255: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

255

thus CR2 = QC ∙ CP [Euc.3.36]

so CL2 = QC ∙ CP [CL = CR]

so CL is tangent to PLBQ [Euc.3.37].

Extend CL to G where it meets the tangent at B.

Since BG is at right angles to BM, therefore BG is parallel to VCM. Also, CLG is straight,

so VCL = LGB and VC = CL [radii of circle V]

and LG = GB [being tangents from one point to a circle] so VCL is similar to LGB

and therefore VLB is also a straight line.

Conceive of a circle with center V, radius VZ (“Circle Z”) such that

VZ2 = AV ∙ VR

Let this circle be taken as circle of inversion. Thus points A, R are inverses. Also, since circle C passes through V, its inverse is a straight line [Thm.7], which

(1) must pass through A since A, R are inverses, and which (2) must be perpendicular to VC since circle C is at right angles to VC, and angles are preserved

in inversion, and VC inverts to itself. But the straight line through A and at right angles to VC is AB. Therefore AB lies along the inverse of circle C. And since VLB is straight, and passes through V, the center of inversion, and

through point L on circle C, and since the inverse of circle C lies along AB, therefore B is the inverse of point L. So the straight line through A and B is the inverse of circle C, and points A, R are inverses, and points B, L are inverses. Therefore the line segment AB is the

inverse of arc RL. And since inversion preserves loglength, therefore loglength of RL = loglength of AB. And since PRAQ is the analogue of a hyperbolic straight line, arc PLBQ is the analogue of an equidistance curve. Q.E.D. Corollary: Conversely, if PLBQ is the locus of endpoints of orthogonal lines

perpendicular to PRAQ that are all equal in loglength, then PLBQ is a circular arc.

Page 256: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

256

The correspondence between our Euclidean disk and the hyperbolic plane is complete. How marvelous that we can map everything in Lobachevsky’s world inside a single circle within Euclid’s. This correspondence is extremely useful in both geometries. Any theorem we prove in Lobachevsky automatically translates into a corresponding theorem about the disk, and vice versa, giving us two theorems for the price of one. You might try taking a random theorem in hyperbolic geometry and translating it into the language of disk geometry, and see what you get.

And if we suspect something might be true in the disk, but find it hard to think the matter through, we can see whether the corresponding question about Lobachevskian straight lines is easier to investigate—or the other way around.

For example, if we define a “regular disk figure” as one formed by orthogonal lines of equal loglength and with all its angles equal, we might wonder whether it is possible to tessellate the disk with such figures, and if so, with which ones. But we have already seen that it is possible to tessellate the hyperbolic plane with the analogues of such figures in hyperbolic geometry, namely regular polygons, so long as they meet the proper angle-sum requirements. Therefore we automatically know that it is possible to tessellate the Poincaré disk with, for example, five-sided

regular disk figures having angles of 90°—four such pentagons around every point will do it.

Canadian mathematician H. S. M. Coxeter (1907-2003) developed the rules for constructing such tessellations, using some of the foregoing theorems about orthogonal lines. The famous twentieth-century Dutch graphic artist M. C. Escher (1898-1972) once saw, and was inspired by, one of Coxeter’s disk tessellations, a hexagonal one, shown below

right. Escher subsequently made many beautiful tessellations of his own, using recognizable figures structured around the geometry of the Poincaré disk, such as the angels and demons in his 1960 woodcut “Circle Limit IV”, below left.

Page 257: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

257

Escher also tiled the disk by using more than one figure in a repeating pattern. For example, in his 1959 woodcut “Circle Limit III”, he tiled the disk with quadrilaterals and triangles. In this particular work, Escher appears to have used non-orthogonal arcs, or the analogue of equidistance curves. Would it be possible to tile the disk with regular quadrilaterals formed out of orthogonal lines? Would it be possible to do so with four such figures at each corner? Would it be possible to tile the disk with quadrilaterals as Escher has done here, with quadrilaterals and triangles, but using only orthogonal arcs, and in such a way that every quadrilateral is regular and has the same angles as every other quadrilateral, and likewise for the triangles? In Escher’s figures, he centers the pattern in the disk, which may be desirable for aesthetic reasons. But is it necessary to do so? Would it be possible, for example, to produce Coxeter’s regular hexagonal tessellation so that it was off center in the disk? The disk model of Poincaré is not the only model of hyperbolic space. Poincaré also developed the Half-Plane Model, in which Lobachevsky’s straight lines translate into semicircles (minus their endpoints) all lying in a plane and sharing a common diameter, which diameter is like the boundary of a disk that has grown infinitely large. Felix Klein also developed a model in a finite disk, and in his the straight lines of Lobachevsky map to straight lines in his disk! To keep them finite and thus fit them in the disk, of course, he must use a Cayley-Klein metric to measure distance. Although Klein’s mapping preserves straightness, it changes angles, whereas Poincaré’s method sacrifices straightness while preserving angles, for which reason it is an example of what is called conformal mapping. There is also a way to map the hyperbolic plane onto a Euclidean hemisphere, and still another way to map it onto a Euclidean hyperboloid (the surface obtained by rotating a hyperbola about its axis).

Page 258: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

258

And here is something astounding about these five distinct Euclidean models: they coexist in a marvelous geometrical relationship to each other. The diagram below

is a cross-section of the five models, depicting how a random point p in the Poincaré disk can be projected stereographically (with lines all drawn from one point) from a point directly beneath its center (at a distance equal to the disk radius) to a point he in the hemisphere model. And if we set up the half-plane perpendicular to the plane of the Poincaré disk and standing on a straight line tangent to the disk, the point he in the hemisphere model then projects stereographically to a corresponding point ha in the half-plane, from the point in the Poincaré disk furthest from the half-plane. If we now set out a plane tangent to the top point of the hemisphere, the point he in the hemisphere model also projects orthographically (by parallel lines, in this case all perpendicular) to a point k in this plane, which gives us the corresponding point in the Klein disk. And if we set up the hyperboloid model above the hemisphere, with its principal vertex on the apex of the hemisphere and its axis perpendicular to the Klein disk, its center

at O, the center of the Poincaré disk, then the point k in the Klein disk projects stereographically (from the center of the Poincaré disk) to a corresponding point hy in the hyperboloid model, and point he in the hemisphere model also projects stereographically (from the original point of projection below the Poincaré disk) to the same point hy. Note: The dotted line in the diagram is parallel to the asymptotic cone surrounding

the hyperboloid (whose center is at O). Thus any orthogonal arc in the Poincaré disk will project to an infinite line on the hyperboloid. Finally, there are also models of Euclidean geometry in hyperbolic geometry. For example, the triangles on the surface of an orisphere, or limit-sphere, have angle-sums of two right angles, and so that surface in hyperbolic geometry behaves analogously to a Euclidean plane.

P

K

p

-1

O

k

Hyhy

ha

Hahe

Hy = Hyperboloid

K = Klein Disk

He = Hemisphere

P = Poincare Disk

Ha = Poincare Half-Plane

He

Page 259: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

259

18 Philosophical Reflections on the Fifth Postulate

and Non-Euclidean Geometry What exactly does Poincaré’s disk model of hyperbolic space prove? At the very least, it proves once and for all that it is impossible simply to deduce Euclid’s fifth postulate from his other principles. We must either postulate the fifth postulate (or some other statement equivalent to it), or live without it. Euclid was right to introduce a special postulate at the foundation of parallel theory. Does the disk model prove that hyperbolic geometry does not contradict itself? Not quite. What the model proves is that if Lobachevsky’s geometry is self-contradictory, then so is Euclid’s. But do we know that Euclid’s geometry is not self-contradictory? That his principles are consistent with each other? That they contain no hidden, implicit absurdity or conflict? Before all the trouble about the fifth postulate began and non-Euclidean geometry came into existence, the universal consensus was that we know Euclidean geometry cannot contradict itself. Why not? Because all Euclid’s principles are true, and true things do not conflict with each other. But once modern mathematicians had called the truth of Euclid’s geometry into question, this reason for thinking it must be consistent became unavailable to them.

Page 260: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

260

Consequently, a new search was on for proof that Euclid’s geometry is perfectly consistent and can never run into contradictions. Leading the effort was the German mathematician David Hilbert (1862-1943), who sought an absolute, rather than a relative proof, of the consistency of Euclidean geometry. He had found a way to model Euclidean geometry in algebra, so that if Euclid contradicted himself, then algebraic operations also must contain a hidden contradiction, but that was still just a relative proof. David Hilbert wanted more. He wanted an absolute guarantee of consistency, not a mere relative “if this is inconsistent, so is that.” What would an absolute proof look like? One method that occurred to Hilbert is called finitistic reasoning. Poincaré showed that Lobachevsky is just as consistent as Euclid by showing how to translate all of Lobachevsky’s principles into principles of Euclid with the same logical form. If we can similarly translate some set of axioms into a definite set of statements about a finite object and a finite number of its states or permutations that we can verify with a finite number of observations, then we know that they are consistent absolutely. This amounts to modeling a system of axioms in another set of statements that we can verify. The reasoning goes like this. What is inconsistent (self-contradictory) in the abstract will also be inconsistent in the concrete. So if we embody the logical form of some set of abstract axioms in a definite object, and in it we find that the states that correspond to the axioms are consistent, then the abstract axioms themselves are consistent. If, for example, some set of abstract axioms (in meaningless symbols) had the same formal structure as certain finite number of states of a certain Rubik’s Cube, and those states were not contradictory to one another in the Rubik’s Cube, we would have an absolute proof of the consistency of those abstract axioms. Or, to take a more detailed example, consider this list of postulates:

1. Any two members of K are contained in just one member of L. 2. No member of K is contained in more than two members of L. 3. The members of K are not all contained in a single member of L. 4. Any two members of L contain just one member of K. 5. No member of L contains more than two members of K.

We can prove a bunch of things from these postulates using customary rules of inference. But simply by doing that we can never be sure we will not eventually run into a contradiction. So how can we tell whether these postulates are consistent?

Page 261: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

261

Hilbert’s method is to embody all of them in some finite object and see, by a finite number of observations of it, that they do not contradict one another. This is a way of turning to truth, again, to ensure consistency, but not by assuming the universal truth of any postulate in itself, but only by looking at a finite (vs. universal) way of embodying each abstract postulate. For example, in the present case, we can build a finite model of the five postulates above as follows:

Let K be the set of points that are the vertices of a particular triangle. Let L be the set of lines that are the sides of a particular triangle.

(So K is like Korner, and L is like Line.) Now each of the five postulates translates to a true statement about this one particular triangle:

1. Any two vertices of the triangle lie on just one side of it. 2. No vertex of the triangle lies on more than two sides of it. 3. The three vertices do not all lie on any one side. 4. Any two sides considered together have just one vertex in common. 5. No side has more than two vertices on it.

We can verify these by checking, observationally, with sense or imagination. After a finite number of observations, we see that each is true, and hence, since all are true, they cannot contradict. And since the concrete postulates do not contradict, neither do the abstract ones, and consequently neither do any other concrete forms of those abstract postulates. Voila. An absolute proof of consistency. Sadly (or happily?), neither Euclid’s geometry nor his arithmetic can be mirrored in a finite model such as a triangle. On any straight line there lies an infinity of points in a certain order, and after every number follows another number, in order, and these ideas are central in Euclidean mathematics—and there is no way to map these infinities of things to a finite number of objects. So Hilbert sought a way around this. Rather than map the objects of mathematics to other objects, why not map the statements of mathematics to other objects, namely to strings of symbols that exhibit a one-to-one correspondence with the words and statements and proofs of geometry? The statement “Every whole is greater than its part,” for example, is about an infinity of different possible instances, but the statement itself contains a finite set of symbols in a certain order. The first step in this direction was to completely “formalize” Euclid’s geometry, that is, to produce a set of utterly meaningless symbols that perfectly embodied the logical structure of Euclidean geometry, and a set of rules by which one could combine, separate, and manipulate those symbols that would make them behave just like the words, statements, and proofs of Euclidean geometry. Hilbert hoped that the possible patterns and structural relations among the meaningless symbols (the “formalized system”) would be finite, and so an exhaustive inventory of these would show in an absolute manner that Euclid’s geometry was self-consistent.

K

K K

L L

L

Page 262: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

262

Does all of this seem like logic to you? It did to many mathematicians of the early twentieth century, among them Bertrand Russell (left), who famously wrote that “mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.”16 Together with

Alfred North Whitehead (right), Russell pursued the mission outlined by Hilbert to achieve an absolute proof of the consistency of geometry (whether Euclidean or non-Euclidean, since other geometries are equiconsistent with Euclid’s, as we have seen). The first step was to “formalize” geometry (and number theory), that is, to express all of its foundational principles in abstract form, so that the content was left out, and only the logical structure of its statements was retained in meaningless symbols. Such an expression of the principles of mathematics would not be tied to any particular geometry. It would also carry the advantage of making the logical relationships among the terms perfectly clear, forcing us to reason from them solely by rules of inference rather than by unconsciously employing other ideas that seemed obvious to us in some special embodiment of this or that logical structure (the way, for example, Euclid assumes the two circles in his first theorem intersect, although he never formally lays down a postulate for that), thus allowing an unformalized principle to sneak in the door. The first step, then, was to formalize all of the principles of geometry and number theory. Russell’s idea was first to mirror all mathematical definitions in (or reduce them all to) logical definitions in terms of classes and the like. The idea of a “class” (a set of certain things, such as “the class of mathematicians” or “the class of elephants”) belongs to logic. Again, if we say that two classes are “similar” if there is a one-to-one correspondence between their members, and if we say that any class containing only a single member (such as “the class of first presidents of the United States”) shall be called a “unit class,” these are also logical ideas. Russell believed that all mathematical ideas could be expressed in terms of such ideas of logic. For example, the number 1 can be defined as “the class of all classes similar to a unit class.” From there, one can readily define the rational numbers. After that,

one can define √2 as a certain class of rational numbers (as we learned from the Dedekindian “cut”). So Russell began his attempt to accomplish Hilbert’s mission by translating all number-theoretical notions into purely logical ideas. (Russell and Gottlob Frege both believed, accordingly, that mathematics is nothing but a chapter of logic.) The attempt to reduce the statements of mathematics to a handful of logical axioms and rules of inference was begun in 1899 by Giuseppe Peano, an Italian mathematician. Russell and Whitehead aimed to finish the job in their massive work, titled Principia Mathematica, which was published in three volumes in 1910, 1912, and 1913, and republished in a second edition in 1927. The work seemed

16 See his 1917 essay, Mathematics and the Metaphysicians.

Page 263: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

263

to be a success until along came a young German mathematician in 1931 named Kurt Gödel (1906-1978) who proved the project was impossible! His famous paper demonstrated that it is impossible to give a metamathematical proof (a proof by mapping the statements of Euclid, for example, to a set of abstract symbols that can be manipulated by certain rules) of the consistency of a system comprehensive enough to contain all of number theory (that is, any system such as the one in Principia Mathematica that formalizes the operation of multiplication as well as that of addition, and hence also formalizes the properties of integers defined by multiplication, such as “square” and “prime”) unless the proof uses rules of inference different from the transformation rules used within the system. Such a proof, with different inferential rules from the system in Principia Mathematica itself, might be possible and useful (in fact, some have been constructed), but they leave us wondering about whether the new rules of inference are consistent! And that does not achieve Hilbert’s goals. Gödel also showed (more famously and importantly) that the logical system in Principia Mathematica (or any similar system capable of formalizing the definitions and axioms and rules of inference for elementary number theory) is essentially incomplete, if it is consistent. Given any consistent formalization of number theory, there are true number-theory statements in the system that cannot be derived in it. Gödel’s proof is a major result of modern mathematics, and it is also beautiful in much the same way that Poincaré’s disk model is. If you are interested in pursuing it further, you might enjoy reading the short popular book Gödel’s Proof, by Ernest Nagel and James R. Newman, a marvelous introduction which should be accessible to you after your experience with non-Euclidean geometry. Gödel’s discovery and the failure of Hilbert’s project seem to be a sign that modern mathematicians of the early twentieth century had been thinking of mathematics the wrong way. Perhaps mathematics is not merely a branch of logic, but really is about something—quantity, for example, or quantitative order or structure—that has a definite content, a nature, and the science of mathematics is about that, and it is not possible to state all of its principles. The emergence of non-Euclidean geometries was a major part of the impetus behind the attempt to reduce mathematics to logic, and the perceived need to find a way to prove the consistency of Euclid’s geometry and of mathematics in general. Only after non-Euclidean geometries had come into existence did some mathematicians begin to think that geometry is not “true,” that is, that it is not about any really existing space, nor even about the nature of a really existing space abstractly considered, but only about a mental construct, whose whole truth was merely a logical one, consisting in no more than self-consistency. What would it mean to say that Euclid’s geometry is truer than Lobachevsky’s if neither one ever contradicts itself? One might suppose that some particular geometry is embodied in the physical world around us, and that geometry would

Page 264: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

264

be the “true” one, but that would seem to be “true” in a sense that is unimportant to geometry, and more relevant to natural science. Perhaps exactly straight lines and flat surfaces are impossible in the world of nature—but that makes no difference to the abstract consideration of geometry. Similarly, even if some number were so large that it was impossible to instantiate it in physical objects, such a number would still be a legitimate number, and would be a genuine object of number theory. Then what ought we to say about Euclid’s geometry? The criticisms of the fifth postulate and the rise of non-Euclidean geometry present us with a number of challenging questions: (1) Is Euclid’s geometry true? And what would that mean, if there are other geometries that contradict it, or that are incompatible with it? Are there many geometries, or only one? If only one, say Euclid’s, what are we to say about the non-Euclidean geometries? Are they all false? If so, how do we see this? If on the other hand there are many geometries, is it right to say that they contradict each other? How can there be many sciences that contradict each other? And if they do not, but are somehow compatible, are all geometries created equal, so to speak, or is there a natural order among them, and if so, what kind of order? (2) Is Euclid’s fifth postulate true and self-evident? (3) Is straight said univocally or equivocally of Euclid’s straight lines, Lobachevsky’s, and Riemann’s? If univocally, are they generically the same, or specifically? If equivocally, are they so called purely equivocally, or in some other way? Could non-Euclidean geometries really be about curves in Euclid, just calling them straight equivocally? To settle these questions correctly and definitively would be no simple matter. Still, we can say something in reply to each of them. The third question is perhaps the most accessible. Straight seems to be said equivocally in the geometries of Euclid, Lobachevski, and Riemann. Being uniform is of the essence of being straight, and uniform means “the same form in every part, regardless of its length,” which is true only of Euclid’s straight lines. A tiny Euclidean straight line is the same shape (so to speak) as a huge Euclidean straight line, and any straight line is a mere blow-up, or scaled-up version, of any one of its parts. That is uniformity for you, and it belongs to no line except a Euclidean straight line. Hence an equilateral triangle built on any part of a Euclidean straight line will always have the same shape, regardless of the length of the part taken. The parts of a Lobachevskian straight line, on the other hand, when they are of different lengths, are formally distinct, being the bases of equilateral triangles with different shapes. Straight in Lobachevsky and Riemann means only a geodesic, that is, a line that locally minimizes distance all along itself. This is true of Euclid’s straight lines also, but the notion is too general to specify any particular kind of line if we abstract from

Page 265: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

265

any particular kind of space. “Uniform,” on the other hand, specifies a definite kind of line, a definite form. The word straight does not seem to be said purely equivocally of Euclid’s straight lines and Lobachevsky’s. Rather, it seems to be a case of “falling away” from the most complete meaning, as art is said first of things like carpentry, and afterward of things like logic, by dropping some of the elements of the first meaning of art. The second question is also difficult, but is more accessible after thinking about the third question as we have just done. It is reasonable to say that Euclid’s fifth postulate is true and self-evident, since it is about Euclid’s straight lines (uniform lines), and about Euclid’s planes (uniform planes). And his postulate is both true and self-evident about such things. The interpretation of Euclid’s fifth postulate as a statement about Lobachevsky’s straight lines is not true—but if it is correct to say that Lobachevsky is not talking about the same kinds of lines and planes as Euclid is, that he is using such words equivocally, then Lobachevsky is not really contradicting Euclid’s postulate. Instead, he is talking about a space in which lines such as those that Euclid calls straight do not exist. If lines that were straight in Euclid’s sense existed in Lobachevskian space, then they would behave just as they do in Euclid’s world. As it is, such lines never occur in Lobachevskian space, and so nothing in Lobachevsky contradicts what Euclid has to say about them. Just as there is one science of both rectilineal and spherical triangles, although these have opposed properties, and yet this science does not contradict itself (since it says opposed things about distinct things, not about the same things), so too we might say that there is one science of both parabolic (Euclidean) and hyperbolic (Lobachevskian) and elliptical (Riemannian) spaces, and other spaces. We can call the studies of these different spaces “different geometries,” somewhat as we can speak of the geometry of the rectilineal triangles and of spherical triangles as “different geometries.” These differ as different parts of one geometry, not as different sciences that contradict. This does not place all these “geometries,” these diverse parts of geometry, on equal footing. There is a natural order among them. Euclidean space is the first in the order of learning because, among other reasons, it agrees with our imagination. It is also first in the order of perfection, since Euclid’s straight lines are straighter than anybody else’s, and his planes are flatter than anybody else’s. This is because they are uniform—little pieces of straight line are formally the same as (have the same formal properties as) big ones, and little pieces of plane are formally the same as big ones, and that is unique to Euclid. That is why we speak of his space, and no one else’s, as “flat,” while we speak of others as “warped” or

“curved.” Again, Euclid’s geometry is first in the order of nature, somewhat as 1 is by nature before 2 and 3 and so on, or the way the origin is before all numbers on a number line. This is because Euclid’s space has no curvature, and so its characteristics are completely defined by his postulates, which are merely about forms and shapes and not about any degree of curvature. Any other geometry must, over and above such postulates, also specify a degree of curvature. Consider Lobachevsky’s. Make an equilateral triangle whose angles are each 30°. Now extend two of its sides so that they are double in length, and complete the

Page 266: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

266

new isosceles triangle. What are the two new angles in it? Nothing in Lobachevsky’s postulates will tell us. We must add another postulate stating, arbitrarily, the degree of curvature in our space. Since the original equilateral has an angle-sum of 90°, we can, if we like, say that this new triangle has an angle-

sum of 89°, or, if we prefer, 27°—and there is much more curvature in that case!. (Question: how would these diverse choices be reflected in the disk model?) Now another question: could non-Euclidean geometries be really about curves in Euclid, just calling them “straight” equivocally? Some have proposed that idea. If everything we say about Lobachevsky’s “straight lines” can be mirrored in what we say about orthogonal arcs in the Poincaré disk, then why not say we are really just talking about the orthogonal arcs (or whatever else in Euclid might mirror Lobachevsky’s talk) and using confusing words? That is tempting. But the disk model also shows that we don’t contradict ourselves when we say that the lines that are the shortest distance between two points do not behave as Euclid’s straight lines do. This seems to show that the concept of straightness represented in the idea of a geodesic does not, of itself, determine us to Euclid’s space or geometry. Which in turn shows that geodesics are thinkable in the non-Euclidean way, although they are not (for us) imaginable in that way. And so, rather than say that the non-Euclidean geometries are about curves, it seems truer to the intention of the non-Euclidean geometers, and to their actual results, to say that they have discovered imperfect kinds of space, in which lines that are straight in Euclid’s sense cannot exist, but lines that are straight in a less complete sense do exist. The final part of the senior mathematics tutorial will take us into the realm of natural science, to the study of Einstein’s theory of relativity. We will see that this theory is a geometrical one, and that non-Euclidean geometry plays a significant role in it. We will also see that the theory is counterintuitive in many other ways as well, and we will have fresh opportunities, over and above the questions surrounding the fifth postulate, to ask ourselves what exactly are the genuinely self-evident truths available to us in mathematics, in the science of nature, and in the life of the mind in general.

Page 267: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

267

Appendix 1

Dot Products and Cross Products This appendix and the following ones build on certain ideas in the junior and senior mathematics tutorials in order to provide supplementary information useful for the senior science tutorial. This first appendix will develop certain ideas concerning vectors that go beyond what we learned from Wessel. Vectors are not only subject to operations analogous to Cartesian operations, as Wessel showed, but also to certain other operations peculiar to them. Of these, the two most fundamental are the “dot product” and the “cross product,” which we will define and discuss here. They are extremely useful in pure mathematics, natural science (especially physics), engineering, and computer programming—our interest in them is for pure mathematics and also for physics (e.g., electromagnetic theory). Before defining these operations and learning some of their properties, it will be useful to explain a few more conventions concerning vectors. One of these is the use of vector coordinates to specify a vector. Suppose in a 3-D coordinate system we have a vector, 𝒂. The tip of this vector is unique to it; no other vector (with its tail at the origin) shares this point as its tip. From this tip of 𝒂, if we drop a perpendicular to the x-y plane, then from the foot of this perpendicular drop lines at right angles to the x-axis and y-axis, we can complete a box, of which the

vector 𝒂 is a diagonal. If we consider the three edges of this box that are along the axes as vectors, calling

these 𝒂𝒙 , 𝒂𝒚 , 𝒂𝒛, then it is clear, by vector addition,

that

z

yx

a

a

a a

0

3

21

Page 268: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

268

𝒂 = 𝒂𝒙 + 𝒂𝒚 + 𝒂𝒛

But the tip of 𝒂 is enough to specify 𝒂, and its tip is a point specified by its three coordinates, 𝑎1 on the x-axis, 𝑎2 on the y-axis, 𝑎3 on the z-axis. So another way to

specify vector 𝑎 is to write

𝒂 = ⟨𝑎1 , 𝑎2 , 𝑎3⟩ where the angled brackets indicate that the contents between them are the coordinates of a vector, that is, 𝑎1 is a length along the x-axis, 𝑎2 a length along the y-axis, 𝑎3 a length along the z-axis, and 𝒂 is the vector from the origin to the

point (𝑎1 , 𝑎2 , 𝑎3).

A further convention is the use of standard basis vectors, 𝐢, 𝐣, 𝐤. These mean unit-long vectors along the x-axis, y-axis, and z-axis respectively. By using these,

we can indicate a vector coordinate, such as 5𝐢, which means “five units long, from the origin and out along the y-axis.” This notation makes it very clear what the magnitude and direction of the vector coordinate it. So we can specify a vector 𝒂 like this:

𝒂 = 𝑎1𝐢 + 𝑎2𝐣 + 𝑎3𝐤

which expresses 𝒂 as a vector sum of three component vectors, each one of a certain magnitude (specified by 𝑎1 , 𝑎2 , 𝑎3) and pointing from the origin along

one of the coordinate system’s axes (the axis being specified by 𝐢, 𝐣, 𝐤). Another convention is to write ‖𝒂‖ for the absolute value of a vector, that is, the

pure magnitude of 𝒂, without including its direction.

NOTE: 𝐢 as we have just used it is not quite the same as 𝑖, or the square root of negative one. They are similar, because each is a unit long, and has a direction.

But normally 𝑖 is thought of as perpendicular to the x-axis (or rather, to the real axis), whereas 𝐢 is taken along the x-axis. THE DOT PRODUCT For two dimensional vectors, where

𝒂 = ⟨𝑎1 , 𝑎2⟩

and 𝒃 = ⟨𝑏1 , 𝑏2⟩ the dot product is 𝒂 ∙ 𝒃 = 𝑎1𝑏1 + 𝑎2𝑏2

Page 269: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

269

Notice that on the right side we have only Cartesian operations on scalars, on pure magnitudes without any direction associated with them. Consequently, the dot product itself is not a vector, but a scalar. For this reason, it is also sometimes called a scalar product. It is just the sum of the products of the two vectors’ corresponding coordinates. Although a dot symbol is used for this kind of “multiplication,” it is not ordinary Cartesian multiplication. The way to tell this apart from ordinary Cartesian multiplication is by the fact that the two terms being multiplied are vectors: 𝑎 ∙ 𝑏 = Cartesian multiplication

𝒂 ∙ 𝒃 = Dot product For 3-D vectors, where

𝒂 = ⟨𝑎1 , 𝑎2 , 𝑎3⟩

and 𝒃 = ⟨𝑏1 , 𝑏2, 𝑏3⟩ the dot product is 𝒂 ∙ 𝒃 = 𝑎1𝑏1 + 𝑎2𝑏2 + 𝑎3𝑏3 and similarly for vectors of more than three dimensions. THEOREM 1 The dot product of any two vectors is equal to the Cartesian product of their absolute values times the cosine of the angle between them. That is, if 𝒂 and 𝒃 are vectors, where

𝒂 = ⟨𝑎1 , 𝑎2⟩

and 𝒃 = ⟨𝑏1 , 𝑏2⟩

and 𝜃 is the angle between them, then 𝒂 ∙ 𝒃 = ‖𝒂‖‖𝒃‖cos 𝜃

a

b

c = a - b

θ

Page 270: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

270

To see this, first ignore the directions of 𝒂 and 𝒃 and think

of them simply as two lines 𝑎 and 𝑏 with an angle 𝜃 between them. Thus they determine a triangle ABC, with

angles A, B, C, and opposite sides 𝑎, 𝑏, 𝑐. The law of cosines then gives us the angle C in terms of sides 𝑎 and

𝑏 as follows:

𝑐2 = 𝑎2 + 𝑏2 + 2𝑎𝑏 cos 𝜃

Now let’s consider 𝒂 and 𝒃 as vectors again, and let side 𝑐 also be a vector, going from the tip of 𝑏 to the tip of 𝑎. Thus, by vector addition,

𝒃 + 𝒄 = 𝒂

so 𝒄 = 𝒂 − 𝒃 And the lengths of the sides of the triangle are just the absolute values of these vectors, so we can rewrite the law of cosines thus:

‖𝒂 − 𝒃‖2 = ‖𝒂‖2 + ‖𝒃‖2 + 2‖𝒂‖‖𝒃‖ cos 𝜃 Looking at our vectors in their 2-D coordinate system, and dropping perpendiculars from their two tips to the axes, we get their coordinates 𝑎1,𝑎2 and 𝑏1, 𝑏2. Thus we see that ‖𝒂 − 𝒃‖ is the hypotenuse of a right triangle in which

‖𝒂 − 𝒃‖2 = (𝑎1 − 𝑏1)2 + (𝑎2 − 𝑏2)

2 Also

‖𝒂‖2 = 𝑎12 + 𝑎2

2

‖𝒃‖2 = 𝑏12 + 𝑏2

2 Substituting these expressions into our rewritten law of cosines, we have

(𝑎1 − 𝑏1)2 + (𝑎2 − 𝑏2)

2 = 𝑎12 + 𝑎2

2 + 𝑏12 + 𝑏2

2 + 2‖𝒂‖‖𝒃‖ cos 𝜃 Expanding the left side and simplifying, we have

−2𝑎1𝑏1 − 2𝑎2𝑏2 = 2‖𝒂‖‖𝒃‖ cos 𝜃

or 𝑎1𝑏1 + 𝑎2𝑏2 = ‖𝒂‖‖𝒃‖ cos 𝜃

0 b a

a

b

a

b

a - b

y

x

2

2

1 1

θ

a

b

c

θA

B

C

Page 271: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

271

But the left side is by definition the dot product of vectors 𝒂 and 𝒃. So 𝒂 ∙ 𝒃 = ‖𝒂‖‖𝒃‖cos 𝜃 Q.E.D. OBSERVATION: So another way to think of a dot product is simply as the product of the Euclidean magnitude of the vectors times the cosine of the angle between them. If the angle between them is zero, nothing, because they are lined up with each other, then cos 𝜃 = 1, and the dot product is just the Cartesian product of the absolute magnitudes. If 𝜃 = 90°, then cos𝜃 = 0, and the dot product is also zero. If 𝜃 is some other angle besides a right angle, between zero and 180°, then the dot

product will be something less than the Cartesian product of the absolute magnitudes of the two vectors. And so the dot product can be thought of as a measure of parallelism in the two vectors. The more parallel they are, the more fully will their dot product share in the full magnitude of the Cartesian product of their absolute magnitudes, and the less, the less. THE CROSS PRODUCT Unlike the dot product, the cross product is not defined for 2-D vectors, but only for 3-D vectors. For a pair of 3-D vectors 𝒂 = ⟨𝑎1 , 𝑎2 , 𝑎3⟩ and 𝒃 = ⟨𝑏1 , 𝑏2, 𝑏3⟩ the cross product is

𝒂 × 𝒃 = ⟨𝑎2𝑏3 − 𝑎3𝑏2, 𝑎3𝑏1 − 𝑎1𝑏3, 𝑎1𝑏2 − 𝑎2𝑏1⟩

In other words, the cross product of the two vectors 𝒂 and 𝒃 is the vector whose three coordinates are

𝑎2𝑏3 − 𝑎3𝑏2 (on the x-axis) 𝑎3𝑏1 − 𝑎1𝑏3 (on the y-axis)

𝑎1𝑏2 − 𝑎2𝑏1 (on the z-axis)

Note that the × notation is the usual sign for ordinary multiplication, but the way to

Page 272: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

272

tell that this is not the operation here signified is by the fact that it is an operation on vectors: 𝑎 × 𝑏 = Cartesian multiplication

𝒂 × 𝒃 = Cross product Unlike the dot product, the cross product is a vector, for which reason it is also sometimes called a vector product. THEOREM 2 A cross product of two vectors is perpendicular to each of those vectors, and therefore also to the plane containing them.

Let 𝒂 = ⟨𝑎1 , 𝑎2 , 𝑎3⟩

and 𝒃 = ⟨𝑏1 , 𝑏2, 𝑏3⟩ be two nonzero vectors in 3-D space. I say that their cross product is perpendicular to both of them.

To see why, let us see what must be true of any vector 𝒗 = ⟨𝑣1 , 𝑣2, 𝑣3⟩ that is perpendicular to them both. Since such a vector will be at an angle of 90° to both,

therefore cos 𝜃 = 0 in each case, and therefore the dot product of this vector with either of our given vectors will be zero [Thm.1]. That is 𝒂 ∙ 𝒗 = 0 and 𝒃 ∙ 𝒗 = 0 Supplying the definition of the dot product, this means that

(1) 𝑎1𝑣1 + 𝑎2𝑣2 + 𝑎3𝑣3 = 0

(2) 𝑏1𝑣1 + 𝑏2𝑣2 + 𝑏3𝑣3 = 0 From Equation (1), we have

𝑣1 = − 𝑎2𝑣2 − 𝑎3𝑣3

𝑎1

Substituting this into Equation (2), we have

Page 273: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

273

𝑏1 [− 𝑎2𝑣2 − 𝑎3𝑣3

𝑎1] + 𝑏2𝑣2 + 𝑏3𝑣3 = 0

− 𝑎2𝑏1𝑣2

𝑎1 −

𝑎3𝑏1𝑣3

𝑎1 + 𝑏2𝑣2 + 𝑏3𝑣3 = 0

[𝑏2 − 𝑎2𝑏1

𝑎1] 𝑣2 =

𝑎3𝑏1𝑣3

𝑎1 − 𝑏3𝑣3

𝑣2 = [ 𝑎3𝑏1

𝑎1 − 𝑏3]𝑣3

[𝑏2 − 𝑎2𝑏1

𝑎1]

𝑣2 = [ 𝑎3𝑏1 − 𝑎1𝑏3

𝑎1]

[𝑎1𝑏2 − 𝑎2𝑏1

𝑎1]𝑣3

𝑣2 = [ 𝑎3𝑏1 − 𝑎1𝑏3]

[𝑎1𝑏2 − 𝑎2𝑏1]𝑣3

Plugging this expression for 𝑣2 back into either (1) or (2), and solving for 𝑣1, gives

𝑣1 = [ 𝑎2𝑏3 − 𝑎3𝑏2]

[𝑎1𝑏2 − 𝑎2𝑏1]𝑣3

Therefore, so long as a vector 𝒗 = ⟨𝑣1 , 𝑣2, 𝑣3⟩ is such that 𝑣1, 𝑣2, and 𝑣3 satisfy the last two equations above, we will be able to work backwards and say that equations (1) and (2) will be true, and consequently 𝒗 will have to be perpendicular

to both the given vectors. But this leaves us quite free to choose 𝑣3 at will, and then the last two equations above will tell us what 𝑣1 and 𝑣2 will be, and such a vector must always be perpendicular to the given vectors, regardless of our choice of 𝑣3. Looking at the denominators in the last two equations, and noticing they are the same, we choose 𝑣3 = 𝑎1𝑏2 − 𝑎2𝑏1 giving us

𝑣2 = 𝑎3𝑏1 − 𝑎1𝑏3 and

𝑣1 = 𝑎2𝑏3 − 𝑎3𝑏2 And so the vector defined by these coordinates must be perpendicular to the given vectors. That is,

𝒗 = ⟨𝑎2𝑏3 − 𝑎3𝑏2 , 𝑎3𝑏1 − 𝑎1𝑏3, 𝑎1𝑏2 − 𝑎2𝑏1⟩

Page 274: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

274

is perpendicular to both the given vectors. But this vector is by definition the cross product of the given vectors. Therefore the cross product of two given vectors is perpendicular to them both. Q.E.D. THEOREM 3 For any two vectors, the square of the absolute value of their cross product is equal to the product of the squares of their absolute values minus the square of their dot product.

That is, if 𝒂 and 𝒃 are two vectors, where

𝒂 = ⟨𝑎1 , 𝑎2 , 𝑎3⟩ and 𝒃 = ⟨𝑏1 , 𝑏2, 𝑏3⟩

then ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 − (𝒂 ∙ 𝒃)2 This is more like a lemma for the upcoming theorems than a theorem in its own right. To prove it, drop a perpendicular from T, the tip of 𝒂, down to

P in the x-y plane, and draw from P the perpendiculars to the x-axis and y-axis, and from T

draw a line to the z-axis parallel to PO (where O is the origin). Thus we get the coordinates of 𝒂, namely 𝑎1 , 𝑎2 , 𝑎3. By the Pythagorean Theorem, we see that

‖𝒂‖2 = 𝑎12 + 𝑎2

2 + 𝑎32

and in just the same way

‖𝒃‖2 = 𝑏12 + 𝑏2

2 + 𝑏32

and we know that by the definition of a dot product

(𝒂 ∙ 𝒃)2 = (𝑎1𝑏1 + 𝑎2𝑏2 + 𝑎3𝑏3)2

Therefore

z

yx

a

a

a a

0

3

21

T

P

Page 275: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

275

‖𝒂‖2‖𝒃‖2 − (𝒂 ∙ 𝒃)2 = (𝑎1

2 + 𝑎22 + 𝑎3

2)(𝑏12 + 𝑏2

2 + 𝑏32) − (𝑎1𝑏1 + 𝑎2𝑏2 + 𝑎3𝑏3)

2

Now on the right side, all the operations are just Cartesian, so we can multiply all that junk out. The first pair of brackets, multiplied out, gives us

𝑎12𝑏1

2 + 𝑎12𝑏2

2 + 𝑎12𝑏3

2 +

𝑎22𝑏1

2 + 𝑎22𝑏2

2 + 𝑎22𝑏3

2 +

𝑎32𝑏1

2 + 𝑎32𝑏2

2 + 𝑎32𝑏3

2 + and the squared expression on the right, multiplied out, gives us

𝑎12𝑏1

2 + 𝑎1𝑏1𝑎2𝑏2 + 𝑎1𝑏1𝑎3𝑏3 +

𝑎2𝑏2𝑎1𝑏1 + 𝑎22𝑏2

2 + 𝑎2𝑏2𝑎3𝑏3 +

𝑎3𝑏3𝑎1𝑏1 + 𝑎3𝑏3𝑎2𝑏2 + 𝑎32𝑏3

2 Now we must subtract all this from the previous stuff we multiplied out. But in that

previous stuff, 𝑎12𝑏1

2 and 𝑎22𝑏2

2 and 𝑎32𝑏3

2 occur, and in what we just multiplied out above these also occur, and are to be subtracted, so all these terms vanish, leaving

𝑎12𝑏2

2 − 2𝑎1𝑎2𝑏1𝑏2 + 𝑎22𝑏1

2 +

𝑎12𝑏3

2 − 2𝑎1𝑎3𝑏1𝑏3 + 𝑎32𝑏1

2 +

𝑎22𝑏3

2 − 2𝑎2𝑎3𝑏2𝑏3 + 𝑎32𝑏2

2 + and each of the three lines in the expression above turns out to be a nice algebraic square, so it is equal to

[𝑎1𝑏2 − 𝑎2𝑏1]2 + [𝑎1𝑏3 − 𝑎3𝑏1]

2 + [𝑎2𝑏3 − 𝑎3𝑏2]2

So now we know that

‖𝒂‖2‖𝒃‖2 − (𝒂 ∙ 𝒃)2 = [𝑎1𝑏2 − 𝑎2𝑏1]2 + [𝑎1𝑏3 − 𝑎3𝑏1]

2 + [𝑎2𝑏3 − 𝑎3𝑏2]2

Now the Pythagorean Theorem says that the right side is just the square of the line drawn from the origin to the point (𝑎2𝑏3 − 𝑎3𝑏2, 𝑎1𝑏3 − 𝑎3𝑏1, 𝑎1𝑏2 − 𝑎2𝑏1).

Page 276: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

276

And that length is just the absolute length of a vector with those same coordinates. And a vector with those coordinates is by definition the cross product of the vectors 𝒂 and 𝒃. Therefore

‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 − (𝒂 ∙ 𝒃)2 Q.E.D. THEOREM 4 The absolute value of the cross product of any two vectors is equal to the Cartesian product of their absolute values times the sine of the angle between them.

That is, ‖𝒂 × 𝒃‖ = ‖𝒂‖‖𝒃‖ sin𝜃 where 𝜃 is the angle betwen the vectors 𝒂 and 𝒃.

For: ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 − (𝒂 ∙ 𝒃)2 [Thm.3]

so ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 − (‖𝒂‖‖𝒃‖ cos 𝜃)2 [Thm.1]

or ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 − ‖𝒂‖2‖𝒃‖2 cos2 𝜃

so ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 (1 − cos2 𝜃)

so ‖𝒂 × 𝒃‖2 = ‖𝒂‖2‖𝒃‖2 sin2 𝜃

since, by the Pythagorean Theorem, sin2 𝜃 + cos2 𝜃 = 1. So, taking the square root of both sides, we have

‖𝒂 × 𝒃‖ = ‖𝒂‖‖𝒃‖sin 𝜃 Q.E.D.

z

yx

0

a1b2 – a2b1

a2b3 – a3b2 a3b1 – a1b3

Page 277: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

277

OBSERVATION: We saw that 𝒂 ∙ 𝒃 is like a measure of parallelism between

vectors 𝒂 and 𝒃. Now we see that 𝒂 × 𝒃 is like a measure of perpendicularity. The absolute value of 𝒂 × 𝒃 will be nothing if 𝜃 = 0, since sin0 = 0. But when 𝜃 =90°, then sin𝜃 = 1, and then the absolute value of 𝒂 × 𝒃 will be just the whole Cartesian product of the absolute values of the vectors 𝒂 and 𝒃. OBSERVATION (THE RIGHT HAND RULE): We cannot write 𝒂 × 𝒃 = ‖𝒂‖‖𝒃‖ sin 𝜃 since the left side is a vector and the right side is a scalar. Therefore we must write ‖𝒂 × 𝒃‖ = ‖𝒂‖‖𝒃‖sin 𝜃 But since the right side gives the length (or absolute magnitude) of 𝒂 × 𝒃, the only thing the right side is missing in order to express vector 𝒂 × 𝒃 completely is

direction. We have seen that 𝒂 × 𝒃 is perpendicular to both 𝒂 and 𝒃 and their plane, but that still leaves two directions it can point in. Which direction does the cross product point in? If we are using a right-handed coordinate system, then the direction of the cross-product is determined by the right hand rule, which works as follows. If you point your right index finger in the direction of 𝒂, and your right middle finger in the direction of 𝒃 (and they do not have to be perpendicular

to each other, but just at some angle 𝜃), and then stick your right thumb straight up from the plane of these two fingers, then your thumb is pointing in the direction of the cross product of 𝒂 and 𝒃. Note that the right hand rule implies that the order of forming a cross product makes a difference: 𝒂 × 𝒃 ≠ 𝒃 × 𝒂 But 𝒂 × 𝒃 = −(𝒃 × 𝒂) for which reason the cross product is said to be anticommutative.

So if we now let 𝒏 designate a unit vector in the direction perpendicular to 𝒂 and 𝒃 as dictated by the right hand rule, then we may write

𝒂 × 𝒃 = ‖𝒂‖‖𝒃‖ sin 𝜃 𝒏

Page 278: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

278

THEOREM 5 The area of a parallelogram spanned by two vectors is equal to the absolute value of their cross product. Let there be a parallelogram, two of whose sides are the vectors 𝒂 and 𝒃 that meet at angle 𝜃. Let ℎ

be its height, and call its area 𝐴𝑃. I say that 𝐴𝑃 = ‖𝒂 × 𝒃‖ For, from Euclid we know that the area of a parallelogram is equal to that of a rectangle with the same base and height. So 𝐴𝑃 = ‖𝒃‖ℎ

But ℎ

‖𝒂‖ =

sin𝜃

1

so ℎ = ‖𝒂‖ sin 𝜃 thus 𝐴𝑃 = ‖𝒂‖‖𝒃‖ sin 𝜃 Now ‖𝒂 × 𝒃‖ = ‖𝒂‖‖𝒃‖sin 𝜃 [Thm.4] so 𝐴𝑃 = ‖𝒂 × 𝒃‖ Q.E.D.

a

b

h

θ

Page 279: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

279

Appendix 2

Line Integrals Sometimes a quantity is distributed evenly over a length. Any quantity distributed evenly throughout a cylinder, for example, including its own volume, will be directly proportional to the length of its axis, and this will be true even if the cylinder is a “curvy” one, so long as the cylinder curves smoothly and the cross-sections orthogonal to the curved axis are equal circles.

A smooth wire of uniform bore is just such a thing. Suppose that mass is distributed evenly over a wire that is 35 cm long, so that it has a uniform linear density (symbolized as 𝛿) of .5 grams per centimeter. What is the mass of the wire?

Easy. It is (35)(. 5) = 17.5 grams. But what if instead the mass of the wire is distributed unevenly over the length

in accord with some kind of function? For example, if the density of the wire at any point (𝑥, 𝑦, 𝑧) along its axis were

𝛿 (𝑥, 𝑦, 𝑧) = 𝑥2 + 𝑦2 + 𝑧 + 1 then how would we find the mass of the wire?

Just as it is easy to find the area under a straight line parallel to the x-axis—just multiply its length times its uniform height above the x-axis—but trickier to find the area under a curve, because its height continuously varies over the interval, and so we must integrate by finding the limit of approximating sums of little rectangles, so too in the case of our wire. The basic strategy is to divide the wire into a series of short pieces, ∆𝑠1, ∆𝑠2, ∆𝑠3, etc., over each of which the corresponding linear densities are nearly constant, and which can be approximated by choosing any linear density 𝛿1, 𝛿2, 𝛿3, etc., occurring in those intervals. Then we just multiply the length of each piece by the linear density we have chosen along it, add up all the results, and we get a decent approximation of

the mass of the wire. In other words, when we divide the wire into 𝑛 pieces, we can say that

Page 280: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

280

Mass of the wire ≈ 𝛿1∆𝑠1 + 𝛿2∆𝑠2 + 𝛿3∆𝑠3 + … 𝛿𝑛∆𝑠𝑛 And this is a Riemann sum, although it is not a series of rectangles. We can also write it in summation notation:

Mass of the wire ≈ ∑𝛿𝑖∆𝑠𝑖

𝑛

𝑖=1

Now this approximation can be made to differ from the exact mass of the wire by

as little as we please, if we take 𝑛 sufficiently large, and so we can write

Mass of the wire = lim𝑛→∞

∑𝛿𝑖∆𝑠𝑖

𝑛

𝑖=1

And since the limit of such a sum is an integral, we may write

Mass of the wire = lim𝑛→∞

∑𝛿𝑖∆𝑠𝑖

𝑛

𝑖=1

= ∫ 𝛿 𝑑𝑠

𝑊

which quantity is called the line integral of the function 𝛿 with respect to the length of the wire 𝑊. Let’s go into a little more detail, now, concerning the method of dividing up the wire into segments. We want to do this in a way that will still correlate the dividing points to our coordinate system. Suppose, for example, that the wire in question is in the form of a helical curve 𝐶, and every point on it is the tip of a vector 𝒓 such that

𝒓 = 𝒓(𝑡) = ⟨2 cos 𝑡 , 2 sin 𝑡 , 𝑡⟩, 𝜋 ≤ 𝑡 ≤ 2𝜋 that is, the coordinates 𝑥, 𝑦, 𝑧 of vector 𝒓 are the results of the functions in the angled brackets performed upon 𝑡, which is an independent variable between 𝜋

and 2𝜋 (or equal to one of them).

Now we can divide the interval between 𝜋 and 2𝜋 (and more generally, the interval between 𝑎 and 𝑏) into 𝑛 equal pieces:

If we call each segment ∆𝑡, then we can designate the corresponding inputs into the function 𝒓(𝑡) by their endpoints like this:

t0 t1 t2 t3 tk - n tk tn

a b

. . .

Page 281: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

281

𝑡0 = 𝜋 + 0 ∆𝑡 𝑡1 = 𝜋 + 1 ∆𝑡 𝑡2 = 𝜋 + 2 ∆𝑡 . . . 𝑡𝑛 = 𝜋 + 𝑛 ∆𝑡 Plugging these values into the function 𝒓(𝑡) will yield certain points (tips of vectors) along the curve 𝐶 : 𝒓(𝑡0) = 𝒓0

𝒓(𝑡1) = 𝒓1 𝒓(𝑡2) = 𝒓2 . . .

𝒓(𝑡𝑛) = 𝒓𝑛 In this way, a subdivision of [𝑎, 𝑏] into 𝑛 equal pieces leads to a subdivision of

the curve 𝐶 into 𝑛 pieces, with lengths ∆𝑠1, ∆𝑠2, ∆𝑠3, etc., which subdivision we can use to form a sum of products that approximates the mass in the wire as nearly as we please. So, in general, we may define a line integral as follows.

Let 𝐶 be a smooth curve (whether in a plane or in 3-D space) defined by a vector function 𝒓(𝑡) over some interval [𝑎, 𝑏]. Let 𝑓 be a function defined on 𝐶 (we

looked at a function for density in our example). Let 𝐶 be divided into 𝑛 pieces with endpoints 𝒓0, 𝒓1, 𝒓2 … 𝒓𝑛 where 𝒓0 = 𝒓(𝑡0) and in general 𝒓𝑘 = 𝒓(𝑡𝑘) and

the points 𝑡0, 𝑡1, 𝑡2 … 𝑡𝑛 divide the interval [𝑎, 𝑏] into 𝑛 equal parts. Choose

any point 𝒄𝑘 on each piece of the curve 𝐶. Then we can define the Riemann sum

𝑅𝑛 = ∑ 𝑓(𝒄𝑘)∆𝑠𝑘

𝑛

𝑘=1

If there is a number 𝐼 to which such a Riemann sum can be made as nearly equal as we please by taking 𝑛 sufficiently large, then 𝑓 is said to be integrable with

respect to arc length on 𝐶, and

𝐼 = lim𝑛→∞

∑ 𝑓(𝒄𝑘)∆𝑠𝑘

𝑛

𝑘=1

= ∫ 𝑓 𝑑𝑠

𝐶

and the number 𝐼 is called the line integral with respect to arc length of 𝑓

along 𝐶.

x

y

z

r0

r1

r2

r3

rk

rk - 1

rn

c1

c2

c3

ck

Δs1

Δs2

Δs3

Δsk

Page 282: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

282

Appendix 3

Double Integrals There is a beautiful analogy between area under a curve as an interpretation of an integral of a function of one independent variable, and volume under a surface as an interpretation of an integral of a function of two independent variables. This idea will lead us to another kind of integral known as a double integral. Recall the definition of an integral of a function 𝑓 of one independent variable, interpreted as the area under a curve over some interval along the x-axis. First we divide the interval into 𝑛 equal parts, and on each of these ∆𝑥 segments build a rectangle with its height being the value of the function at some point along the segment. For example, on the ∆𝑥 segment between 𝑥0 and 𝑥1, we choose a point 𝑥1

∗, and the height of our rectangle on that segment will be 𝑓(𝑥1∗). In this way, we

can approximate the area under the curve with a sum of such rectangles, which sum is a Riemann sum:

area under curve ≈ 𝑓(𝑥1∗)∆𝑥 + 𝑓(𝑥2

∗)∆𝑥 + … + 𝑓(𝑥𝑖∗)∆𝑥 + … + 𝑓(𝑥𝑛

∗)∆𝑥 or

area under curve ≈ ∑𝑓(𝑥𝑖∗)∆𝑥

𝑛

𝑖=1

Since this sum, by taking 𝑛 sufficiently large, can be made to differ from the exact

. . .

0

y

xx0 x1 x2 x3 xn-1 xn

x1* x2

* x3* xn

*

a b

Page 283: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

283

area under the curve by as little as we please, therefore the limit of this sum as 𝑛 approaches infinity (or becomes larger without bound) is equal to the area of under the curve, and this limit is called the integral of the function over the interval:

area under curve = lim𝑛→∞

∑𝑓(𝑥𝑖∗)∆𝑥

𝑛

𝑖=1

= ∫ 𝑓(𝑥) 𝑑𝑥𝑏

𝑎

Now we want to define an integral of a function of two independent variables, first of all interpreted geometrically, in a manner analogous to what we just did. Since a function in two independent variables describes a surface rather than a line, therefore we should expect the integral to give us a measure of the volume between the surface and the x-y plane, rather than an area between a line and the x-axis. Consequently, instead of evaluating the integral over a length, that is, over an x-axis interval [𝑎, 𝑏], we will evaluate it over an area, a certain region in the x-y plane, and for simplicity we will use a rectangle (although this is not strictly necessary), namely [𝑎, 𝑏] × [𝑐, 𝑑].

So we aim to compute the volume between our surface 𝑆 and the “shadow” under it, the rectangular region 𝑅, given a two-variable function describing 𝑆, namely 𝑓(𝑥, 𝑦). The first step is to find a way to express an approximation of this volume. As with the area under the curve, where we divide the interval [𝑎, 𝑏] into 𝑛 equal pieces each one of

which is of length ∆𝑥 = 𝑏 − 𝑎, so we divide region 𝑅 into 𝑛 × 𝑚 rectangles each one of which is of

area ∆𝐴 = [𝑏−𝑎

𝑛] [

𝑑−𝑐

𝑚], by dividing

[𝑎, 𝑏] into 𝑛 equal parts, and [𝑐, 𝑑] into 𝑚 equal parts. And just as we chose any point 𝑥𝑖

∗ anywhere along ∆𝑥𝑖, and built a rectangle on

that ∆𝑥 as base and with height 𝑓(𝑥𝑖∗), so now we choose any point (𝑥𝑖

∗, 𝑦𝑗∗)

inside each of the 𝑛 × 𝑚 equal rectangles into which we divided 𝑅, and build a

box (a rectangular parallelepipedal solid) with height 𝑓(𝑥𝑖∗, 𝑦𝑗

∗), and base ∆𝐴.

x

y

z

S

Rd

ca

b

y

0 x

d

c

a b

x0

x1 xi

xn

xn-1

y0

ym

yj

Page 284: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

284

Evidently, the volume under surface 𝑆 is approximately equal to the sum of all

these boxes. Taking this sum means to take first the sum of all boxes 𝑓(𝑥𝑖∗, 𝑦𝑗

∗)∆𝐴

in which 𝑖 = 1, and in which 𝑗 ranges from 𝑗 = 1 to 𝑗 = 𝑚, then to do that all over

again where 𝑖 = 2, and again where 𝑖 = 3, etc., all the way to 𝑖 = 𝑛. So we are taking a double sum, or a sum of sums, because it is a sum of the rows of boxes (where each row is lined up parallel to the x-axis, and each such row is itself a sum of boxes). We may write this as follows:

volume under 𝑆 = 𝑉𝑆 ≈ ∑∑𝑓(𝑥𝑖∗, 𝑦𝑗

∗)∆𝐴

𝑚

𝑗=1

𝑛

𝑖=1

But by taking 𝑛 and 𝑚 sufficiently large, this sum of boxes may be made to differ

by as little as we please from the exact volume under 𝑆, so we may write

𝑉𝑆 = lim𝑛,𝑚→∞

∑∑𝑓(𝑥𝑖∗, 𝑦𝑗

∗)∆𝐴

𝑚

𝑗=1

𝑛

𝑖=1

And this is an example of a double integral—more specifically, it is an integral of a function of two independent variables over a rectangle, which we designate like this:

∬𝑓(𝑥, 𝑦) 𝑑𝐴

𝑅

= lim𝑛,𝑚→∞

∑∑𝑓(𝑥𝑖∗, 𝑦𝑗

∗)∆𝐴

𝑚

𝑗=1

𝑛

𝑖=1

We use two integral signs to signify that we are dealing with integration over a 2-D region, and to remind us that it is the limit of a double sum. We use the expression 𝑑𝐴 as the differential rather than 𝑑𝑥 for similar reasons: because this limit is not taken as just one interval,

∆𝑥, shrinks to zero, but as the product of two intervals, ∆𝑥∆𝑦, or

∆𝐴, shrinks to zero. Instead of bounds of integration, such as 𝑎 and 𝑏, we write 𝑅 under the integral signs, denoting the region over which we are taking the integral.

z

yx

Page 285: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

285

Appendix 4

Surface Integrals In Appendix 2, we learned about line integrals, which enabled us to find the total quantity of something (such as mass) that is unevenly (but functionally, that is, in accord with some function) distributed over a length. Now we will learn about surface integrals, which enable us to find the total quantity of something that is unevenly (but “functionally”) distributed over a surface. A surface integral is a generalized form of multiple integrals (e.g. of double integrals), integrating some quantity distributed over a wavy surface as opposed to a flat one. Our double integral example in Appendix 3 integrated box-volumes over a flat surface, that is, over a region of the x-y plane, and the result was a volume (under a surface, over a flat region). What if we wanted to integrate not box-volumes, but another quantity, such as mass (or charge, or whatever), that varied with location in a wavy surface? Then we would get not a volume, but a total mass (or charge, or whatever) of the surface, and the integral we are seeking is called a surface integral. Let 𝑆 be a thin surface described by a function

𝑧 = 𝑔(𝑥, 𝑦)

and let it stand directly above a rectangular region 𝑅, just as in Appendix 3. Also, suppose that the mass-density 𝛿, that is, the mass per unit area at any given point in the surface, is determined in accord with a function

𝛿 = 𝑓(𝑥, 𝑦, 𝑔(𝑥, 𝑦)) where we do not write 𝑓(𝑥, 𝑦, 𝑧) because we are interested in the mass in the

surface 𝑆, and the z-values for the surface are not independent of 𝑥 and 𝑦, but are determined by them.

Page 286: SPRING SEMESTER 2017 Edition · Einstein’s general theory of relativity. So modern math does indeed belong in a liberal education at least as a preparation for other disciplines,

286

How do we find the mass of the whole surface 𝑆? First we chop up 𝑅, the rectangular region under 𝑆, into 𝑛 × 𝑛 parts and

orthogonally project this grid onto 𝑆 itself, thus chopping up 𝑆 also into 𝑛 × 𝑛 parts. Then in each resulting element of 𝑆, we choose one of the mass densities that occur in it, and consider this to be approximately a constant in that element, if we take 𝑛 sufficiently large. We may designate each element of 𝑆 as 𝑆𝑖𝑗 (where the numbers 𝑖 and 𝑗 give the

coordinates of the corner of the corresponding rectangular element in 𝑅 that lies closest to the origin of our coordinate system), and designate the mass density we choose within it as 𝛿𝑖𝑗. Then if we multiply each area element 𝑆𝑖𝑗 by the mass

density we chose within that element, we will get an approximation of the mass in

that element of 𝑆. Adding up all such products yields a decent approximation of the total mass in 𝑆, for 𝑛 sufficiently large. So we may write

𝑀𝑆 ≈ ∑∑𝛿𝑖𝑗𝑆𝑖𝑗

𝑛

𝑗=1

𝑛

𝑖=1

This is of course just a Riemann sum, more particularly a double sum. Since this approximation of the mass in 𝑆 can be made to differ from the mass in 𝑆 by less

than any given difference, simply by taking 𝑛 sufficiently large, therefore the limit of this sum, as 𝑛 increases without bound, is equal to the mass of 𝑆, and we write this as a surface integral:

𝑀𝑆 = lim𝑛→∞

∑∑𝛿𝑖𝑗𝑆𝑖𝑗

𝑛

𝑗=1

𝑛

𝑖=1

= ∬𝑓(𝑥, 𝑦, 𝑔(𝑥, 𝑦)) 𝑑𝜎

𝑆

where the 𝑑𝜎 (the second letter is the Greek letter sigma) indicates that the surface integral was taken by diminishing the area-elements of region 𝑆 toward zero.