complex.tex

CONSTRUCTING THE COMPLEX NUMBERS

Al Cuoco

EDC

Introduction

The development of the complex number system C in precalculus or algebra 2texts usually follows one of two paths:

(1) The existence of an “imaginary unit” i =√−1 is postulated in order to

solve certain quadratic equations.(2) The complex numbers are “defined” by giving R2 an algebraic structure in

which addition is defined as ordinary vector addition and multiplication isdefined by the formula:

(a, b) · (c, d) = (ac − bd, bc + ad)

The properties of complex numbers are then listed. Then i is defined asthe special pair (0, 1), and (1, 0) is identified with 1, so that (a, b) “equals”a + bi.

Both of these methods seem arbitrary to students. But this doesn’t alter the effec-tiveness of either development of the complex numbers, because within a paragraphof the definitions, students are practicing the real point of the unit: they work pagesof problems in which they have to “simplify” things like

√2 + 3i

3 −√

2i

and students quickly forget that there was ever any confusion about what these newobjects are; the meaning of complex numbers resides in the use of the propertiesthat allows one to transform expressions like the one above.

There is in this tradition, as there is in most traditions in school mathematics,a germ of something very important. In (Katz 1996), Katz argues that mathe-maticians dealt with the algebraic behavior of complex numbers long before C wasencapsulated into an algebraic system. For example, given a cubic equation withreal coefficients and roots, in applying Cardan’s algorithm, one often needs to per-form algebraic manipulations on expressions that involve square roots of negativenumbers. If you “make believe” that these creatures obey the usual rules of algebra

This work was supported in part by the National Science Foundation, grant DUE 9450731. It

is part of the Gateways to Advanced Mathematical Thinking project at Education Development

Center. The opinions expressed here are not necessarily those of the NSF.

1

2 AL CUOCO

in R, then they eventually drop out of the calculations and you are left with realnumbers (and correct solutions).

But this is a long way from “introducing” i into the number system or definingan algebra on ordered pairs. In the mid 1970’s, I began looking for an approachto complex numbers that would make sense to my high school algebra students,connect to other mathematics they knew, and fit into a broader mathematicallandscape. What I finally adopted was an approach, described below, that goesback to Kronecker. It uses a technique that is central to algebra and numbertheory (“reducing modulo a prime”), and it parallels the construction of othermathematical structures, establishing connections that have been quite productiveover the past century and a half. Others, most notably Sawyer (Sawyer, 1956),have used this approach in undergraduate algebra texts; my version was for highschool students, so it was more informal, less rigorous, and based in the detailedanalysis of algebraic calculations.

The basic idea is to exploit the similarities between Z, the ring of ordinaryintegers, and R[x], the ring of polynomials in one variable with coefficients in R.Students practice the technicalities of these similarities throughout their high schoolcourses, often without any inkling that “algebra” on polynomials acts very much likethe arithmetic on Z that they studied in elementary school. They factor polynomialsinto irreducibles, they find greatest common divisors for pairs of polynomials, andthey divide one polynomial by another to get a quotient and remainder.

The properties of arithmetic in Z allow one to consider the notion of congruence(first introduced by Gauss in Disquisitones Arithmeticae (Gauss, 1966)). Two in-tegers are congruent modulo m if their difference is divisible by m. This relationof being congruent modulo m breaks the integers up into classes, and there is anatural algebraic structure on the set of these classes that turns it into a ring. If mis a prime integer, this ring actually contains a reciprocal for each of its non-zeroelements, so it is a field. In practice, this process of “reducing modulo m” is donewhen one wants to “ignore” multiples of m. For example, when finding the unitsdigit of a long calculation, one wants to forget about multiples of 10, so the cal-culation can be carried out modulo 10. In essence, reduction modulo an integer amathematical mechanism for identifying that integer with 0.

The preceding paragraph can be transported via a simple dictionary to a para-graph about polynomials: If F is a field, the properties of arithmetic in F [x] allowone to consider the notion of congruence for polynomials. Two polynomials are con-gruent modulo g if their difference is divisible by g. This relation of being congruentmodulo g breaks F [x] up into classes, and there is a natural algebraic structure onthe set of these classes that turns it into a ring. If g is irreducible over F , this ringactually contains a reciprocal for each of its non-zero elements, so it is a field. Inthis field, things are equal if, as polynomials, they differ by a multiple of g. In par-ticular, the class containing g also contains 0, so we can write g(x) = 0. This meansthat the class of x is a root of the equation g(x) = 0, so we have constructed a fieldin which g(x) has a zero. As we’ll see, this process gives no analytic informationabout the roots of g (it says nothing about the size of complex roots, for example),but it completely determines the algebraic character of the roots, in the sense thatit provides all the necessary information for calculating with them. By choosing Fto be R and g to be x2 + 1, this process yields the complex numbers.

CONSTRUCTING THE COMPLEX NUMBERS 3

Now, for years, I had been introducing my second year algebra students to Z/mZ,the ring of integers modulo m (for various m), and we used these systems as coun-terpoint to ordinary algebra over Q or R (doing this was not all that uncommon inAmerican schools in the late sixties and early seventies). There were two reasonsfor including the Z/mZ in my algebra classes:

(1) Students are supplied with a spectrum of domains in which they can try outthe usual theorems of algebra. The existence of zero divisors in Z/6Z, forexample, explains the curious fact that some quadratic equations over Z/6Zhave more than 2 roots, and this provides insight into the features of Q thatcause the the “factor to solve” method to work.

(2) Reducing modulo a troublesome element so that you can see how thingsbehave when you “take it out of the picture” seems like a central and ubiq-uitous mathematical habit of mind that I want to introduce early and oftento my students.

In 1975, I decided to build on these investigations in modular arithmetic to con-struct quotients of polynomial rings (Q[x] and R[x]). Students started out with setsof orchestrated calculations that focused on the development of the structure pre-serving nature of the reduction R[x] → R[x]/fR[x]. For example, students mightwork a problem set in which they were asked to find the remainders when givenpolynomials were divided by, say, f(x) = x2 +x+1. And they would reduce thingslike:

(1) 5x3 − 3x2 + 2x + 1(2) x4 − 3x3 + 2x − 4(3) 5x3 − 3x2 + 2x + 1 + x4 − 3x3 + 2x − 4(4) (5x3 − 3x2 + 2x + 1)(x4 − 3x3 + 2x − 4)(5) (5x3 − 3x2 + 2x + 1)2 + (x2 + x)(x4 − 3x3 + 2x − 4)

The point of these exercises was not to practice algebraic manipulation (althoughthat was a side-effect), but to analyze how the reduction process worked. Weeventually came up with some rules:

(1) The remainder of the sum is the sum of the remainders.(2) The remainder of the product is the remainder of the product of the re-

mainders.(3) In a long calculation, you can “reduce as you go” rather than calculate and

reduce at the end.(4) Every calculation ends up in the form ax + b.(5) x2 can be replaced by −x − 1 in any calculation.

Then we tried different moduli f , concentrating on the very different structuresthat arise when f is composite versus when it is irreducible. Finally, we specializedto R[x]/(x2 +1)R[x], developing all the rules for calculating with complex numbersby going back to remainder arithmetic with polynomials.

I was quite pleased with the results of this approach. The algebraic rules forworking with complex numbers arose in a much more general context, they wereforced on us by the nature of the reduction process, and students never worriedabout the “mysterious” nature of complex numbers. Because we were dealing withpolynomials as formal expressions rather than as functions, the “x” in R[x] was anindeterminant rather than a variable. Changing x to i was therefore far less trou-

4 AL CUOCO

blesome that it would have been had we taken a “functions” approach to algebra;it amounted to making a name change. Somewhat later, we went back and lookedat C from a geometric perspective, but this “algebra first” development was at thesame time accessible and historically correct. So, my first message is that the ideasof structural similarity can be introduced long before a formal course in abstractalgebra:

Claim 1. The well-known similarity between the Euclidean domains of the integersand the polynomials over R (or Q) can be taught effectively to students by analyzingcalculations rather than via the abstract machinery of ring and field theory.1

That’s how the situation stayed for the next decade. As I gradually beganto introduce technology into my classes, several students (usually as independentstudy projects) implemented computational models of of Z/mZ in languages likeLogo (details of these models are described in (Cuoco, 1990)), Scheme, and ISETL.But the lack of algebraic expressions (even polynomials) as first class objects inthese environments kept me from implementing the other half of my program:the transport of the constructions from Z to Q[x] or R[x]. It is possible in theseenvironments to mock up a polynomial calculator (see (Dubinsky-Leron, 1991) foran example) but these constructions lack the capability for performing genericcalculations (multiplying and simplifying polynomials with literal coefficients, forexample), an essential ingredient in my mind for showing how the rules for addingand multiplying in quotients of polynomial rings are imposed by the arithmetic ofordinary polynomials.

With the introduction of programmable symbolic manipulators, it became pos-sible to do what I had wanted to do for many years: Starting with a model for Zand its quotients, one can transport the model, almost word for word, to poly-nomial rings, getting workable models for algebraic extensions and root fields forpolynomial equations. I never did get to carry out this enterprise with high schoolstudents; the programs were (and still are) too expensive and hardware-intensivefor most public schools.

Claim 2. A computer algebra system enables students to concentrate on analyzingpatterns and similarities in calculations rather than on just performing them. Thatthe same computational algorithms can be used in Z and R[x] allows instructors toemphasize the structural similarities between these two domains.

Claim 2 is still a conjecture, at least for high school students. This paper outlineshow the approach I implemented with paper and pencil could be adapted to useof a programmable symbol manipulator; the one used here is Mathematica, butthe actual choice of software is irrelevant. During the summer of 1990, a groupof high school mathematics teachers2 worked through the same construction usingMaple. Most of the construction is a translation into Mathematica of the originalLogo models that I built with my students; the difference of course is that theMathematica models work with polynomials as well as with integers.

1For the definitions of unfamiliar terms from abstract algebra, see any standard text (Herstein,

1964, for example).2These teachers were participants at the Institute for Secondary Mathematics and Computer

Science Education, a two summer program conducted at Kent State and funded by the NSF.


Why bother with computers at all? The paper and pencil version of this devel-opment did what it was supposed to do quite well, so what is the value added byintroducing computational media? Without extensive field testing, it’s impossibleto say for sure, but I have some conjectures:

(1) This use of technology adheres to the point of view that building a com-putational model for a mathematical structure helps one build the mentalconstructions needed to interiorize that structure (Papert and Harel, 1991).

(2) Once built, the models are executable, so that students build a laboratoryin which they can conduct experiments, check out conjectures, and look forpatterns (Cuoco and Goldenberg, 1997).

(3) The models are easily malleable, so that it’s a simple matter to change animplementation of Z/5Z to a model of Z/6Z or even to a model ofR[x]/(x2 + x)R[x]. Analysis of why this is so leads to an investigation ofthe structural similarities between Z and R[x].

(4) The fact that the environment is capable of doing generic calculations allowsstudents to produce general rules for calculation with elements of an alge-braic extension (of, say, Q). For example, the rule for multiplying complexnumbers:

(a + bi)(c + di) = (ac − bd) + (ad + bc)i

can be derived by asking the system for the remainder when

(a + bx)(c + dx)

is divided by x2 + 1.(5) Being able to easily change the modulus while keeping the construction

the same is a great way to begin a discussion about the “universality” ofpolynomial rings as calculation domains.

Think of what follows as an outline for a curriculum unit, a set of notes forteachers and instructors that sketches a development of mathematical ideas as itoutlines some computational models for the ideas.

In the next section, I’ll sketch some of the relevant number theory, and I’ll showhow to implement a Mathematica model for arithmetic with congruence classesin Z, using ideas from (Abelson and Sussman, 1985). In particular, the focuson constructors and selectors allows students to deal with the very demandingnotions of quotient rings and cosets without having to deal with the set of cosetsas an algebraic system. Then I’ll transport our implementation to R[x] by simplychanging a few basic functions. This will lead to a model for C, and the “rules” foralgebra in C will fall out of computer calculations with generic elements of C.

A notebook containing the Mathematica code for everything that follows is avail-able at EDC’s Gateways to Advanced Mathematical Thinking websitewww.edc.org/LTT/GAMT/ALG.html. The ideas in this paper will be much clearerand much more fun for readers who download the code, run the models, and see forthemselves how algebraic structure emerges from careful analysis of calculations.

6 AL CUOCO

Some Arithmetic

The integers Z form an integral domain with units 1 and −1. This domain isalso Euclidean: If a and b are integers with a > 0, there are integers q and r so that

b = aq + r and 0 ≤ r < a

In fact, q and r are functions of the pair (a, b), and (if b ≥ 0) the division algorithmtaught in elementary school is one way to calculate the outputs of these functions.These functions are also built into most programming languages; let’s assume thatour language has a function mod that calculates remainders and a function quotthat calculates quotients. They behave like this

In [1]:=mod [17,5]

Out [1]=2

In [2]:=quot [17,5]

Out [2]=3

Note that mod(b, a) = 0 if and only if a | b.If a, b ∈ Z, we can ask for their greatest common divisors; that is, we can ask for

the integers d that have the following properties:(1) d | a and d | b, and(2) if c is any common divisor of a and b, then c | d

A pair of integers has two greatest common divisors; one is the negative of the other(so that the two greatest common divisors for a pair of integers can be obtainedfrom one another by multiplying by units). By convention, we’ll take the positiveone, so that we’ll call 6 the greatest common divisor of 18 and 60. Let’s denote thegreatest common divisor function by gcd, so that gcd(12, 16) = 4.

There is an algorithm for calculating the greatest common divisor of two integersthat goes back to Euclid. It is based on the following result

Euclid’s Theorem. If a and b are integers and a = 0, then

gcd(a, b) = gcd (mod(b, a), a)

Using this theorem and the fact that gcd(0, b) = b, we have a method for calcu-lating the gcd of any two integers. For example,

gcd(124, 1028) = gcd(36, 124) because mod(1028, 124) = 36

= gcd(16, 36) because mod(124, 36) = 16

= gcd(4, 16) because mod(36, 16) = 4

= gcd(0, 4) because mod(16, 4) = 0= 4

The calculation (called Euclid’s Algorithm) is often arranged like this


8

124)

1028

992 3

36)

124

108 2

16)

36

32 4

4)

16

16

0

The function gcd can be modeled in Mathematica by these two lines:

gcd[0,b ] := b

gcd[a ,b ] := gcd[ mod[b,a],a ]

Suppose we take two integers, say 6 and 15, and we form the set of all integersthat can be written as 6x + 15y where x and y are any integers. This set is calledthe ideal generated by 6 and 15, and it is denoted by (6, 15) or 6Z + 15Z. Forexample, 27 ∈ (6, 15) because

27 = 6 · −18 + 15 · 9

Since 27 ∈ (6, 15), any multiple of 27 is also in (6, 15), because

27k = 6 · −18k + 15 · 9k

Similarly, the sum of any two elements of (6, 15) is also in (6, 15). Notice that4 ∈ (6, 15) because any combination of 6 and 15 must be divisible by 3. It turnsout that every multiple of 3 is in (6, 15) because 3 itself is in (6, 15):

3 = 6 · −2 + 15 · 1

Denoting the set of multiples of 3 by 3Z, we can summarize by saying that (6, 15) =3Z. In general, we have

Theorem. If a and b are integers and d = gcd(a, b), then

(a, b) = dZ

In more abstract treatments of arithmetic in Z, the greatest common divisor oftwo integers is defined as the ideal generated by the integer that we have called

8 AL CUOCO

their greatest common divisor (that is, if d = gcd(a, b), the greatest common divisorof a and b is defined as dZ). In that case, the above theorem says that the idealgenerated by two integers is their greatest common divisor.

The proof of this theorem amounts to showing (just as we did in the aboveexample) that d ∈ (a, b). In other words, the theorem follows from the fact thatthere are integers x and y so that d = ax + by. In fact, there is a method forcalculating the coefficients x and y that can be captured in Mathematica, and itamounts to working Euclid’s algorithm backward. Here are the details:

First, notice that there are many ways to write 3 as a combination of 6 and 15:

3 = 6 · −2 + 15 · 1= 6 · 3 + 15 · −1= 6 · 8 + 15 · −3

=...

We’ll produce one set of coefficients. This amounts to making some choices in thecase where one of our integers, say b, is 0. Different choices will produce differentpairs of coefficients. So, imagine two functions fcoeff and scoeff so that

gcd(a, b) = a · fcoeff(a, b) + b · scoeff(a, b)

Since gcd(0, b) = b for any b, let’s agree that

fcoeff(0, b) = 0 and scoeff(0, b) = 1

In Mathematica:

fcoeff[0,b ] := 0

scoeff[0,b ] = 1

Now, for the general case, look again at how the gcd is calculated, here for 216and 3162:


14

216)

3162

3024 1

138)

216

138 1

78)

138

78 1

60)

78

60 3

18)

60

54 3

6)

18

18

0

So, gcd(216, 3162) = 6. Write out each of the results of the divisions, solving forthe remainders:

138 = 3162 − 14 · 21678 = 216 − 1 · 13860 = 138 − 1 · 7818 = 78 − 1 · 606 = 60 − 3 · 18

Now, start with the last equation, and inductively back-substitute, simplifyingat each step:

6 = −3 · 18 + 60

= −3(78 − 1 · 60) + 60 = 4 · 60 − 3 · 78

= 4(138 − 1 · 78) − 3 · 78 = −7 · 78 + 4 · 138

= −7(216 − 1 · 138) + 4 · 138 = 11 · 138 − 7 · 216

= 11(3162 − 14 · 216) − 7 · 216 = −161 · 216 + 11 · 3162

We find that 6 = −161 · 216 + 11 · 3162. Notice that 6 is not only a combinationof 216 and 3126, it is also a combination of 18 and 60, of 60 and 78, of 78 and 138,

10 AL CUOCO

and of 138 and 216. Notice also that the pairs

(18, 60) (60, 78) (78, 138) (138, 216) (216, 3162)

are just the quotients and remainders in our calculation

14

216)

3162

3024 1

138)

216

138 1

78)

138

78 1

60)

78

60 3

18)

60

54 3

6)

18

18

0

Actually, the complete set of quotient-remainder pairs can be arranged like this:

(216, 3162) → (138, 216) → (78, 138) → (60, 78)

→ (18, 60) → (6, 18) → (0, 6)

Now, the equations

6 = −3 · 18 + 60

= −3(78 − 1 · 60) + 60 = 4 · 60 − 3 · 78

= 4(138 − 1 · 78) − 3 · 78 = −7 · 78 + 4 · 138

= −7(216 − 1 · 138) + 4 · 138 = 11 · 138 − 7 · 216

= 11(3162 − 14 · 216) − 7 · 216 = −161 · 216 + 11 · 3162

show that 6 is a combination of the numbers in every pair except the last two inthe above sequence, and 6 is also a combination of the numbers in these pairs:

6 = 0 · 0 + 1 · 66 = 1 · 6 + 0 · 18


We could think of the complete calculation as

6 = 0 · 0 + 1 · 6= 1 · 6 + 0 · 18= −3 · 18 + 1 · 60= 4 · 60 − 3 · 78= −7 · 78 + 4 · 138= 11 · 138 − 7 · 216= −161 · 216 + 11 · 3162

Euclid’s theorem guarantees that every pair in our sequence has the same gcdbecause the sequence develops according to the rule:

· · · −→ (s, t) −→ (mod(t, s), s) −→ · · ·In fact, if

b = qa + r and 0 ≤ r < a

then the calculation for finding gcd(a, b) starts out like this:

(a, b) −→ (r, a) −→ · · ·Suppose that d = gcd(a, b), so that d = gcd(r, a). As in our calculation with 216and 3162, we take the coefficients for r and a and “lift” them to coefficients for aand b. That is, suppose we have found integers x and y so that

d = rx + ay

Then, since r = b − aq, we have

d = (b − aq)x + ay = a(y − qx) + bx

In other words,

fcoeff(a, b) = scoeff(r, a) − q · fcoeff(r, a)

andscoeff(a, b) = fcoeff(r, a)

So, we can complete our Mathematica models with these two lines:

fcoeff[a ,b ] := scoeff[ mod[b,a],a ] -quot[b,a]* fcoeff[ mod[b,a],a ]

scoeff[a ,b ] := fcoeff[ mod[b,a],a ]

For example,

In [1]:=fcoeff [216,3162]

Out [1]=-161

In [2]:=scoeff [216,3162]

Out [2]=11

12 AL CUOCO

Congruence classes. Suppose that m is a positive integer. The relation of beingcongruent modulo m, written a ≡ b (mod m), is an equivalence relation on Z.The set of classes modulo m (“the integers modulo m”) is denoted by Z/mZ. Acongruence class modulo m is just a stream of integers in an arithmetic sequence.It’s common to describe a class by giving a representative in the class and themodulus of the class. For example, “the class modulo 7 that contains 13” describes

· · · − 22, −15, −8, −1, 6, 13, 20, . . .

Every class modulo m contains a distinguished representative: an integer between0 and m− 1. So, a congruence class is determined by two pieces of information: itsmodulus and its distinguished representative.

The relation a ≡ b (mod m) is equivalent to mod(a, m) = mod(b, m). That is, ais congruent to b modulo m if and only if a and b leave the same remainder whendivided by m. Since the set of remainders on division by m is finite (the onlypossible remainders when you divide my m are 0, 1, 2, 3, . . . , m − 1), there areonly a finite number of classes modulo m. In other words, Z/mZ is a finite set.

One way to model Z/mZ in a computer language is to imagine a constructorclass that takes two inputs a and m and returns an abstract representation forthe class modulo m that contains a, and two selectors rep and modulus that takethe output of class and return, respectively, the distinguished representative of theclass and the modulus of the class.3 In Mathematica, a typical exchange with thecomputer might be:

In [1]:=modulus [ class [15,6] ]

Out [1]=6

In [2]:=rep [ class [15,6] ]

Out [2]=3

In [3]:=class [15,6] == class [21,6]

Out [3]=True

One Mathematica implementation of this representation makes use of its list pro-cessing. We can represent a class as a list of two elements; the first element is itsdistinguished representative, and its last element is its modulus. The selectors arethen just the usual list selectors:

class[a ,m ] := { mod[a,m],m }

rep := First

modulus := Last

3The constructor-selector theory is a data abstraction mechanism for representing objects via

their behavior. See (Abelson and Sussman, 1985) for a complete description.


This constructor-selector mechanism allows students to deal with equivalenceclasses without the overhead of constructing the set of equivalence classes. Justas young children can work with integers without needing to think of the set ofintegers as an object, dealing directly with the classes can be thought of as a firststep to thinking about Z/nZ in its entirety as an algebraic system.

Arithmetic with classes. We can view the sorting of integers into classes modulom as a function

Z→ Z/mZ

where a → class(a, m). The arithmetic of Z is compatible with this mapping inthe sense that if class(a, m) = class(b, m) and class(c, m) = class(d, m), then

class(a + c, m) = class(b + d, m) and

class(ac, m) = class(bd, m)

This follows from facts about divisibility and congruence:We can use our mapping Z → Z/mZ to transport arithmetic on Z to an arith-

metic on Z/mZ, by defining the sum or product of two classes to be the classcontaining the sum or product of their representatives, and the above discussionshows that this arithmetic on Z/mZ is well defined (thinking back to my high schoolstudents, this would be a good place for some orchestrated hand calculations). And,we can model this arithmetic in Mathematica:

add[c ,d ] := class [ rep [c] + rep [d], modulus[c] ]

mult[c ,d ] := class [ rep [c] * rep [d], modulus[c] ]

Here’s an example of how one can use this package as a Z/mZ calculator:

In [1]:=add [ class [5,7] ], class [8,7] ]

Out [1]={6, 7}

In [2]:=mult [class [4,10], class [3,10 ] ]

Out [2]={2, 10}


Out [3]={0, 10}

With these definitions and Mathematica models, students can verify the usualdefining properties for a commutative ring:

(1) Addition of classes is commutative and associative, and class(0, m) is theadditive identity.

(2) Every class has an additive inverse; the process of negating a class can bemodeled in Mathematica:

14 AL CUOCO

neg [c ] := class [ -rep [c], modulus [c] ]

(3) Multiplication of classes is commutative and associative, and class(1, m)is the multiplicative identity.

(4) Multiplication distributes over addition.The usual arguments show that the units in Z/mZ are precisely the classes

whose representatives are relatively prime to m. Indeed, the Mathematica packagecan be used to calculate reciprocals for units: if c is a unit and a = rep(c), thengcd (a, m) = 1. If x = fcoeff(a, m), then

ax + m · scoeff(a, m) = 1

so that ax ≡ 1 (mod m). This means that class(x, m) is the multiplicative inversefor c. In Mathematica, the function recip returns the multiplicative inverse of aclass:

recip[c ] := class[ fcoeff[ rep[c],modulus[c] ], modulus[c] ]

For example,

In [1]:=recip [ class [15,71] ]

Out [1]={19,71}


Out [2]={1,71}

Standard arguments also show that the non-units are precisely the zero divisors,and that these are the classes whose representatives have a common factor with m.4

If p is a prime number, the representative of every non-zero class modulo p hasno common factor with p, so every non-zero class modulo p is a unit. In otherwords, Z/pZ is a field. Because Z/pZ has reciprocals for all its non-zero elements,its structure is quite different from the rings Z/mZ where m is composite. Forexample, if a ∈ Z/pZ and a = 0, the equation ax = b always has a solution (for anyb ∈ Z/pZ), so most of the rules from elementary algebra about solving equationshold without modification in Z/pZ.

Transporting the Construction to R[x]

The construction of the ring Z/mZ from Z is perfectly general in the sense thatit can be carried out in any ring R that has a structure similar enough to that ofZ. If we look carefully at what is required in the construction of Z/mZ, we seethat the crucial ingredient is an arithmetic that allows for functions mod and quotdetermined uniquely by the equation

b = a · quot(b, a) + mod(b, a)

4In my high school classes, we used this analysis to explain why quadratic equations in Z/mZ

can have more than two roots and to see the dependence on the non-existence of zero divisors of

the “degree = number of roots” theorem.


In Z, the uniqueness in this equation came from our insistence that a > 0 and that0 ≤ mod(b, a) < a. The Mathematica models for mod and quot are the built infunctions Mod and Quotient, so that in all the examples from the first section, wewere using these definitions:

mod := Mod

quot := Quotient

The ring R[x] has exactly the similarities to Z that we need to carry out ourconstruction. Let’s look briefly at the arithmetic of R[x], and then change theMathematica definitions for mod and quot so that we can work in this new ring.

Arithmetic with Polynomials. R[x] is the ring of polynomials in one variable xwith coefficients in R. A polynomial is just a formal sum anxn+an−1x

n−1+· · ·+a0,treated very much the way such expressions are treated in that part of schoolalgebra that concentrates on algebraic manipulation. In particular, we’ll neverthink of x as an “unknown” that can be replaced by a number, and we’ll neveridentify a polynomial with a function (although we’ll use shorthand like f(x) oreven f to stand for generic polynomials). The degree of a non-zero polynomialanxn + an−1x

n−1 + · · ·+ a0 is n, so that the degree of a non-zero constant is 0 (byconvention, we’ll take the degree of 0 to be −∞). Two polynomials are equal if theyhave the same degree and the same coefficients (that is, if they are identical exceptperhaps in the order in which their terms are written). Addition, subtraction, andmultiplication are carried out via the usual rules of algebra, and these rules force astructure on R[x] that is quite similar to that of Z.

More precisely, R[x] is an integral domain (a ring without zero divisors). Theunits in R[x] are precisely the polynomials of degree 0 (that is, the non-zero realnumbers). The fact that there are infinitely many units in R[x] and only 2 units inZ will cause minor complications with our Mathematica models, but these can beovercome with a few simple adjustments.

Divisibility. From now on, the word “polynomial” means an element of R[x] orQ[x]. Just as in Z, we can divide to get a quotient and remainder: If f and g arepolynomials, we can divide g by f to find polynomials q and r so that

g = fq + r and 0 ≤ deg(r) < deg(f)

As in Z, q and r are functions of the pair (f, g), and the division algorithm forpolynomials taught in algebra 1 courses is one way to calculate the outputs ofthese functions. These functions are also built into most symbol manipulators. InMathematica, if we re-define mod and quot like this:

quot[f ,g ] := Collect[ PolynomialQuotient [f,g,x], x ]

mod[f ,g ] := Collect[ PolynomialRemainder [f,g,x], x ]

the following Mathematica session shows how they behave:

In [1]:=r := mod [2 x^5 + 3 x^2 + 4 x + 5, 3 x^2 + 2 x + 1]

In [2]:=

16 AL CUOCO

rOut [2]=316 140 x--- + -----81 81

In [3]:=q := quot [2 x^5 + 3 x^2 + 4 x + 5, 3 x^2 + 2 x + 1]

In [4]:=q

Out [4]=2 3

89 2 x 4 x 2 x-- + --- + ---- + ----81 27 9 3

So,

mod(2x5 + 3x2 + 4x + 5, 3x2 + 2x + 1) =14081

x +31681

quot(2x5 + 3x2 + 4x + 5, 3x2 + 2x + 1) =23x3 +

49x2 +

227

x +8981

This can be checked either by carrying out the division by hand, or by checkingthat

2x5+3x2+4x+5, 3x2+2x+1 =(

23x3 +

49x2 +

227

x +8981

) (3x2 + 2x + 1

)+

14081

x+31681

Since Z ⊂ R[x], Z inherits the arithmetic on R[x] defined by these new versionsof mod and quot, but this arithmetic is quite different from ordinary arithmetic inZ. In particular, every integer is a unit, so every integer divides every other integer:

In [1]:=mod [5,3]

Out [1]=0

In [2]:=quot [5,3]

Out [2]=5-3

Greatest common divisor and Euclid’s Algorithm. A greatest common di-visor for two polynomials is defined exactly the same as it is defined for integers:

If f, g ∈ R[x], a greatest common divisor for f and g is a polynomial d that hasthe following properties:

(1) d | f and d | g, and(2) if c is any common divisor of f and g, then c | d


Although a pair of integers has two greatest common divisors (differing by a factor of−1), a pair of polynomials has a infinitely many greatest common divisors, because,if d is one greatest common divisor for f and g, any polynomial of the form ud whereu is a unit is another; since there are infinitely many units, there are infinitely manygreatest common divisors. Without moving to more abstract ring theory, we canchoose a distinguished greatest common divisor by observing that Euclid’s theoremis still true in R[x]:

Euclid’s Theorem for R[x]. If f and g are integers and f = 0, then a greatestcommon divisor of g and f is also a greatest common divisor for f and mod(g, f).

So, we can define the function gcd to be the output of Euclid’s algorithm. Forexample, gcd(2x2 − x − 1, 6x2 + x + 1) is calculated like this

3

2x2 − x − 1)

6x2 + x − 1

6x2 − 3x − 3 12x − 1

2

4x + 2)

2x2 − x − 1

2x2 − x − 1

0

That is, gcd(2x2−x−1, 6x2+x+1) = 4x+2. This is not the result produced by thealgebra 1 algorithm (that is, 2x+1) which essentially involves factoring 2x2−x−1and 6x2 +x+1 into irreducibles and taking the product of the common irreduciblefactors, but it differs from this result by a unit factor, and hence it produces a validgreatest common divisor (and our algorithm is far more efficient, even for polyno-mials of small degree). It must be understood that all statements about gcd haveto be interpreted up to a unit factor. So, when we say that gcd(f, g) = gcd(g, f),we mean that the two calculations produce answers that are the same except fora constant (real valued) multiple. Notice, for example, that two polynomials arerelatively prime if their gcd is a unit. So, for example, x+1 and x−1 are relativelyprime because their gcd is −2.

Everything else works. Once we adjust mod and quot, the entire arithmeticpackage that we developed for Z works perfectly in R[x] (except for one smallmodification to recip that will be described shortly). Here is an example of aMathematica session that uses unaltered versions of the functions defined earlier.

In [1]:=gcd[ x^4 + x^2 + 1, x^3 + 1]

Out [1]=2

1 - x + x

In [2]:=fcoeff [ x^4 + x^2 + 1, x^3 + 1]

18 AL CUOCO

Out [2]=1

In [3]:=scoeff [ x^4 + x^2 + 1, x^3 + 1]

Out [3]=-x

In [4]:=Simplify [ x^4 + x^2 + 1 + -x ( x^3 + 1)]

Out [4]=2

1 - x + x

In [5]:=f := 2 x^5 + 5 x^4 - 3 x + 1

In [6]:=g := 3 x^2 + 2 x + 1

In [7]:=h := Simplify [fcoeff [f,g]]

In [8]:=h

Out [8]=47304 243 x----- + -----42025 205

In [9]:=k := Simplify [scoeff [f,g]]

In [10]:=k

Out [10]=2 3 4

28593 + 34911 x - 6156 x - 92421 x - 33210 x----------------------------------------------

42025

In [11]:=Simplify [h f + k g]

Out [11]=75897-----42025

Just as in Z, given a polynomial f , we can form the ring of classes modulo f ,denoted by R[x]/fR[x]. This ring is not finite, but every class has a distinguishedrepresentative whose degree is less than deg(f). So, if we reduce modulo a cubic,every class has a representative of degree 2 or less:

In [1] :=class [3 x^4 + 5 x^3 + 4 x + 1, 2 x^3 + 3 x]


Out [1] =2

7 x 9 x 3{ 1 - --- - ---- , 3 x + 2 x }

2 2

Classes can be added and multiplied as before, and the Mathematica modelswork without change:

In [1] :=add [class [x^3 + 5 x^2 - 1, x^2 + 3 x + 2],

class [x^2 + 1, x^2 + 3 x + 2]]Out [1] =

2{ -6 - 11 x, 2 + 3 x + x }

In [2] :=mult [class [x^3 + 5 x^2 - 1, x^2 + 3 x + 2],

class [x^2 + 1, x^2 + 3 x + 2]]Out [2] =

2{ -43 - 48 x, 2 + 3 x + x }

Suppose that f is a fixed polynomial in R[x]. One way to develop a feelingfor arithmetic in R[x]/fR[x] is to develop a mathematical shorthand that makesarithmetic manipulations easier, and then to perform some manipulations.

Shorthand. Suppose that f is a fixed polynomial in R[x] and that we are workingin R[x]/fR[x]. If g ∈ R[x], we’ll use the same symbol g to stand for both thepolynomial g and the class class(g, f), depending on the context.

Example. If P = R[x]/(x2 + 3x + 2)R[x], 5x3 + 4x2 + x + 1 stands for

class(5x3 + 4x2 + x + 1, x2 + 3x + 2)

that is, for class(23 + 24x, x2 + 3x + 2). When we write

5x3 + 4x2 + x + 1 = 23 + 24x

we mean this equality in P rather than in R[x]. With this convention,

x2 + 3x + 2 = 0

Using this fact and the facts about classes:

class(h, f) + class(g, f) = class(h + g, f)

class(h, f) · class(g, f) = class(h · g, f)

we can say thatx2 = −3x − 2

20 AL CUOCO

Finally, we can manipulate in P by doing ordinary algebra in R[x], except we replacepowers of x higher that 2 by lower powers using the rule x2 = −3x − 2. Here’s anexample:

(3 + 4x)(5 − 2x)(1 + x) = (15 + 14x − 8x2)(1 + x)

= (15 + 14x − 8(−3x − 2)) (1 + x)

= (31 + 38x)(1 + x)

= 31 + 69x + 38x2

= 31 + 69x + 38(−3x − 2)

= −45 + 45x

Mathematica can be used to help in these calculations in several ways. Forexample, we could have gone from the step (31 + 38x)(1 + x) right to the last stepby typing

In [1] :=Expand[( 31 + 38 x) (1 + x) /.x^2 -> -3 x - 2]

Out [1] =31 + 38 (-2 - 3 x) + 69 x

In [2] :=Expand[ Expand[( 31 + 38 x) (1 + x) /.x^2 -> -3 x - 2] ]

Out [2] =-45 - 45 x

Instead of simplifying our calculations every time x2 shows up, we could performthe entire calculation in R[x] (ending up with arbitrarily high powers of x), reducingthe resulting polynomial to one of the form a+bx by recursively lowering the degreeusing the substitution

x2 → −3x − 2

So, for example,

x3 = x · x2

= x(−3x − 2)

= −3x2 − 2x

= −3(−3x − 2) − 2x

= 6 + 7x

Mathematica can be used to help with the calculations:

In [1] :=class[x^3, x^2 + 3 x + 2]

Out [1] =


2{ 6 + 7 x, 2 + 3 x + x }

Using Mathematica, an inductive pattern emerges:

x2 = −3x − 2

x3 = 7x + 6

x4 = −15x − 14

x5 = 31x + 30

x6 = −63x − 62

......

Mathematica could also have been used to determine general rules for arithmeticin P if we take advantage of it’s capability to do polynomial algebra with genericcoefficients.5 Because deg(x2 + 3x + 2) = 2, every class in P has a representativeof the form a + bx. Let’s model our mathematical shorthand in Mathematica:

short [f ] := class [f, x^2 + 3 x + 2]

How are classes added?

In [1] :=add [ short[a + b x], short[c + d x] ]

Out [1] =2

{ a + c + (b + d) x, 2 + 3 x + x }

So, a + bx + c + dx = a + c + (b + d)xHow are classes multiplied?

In [2] :=mult [ short[a + b x], short[c + d x] ]

Out [2] =2

{ a c - 2 b d + (b c + a d - 3 b d) x, 2 + 3 x + x }

So, (a + bx)(c + dx) = ac − 2bd + (bc + ad − 3bd)xNotice that x2 + 3x + 2 is composite in R[x] , so there will be zero-divisors in P.

Shortly, we’ll see how to use Mathematica to determine the units in P.

The function recip needs a minor fix: We want the product of a class modulof and its reciprocal to be class(1, f), but our definition of recip:

recip[c ] :=class[ fcoeff[ rep[c],modulus[c] ], modulus[c] ]

5Before typing these expressions, make sure that a, b, c, and d have been cleared of any values.

22 AL CUOCO

only guarantees that the product of a class c and its reciprocal will be the class

class (gcd(rep(c), modulus(c)), modulus(c))

and, as we’ve seen, if c is a unit in R[x]/fR[x] (that is, if the representative of c isrelatively prime to f), gcd(rep(c), modulus(c)) is a non-zero real number, but it’snot necessarily 1. To remedy the situation, we redefine recip like this:

recip[c ] := class[ (1/gcd[rep[c],modulus[c]]) *fcoeff[ rep[c],modulus[c] ],

modulus[c] ]

With this modification, the entire model for Z/mZ goes through. The followingMathematica session gives some examples:

In [1]:=class [x^3 + + 5 x^2 - 1, x^2 + 3 x + 2]

Out [1]=2

{ -5 - 8 x, 2 + 3 x + x }In [2]:=recip [ class [x^3 + + 5 x^2 - 1, x^2 + 3 x + 2] ]

Out [2]=19 8 x 2

{ -- + --- , 2 + 3 x + x }33 33

In [3]:=mult [class [x^3 + + 5 x^2 - 1, x^2 + 3 x + 2],

recip [ class [x^3 + + 5 x^2 - 1, x^2 + 3 x + 2] ] ]Out [3]=

2{ 1, 2 + 3 x + x }

Example, cont’d. Return to the ring

P = R[x]/(x2 + 3x + 2)R[x]

Since x2 + 3x + 2 = (x + 1)(x + 2), not every class has a reciprocal; indeed, 1 + xand 2 + x cannot have reciprocals. Are there others? If we ask Mathematica forthe reciprocal of a generic element, we get our answer:

In [1]:=Simplify[ recip [short [a + b x] ] ]

Out [1]=a - 3 b - b x

-----------------(a - 2 b) (a - b)

This formula for the reciprocal of a class in P breaks down only if a = 2b or a = b;everything not of this form is a unit.


The Complex numbers

In this section, we’ll develop the complex numbers from scratch, outlining anapproach that can be expanded to a development of complex numbers designed forstudents who are learning about the complex numbers for the first time.

Just as in Z, reducing modulo a polynomial f ∈ R[x] provides a way for ignoringf and its multiples. If f is an irreducible polynomial, R[x]/fR[x] is actually afield; that is, every non-zero class is a unit, so it has a reciprocal. Furthermore, ifdeg(f) > 0, no two elements of R belong to the same class modulo f , so that Rcan be considered as a subfield of R[x]/fR[x].

Specialize now to the case wheref = x2 + 1, an irreducible element of R[x]that is clearly different form 0. But in R[x]/(x2 + 1)R[x], class(x2 + 1, x2 + 1) =class(0, x2 + 1). So, we have the following facts:

(1) R[x]/(x2 + 1)R[x] is a field: every non-zero element has a reciprocal.(2) R can be considered a subfield of R[x]/(x2 + 1)R[x].(3) In R[x]/(x2 + 1)R[x], the class containing x2 + 1 is the zero class.Our Mathematica model for R[x]/(x2 + 1)R[x] can be used to investigate the

properties of arithmetic in R[x]/(x2 + 1)R[x]. Since deg(x2 + 1) = 2, every classhas a representative of the form a + bx where a, b ∈ R. Here is a sample session:

In [1]:=add [class [1 + 3 x, x^2 + 1], class [2 + 5 x, x^2 + 1] ]

Out [1]=2

{ 3 + 8 x, x + 1 }In [2]:=mult [class [1 + 3 x, x^2 + 1], class [2 + 5 x, x^2 + 1] ]

Out [2]=2

{ -13 + 11 x, 1 + x }In [3]:=recip [class [1 + 3 x, x^2 + 1] ]

Out [3]=1 3 x 2

{ -- - ---, 1 + x }10 10

What are the rules for arithmetic in R[x]/(x2 + 1)R[x]? Again, we’ll investigatethis question by using Mathematica’s ability to do polynomial algebra with genericcoefficients. Before we look at these rules, let’s develop a mathematical shorthandfor arithmetic in R[x]/(x2 + 1)R[x]:

Definition. The complex numbers, denoted by C, is the field obtained from R[x]by reducing modulo x2 + 1:

C = R[x]/(x2 + 1)R[x]

Instead of the notation class(a + bx, x2 + 1), we’ll use the shorthand a + bx.Elements of C are called complex numbers.

24 AL CUOCO

So, in C, the symbol a + bx is just shorthand for class(a + bx, x2 + 1); 3 + 2xstands for a class modulo x2 + 1. Just as before, if f is any polynomial, we’ll use fitself as a shorthand for the class modulo x2 + 1 that contains f . In other words,5x4 + 3x2 + 1, will stand for the complex number class(5x4 + 3x2 + 1, x2 + 1). Inthis shorthand, 3x5 = 3x5 + x2 + 1 (because 3x5 ≡ 3x5 + x2 + 1 (mod x2 + 1)).

How are classes added?

In [4]:=add [class [a + b x, x^2 + 1 ], class [ c + d x, x^2 + 1 ] ]

Out [4]=2

{ a + c + (b + d) x, 1 + x }

so, a + bx + c + dx = a + c + (b + d)xHow are classes multiplied?

In [5]:=mult [class [a + b x, x^2 + 1 ], class [ c + d x, x^2 + 1 ] ]

Out [5]=2

{ a c - b d + (b c + a d) x, 1 + x }

so, (a + bx)(c + dx) = ac − bd + (ad + bc)xWhat is the reciprocal of a non-zero class?

In [6]:=Simplify [ recip [class [a + b x, x^2 + 1 ] ] ]

Out [6]=a - b x 2

{ -------, 1 + x }2 2

a + b

so,1

a + bx=

a − bx

a2 + b2. Notice that, since there are no pairs of real numbers (a, b)

except (0, 0) with the property that a2 + b2 = 0, we recover the fact that everynon-zero element of C has a reciprocal.

Finally, every class has a representative of the form a + bx. What about higherdegree polynomials? For example, what’s the class that contains x2?

In [7]:=class [x^2 , x^2 + 1 ]

Out [7]=2

{ -1, 1 + x }

so, x2 = −1. In other words, x is a complex number whose square is −1:

In [8]:=mult [ class [x , x^2 + 1 ], class [x , x^2 + 1 ] ]

Out [8]=


2{ -1, 1 + x }

Since distinct real numbers cannot be congruent modulo x2 + 1, distinct realnumbers stay distinct when they are looked at as complex numbers, so we canconsider R as a subfield of C. Since two distinct linear polynomials cannot becongruent modulo x2 + 1, two complex numbers a + bx and c + dx are equal if andonly if a = c and b = d. In summary, we have:

Theorem. The complex numbers C enjoy the following properties:(1) R a subfield of C.(2) a + bx = c + dx ⇔ a = c and b = d.(3) a + bx + c + dx = a + c + (b + d)x(4) (a + bx)(c + dx) = ac − bd + (ad + bc)x

(5)1

a + bx=

a − bx

a2 + b2

(6) x2 = −1

One way to calculate with complex numbers is to use the rules for adding andmultiplying given in the theorem. A more natural way is to use the technique weused to calculate in P:

How to do algebra in C. To manipulate an expression containing complex num-bers, manipulate their representatives as if they were elements of R[x], and replacex2 by −1 whenever exponents get bigger than 2.

So, for example,

(3x + 2)(5x − 1)(4x +√

2) = (3x2 + 7x − 2)(4x +√

2)

= (−3 + 7x − 2)(4x +√

2)

= (−5 + 7x)(4x +√

2)

= −20x + 28x2 − 5√

2 + 7√

2x

= −20x + −28 − 5√

2 + 7√

2x

= −28 − 5√

2 + (7√

2 − 20)x

To avoid confusion (and to use the same conventions as everyone else), we coulduse i instead of x, so that we could think of C as R[i]/(i2+1)R[i]. Complex numberswould then look like 4+3i. But keeping the notation a+ bx might remind studentsthat C is just a quotient ring of R[x], and that manipulating complex numbers isjust algebra in that quotient ring.

Other moduli. The construction we have given for C borrows the idea that, inZ/mZ, all multiples of m vanish. By forming R[x]/(x2+1)R[x], we are really saying,“Work with ordinary polynomials, but replace x2 + 1 by 0 in all calculations.” Putanother way, we are saying, “Work with ordinary polynomials, but replace x2 by−1 in all calculations.”

This is pretty much the same instruction that algebra 2 students get when theyare told to simply do the calculations and replace i2 by −1, but the difference here

26 AL CUOCO

is that our construction puts this algebra in a general context that applies in manysituations.

Indeed, if K is any field and f ∈ K[x] is any irreducible polynomial with coef-ficients in K, the field K[x]/fK[x] is a new field that contains K and that has analgebraic structure in which f = 0.

In the case K = R, it might seem that reducing R[x] by other irreducible poly-nomials (like x2 + x + 1) yields other “extension” fields. Gauss proved that thisisn’t so: the fundamental theorem of algebra (first proved in Gauss’ thesis) saysthat C not only contains a solution to x2 + 1 = 0, it contains all the solutions toevery polynomial equation f = 0 where f ∈ R[x] (in fact, C contains the roots ofall polynomials in C[x]).

If, however, we look at Q, the field of rational numbers, as our basefield, thenthe construction presented here gives a method for constructing extension fields ofQ in which a given polynomial has a root. For example, if f = x2 + x + 1, thenQ[x]/fQ[x] is a field in which x2 = −x − 1 , so that

x3 = x · x2

= x · (−x − 1)

= −x2 − x

= −(−x − 1) − x

= 1

So Q[x]/fQ[x] is an extension of Q that contains a cube root of 1. Our Mathematicaconstruction for C can easily be modified to get the rules for algebra in this system.Indeed, if we simply change the constructor complex so that it returns a classmodulo x2 + x + 1 rather than a class modulo x2 + 1, we find that, in Q[x]/fQ[x]:

a + bx + c + dx = a + c + (b + d)x

(a + bx)(c + dx) = ac − bd + (bc + ad − bd)x

1a + bx

=a − b − bx

a2 − ab + b2

With a little more work, we can even get Mathematica models for extensions ofZ/pZ.

The general study of extension fields as we have constructed them forms a part ofalgebraic number theory, an area fairly remote from elementary mathematics. Doesit need to stay that way? Perhaps programmable symbol manipulators will beuseful in making this beautiful part of mathematics accessible to a wider audience.

References

1. Abelson, H. and Sussman, J., Structure and Interpretaion of Computer Programs, MIT Press,

Cambridge, MA, 1985.

2. Cuoco, A., Investigations in Algebra, MIT Press, Cambridge, MA, 1990.

3. Cuoco, A. and Goldenperg, P., A Role for Technology in Mathematics Education, The Boston

University Journal of Education (to appear) (1997).


4. Dubinsky, E. and Leron, U., Learning Abstract ALgebra witjh ISETL, Springer Verlag, New

York, 1991.

5. Gauss, C.F., Disquisitiones Aritmeticae (trans. Arthur A. Clarke), Yale University Press,

New Haven, 1966.

6. Herstein, I.N., Topics in Algebra, Blaisdell Publishing Co., Waltham, MA, 1964.

7. Katz, N., The Roots of Complex Numbers, Math Horizons ????? (1996).

8. Harel, I. and S. Papert, Constructionism, Ablex Publishing Corporation, Norwood, NJ.

9. Sawyer, W.W, A Concrete Approach to Abstract Algebra., Dover, New York, 1959.

55 Chapel Street, Newton, MA 02158, USA

E-mail address: [email protected]

complex.tex

Documents

approachto complex numbers

meaning of complex numbers

complex number system

algebraic system

high school algebra

highschool students

deningan algebra

undergraduate algebra