binary quadratic forms and quadratic number ﬁelds...the ﬁrst “universal” result on...

Binary quadratic forms and quadraticnumber fields

401-3113-12LProf. G. Wustholz 16.02.2012

Contents

1 Introduction and Motivation(Febr. 20, 2012) 2

2 Sums of two squares(Febr. 24, 2012) 6

3 Number rings(Febr. 26, 2012) 8

4 Factorization(March 4, 2012) 11

5 Primes in Gaussian numbers rings(March 12, 2012) 15

6 Ideals in quadratic number rings(March 19, 2012) 18

7 Ideal factorization(March 26, 2012) 22

8 Prime ideals(March 26, 2012) 25

9 Fractional ideals(April 2, 2012) 27

10 The three squares theorem(April 16, 2012) 30

11 The four squares theorem(April 20, 2012) 31

12 Binary quadratic forms(April 30, 2012) 33

13 Class groups(May 7, 2012) 36

14 Representations of integers by binary quadratic forms(May 14, 2012) 39

List of Abbreviations 41

1

Index 41

In this course we connect different areas from algebra and number theory in order to obtaindeeper knowledge; the common theme is “quadraticity”. The notes were typed by C. Fuchs;for comments and corrections send an email to [email protected].

2

1 Introduction and Motivation

(Febr. 20, 2012)

A binary quadratic form is a polynomial

q(x, y) = ax2 + bxy + cy2

with coefficient a, b, c which we assume to be in Z. Writing x = (x, y) the form q can beexpressed as

(x, y)Q

(

xy

)

= x TQx

with

Q =

(

a b/2b/2 c

)

, a, b, c ∈ Z;

and discriminant dQ = dq = b2 − 4ac = −4 detQ. We say that q is primitive if and only if(a, b, c) = 1.

Binary quadratic forms come up naturally in the context of quadratic number fields. Let d bea square-free integer and put K = Q(

√d). Then

qK(x, y) =

{

x2 − dy2, d 6≡ 1 (mod 4),x2 + xy + (1− d)y2/4, d ≡ 1 (mod 4).

is a canonical binary quadratic form associate with the field K.

There is a natural action of SL(2,Z) on the space of binary quadratic forms which is deducedfrom the natural action on Z× Z given by (x, y) 7→ (x, y) gT = (αx+ βy, γx+ δy) for

g =

(

α βγ δ

)

∈ SL2(Z)

and this leads to an equivalence relation with

q ∼ qσ ⇔ ∃ g =

(

α βγ δ

)

∈ SL2(Z) : qσ(x, y) = (gq)(x, y) = q(αx+ βy, γx+ δy).

Since the discriminant of a binary quadratic form is invariant under the action of SL(2,Z) itis clear that q ∼ qσ ⇒ dq = dqσ . However the important fact is that “⇐” is not true and thisleads to a rich mathematical theory.As an example we may take the forms q1 = x2 +6y2, q2 = 2x2 +3y2 with d1 = d2 = −24. Then5 = q2(1, 1) but q1(x, y) = x2 + 6y2 never takes the value 5. This means that the two formsare not equivalent though they have the same discriminant. The example shows that the set ofvalues of a quadratic form gives important information about the form and its class.

Since centuries one of the main questions about binary quadratic forms was which numbersare represented by the form. We say that q represents an integer n if and only if ∃ x, y ∈ Z:n = q(x, y). It is obvious that if q ∼ qσ then n is represented by q ⇔ n is represented by qσ. Asa consequence we see that the set of values of a quadratic form is independent of the form inthe equivalence class under SL2(Z). It remains to give a characterization of the value set andthis is where the theory of genera came up. We shall discuss and explain this in the final lecture.

3

There is a close relation between quadratic forms and fractional ideals. Let K = Q(√d) ⊃ b

be a fractional ideal. This means that there exists an ideal a ⊆ Q(√d) and some c ∈ K× such

thatb = a : c = {x ∈ K; cx ∈ a}.

Let b1, b2 be fractional ideals, then b1 ∼ b2 ⇔ ∃ α ∈ K× : b2 = (α)b1. We denota by CK thegroup of equivalence classes of fractional ideals in K.

Let a be a fractional ideal and let Gal(K/Q) = {1, σ} be the Galois group of K over Q. Weassociate with a = Za1 + Za2 the binary quadratic form

qa(x, y) = N(a)−1(xa1 + ya2)(xaσ1 + yaσ2)

with 1 6= σ ∈ Gal(K/Q). The rational number N(a) is defined as follows:• a ⊂ OK ⇒ N(a) = OK/a• a, b ⊂ OK ⇒ N(ab) = N(a)N(b)• extend to fractional ideals.

The form qa(x, y) = N(a)−1(a1aσ1x

2 + (a1aσ2 + aσ1a2)xy + a2a

σ2y

2) is called the quadratic formassociated with a.

We denote by Q(d) the set of equivalence classes of binary quadratic forms with discriminant dand if d is the discriminant of a quadratic number field K then we also write QK . The questionis whether there exists a bijection

κ : CK → QK .

We shall see that this is not always the case and that the group CK is too small in general.In other words, the equivalence relation is not quite the right one. One needs a more narrowequivalence relation and gets a new group C+

K , called narrow class group, which in some casescoincides with the class group CK and in some cases it is an extension

0 → Z/2Z → C+K → CK → 0

of the class group by the group Z/2Z. One shows that there is a bijection

κ : C+K → QK .

In particular the number of equivalence classes of quadratic forms with discriminant d is theorder of the group C+

K .

The group C+K gives much information about whether a given integer n is represented by the

norm form qK . As an example we shall deal with the case that |C+K | = 1. Then we say that

n is (properly) represented by the norm form qK if there exist x, y ∈ Z wiith (x, y) = 1 andn = qK(x, y). We shall show that this is the case if and only if the congruence

x2 ≡ dK mod 4n

has a solution and if dK < 0 in the case n < 0. The congruence is solvable if and only if(

dK4n

)

= 1; here(

mn

)

stands for the Kronecker symbol.

Another related question is, whether a given quadratic form represents a prime number p. Letq(x, y) ∈ QK be such a form with discriminant dK . Then it is possible to show that q representsp if and only if the class (corresponding to) q under κ is of the form c+(p) for some prime ideal

4

p ⊆ OK with N(p) = p. This leads immediately to the question, which prime ideals have aprime norm. The answer is given by the theory of fractional ideals in K. One has to study thequestion how a prime p ∈ Q decomposes in K. It can be shown that the following possibilitiescome up:(i) (p) is prime(ii) (p) = p · pσ (Gal(K/Q) = 〈σ〉)(iii) (p) = p2

In the first case p is called inert, in the second case p is called split and in the third case p iscalled ramified. The norms are

• p2 = N((p))• p2 = N(p) ·N(pσ) = N(p)2

• p2 = N(p)2

and in the last two cases p = N(p). The question is then: when do the cases (ii) and (iii) comeup. The answer is given in terms of the discriminant and the Legendre symbol.

This leads us to an old problem: which numbers are of the form x2 + ny2. The question goesback at least to Fermat who asserted in 1640 that an odd prime p is a sum of square

p = x2 + y2, x, y ∈ Z

⇔ p ≡ 1 (mod 4). He did not give a proof but announced it in a letter to Mersenne datedDecember 25, 1640. A first proof was given by Euler (Letter to Goldbach on April 12, 1749),another proof by Lagrange in 1775 based on his work on quadratic forms, simplified in theDisquisitiones Arithmeticae, and by Dedekind using the field Q[i]. The general question whenan integer n can be represented as a sum of squares

n = x2 + y2, x, y ∈ Z

can be answered by reducing it to primes.

Fermat extended the assertion (Letter to Pascal, September 25, 1654) and announced that forodd primes

p = x2 + 2y2 ⇔ p ≡ 1, 3 (mod 8)

p = x2 + 3y2 ⇔ p ≡ 1 (mod 3)

and Euler added as a conjecture

p = x2 + 5y2 ⇔ p ≡ 1, 9 (mod 29)

The statements were established by Lagrange.

In 1798 Legendre stated that a positive integer n can be expressed as n = x2 + y2 + z2 ⇔ n 6=4a(8m+ 7). His proof was incomplete and Gauss gave a complete proof in the Disquisitiones.

The first “universal” result on representations of integers as a sum of squares is Lagrange’sfour-square theorem (Lagrange, 1770): Every positive integer is a sum of four squares. Thetheorem had been conjectured by Bachet.

The four-squares theorem leads us to a very interesting question: Which diagonal forms

q = ax2 + by2 + cz2 + dw2

5

a, b, c, d ∈ N coprime, are universal in the sense that they represent each positive integer n?The answer was given by Ramanujan by writing down a complete list of 55 forms. Ramanujan’slist had to be corrected later since the diagonal form x2+2y2+5z2+5w2 had to be eliminated.

More recently a very interesting and surprising series of papers were published. It concernsthe question whether the four-squares theorem also holds generally for arbitrary quaternarypositive definite form. The answer is the Conway-Schneeberger Fifteen Theorem : If a positivedefinite quaternary quadratic form has an integer matrix representation and represents all pos-itive integers up to 15 then it represents all positive integers.Here the distinction concerning the matrix representation is essential. There is a differencebetween integer valued quadratic forms and quadratic forms with associated matrix having in-tegers as coefficients as in the theorem. If we are in the case of integer valued quaternary formsthen Manjul Bhargava and Jonathan P. Hanke proved that: If a positive definite quadratic formwith integer coefficients represents the 29 integers 1, 2, 3, 5, 6, 7, 10, 13, 14, 15, 17, 19, 21, 22, 23, 26,29, 30, 31, 34, 35, 37, 42, 58, 93, 110, 145, 209, 290 then it represents all positive integers.

6

2 Sums of two squares

(Febr. 24, 2012)

In this lecture we turn to the question under which condition a given non-negative integer canbe written as a sum of 2 squares.

Theorem 2.1. Every prime p ≡ 1 (mod 4) is a sum of two squares.

Theorem 2.2. A positive integer n is a sum of two squares ⇔

n = 2a∏

p|np≡1 (4)

pb∏

q|nq≡3 (4)

qc, c ≡ 0 (2).

The proof of Theorem 2.1 will be given later. At the end of this lecture we shall shortly describehow the proof works. There are many other and quite different proofs in the literature. Herewe shall show how it implies Theorem 2.2. This is done by a sequence of lemmata.

Lemma 2.1. m,n sums of squares then mn sums of squares.

Proof. This is a simple application of an identity about sum of squares going back to Euler:

(a1x21 + bx1y1 + a2cy

21)(a2x

22 + bx2y2 + a1cy

22) = a1a2X

2 + bXY + cY 2

forX = x1x2 − cy1y2, Y = a1x1y2 + a2y1x2 + by1y2.

We choose x1, y1 ∈ Z such that m = x21 + y21 and x2, y2 ∈ Z such that n = x2

2 + y22 and getnm = X2 + Y 2. //

The next lemma is a priori a statement about quadratic residues and follows from the firstsupplement to the quadratic reciprocity law. We give an elementary proof of it not making useof the Legendre symbol.

Lemma 2.2. x2 ≡ −1 (mod p), p odd prime has a solution ⇔ p ≡ 1 (mod 4).

Proof. This is a simple application of Wilson’s theorem. “⇒”: x2 ≡ −1 (mod p) implies that

1 ≡ xp−1 = (x2)(p−1)/2 ≡ (−1)(p−1)/2 =

{

1, p ≡ 1 (mod 4)−1, p ≡ 3 (mod 4)

and this shows that p ≡ 1 (mod 4). “⇐”: By Wilson

−1 ≡ (p− 1)! = (1 · 2 · . . . · p− 1

2)(p+ 1

2· . . . · (p− 1)) =

(

p− 1

2

)

!(p− p− 1

2) · · · (p− 1)

≡(

p− 1

2

)

!(−1)(p−1)/2

(

p− 1

2

)

! = (−1)(p−1)/2

((

p− 1

2

)

!

)2

(mod p) and gives the solution x = ((p− 1)/2)!. //

The next lemma is the main step in the proof of Theorem 2.2 on the basis of Theorem 2.1.

Lemma 2.3. 0 6= n = x2 + y2, q ≡ 3 (mod 4), n = qam with (q,m) = 1. Then a ≡ 0 (mod 2).

7

Proof. We first show that q|n implies that q|x and q|y. Otherwise wlog (q, x) = 1. and thismeans that x is invertible mod q and we get (yx−1)2 ≡ −1 (mod q). By Lemma 2.2 we concludethat q ≡ 1 (mod 4), a contradiction. This proves our assertion.Let now qa be the highest power of q with qa|x, qa|y. Then q2a|n and we claim that q2a+1 ∤ n.To see that write

X = x/qa, Y = y/qa, N = n/q2a.

Then N = X2 + Y 2 and (X, q) = 1 or (Y, q) = 1. If q|N we get again (Y X−1)2 ≡ −1 (mod q)and once again Lemma 2.3 gives q ≡ 1 (mod 4), a contradiction. This proves the lemma. //

We show that Theorem 2.1 implies Theorem 2.2.“⇒”: This is Lemma 2.3.“⇐”: For p = 2 or p ≡ 1 (mod 4) we take 2 = 12 + 12 and p2 = p2 + 02. For p ≡ 3 (mod 4) weuse Theorem 2.1.As mentioned at the beginning we briefly give the main idea of the proof of Theorem 2.1. Wework over the field Q[i] with Z[i] euclidean and therefore a unique factorization domain andshow that p = N(α) for some α = x+ iy. Then p = (x+ iy)(x− iy) = x2 + y2.

8

3 Number rings

(Febr. 26, 2012)

We take a square-free integer d and consider the set

Q(√d) = {r + s

√d; r, s ∈ Q} = Q[T ]/(T 2 − d).

This is a priori only a vector space over over Q of dimension 2 with basis 1,√d, but it is an

easy exercise (do it!) that this is indeed a field, a quadratic number field. For d > 0 we callthe field real quadratic, for d < 0 imaginary quadratic.

a) Determination of the ring of integers Od

A quadratic number field is an extension of degree two of Q and is therefore Galois over Q.Its Galois group is Gal(Q(

√d)/Q) = {1, σ} with σ 6= 1 and σ2 = 1. We introduce several

operations on Q[√d] related to Gal(Q(

√d)/Q):

• conjugation: σ : ξ = r + s√d 7→ ξσ = r − s

√d = (r + s

√d)σ

• norm: N = NQ(√d)/Q : Q(

√d)× → Q×, ξ 7→ N(ξ) := ξ ·ξσ. The norm is a (multiplicative)

group homomorphism.

• trace: Tr = TrQ(√d)/Q : Q(

√d) → Q, ξ 7→ Tr(ξ) := ξ + ξσ. The trace is a (additive)

group homomorphism.

One calculates N(ξ) = r2 − ds2, T r(ξ) = 2r and one sees that the minimal polynomial for ξ is(T − ξ)(T − ξσ) = T 2−Tr(ξ)T +N(ξ). Using norm and trace we make the following

Definition. ξ ∈ Q[√d] is called integral ⇔ N(ξ), Tr(ξ) ∈ Z.

It is easy to determine the integral elements of Q[sqrtd]. It is clear that

N(ξ), T r(ξ) ∈ Z ⇔ r2 − ds2, 2r ∈ Z.

We write r = a/c, s = b/c, (a, b, c) = 1 and find that

2r, r2 − ds2 ∈ Z ⇔ c | 2a ∧ c2 | (a2 − db2).

Now, if e = (a, c), then e|b ⇒ e|(a, b, c) = 1 ⇒ e = 1. This gives (a, c) = 1, whence c ∈ {1, 2}.We shall show that

Od = {u+ v√d

2; u, v ∈ Z, u2 − dv2 ≡ 0 mod 4}.

To verify this we denote by U the right hand side and first show that it contains Od. Let ξ bein Od. Then

ξ =

{

a+ b√d, if c = 1

(a+ b√d)/2, a2 + b2d ≡ 0 mod 4, if c = 2.

Writing u = 2a/c, v = 2b/c we find that

ξ = (u+ v√d)/2, u2 − dv2 ≡ 0 mod 4

9

which leads to ξ ∈ U. For the inclusion U⊆ Od we just take the norm N(ξ) and the trace Tr(ξ)and the congruence condition shows that they are both integers.we have to solve the congruence u2 − dv2 ≡ 0 (4). The solution depends on the congruenceclass of d modulo 4. In the case case d ≡ 2, 3 (4) we get u2 − 2v2 ≡ 0 (4) if d ≡ 2 (4) andu2+ v2 ≡ 0 (4) if d ≡ 3 (4). If d ≡ 1 (4) we have to solve the congruence u2−v2 ≡ 0 (4). In thecase d ≡ 2 (4) we have u2 ≡ 0 (2) and v2 ≡ 0 (2) and this leads to u ≡ 0 ≡ v (2). If d ≡ 3 (4),since squares are either congruent to 0 or to 2 modulo 4, we conclude that the first alternativeoccurs and then again u ≡ 0 ≡ v (2). In the case d ≡ 1 (4) we have

u2 − v2 ≡ 0 (4) ⇔ u ≡ v (2),

that is u = v+2w for some w ∈ Z. We conclude that ξ = 12(v+2w+ v

√d) = w+ v(1+

√d)/2

and this is in Od. The final result is

Theorem 3.1. The integers in Q[√d] are

(i) Od = {a+ b√d; a, b ∈ Z}, if d ≡ 2, 3 mod 4

(ii) Od = {a+ b1+√d

2; a, b ∈ Z}, if d ≡ 1 mod 4.

We shall write

ω =

{

1+√d

2, d ≡ 1 mod 4,√

d, d ≡ 2, 3 mod 4.

and the Od can be written uniformly as Od = Z + Zω. We also may replace ω by (d−√d)/2

with the same effect. The discriminant becomes

dK = det

(

1 ω1 ωσ

)2

=

{

d, d ≡ 1 (mod 4),4d, d ≡ 2, 3 (mod 4).

We see then that either dK is odd or of the form dK = 8δ and dK = 4δ respectively with δ odd.

b) Determination of units:

We denote by Ud = O×d the group of invertible elements in Od. By definition

ε =1

2(a+ b

√d) ∈ Ud ⇔ ∃εσ ∈ Od : εε

σ = 1.

On taking the norm we get 1 = N(εεσ) = N(ε)N(εσ) which leads to N(ε) = ±1 which isequivalent to the famous Pell’s equation

a2 − db2 = ±4. (Pell’s equation).

Lett P(d) be the set of solutions of the Pell equation.

Theorem 3.2. (u, v) 7→ 12(u+ v

√d) gives bijection P(d) = {(u, v); u2 − dv2 = ±4} → Ud.

A famous theorem, Dirichlet’s unit theorem says that Ud is a finitely generated abelian groupand this means that Ud

∼= Zr ×Ud,tor with r the rank of Ud which is 0 if d < 0) and 1 if d > 0 inwhich case Ud,tor is finite. Unfortunately Dirichlet’s Unit Theorem does not give a constructiveway to find the units in Od. In the case of a real quadratic number field – and this is the onlyinteresting case for quadratic number fields – there is an alternative way through the theory ofcontinued fractions. (see [10]). However we can easily determine the torsion subgroup in both,the real and imaginary quadratic case. This gives all units in the imaginary quadratic.

10

Theorem 3.3. The torsion group Ud,tor is µ2 except in the case d = −1 and d = −3 whereU−1,tor = µ4 and U−3,tor = µ6.

Proof. We know from algebra that a subgroup of the multiplicative group of a field is a group ofroots of unity. this shows that Ud,tor ⊇ µ2 and we have to determine, if any, the remaining rootsof unity. In the case d > 0 the only roots of unity are ±1. If d ≤ −5 then we must have ±4 =u2− dv2 ≥ u2+5v2 and this immediately leads to v = 0, u2 = 4 and then to v = 0, u = ±2. Anapplication of Theorem 3.2 then gives Ud = {±1} = µ2. In the case d = −1 one gets u2+v2 = 4and in the case d = −3 one has to solve u2 +3v2 = 4. In the first case this gives u = ±2, v = 0or u = 0, v = ±2) and then U−1,tor = {(u + v

√−1)/2 = {±1,±i} = µ4. In the second case

d = −3 one has u = ±1, v = ±1 implying that U−3,tor = {(u+ v√−3)/2; u, v = ±1} = µ6. //

11

4 Factorization

(March 4, 2012)

In Z there are two different possibilities to define a prime number p, namely to say that psatisfies one of

a) p prime number ⇔ (p = ab ⇒ a unit or b unit),b) (p) prime ideal ⇔ (ab ∈ (p) ⇒ a ∈ (p) or b ∈ (p)).

For Z both definitions are equivalent, i.e. p prime number ⇔ (p) prime ideal. One immediatelydeduces that in Z we have unique factorization.

Now we replace Z by a quadratic number ring O and ask whether we have unique factorizationas well and if not why? Some light onto the problem is put by an example.

Example: We take O−5 which is equal to Z[√−5] by Theorem 3.1 since −5 ∼= 3 (mod 4).

Clearly 2 · 3 = (1 +√−5)(1−

√−5). The number 2 satisfies the first condition a) since 2 = ab

implies 4 = N(a)N(b). This gives N(a) = 1 or N(a) = 4 (since N(x + y√−5) = x2 + 5y2 =

1, 4, 5, 9, 10, . . .). In the first case a is a unit, in the second case we write a = x + y√−5 and

then N(a) = x2 + 5y2. If x2 + 5y2 = N(a) = 4, then y = 0 and x = ±2 which shows thatb = ±1 is a unit and that 2 is a prime in the sense a). However

6 = (1 +√5)(1−

√5) ∈ (2).

We next show that 1+√−5 /∈ (2) and (1−

√−5) /∈ (2). Since 1+

√−5 ∈ (2) ⇔ 1−

√−5 ∈ (2) by

Galois conjugation we conclude that if 1+√−5 ∈ (2) then 1+

√−5 = 2α for some α ∈ O−5 and

then α = 12+ 1

2

√−5) ∈ O−5 = Z+Z

√−5, a contradiction. Therefore (1+

√−5)(1−

√−5) ∈ (2)

but 1 +√−5 /∈ (2) and 1−

√−5 /∈ (2).

As a conclusion wee see that in O−5 the definitions are not equivalent. therefore they definedifferent notions.

Definition 4.1. Let Od be a quadratic number ring. An element 0 6= a ∈ Od is called irreducibleif it is not a unit and if a = bc implies b ∈ Ud or c ∈ Ud. It is called a prime if the ideal (a) isa prime ideal, i.e. bc ∈ (a) ⇒ b ∈ (a) or c ∈ (a).

As seen before, in Z an element is irreducible if and only if it is prime. In the caseO−5 = Z[√−5]

the element 2 is irreducible but not prime. Similarly 3 is irreducible (look as before at norms).

Now let Od be a quadratic number ring. Then prime elements are irreducible: suppose that ais prime, and let a = bc. Then b or c ∈ (a). Without loss of generality b ∈ (a) ⇒ b = ua ⇒ a =uca ⇒ (1− ac)a = 0 ⇒ uc = 1 ⇒ c ∈ Ud.

We say that b divides a and write b|a if there exists c ∈ Od such that a = bc.

Lemma 4.1. If π1, . . . , πn are irreducible elements and π is irreducible and divides π1 · · ·πn

then there exists an index k and a unit ǫ such that πk = ǫπ.

Proof : Since π divides the product we deduce that π1 · · ·πn = απ and this means thatπ1 · · ·πn ∈ (π). Since π is irreducible it is prime and this implies that there exists an index ksuch that πk ∈ (π). In other words πk = απ. This shows that π|πk. Since πk is irreducible, πnot a unit it follows that α is a unit and therefore πk = ǫπ for some unit ǫ ∈ OK as stated. �

In Od there are two ways to factorize:

12

• into irreducible elements• into prime elements

The next theorem shows that a factorization into irreducible elements is always possible.

Theorem 4.1. Let O be a quadratic number ring. Then every 0 6= a ∈ O can be expressed asa = επ1 · · ·πr with ǫ a unit and πi irreducible.

Proof : We prove the assertion by induction on N = |N(a)|. For N = 1 we have N(a) = ±1,hence a is a unit and there is nothing more to show in this case. Let now N > 1 and assumethat the claim holds for all 0 6= b ∈ O with |N(b)| < N . If a is irreducible, then we are doneagain (since a = a is a factorization as claimed in the statement). Otherwise we can factora = bc with b, c not units and not associated to a; hence 1 < |N(b)|, |N(c)| < N . By theinduction hypothesis we can factor b and c as wanted. But then the same is true for a sincea = bc. This proves the claim. �

Definition 4.2. We call Od a unique factorization domain if every 0 6= a ∈ Od can be writtenas a = επ1 · · ·πr with a unit ε and with irreducible elements πi, and if this factorization isunique in the sense that if a = ε′π′

1 · · ·π′s is a second factorization, then r = s and up to

permutation π′i = εiπi with εi unit.

If we take prime elements instead of irreducible elements we get

Definition 4.3. Od is factorial if for all 0 6= a ∈ O there exist a unit ε ∈ Ud and primesπ1, . . . , πr in Od such that a = επ1 · · ·πr.

It is not difficult to show that if Od is factorial then the factorization is unique up to some unit.We see this by taking two factorizations

επ1 · · ·πr = ε′π′1 · · ·π′

s

as a product of prime elements and applying Lemma 4.1 recursively.The last two definitions define a priori two different notions, but in fact they turn out to bethe same as can be deduced from the subsequent discussion. We begin with a technical lemmawhich is used in the proof of our next theorem.

Lemma 4.2. Let p be a rational prime. Then p is reducible in Od if and only if there is ana ∈ Od with N(a) = ±p. In this case a is a prime element in Od.

Proof : “⇒”: If p is reducible, then p = ab with 1 < |N(a)|, |N(b)| < |N(p)|. Taking norms weget p2 = N(p) = N(a)N(b) and therefore N(a) = ±p, which is what we want.“⇐”: Assume that N(a) = ±p for some a ∈ Od. Then ±aaσ = p with 1 < |N(a)| < |N(p)|.This means that p is reducible. We prove the additional statement: Observe that pOd ⊂ aOd ⊂Od

∼= Z × Zω and, by looking at norms, that these inclusions are strict. Thus Od/aOd is aproper subring of Od/pOd

∼= Z/pZ × Z/pZ. Hence Od/aOd∼= Z/pZ is a field. It follows that

pOd = (p) is a prime ideal (even a maximal ideal), hence a is prime. �

Theorem 4.2. Let Od be a quadratic number ring. Then Od is factorial if and only if everyrational prime p remains prime in Od or is reducible.

13

Proof : By the last theorem the direction from left to right is easy since either the rationalprime remains prime in Od and then there is nothing anymore to prove. Or, Od being factorial,it can be decomposed as a product of at least two prime elements and then p can be written asab with a and b not a unit.. This shows that p becomes reducible in Od. We just have to provethe converse. Let a ∈ Od be irreducible and let p be a (rational) prime dividing N(a) = aaσ. Ifp remains prime as an element of Od, then p|aaσ implies that p|a or p|aσ; in the second case itfollows that p = pσ|aσσ = a. Hence p|a in both cases, which means (since a is irreducible) thata is associated to a prime and therefor is prime itself. If p is not prime, then by assumptionit is reducible. Hence by Lemma 4.2 there is a prime b ∈ Od with N(b) = ±p. It follows thatb|p|aaσ. Thus either b|a or bσ|a. In either case, as before, a is associated to a prime and henceprime itself. �

Theorem 4.3. Suppose that Od is a domain in which every element different from 0 and nota unit can be written up to a unit as a product of irreducible elements. Then the factorizationis unique ⇔ irreducible elements are prime.

Proof :. “⇐”: Assume that irreducible elements are prime and let

επ1 · · ·πr = εσπσ1 · · ·πσ

s

be two factorizations. Then πσ1 · · ·πσ

s ∈ (π1). Since (π1) is a prime ideal we deduce that up tonumeration πσ

1 ∈ (π1). This implies that πσ1 = επ1 and since πσ

1 is irreducible and π1 is not aunit we conclude that ε is a unit. Induction then shows that r = s and up to units that πi = πσ

i

as stated.“⇒”: If the factorization is unique, π irreducible and ab ∈ (π) then ab = απ for some α ∈ Od.Write a, b and r as a product of irreducible elements to see that by the unique factorizationproperty π|a or π|b and this means that a ∈ (π) or b ∈ (π). �

Corollary 4.1. The ring Od is a unique factorisation domain of and only if it is factorial.

Proof : Since prime elements are irreducible factorial rings are unique factorisation domains.Conversely if the ring is a unique factorisation domain then by Theorem ?? irreducible elementsare prime and then uniqueness of the factorization into irreducible elements implies unique fac-torization into prime elements. �

Theorem 4.4. A principal ideal domain Od is an unique factorization domain.

Proof : We first show that in Od every element can be written up to a unit as a finite productof irreducible elements. We let S be the set of principal ideals {(a), 0 6= a ∈ Od} such that adoes not have this property. We show that the set is empty. Otherwise let (a) be a maximalelement. Such an element exists for otherwise for each (b) ∈ S there would exist a (c) ∈ Ssuch that (b) ⊂ (c) and one would be able to construct a strictly increasing infinite sequence ofideals (a1) ⊂ (a2) ⊂ · · · ⊂ (an) ⊂ · · · with union an ideal in Od. Since Od is a principal idealdomain we may express the union as (a) for some a ∈ Od. The element a must be containedin (an) for some n which implies that (an) ⊆ (d) ⊆ (an). This means that (an+k = (an) for allk, a contradiction. Our a cannot be irreducible and therefore there exist non-units b and c inOd with a = bc and neither (b) nor (c) are in S. therefore b and c can be written up to unitsas a finite product of irreducible elements. The same holds then for the product a = bc whichmeans that (a) 6∈ S, a contradiction, and S = ∅.

14

To complete the proof we show that irreducible elements are prime. Let p be irreducible andsuppose that bc ∈ (p). If b 6∈ (p) then (b, p) = Od. This means that 1 = rb+ sp with r, s ∈ Od.Multiplcation with c gives c = crb+ csp and we conclude that c ∈ (p). Now we apply Theorem4.3. �

A very important question is now: For which d is the number ring Od a principal ideal domain?An answer is in general unknown. Special cases are euclidean rings: here an analogue ofeuclidean algorithm exists and this shows that such rings are principal and by Theorem 4.4they are unique factorization domains.To give an example we mention the ring Z[i] which is euclidean and therefore also a uniquefactorization domain. We come to this back in Lecture 5.

15

5 Primes in Gaussian numbers rings

(March 12, 2012)

In the introduction we have asked the question which integers can be written as a sum of twosquares. The answer was given in Lecture 2. We also asked the question which primes can bewritten as a sum of two squares. As an example we mention that

2 = 1 + 1, 5 = 1 + 4, 13 = 4 + 9, 17 = 1 + 16, 29 = 4 + 25, 37 = 1 + 36, 41 = 16 + 25

In all these examples up to the first we deduce from p = a2 + b2 that p ≡ 1 (4). Squares areeither ≡ 0 (4) or ≡ 1 (4). Since a prime p 6= 2 is ≡ 1 (2) and and if p is a sum p = a2 + b2

of squares we may assume without loss of generality that a2 ≡ 0 (2) and b2 ≡ 1 (2) and thisimplies that a2 ≡ 0 (4) and b2 ≡ 1 (4) and finally p ≡ 1 (4). The converse is also true.

Theorem 5.1. 2 6= p. Thenp = a2 + b2 ⇔ p ≡ 1(4).

The proof uses the Gaussian number ring which we have to introduce first. We consider thefield Q−1 = Q[i] = {r + si; r, s ∈ Q}. In this case d = −1 ≡ 3 (4) and therefore O−1 = Z+ Zωwhich is Z[i] since we can take ω as i. Furthermore U−1 = µ4 and the norm is

NQ−1/Q : Q−1 −→ Q,

α = a+ ib 7−→ αασ = a2 + b2.

Theorem 5.2. The ring of integers O−1 in Q−1 is euclidean.

Proof :. We define an euclidean algorithm, i.e. given α, β ∈ O−1 with β 6= 0 we show that thereexist γ, ρ ∈ O−1 such that α = γβ+ρ andN(ρ) < N(β). Consider the lattice Z[i] = Z+Z[i] ⊆ Cand write

α

β= ξ = [ξ] + {ξ}

with [ξ] ∈ O−1 and {ξ} = a+ bi chosen so that −1/2 ≤ a < 1/2,−1/2 ≤ b < 1/2. Then

N({ξ}) = a2 + b2 ≤ 1

2.

Write γ = [ξ] and since α, β, γ ∈ O−1 we see that α = γβ + ρ and therefore ρ = {ξ} · β ∈ O−1.Furthermore N(ρ) = N({ξ}) ·N(β) < N(β) and this shows the existence of the desired divisionwith remainder term. �

We are now ready to prove Theorem 5.1.

Proof : We begin with showing that every prime p ≡ 1 (4) is in the image under the normfunction of prime elements in O−1. Our first step is to prove that if p ≡ 1 (4) then p is not aprime element in Z[i]. Since p = 1 + 4n and since by Wilson’s theorem we have (p− 1)! ≡ −1(mod p) we get

−1 ≡ (p− 1)! ≡ 1 · 2 · . . . · (2n)(p− 1)(p− 2) · · · (p− 2n) ≡ (2n)!(−1)2n(2n)! ≡ (2n)!2

This shows that x = (2n)! is a solution of x2 ≡ −1 (mod p) and therefore

(x+ i)(x− i) = x2 + 1 ∈ (p).

16

If p would be a prime in Z[i] then x + i or x − i would be in (p) and this would mean thatx+ i = αp with some α ∈ Z[i] or that x− i = αp. But then

α =x

p± i

p/∈ Z[i]

in contradiction to the definition of α. Therefore p is not a prime element in the ring O−1

which, being euclidean, is a principal ideal domain. By Theorem 4.4 it is a unique factorizationdomain. Theorem 4.3 tells us that in this situation primes are irreducible and therefor p isnot irreducible. It follows that p = αβ for some elements α, β 6∈ U−1. Taking norms givesp2 = N(α)N(β) with N(α),N(β) > 1, and this implies that N(α) = p = N(β). If we writeα = a + ib with a, b ∈ Z, then p = N(α) = a2 + b2 as stated in Theorem 5.2. �

We determine now the prime elements in Z[i], i.e. the elements π such that (π) is a prime ideal.Note that this amounts to the same as to determine the irreducible elements because in oursituation the irreducible elements are exactly the prime elements.

Theorem 5.3. The prime elements π of Z[i] are given (up to units) by1) π = 1 + i2) π = a + bi with a2 + b2 ≡ p, p ≡ 1 (4) and a > |b| > 03) π = p with p ≡ 3 (4).

Proof :. ”⇐”: The elements π listed in 1) and 2) are prime elements since π = αβ withα, β ∈ Z[i] implies p = N(π) = N(α)N(β) and this gives N(α) = 1 or N(β) = 1 and thereforea ∈ U−1 or β ∈ U−1. This shows that π is irreducible and therefore prime. If π is as in 3) andπ = αβ then p2 = N(p) = N(π) = N(α)N(β). If α, β /∈ U−1 then N(α) = N(β) = p and wemay express α as α = a + ib with p = N(α) = a2 + b2 and this would imply that p ≡ 1 (4)contrary to the assumption.”⇒”: We show that any π prime is (up to unit) one of the prime elements in 1), 2) or 3). Firstwrite

N(π) = π · π = pe11 · · · pe1rwith p1, . . . , pr distinct rational primes. Then p1 · · · pr ∈ (π) and π being prime we find thatp ∈ (π) i.e. p|π for some p ∈ {p1, . . . , pr}. This gives N(π) ∈ {p, p2}. If N(π) = p we getp = a2 + b2 and therefore π is of type 2), unless p = 2 and then of type 1). If N(π) = p2

then N(π/p) = p2/p2 = 1 and from p ∈ (π) we deduce that ǫ = π/p ∈ O−1. This implies thatπ/p ∈ U−1 which shows that π = ǫp is a prime in O−1.It remains to show that p ≡ 3 (4). If not then p = 2 and then equal to (1 + i)(1 − i) or p ≡ 1(4) and then by Theorem 5.1 of the form p = a2 + b2 = (a + bi)(a − bi). But this would meanthat π would not be a prime element. �

We end this lecture with determining how a rational prime decomposes in Z[i].

Theorem 5.4. Rational primes p ∈ Z remain prime in Z[i] or decompose in Z[i] as p = ππσ

with πσ conjugate to π and π, πσ prime. Conversely every prime element π in Z[i] divides auniquely determined rational prime p.

Proof : Let π ∈ Z[i] be prime. Then

π|N(π) = ππσ ∈ Z

and therefore there exists a rational prime p such that π|p. If q is another prime with this prop-erty then π|(p, q) = 1 and this would mean that π is a unit. This is not the case. Therefore π

17

is (up to unit) unique.Assume now that p is not prime in Z[i]. Then p = ππ∗ for π, π∗ ∈ Z[i] and not a unit. Wededuce that p2 = N(π)N(π∗) and then N(π) = p = N(π∗). This gives a decomposition p = ππσ

with π a prime as stated. �

Definition 5.1. A rational prime p is called inert if p remains prime in Z[i], it is called splitif p = ππσ and πσ is not επ for some unit ε and it is ramified iff it is up to a unit of the formπ2 for some prime element π in Z[i].

The latter is the case iff p = 2 since then 2 = (1+ i)(1− i), 1± i prime and 1− i = −i(1+ i) andthen 2 = −i(1 + i)2 is up to unit a square. We conclude that 2 is ramified. The discriminantdK of Q[i] is −4 and therefore 2, a divisor of the discriminant dQ[i], is the only prime whichdivides the discriminant. It can be shown that the prime number p is ramified in K = Q(

√d)

if and only if p|dK with dK = d in the case that d ≡ 1 mod 4 and with dK = 4d if d ≡ 2, 3mod 4. We formulate this in

Theorem 5.5. In Z[i] the prime number p is inert ⇔ p ≡ 3 (4), it is split ⇔ p ≡ 1 (4) and pis ramified ⇔ p = 2.

There are many very interesting questions which come up in this context. By Theorem 5.5ramified primes divide the discriminant. Therefore their number is finite. For example onemay ask what about inert and split primes? (The answer is that they have density 1/2 (see [8],p. 346, Cor.4 and Cor. 5).)

18

6 Ideals in quadratic number rings

(March 19, 2012)

In the last lecture we have studied in an example, the ring of Gausian integers, the questionhow in a principal ideal the factorization theory from Z extends to the number ring. This is avery particular situation since in general this is not possible anymore. Instead one has to followa new concept in which irreducible elements are replaced by ideals and factorization will be interms of ideals. Quadratic number rings are the easiest case of a very general theory whichends in abstract commutative algebra or algebraic geometry. Since we are very much interestedin number theoretical properties our approach is slightly extended towards particular numbertheoretical concepts.One of the main objects concerning ideals will be the class group of the quadratic numberfield. As a first step towards this goal we introduce the set Md of ideals in Od. We begin withrecalling, so to say as a warming up, the definition of an ideal. We take a quadratic numberfield Q(

√d) with ring of integers Od which we may write as Z + Zω. Given Q(

√d) its ring of

integers Od = Z+ Zω is a free Z-module.

Definition 6.1. A subset a ⊂ Od is an ideal if and only if the following properties are satisfied:

- a is an additive subgroup

- λa ⊆ a for all λ ∈ Od.

The set of ideals is an additive as well as a multiplicative monoid with neutral element. To seethis we have to define the addition and the multiplication of two ideals a and b in Md.

Definition 6.2 (Sum and Product of Ideals). The sum and product of two ideals a, b ⊆ Od aredefined as

a+ b =:= {a + b; a ∈ a, b ∈ b}and

a · b :=

{

r∑

i=1

aibi; ai ∈ a, bi ∈ b; r ∈ N

}

.

One sees easily that the product of two ideals is an ideal.

Let 0 6= a ⊆ Od be an ideal. Then [Od : a] < ∞. To see this we take 0 6= a ∈ a ∩ Z and then(a) := aOd ⊆ a ⊆ Od implies that [Od : a] ≤ [Od : (a)]. Since (a) = Za + Zaω and since(Z+ Zω)/(Za+ Zaω) ≃ (Z/aZ)× (Z/aZ) we have [Od : aOd] = a2 < ∞.

For an ideal a = Zα + Zβ ⊆ Od we introduce the norm of a as N(a) := [Od : a] and thediscriminant as

D(a) := det

(

α βασ βσ

)2

.

A transformation α = a + bω, β = c+ dω of the basis leads to

(

α ασ

β βσ

)

=

(

a bc d

)(

1 1ω ωσ

)

19

and gives D(a) = (det T )2D(Od) and det T = N(a) for

T =

(

a bc d

)

.

We deduce that D(a) = N(a)2D(Od). In the special case of a principal ideal a = (ξ) = ξOd weget

D((ξ)) = det

(

ξ ξωξσ ξσωσ

)2

= (ξξσ)2 det

(

1 ω1 ωσ

)2

= N(ξ)2D(Od)

and this implies that N((ξ))2 = N(ξ)2. We conclude that the ideal norm of the ideal generatedby ξ and the norm of ξ are related by N((ξ)) = |N(ξ)|. The norm for ideals is multiplicative,i.e. N(ab) = N(a)N(b).

We deduce from this property the very useful proposition which allows us to define the inverseof an ideal.

Proposition 6.1. If a ⊆ Od is an ideal then

a · aσ = (N(a)) = N(a)Od.

Proof : We write a = Zα + Zβ and get aσ = Zασ + Zβσ This gives

aaσ = ZN(α) + ZN(β) + Zαβσ + Zασβ.

Let m = (N(α), N(β), T r(αβσ)) > 0 be the greatest common divisor of the integers N(α),N(β), Tr(αβσ) and introduce γ = αβσ/m. Then

Tr(γ) = γ + γσ =Tr(αβσ)

m∈ Z,

N(γ) = γγσ =N(α)

m

N(β)

m∈ Z

and this implies that γ is integral. By definition there exist r, s, t ∈ Z such that

m = rN(α) + sN(β) + t T r(αβσ)

and we conclude that

1 = rN(α)

m+ s

N(β)

m+ t

T r(αβσ)

m.

This leads to 1 ∈ Z N(α)m

+ ZN(β)m

+ ZTr(αβσ)m

whence

aaσ = m

(

ZN(α)

m+ Z

N(β)

m+ Z

Tr(αβσ)

m

)

= mZ.

Taking norms gives m2 = N(m) = N(mOd) = N(aaσ) = N(a)N(aσ) = N(a)2, which impliesthat m = N(a). �

From the Proposition it easily follows that the norm is a homomorphism from the monoid ofideals into N..

Corollary 6.1. The norm satisfies N(ab) = N(a)N(b).

20

Proof : We apply the proposition to ab and get

(N(a)Od) (N(b)Od) = aaσbbσ = (aaσ)(bbσ) = (N(a)N(b))Od

and the claim follows. �

In the next proposition we shall determine the structure of an ideal in a quadratic number ring.We do it in a slightly more general situation by looking at arbitrary Z-modules, not only atideals which are Od-modules.

Proposition 6.2. Let a ⊆ Od be a Z-module different from (0). Then the following propertiesare satisfied:(i) there exist integers a,m, n ∈ N such that

a = (n, a +mω)

and 0 ≤ a < n if the rank of a is two,(ii) if a is an ideal 6= 0 then m|n, m|a and

a = (m)(n/m, a/m+ ω),

(iii) the integer n divides mN(a/m+ ω).

Proof :. The set H = {s ∈ Z; ∃r ∈ Z : r + sω ∈ a} is easily seen to be a subgroup of Z andtherefore there exists an integer m ≥ 0 such that H = mZ. By the definition of m there existsan integer a ∈ Z such that a + mω ∈ a. The intersection a ∩ Z is a subgroup of Z and canbe written as a ∩ Z = nZ for some integer n ≥ 0. To prove (i) it is sufficient to show thata ⊆ (n, a+mω). If r+sω ∈ a then by the definition of H we have s ∈ H = mZ so that s = mufor some u ∈ Z. This means that

r − ua = r + sω − (sω + ua) = r + sω − u(a+mω) ∈ a

and since also r − ua ∈ Z, that is in a ∩ Z = nZ, we conclude that

r + sω = r − ua+ u(a+mω) ∈ nZ+ (a +mω)Z,

which proves the first part of (i). If the rank of a is two then n 6= 0 and division of a by n withremainder gives the second part.

To prove (ii) we assume in addition that a is an ideal which is a module of rank two and by(i) contains n. Then ωa ⊆ a and therefore nω ∈ ωa ⊆ a which, again by the definition of H ,implies that n ∈ H = mZ. This means that m|n and this is the first part of the assertion in(ii). For the second part we show that m|a which will follow if we prove that a ∈ H since thena ∈ mZ. We express ω2 as k + lω and then

mk + (a+ml)ω = aω +m(k + lω) = aω +mω2 = ω(a+mω) ∈ a

and we deduce from the definition of H that a +ml ∈ H . This implies that a ∈ H as stated.It follows that

a = (n, a +mω) = (m)(n/m, a/m+ ω)

and the second assertion of (ii) is established.For (iii) we put α = a+mω which is in a and then, by (i), we have ασ/m = (a/m) + ωσ ∈ Od.This shows that N(α/m) ∈ Z. Together with

mN(α/m) = α(ασ/m) ∈ αOd ⊆ a

21

we conclude that mN(α/m) ∈ a ∩ Z = nZ and this gives n|mN(a/m+ ω). �

From (i) one sees that if 0 6= a ⊆ Od is an ideal it can be generated by at most two elements.

Let a 6= 0 be an ideal. By Proposition 6.2 we have a = (n,mα) with α = m−1(a+mω) = b+ω.

Proposition 6.3. The norm of the ideal a = (n, a+mω) is N(a) = mn.

Proof : We first show that mn|N = N(a). We have aσ = (n,mασ) and therefore

aaσ = (n2, nmα, nmασ, m2N(α)) = (nm)(n2

nm, α, ασ,

m

nN(α)) = (nm)b.

for some ideal b. Proposition 6.2 shows that b is an ideal in Od and therefore (N) = (N(a)) =aaσ ⊆ (nm) which gives the first assertion. Secondly we show that N |nm and this then givesthe desired statement. Since mα ∈ a and n ∈ aσ we have nmα ∈ aaσ = (N) and therefore

nm

N(b+ ω) =

nm

Nα ∈ Od.

This gives nmN(b+ ω) = u+ vω, u, v ∈ Z, and v = nm/N ∈ Z. It follows that N |nm. �

Corollary 6.2. The norm of a prime ideal p is p or p2 for some rational prime in p and aprime ideal p contains a rational prime element p.

Proof : Let p be a prime ideal. Then by Proposition 6.2 it takes the form p = (m)(u, v + ω)for some integers u, v ≥ 0. By Proposition 6.3 its norm is N(p) = um. Since p is prime either(m) = (1) or (u, v+ω) = (1) and this implies that p = (m) or p = (u, v+ω) respectively. FromN(p) ∈ ppσ ⊆ p we see that if p = (u) and u = ab for integers a, b then, p being prime, we mayassume that a ∈ p = (u). This gives u|a and together with a|u leads to u = ±a. Therefore uis a rational prime p contained in p and shows that p = (p).If p = (u, v + ω) then Proposition 6.3 shows that N(p) = u and if u = ab then as above we geta ∈ p and this shows that a = ku+ l(v +ω) for integers k and l. Since a ∈ Z we conclude thatl = 0 and that u|a. This together with a|u shows that u = ±a and therefore that u is somerational prime p and that p = (p, v+ω). This proves the second part of the corollary. The firstpart then follows from Proposition 6.3. �

We are now able to give the general shape of a prime ideal in Od

Proposition 6.4. Let p be a prime ideal. Then there is a rational prime such that p = (p) orp = (p, ǫ+ ω) with ǫ = 0 or ǫ = ±1.

Proof : From Corollary 6.2 together with Proposition 6.2 and Proposition 6.3 we deduce thatp = (p, v + ω) with 0 ≤ v < p. If v = 0 we get ǫ = 0 otherwise (v, p) = 1 and then 1 = kp+ lvand so p = (p, 1 + lω) with l ∈ Z. But l|1 by Proposition 6.2 and this gives the assertion. �

22

7 Ideal factorization

(March 26, 2012)

We recall that an ideal 0 6= p ⊂ Od is a prime ideal if and only if ab ∈ p implies a ∈ p or b ∈ p.It is easy to see that equivalently an ideal p is prime if and only if Od is entire.

Lemma 7.1. An ideal p is prime if and only if p ⊇ ab for some ideals a and b implies thatp ⊇ a or p ⊇ b.

Proof : ”⇐”: If ab ⊆ p and a 6⊆ p and b 6⊆ p we choose α ∈ a and β ∈ b but not in p. Thenαβ ∈ p but none of α and β. This means that p is not prime.”⇒”: Suppose that αβ ∈ p. Then (α)(β) ⊆ p and this implies that (α) ⊆ p or (β) ⊆ p onassuming that p is prime. �

We come back to O−5 = Z+ Z√−5. There we have seen that

6 = 2 · 3 = (1 +√−5)(1−

√−5)

and we have seen that 2 is irreducible but not prime and 3 has the same property. If howeverwe allow prime ideals as factors then factorization has a chance. Write

(2) = p1p2, (3) = p3p4, (1 +√−5) = p1p3, (1−

√−5) = p2p4.

Then(6) = (2)(3) = p1p2p3p4

and everything would work. There cannot be more then 2 factors in (2) or (3) by normconsiderations. This implies

N(p1) = N(p2) = 2, N(p3) = N(p4) = 3.

The prime ideals in that decomposition can be explicitly determined using the rsults of the lastsection. From there it follows that p1 = (2, 1+

√−5), p2 = (2, 1−

√−5), p3 = (3, 1+

√−5), p4 =

(3, 1−√−5).

We show this for the ideal p = p1. By Proposition 6.2 we have p = (n, a + mω) and fromProposition 6.3 we deduce that 2 = N(p) = nm. This gives n = 2 and m = 1 for n = 1 can beexcluded. By Proposition 6.2 the integer a can take only the values 0 or 1. The value 0 is notpossible for otherwise

√−5 would be in p and then 5|2 = N(p).This shows that the ideal p has

the form given above.

Definition 7.1. An ideal m is called maximal if m ⊆ a ⊆ Od for an ideal a implies that a = m

or a = Od.

Remark 7.1. It is easy to see that an ideal p is maximal if and only if Od is a field.

The goal of this lecture is to show that in Od we have unique factorization of an ideal a as aproduct of prime ideals

a = p1 · · · pr, pi prime.

From the remark we deduce that every maximal ideal is prime. To see this directly we takem ⊂ Od maximal and choose r ∈ Od such that r 6∈ m. Then (m, r) ) m is an ideal differentfrom m and therefore (m, r) = Od ∋ 1. We write 1 = m + rs, r ∈ Od, m ∈ m and deducethat 1 ≡ rs (mod m). This shows that r is invertible modulo m and therefore that Od/m is afield.We conclude that m is prime. The converse is also true and we get

23

Theorem 7.1. The ideal 0 6= p ⊂ Od is prime if and only if p is maximal.

Proof : Let pZ = p ∩ Z and (p) = pOd. Then (p) ⊆ p ⊂ Od and we show that Od/p is afield. To show this we take α ∈ Od/p and choose some α ∈ α. It satisfies a quadratic relationα2 + aα + b = 0 with a, b,∈ Z. Taking residues gives

α2 + aα + b = 0 (1)

where a, b ∈ Z/pZ. If b 6= 0 then it is is invertible and we find some c ∈ Z/pZ such that bc = 1.The relation (1) multiplied by c leads to

1 = (−c α− a c)α

and this shows that α has an inverse. If b = 0 then from (1) we deduce that either α = 0 orα = −a. In the latter case 0 6= α ∈ Z/pZ and since Z/pZ is a field it follows that α has aninverse as well. This shows that Od/p is a field and therefore that p is maximal. �

We next introduce irreducible ideals. Let a, b ⊆ Od be ideals. Then we say that a|b if thereexists an ideal c ⊂ Od such that b = a · c. Then b = ac ⊆ a. Also the converse is true. If b ⊆ a

then multiplication by aσ and applying Proposition 6.1 shows that

aσb ⊆ aσa = (N(a)) = (N)

and c := N−1aσb ⊆ Od. We get ac = N−1aaσb = b and finally that a|b. From Lemma 7.1 itfollows at once that p 6= 0 is prime if a|bc ⇒ a|b or a|c

The next definition takes up the discussion at the beginning of section 7.

Definition 7.2. An ideal a 6= Od is called irreducible if a = bc implies b = Od or c = Od.

Proposition 7.1. An ideal in Od is irreducible if and only if it is maximal.

Proof : To see this let a be irreducible and suppose that b ⊇ a is an ideal. Then b|a andirreducibility implies b = Od or b = a. For the converse let m be maximal and assume thatm = bc. We have to show that b = Od or c = Od. We have m = bc ⊆ b and if b 6= Od we findthat m = b by maximality of m and this gives m = mc. We want to conclude that c = Od. Thisfollows from the subsequent Lemma. �

Lemma 7.2. Let a, b and c be ideals in Od all different from (0). Then ab = ac implies b = c.

Proof : This holds for principle ideals a = (α) because αb = ab = ac = αc. Let β be in b.Then αβ ∈ αc and there exists γ ∈ c such that αβ = αγ. This shows that β = γ ∈ c. Weconclude that b ⊆ c and by symmetry also that c ⊆ b.If a is arbitrary we multiply by aσ and get

(N(a))b = aσab = aσac = (N(a))b.

Since (N(a)) is principal we conclude that b = c. �We deduce that the set of ideals of Od is a monoid. Furthermore we have shown that

a irreducible ⇔ a maximal ⇔ a prime.

We are now ready to prove the following:

24

Theorem 7.2. (0) 6= a ⊂ Od ideal. Then up to permutation of the factors we can write

a = p1 · · · ps

with pi prime and unique.

Proof :. Let S be the set of ideals which do not have the property. If S 6= ∅ then there existsa maximal element c in S which by the definition of S has to be reducible. Therefore we canwrite m = ab with a ⊃ c, b ⊃ c and a, b /∈ S. But then c is the product of the prime factors ofa and b and cannot be in S. It follows that S = ∅. Uniqueness follows from Lemma 7.2. �

25

8 Prime ideals

(March 26, 2012)

Let p ⊂ O be a prime ideal. Then there exists a rational prime p such that p|(p) = pO. In factwe have p|(N(p)) = ppσ (since p ⊃ ppσ = (N(p))). We now factorize N(p) = p1 · · · ps in Z anddeduce that p|(p1) · · · (ps) It follows that there exists an i such that p|(pi) and this proves ourclaim. As a consequence we have N(p)|N((p)) = p2 and therefore N(p) ∈ {p, p2}.

Lemma 8.1. Prime ideals in Od take the form (p) or (p, v + ω) wit p a rational prime and0 ≤ v ≤ p− 1.

Proof : By Proposition 6.2 we may write p as

p = (n, a+mω) = (m)( n

m,a

m+ ω

)

and n|mN(a/m+ω). We want to determine m,n and a. First note that since p is prime we musthave eitherm = 1 and then p = (u, v+ω) or (n/m, a/m+ω) = (1) and (m) = p. The latter givesm = p and p = (p). If p = (u, v + ω), as in the first case, we deduce that u = N(p) ∈ {p, p2}.If u = p2 then p2 = u ∈ p and since p is prime we find that p ∈ p and therefore p = (p, v+ω). �

Theorem 8.1. Let dK be the discriminant of Qd and p 6= 2 a prime. Then

(p) =

p if(

dp

)

= −1

ppσ if p 6= pσand(

dp

)

= 1

p2 if p|dK

(2)

Proof :. p 6= 2 and p|dK implies p|d. Now

(p,√d)2 = (p2, p

√d, d) = (p)(p,

√d,

d

p) = (p)

since d is square-free and therefore 1 = kp+ ld/p ∈ (p,√d, d/p) whence (p,

√d, d/p) = Od.

Assume now that(

dp

)

= 1. Then either(

dKp

)

=(

dp

)

= 1 or(

dKp

)

=(

2p

)2 (dp

)

= 1 and

therefore there exists an integer x with x2 ≡ dK mod p. We put p = (p, x+√d). Then

ppσ = (p2, p(x+√d), p(x−

√d), (x2 − d)) = (p)(p, x+

√d, p−

√d, (x2 − d)/p)

= (p)(p, x+√d, 2

√d, (x2 − d)/p) = (p)b.

Now 2√d ∈ b implies (2

√d)2 = 4d ∈ b. therefore 1 = (p, 4d) ⊂ b and this gives ppσ = (p).

Assume next that (d/p) = −1. Lemma 8.1 shows that p = (p) or p = (p, v + ω). Inthe latter case we have N(v + ω) = (v + ω)(v + ωσ) ∈ ppσ = (p). This implies thatp|N(v + ω) = v2+Tr(ω)v + N(ω). Now ω =

√d and then Tr(v + ω) = 0 or (1 +

√d)/2

and then Tr(v + ω) = 1. This gives v2 ≡ d (mod p) which is equivalent to (d/p) = 1 orN(ω) = (d − 1)/4 and Tr(ω) = 1. In the latter case v2 + v − (d − 1)/4 ≡ 0 (mod p) and thisimplies (2v + 1)2 = 4v2 + 4v + 1 ≡ d (mod p) and again (d/p) = 1. In both cases we get acontradiction. This shows that p = (p). �

26

We make now a short deviation into Galois theory. First note that in the case K = Q(√d) we

have G = Gal(K/Q) = 〈σ〉, σ 6= id, σ2 = id with σα = ασ. We define now the decompositiongroup and the inertia group. Let p be a prime ideal. Then

Dp = {τ ∈ G; pτ = p}

is the decomposition group. Let k(p) = O/p be the residue field of p. It is an extension ofFp = Z/pZ of degree 1 or 2 and we get a homomorphism π : Dp → Gal(k(p)/Fp).

Lemma 8.2. π is surjective.

Proof. If k(p) = Fp there is nothing to show. Otherwise we have k(p) = Fp(θ) where θ = θ+ p.Let τ ∈ Gal(k(p)/Fp) and let F (T ) ∈ Z[T ] be the minimal polynomial of θ over Z. Letalso G(T ) ∈ Z[T ] be a polynomial such that G(T ) is the minimal polynomial of θ. We maychoose G of degree ≤ 2. Then G|F and if F (T ) = (T − θ)(T − θσ) with θ, θσ ∈ O we haveF (T ) = (T − θ)(T − θ

σ). Then τ(θ) = ξ for some ξ ∈ {θ, θσ} and we define τ(θ) = ξ. Then

τ = π(τ). //

From the lemma we get an exact sequence

1 → Ip → Dp → Gal(k(p)/Fp) → 1

with Ip = ker π the inertia group.

Theorem 8.2. The following properties of a prime ideal p are equivalent(i) p is unramified,(ii) p ∤ dK,(iii) Ip = 〈1〉.

Proof. The equivalence of (i) and (ii) has already been shown. We show now that (i) and (iii)are equivalent. If p is unramified then (p) = ppσ with p 6= pσ or (p) = p. Consider first the case(p) = ppσ: Then Dp 6= G and this implies that Dp = 〈1〉 and therefore Ip ⊆ Dp = 〈1〉. Nowconsider the case (p) = p: Then Dp = G and [k(p) : Fp] = 2. Therefore Gal(k(p)/Fp) 6= 〈1〉 andthis implies Ip ( Dp = G. But then Ip = 〈1〉. //

We illustrate this in the following table:

p = (p) ppσ = (p) p2 = (p)Dp G 〈1〉 GIp 〈1〉〈1〉 G

27

9 Fractional ideals

(April 2, 2012)

In this lecture we shall study the set of ideals in Od. We have seen that it is a multiplicativeunitary monoid. The main aspect will be to make out of it a commutative group and tointroduce the associated class group. We shall prove that the class group is finite and introducethe two aspects of its order, the class number.

Definition 9.1. A fractional ideal of K is a finitely generated Od-submodule of K.

Examples. 1) Let r ∈ K, then (r) := rOd is a fractional ideal.2) Every ideal a ⊂ Od is a fractional ideal.

We give now a very simple characterization of fractional ideals.

Lemma 9.1. A Od-submodule 0 6= a of K is a fractional ideal if and only if there exists anelement c ∈ Od, different from 0 and with c a ⊆ Od.

Proof :. A fractional ideal a is finitely generated and can therefore be written as a = (r1, . . . , rs)with ri ∈ K. There exists an element c ∈ Od such that cri ∈ Od for all i and this implies thatca ⊆ Od. Conversely let a ⊂ K be a Od-submodule such that there exists an element c ∈ Od

such that ca ⊆ Od. Then ca is a submodule of Od which by Proposition 6.2 can be written asnOd + (a+mω)Od and this is a finitely generated Od-submodule of K. �

We have seen that the set of ideals a ⊆ Od form a multiplicative monoid. We define

JK = {a; fractional ideal}.

Then the monoid of integral ideals a ⊆ Od becomes a submonoid of JK . The latter is a groupif we define for a 6= 0, a ∈ JK

a−1 =1

N(a)aσ

since aaσ = N(a) by Proposition 6.1. We let PK be the set of principal fractional ideals, i.e.PK = {(r); r = a/b, a, b ∈ Od}.

Definition. The quotient group CK = Cl(K) := JK/PK is called the ideal class group of K.The number hK = #Cl(K) is called the class number of K.

We get an exact sequence (UK = units of K)

1 → UK → K× → JK → Cl(K) → 1

where r ∈ K× 7→ (r) ∈ JK and a ∈ JK 7→ a = class of a.

We also introduce the narrow class group. In the case of quadratic fields the narrow class groupdiffers from the class group only when K is a real field or, equivalently to that, if dK > 0. Todefine it we consider the homomorphism

sign : K× → µ2 = {±1}r 7→ r

|r|

28

and the embeddings σ1, σ2 : K → R. This gives

s : K× → µ2 × µ2.

The kernel of s is called the subgroup of totally positive elements. Let P+K ⊂ PK be the subgroup

of principal ideals generated by totally positive elements. Then C+K = Cl(K)+ = JK/P+

K is thenarrow class group. Let ε be a fundamental unit of K. Then

h+K = #Cl(K)+ =

{

2hK , N(ε) = 1,hK , otherwise.

This gives the relation between the narrow class group and the class group. The proof is notdifficult but we omit it. (See Frohlich-Taylor, pp. 180-181).

So far we do not know whether the class number is finite. Of course the above relation holdsalso in the case when the class number is infinite. That this is not the case is the statement ofthe following

Theorem 9.1. Let d be square-free and put

µK =

√

45d, d > 0

√

−43d, d < 0

Then each class contains an ideal a ⊆ Od, a 6= 0 with N(a) ≤ µK.

Proof :. Let a ∈ [a] be an integral ideal in its class. Then by Proposition 6.2(ii) it takes theform (m)(u, β) with β = v + ω. We may replace a by (u, β) since both are in [a] and assumethat a = (u, v + ω) with 0 < u = N(a). Taking into account that ω =

√d or ω = (1 +

√d)/2

and defining in the latter case β ′ as v′ +√d where v′ = v + 1/2 we may write a = (u, β) and

a = (u, β ′) respectively. We shall now show that there exists an integral ideal a′ ∈ [a] such thatN(a′) ≤ µK . Division by u with remainder allows us to assume that

|v|, |v′| ≤ u/2 if d < 0 (3)

and

u/2 ≤ |v|, |v′| ≤ u if d > 0. (4)

Furthermore we assume that

u2 > µ2K . (5)

We first deal with the case that d < 0 where −d = 3µ2K/4 < 3u2/4 and then (3) gives

N(β) = v2 − d = u2/4− d < u2.

Our assumption (5) implies that u2 > −4d/3 > −d/3 and again from (3) we deduce that

N(β ′) = (v′)2 − d/4 ≤ u2/4− d/4 < u2

and we conclude that N(β) < u2 and N(β ′) < u2.

29

If d > 0 we get from (5) together with (4) that

−u2 =u2 − 5u2

4< v2 − 5µ2

K/4 = v2 − d < u2

if ω =√d and

−u2 =u2 − 5u2

4≤ (v′)2 − d

2< u2

if ω = (1 +√d)/2. This leads to

N(β), N(β ′) < u2.

Define a′ = u−1βσa or a′ = u−1(β ′)σa. Then a′ ∼ a and

N(a′) < N(a).

Since we may assume that a has smallest norm in its class we get a contradiction provided thata′ ⊆ Od. This means that our assumption (5) is wrong and, as a consequence, that u = N(a)is bounded by µK .It remains to show that a′ ⊆ Od. If a′ = u−1βσa then a′ ⊆ Od ⇔ βσa ⊆ uOd = aaσ ⇔βσ(N(a)) ⊆ aσ(N(a)) ⇔ βσ ∈ aσ ⇔ β ∈ a and this is the case. In the same way we deal withthe case a′ = u−1(β ′)σa. �

:::::::::::::::

Complements:

1. decomposition of 2 in K = Q(√d)

• (2) = (2,√d)2 ⇔ d ≡ 2 (mod 4)

• (2) = (2, 1 +√d)2 ⇔ d ≡ 3 (mod 4)

• d = 1 (mod 4) ⇒– d ≡ 1 (mod 8) ⇔ 2 = ppσ, p = (2, (1 +

√d/2), p 6= pσ

– d = 5 (mod 8) ⇔ (2) = p.2. calculation of P+

K/PK : We define 〈[√d]+〉 the subgroup of the class group Cl∗(K). Then

1 → 〈[√d]+〉 → Cl+(K) → Cl(K) → 1

is exact; call the injection ι and the surjection π. Let [a]+ ∈ ker π. Then a = αOK , α ∈ K.• N(α) > 0 ⇒ α ≫ 0 or α ≪ 0. (Because: α ≫ 0 or α ≪ 0 ⇒ N(α) = αασ > 0.Conversely if N(α) > 0 we have αασ > 0 and this means that α > 0, ασ > 0 orα < 0, ασ < 0.) We choose α ≫ 0 and find [a]+ = 1.

• N(α) < 0 ⇒ α > 0, ασ < 0. Then α/√d > 0 and ασ/

√dσ = ασ/(−

√d) > 0 and

this gives a = αOK = α/√d(√d) and then [a]+ = [

√d]+ ∈ imι.

3. totally positive elements in Q(√d), d > 0: α = k + l

√d, ασ = k − l

√d; α > 0 and

ασ > 0 ⇔ k > −l√d and k > l

√d.

4. Cl(K) is a finite group of order hK ⇒ ∀[a] ∈ Cl(K): [a]h = [1] ⇒ ah = (r), r ∈ K.5. K = Q(

√d), (p) = ppσ, p 6= pσ ⇒ ∃x, y ∈ N : 4ph = x2 − dy2. In fact: ph = (x+ y

√d)/2

with x ≡ y (mod 2). This gives ph = N(ph) = N((x + y√d)/2) = (x2 − dy2)/4 and then

4ph = x2 − dy2. In particular, if h = 1: 4p = x2 − dy2.6. class number 1 for d < 0: d = −1,−2,−3,−7,−11,−19,−43,−67,−163

30

10 The three squares theorem

(April 16, 2012)

One of the entries in Gauss’ diary is the mysterious “Eureka!num = ∆+∆+∆” i.e. for everypositive integer n we have n = x(x+ 1)/2 + y(y + 1)/2 + z(z + 1)/2 with x, y, z ∈ Z. This thesame as 8n + 3 = (2x + 1)2 + (2y + 1)2 + (2z + 1)2. This theorem is much deeper than La-grange’s theorem that every positive integer n can be expressed as a sum of 4 squares of integers.

We next prove (in order to give a new proof of Gauss’ assertion):

Lemma 10.1 (Cassel’s Lemma). If the positive integer M can be expressed as a sum of 3squares of rational numbers then it can also be expressed as a sum of 3 squares of integers.

Proof. From now on all letters denote integers. Suppose we have x21+x2

2+x23 = Ma2 (i.e. M is

a sum of 3 squares of rational numbers). We write this as Q(x) = Ma2. Set xi = ayi+zi, |zi|a/2for i = 1, 2, 3. Briefly x = ay + z. Then we have a2Q(y) + 2aQ(y, z) + Q(z) = Ma2 whereQ(y, z) = y1z1 + y2z2 + y3z3. Clearly Q(z) ≡ 0 (mod a). So write Q(z) = −ab where|b| ≤ 3a/4 < a (since |z| ≤ a/2). It follows 2Q(y, z) − b = a(M − Q(y)). Next we shallchoose an integer t such that Q(by + tz) = Mb2 or b2Q(y) + 2btQ(y, z) + t2Q(z) = Mb2. It iseasy to see that we can take t = M − Q(y). (We reject t = b/a, since this is numerically lessthan 1, and b = 0 since this implies z = 0 and so a|x; the latter would make M a sum of 3integral squares.) Thus we get Q(by+ tz) = Mb2 with |b| < a. Continuing, we finally get M =a sum of 3 squares of integers. //

We next use an argument suggested by A. Selberg to deduce from Cassell’s Lemma that forM ≡ 3 (8), we have M = u2 + v2 + w2, u, v, w ∈ Z. By the lemma it is enough to show theexistence of x, y, z, f ∈ Z such that x2 + y2 + z2 = Mf 2, f odd, or x2 + y2 = Mf 2 − z2.Necessarily x, y, z have to be odd. Set f = (p+ q)/2, z = (p− q)/2. Then

x2 + y2 = 2

[(

M − 1

2

)

(p2 + q2) + (M + 1)pq

]

.

It now follows that p and q have opposite parity.

Consider the primitive (i.e. the coefficients have no common factor) binary quadratic form

(

M − 1

2

)

(p2 + q2) + (M + 1)pq.

It represents (by a know but deep theorem of analytic number theory) infinitely many primes.Since M ≡ 3 (8), it if it represents a prime ρ then p and q have opposite parity. But then thisρ ≡ 1 (4). Hence ρ = α2 + β2, αβ ∈ Z. It follows x2 + y2 = 2ρ = (α + β)2 + (α − β)2. Butthen the original assertion is satisfied with x = α + β, y = α− β. We have now proved Gauss’assertion that M ≡ 3 (8) ⇒ M = sum of 3 squares. //

There is still another proof of the three squares theorem suggested by A. Weil in his course“300 years of number theory” at the IAS Princeton. Apply the so-called “Hasse Principle” tothe form (x2 + y2 + z2)−Mw2 where M > 0,M ≡ 3 (8). Since this form represents 0 in everyp-adic field (p a prime) it represents 0 over Q. Apply Cassell’s Lemma. //

31

11 The four squares theorem

(April 20, 2012)

In this lecture we give Euler’s proof of the famous “Four squares theorem” of Lagrange. Thebasic tool is the identity for products of sums of four squares which he discovered in 1770.

Lemma 11.1. Let a, b, c, d and p, q, r, s be 2 pairs of independent variables and put m = a2 +b2 + c2 + d2 and n = p2+ q2+ r2+ s2. Define A = ap+ bq+ cr+ ds, B = aq− bp− cs+ dr, C =ar + bs− cp− dq,D = as− br + cq − dp. Then mn = A2 +B2 + C2 +D2.

Proof. Tedious checking. //

The next lemma is a simple application of the Legendre symbol.

Lemma 11.2. Let p be any prime, λ, ν, µ integers prime to p. Then the congruence λx2 +µy2 + νz2 ≡ 0 (mod p) has a non-trivial solution (x, y, z) in integers.

Proof. Consider the group (Z/pZ)× which has order p − 1. There is a homomorphism λp :(Z/pZ)× → µ2 = {±1} given by λp(a) = (a/p), the Legendre symbol, which gives an exactsequence:

1 → ker λp → (Z/pZ)× → µ2 → 1

therefore the function σ : Z/pZ → Z/pZ, x = (x (mod p) 7→ x2 takes (p− 1)/2 distinct values,the quadratic residues which are in ker λp, together with 0. The functions x 7→ λx2, x 7→−µx2−ν are affine transformations from Fp to Fp and restricted to ker λp take also (p+1)/2 =(p−1)/2+1 distinct values. This means that there exists x, y ∈ Z/pZ such that λx2 = −µy2−νwith (x, y) 6= (0, 0). Take x ∈ x, y ∈ y and z = 1 to get λx2 + µy2 + νz2 ≡ 0 (mod p). //

Lemma 11.3. Let a1, a2, a3, a4 be coprime integers and p a prime such that a21+a22+a23+a24 ≡ 0(mod p). Then p is a sum of four squares.

Proof. Let p be an odd divisor of a21 + a22 + a23 + a24. Then we may assume that ai ≥ 0 andai < p/2 for 1 ≤ i ≤ 4. We write

N =∑

a2i = pm.

If m = 2 then two of the ai must be odd, say a1 and a2. Then

a21 + a22 =

(

a1 − a22

)2

+

(

a1 + a22

)2

, a23 + a24 =

(

a3 − a42

)2

+

(

a3 + a42

)2

and this gives

p =

(

a1 − a22

)2

+

(

a1 + a22

)2

+

(

a3 − a42

)2

+

(

a3 + a42

)2

.

therefore p is a sum of four squares.If m > 2 then we write ai = bi + mci with |bi| ≤ m/2. Then

∑

b2i ≡ 0 (mod m) and so∑

b2i = mn. Since the ai are coprime not all of the bi can be 0 or ±m/2 which shows that

0 <∑

b2i = mn < m2

and 0 < n < m. By Lemma 11.1 we get(

∑

a2i

)(

∑

b2i

)

= (pm)(mn) = m2pn =∑

A2i

32

with

A1 =∑

aibi

A2 = a1b2 − a2b1 − a3b4 + a4b3 = m(c1b2 − c2b1 − c3b4 + c4b3) = mB2

A3 = m(c1b3 + c2b4 − c3b1 − c4b2) = mB3

A4 = m(c1b4 − c2b3 + c3b2 − c4b1) = mB4

Since A21+A2

2+A23+A2

4 ≡ 0 (modm2) we find that A21 ≡ 0 (modm2) and this gives A1 ≡ 0 (mod

m) and we write A1 = mB1. We deduce that pm = B21+B2

2+B23+B2

4 . Let d = (B1, B2, B3, B4)be the common divisor such that Bi = da′i. Then d2|pn. Since 0 < n < m < p we concludethat (d, p) = 1 and so d2|n and we write n = d2mσ and pm′ =

∑

a′2i , m′ < m. After a finite

number of steps we find that p is a sum of four squares. //

33

12 Binary quadratic forms

(April 30, 2012)

In this lecture we begin to study quadratic forms in two variables x and y with integral coeffi-cients, called binary quadratic forms,

f(x, y) = ax2 + bxy + cy2,

a, b, c ∈ Z. We call f primitive if (a, b, c) = 1.

Its discriminant is

d(f) = −4 detA, M(f) = A =

(

a b/2b/2 c

)

and then4af(x, y) = (2ax+ by)2 − d(f)y2.

We shall now discuss some basic problems which lead to study binary quadratic forms. Then weshall study definiteness and reducible forms (reducible over Z) and the notions of equivalence(equivalence and proper equivalence). We end with showing the finiteness of the class number.

1) Basic problems f = (a, b, c) ↔ M(f):

a) Representation of integers: Given n ∈ Z, find x, y ∈ Z × Z: f(x, y) = n. If representationexists: (x, y) proper ⇔ (x, y) = 1; otherwise improper.

::::::::::

Example: f(x, y) = x2 − y2, n = 113: x2 − y2 = 113 ⇔ (x + y)(x − y) = 113; (x, y) solution⇒ (±x,±y) solution. It therefore suffices to solve with x ≥ 0, y ≥ 0. x = 0 or y = 0 not a solu-tion since 113 is not a square. Assume x > y > 0. Since 113 is a prime: x+ y = 113, x− y = 1or x + y = 1, x − y = 113 (latter not possible since otherwise x = 0 or y = 0). Thus2x = 114 ⇒ x = 57 ⇒ y = 56 ⇒ (x, y) = (±57,±56).b) number of representationsc) finding minimum: λ1(f) = inf06=(x,y)∈Z2{|f(x, y)|1/2}::::::::::

Example: f = (269, 164, 25), f(x, y) = 269x2+164xy+25y2 = (10x+3y)2+(13x+y)2. Clearly|f(x, y)| ≥ 1 for (x, y) 6= (0, 0). We show: λ1(f) = 1. Solve 10x+ 3y = 0, 13x+ 4y = 1 ⇔ x =3, y = −10. Thus f(x, y) = 1.d) representation problem for d(f) < 0: ax2 + bxy + cy2 = n; d(f) < 0 ⇒ ac > 0. Now4an = (2ax+ by)2 + |∆|y2 ≥ 0 or 4cn = (2cy + bx)2 + |∆|x2 ≥ 0. Thus 4an, 4cn ≥ 0. If n = 0then x = y = 0; otherwise x2 ≤ 4cn/∆, y2 ≤ 4an/∆. It follows that there are only finitelymany solutions.

2) Positive definite, negative definite, etc.:

Definition.• f > 0 ⇔ ∀(x, y) 6= (0, 0): f(x, y) > 0• f < 0 ⇔ ∀(x, y) 6= (0, 0): f(x, y) < 0• f ≥ 0 ⇔ ∀(x, y) 6= (0, 0): f(x, y) ≥ 0• f ≤ 0 ⇔ ∀(x, y) 6= (0, 0): f(x, y) ≤ 0• f indefinite ⇔ ∃(x, y), (xσ, yσ): f(x, y) > 0 and f(xσ, yσ) < 0.

Proposition 12.1. i) f > 0 ⇔ d(f) < 0 and a(f) > 0.

34

ii) f < 0 ⇔ d(f) < 0 and a(f) < 0.iii) f indefinite ⇔ d(f) > 0.

Proof. We have 4a(f)f = (2a(f)x+ b(f)y)2 − d(f)y2 and i) and ii) follow. But also iii) followssince there exist (x, y) with (2a(f)x+ b(f)y) > d(f)y2 and (2a(f)x+ b(f)y) < d(f)y2. //

3) Reducible forms:

Proposition 12.2. The following statements are equivalent:i) d(f) = n2,ii) f = L1 · L2, L1 = k1x+ l1y, L2 = k2x+ l2y and k1, k2, l1, l2 ∈ Z.

Proof. a = 0 ⇔ d(f) = b2 ⇒ f(x, y) = y(bx + cy). a 6= 0: 4af = (2ax + by)2 − dy2 =(2ax + by −

√d)(2ax + by +

√d) = L1L2. Then L1, L2 integral ⇔

√d ∈ Z ⇔ d = n2 and 4a

cont(f) = cont(L1)cont(L2). This implies 4a = a1 · a2 with a1|cont(L1) and a2|cont(L2). Wewrite M1 = L1/a1,M2 = L2/a2 and then f = M1M2. //

4) Equivalence of forms:

For

B =

(

u u′′

v v′′

)

∈ GL2(Z)

define B(x, y) = (ux + u′′y, vx + v′′y) ; f(B(x, y)) = f(u, v)x2 + (2(auu′′ + cvv′′) + b(uv′′ +u′′v))xy + f(u′′, v′′)y2. Define fB := detB(f ◦B);M(fB) = detB · B TM(f)B. We get actionGL2(Z)×Q → Q, (B, f) 7→ fB (check that this is an action).

Definition. a) f ∼ g ⇔ g = fB for some B ∈ GL2(Z).b) f and g properly equivalent ⇔ g = fB for some B ∈ SL2(Z).

::::::::::

Example: f = −13x2 − 36xy − 25y2, g = x2 + y2. Take

B =

(

−4 33 −2

)

and get f ◦ B = −x2 − y2 so that fB = degBf ◦ B = x2 + y2 = g. B ∈ GL2(Z) but not inSL2(Z) since degB = 8− 9 = −1.

5) Find equivalence classes of binary quadratic forms:

We define the class number for d as

h(d) =

# equiv. classes of primitive quadratic forms of discriminant d,if d > 0 (indefinite case),

# equiv. classes of positive definite primitive quadratic forms of discriminant d,if d < 0 (definite case).

Lemma 12.1. h(d) ≥ 1 for all d with d ≡ 0, 1 (mod 4).

Proof. Put

f(x, y) =

{

x2 − d4y2, d ≡ 0 (4),

x2 + xy + 1−d4y2, d ≡ 1 (4).

Then d(f) = d in both cases. //

35

Theorem 12.1. For 0 6= d ∈ Z, d not a square, we have h(d) < ∞.

Proof. We show that f = ax2 + bxy + cy2 is equivalent to a form g = a′′x2 + b′′xy + c′′y2 with

|b′′| ≤ |a′′| ≤ |c′′|.

The statement follows since there are only finitely many triples a′′, b′′, c′′ which satisfy thiscondition subject to b′′2−4a′′c′′ = d. Indeed |d| = |b′′2−4a′′c′′| ≥ 4|a′′c′′|−b′′2 ≥ 4a′′2−a′′2 = 3a′′2.Also |a′′| ≤

√

|d|/3, |b′′| ≤ |a′′|, c′′ = (b′′2 − d)/(4a′′). This leaves only finitely many possibilitiesfor a′′, b′′, c′′.Define a′′ the smallest integer in absolute value such that there exist u, v ∈ Z with a′′ = f(u, v).We have 1 = (u, v) for otherwise a′′/r for r = (u, v) could be represented by f . Choose u′′, v′′

such that 1 = u′′u+ v′′v and write

A =

(

u u′′

v v′′

)

.

We write(

x′′

y′′

)

= A

(

xy

)

=

(

ux+ u′′yvx+ v′′y

)

and f(x′′, y′′) = f(ux + u′′y, vx+ v′′y) = g(x, y) = a′′x2 + b′′xy + c′′y2 with a′′ = f(u, v), b′′ =2auu′′+ b(uv′′+u′′v)+ 2cvv′′, c′′ = au′′2+ bu′′v′′+ cv′′2 = f(u′′, v′′). We now choose n such thatb′′ = b′′ − 2a′′n, |b′′| ≤ |a′′|. Since a′′(x − ny)2 + b′′(x − ny)y + c′′y2 = a′′x2 + (b′′ − 2a′′n)xy +(a′′n2 − b′′n+ c′′)y2 we see that g = a′′x2 + b′′xy + c′′y2 ∼ ga′′x2 + b′′xy + c′′y2 with |b′′| ≤ |a′′|.By choice of a′′ we have |c′′| ≥ |a′′| (since g(−n, 1) = c′′ and |c′′| ≥ |a′′| by minimality of a′′). //

36

13 Class groups

(May 7, 2012)

We recall some notations and definitions. K ⊇ Q quadratic, K = Q(√d) with d square-free,

dK =

{

4d, d ≡ 2, 3 (mod 4),d, d ≡ 1 (mod 4).

We define√dK as the positive square root if dK > 0 and the root with positive imaginary part

if dK < 0. We have

OK = Z+ Zω, ω =

{ √d, d ≡ 2, 3 (mod 4),

1+√d

2, d ≡ 1 (mod 4)

and

dK =

∣

∣

∣

∣

1 ω1 ωσ

∣

∣

∣

∣

2

.

Note: We may take also ω∗ = (dK +√dK)/2 and then OK = Z[ω∗].

Let a = (a1, a2) be a fractional ideal. If a, b ⊂ OK then N(a) := |O/a| and N(ab−1) =N(a)N(b)−1. One has

∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

2

= dKN(a)2.

This gives∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

= ±√

dKN(a)

and we call the basis a1, a2 normalized iff the sign is positive∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

=√

dKN(a).

Put a∗1 = a2 and a∗2 = a+. Then either the basis a1, a2 or a∗1, a∗2 is normalized.

We attach to any fractional ideal a a binary quadratic form as follows: Let a : {1, 2} → a be anormalized basis in the sense that a1 = a(1), a2 = a(2) is a normalized basis as defined. Then

qa(x, y) := N(a)−1NK/Q(a1x+ a2y).

Since α = a1x+ a2y ∈ a for x, y ∈ Z we find that NK/Q(α)|N(a) so that qa(x, y) ∈ Z.

We need two lemmata:

Lemma 13.1. A positive integer n is represented by qa ⇔ n norm of an integral ideal b in theclass of a.

Proof. Suppose that n = N(b) with b integral and b = (γ)a for γ totally positive. Thenbσ = (γσ)aσ = (γσN(a))a−1 = (α)a−1 with α = γσN(a) and totally positive; here we usedthat for any fractional ideal 0 6= c ⊂ K we have cσ = (N(c))c−1. We get n = N(b) = N(b) =N(α)N(a)−1 = NK/Q(α)N(a)−1 where α is totally positive and therefore NK/Q(α) > 0. Thisleads to n = N(a)−1NK/Q(a1x + a2y) = qa(x, y). Conversely we assume that n = qa(x, y).

37

Then there exists α = a1x + a2y ∈ a such that n = NK/Q(α)N(a)−1. By assumption n > 0which leads to NK/Q(α) > 0 and to NK/Q(α) = N((α)). therefore n = N((αa−1) with α eithertotally positive or totally negative. Since (α)a−1 is integral iff αa−1 ⊆ OK which is the sameas α ∈ a we can put b = (α)a−1 ⊆ OK . Now bσ = N(b)b−1 = N(b)(α−1)a and thereforen = N(b) = N(bσ) with c+(bσ) = c+(a). This proves the claim. //

Calculation of the discriminant of qa:

qa = N(a)−1(NK/Q(a1)x2 + TrK/Q(a1a

σ2 )xy +NK/Q(a2)y

2)

dqa = N(a)−2(TrK/Q(a1aσ2 )

2 − 4NK/Q(a1)NK/Q(a2)) = N(a)−2(a1aσ2 − aσ1a2)

2

= N(a)−2

∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

= dK .

Note: Coefficient of x2 is N(a)−1NK/Q(a1). If dK < 0 we have for 0 6= ξ = x +√dy that

N(ξ) = ξξσ = x2 − dy2 > 0 and so NK/Q(a1) > 0.

Change of normalized basis: b : {1, 2} → a different basis, then b = a · tg for

g =

(

α βγ δ

)

∈ SL2(Z)

(easy calculation). Let QK be the set of equivalence classes of binary quadratic forms withdiscriminant dK . We constructed a map κ : C+

K → QK by mapping the class of a to the classof qa where a is a normalized basis for a.

Theorem 13.1. κ : C+K → QK is a bijection.

Proof. Surjective: Given q = (a, b, c) we define the Z-module a = aZ + ((b −√dK)/2)Z =

(a, (b −√dK)/2) and show that a is a fractional ideal, i.e. OKa ⊆ a. It suffices to verify that

ωa ⊆ a for ω = (dK +√dK)/2 since a basis for OK is 1, ω. Since dK = b2 − 4ac ≡ b2 ≡ b ≡ −b

(mod 2) we getdK +

√dK

2≡ −b+

√dK

2(mod 1)

and this shows that aω ∈ a. Further

ωb−

√dK

2=

dK2

b−√dK

2+

b√dK − dK4

≡ dK2

b−√dK

2+

b

2

√dK − b

2≡ dK − b

2

b−√dK

2

(mod a) again since dK ≡ b2 (mod 4a). This proves that ω(b−√dK)/2 ∈ a and together that

OKa ⊆ a.The norm of the ideal a is a. To see this we calculate

N(a) = aaσ = (a2, ab−

√dK

2, a

b+√dK

2,b2 − dK

4) = (a2, ab, a

b+√dK

2, ac) ⊆ (a)

and also contains (a)(a, b, c) = (a) since q is primitive. therefore N(a) = N((a)) = |a|. Thebasis a, (b−

√dK)/2 is not necessarily normalized:

∣

∣

∣

∣

a (b−√dK)/2

a (b+√dK)/2

∣

∣

∣

∣

= a√

dK =a

|a|N(a)√

dK

38

unless a > 0 which is implied by dK < 0. If a < 0 then dK > 0 and so λ =√dK is real, which

implies that λσ = −λ and then NK/Q(λ) = −dK . Then the ideal λa has normalized basis:

∣

∣

∣

∣

∣

λa λ b−√dK

2

λσa λσ b−√dK

2

∣

∣

∣

∣

∣

= NK/Q(λ)a

|a|N(a)√

dK = −N(λ)a

|a|N(a)√

dK = N(λa)√

dK .

Now

qλa = N(aλ)−1NK/Q(λax+ λb−

√dK

2y) =

NK/Q(λ)

N(λa)(a2x2 + abxy +

b2 − dK2

y2)

= −NK/Q(λ)

NK/Q(λ)

a

|a|q = q.

We conclude that q is either κ(c+(a)) or κ(c+(λa)).Injective: Suppose that there are fractional ideals a, b with normalized bases a and b such thatqa ∼ qb. We may then replace b by bg with g ∈ SL2(Z) to get qa = qb. We have to showthat there exists λ ∈ K× totally positive such that a = λb. Roots of qa: −a1/a2,−aσ1/a

σ2 ,

roots of qb: −b1/b2,−bσ1/bσ2 . This implies that a1/a2 = b1/b2 or a1/a2 = bσ1/b

σ2 and we write

a1 = λb1, a2 = λb2 in the first case and a1 = λbσ1 , a2 = λbσ2 in the second. In the second case wehave

∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

= NK/Q(λ)

∣

∣

∣

∣

bσ1 bσ2b1 b2

∣

∣

∣

∣

= −NK/Q(λ)

∣

∣

∣

∣

b1 b2bσ1 bσ2

∣

∣

∣

∣

and since a and b are normalized we find that NK/Q(λ) < 0. Since qa = qb we find thatN(b)−1(b1x+b2y)(b

σ1+bσ2y) = N(a)−1(a1x+a2y)(a

σ1x+aσ2y) = N(a)−1NK/Q(λ)(b+x+b2y)(b

σ1x+

bσ2y) which implies that NK/Q(λ) > 0, a contradiction. therefore we are in the first case andthen

√

dKN(a) =

∣

∣

∣

∣

a1 a2aσ1 aσ2

∣

∣

∣

∣

= NK/Q(λ)

∣

∣

∣

∣

b1 b2bσ1 bσ2

∣

∣

∣

∣

= NK/Q(λ)√

dKN(b).

(remember that a, b are normalized). This gives NK/Q(λ) > 0. We conclude that a = λb withsome λ such that NK/Q(λ) > 0 which means that c+(a) = c+(b). Indeed, define

λ∗ =

{

λ, if λ > 0,−λ, if λy < 0.

Then λ∗ is totally positive and

a = λb =

{

λb = λ∗b, λ > 0,(−λ)(−b) = (−λ)b = λ∗b, λ < 0

and this gives a ∼ b in the narrow sense. //

Note: A nicer discussion can be found in [3, V, 5.1-5.2]. Look at Prop. 5.1.4, Prop. 5.2.1,Definition 5.2.7, Theorem 5.2.8 and 5.2.9.

39

14 Representations of integers by binary quadratic forms

(May 14, 2012)

In this final lecture we shall determine which numbers n can be represented by some fixedprimitive binary quadratic form q(x, y). It turns out that the answer depends only on theproper equivalence class [q] of the form. Furthermore it can be shown that the answer dependsonly on the residue of n modulo the discriminant d of the form. We introduce therefore in thegroup Ud = (Z/dZ)× of units of the residue class ring Z/dZ modulo d a subgroup Hd and definefor each [q] as above an element ωd([q]) in Ud/Hd as follows:Let n be an integer co-prime with d that is represented by some form q ∈ [q] and let n be itsimage in Ud/Hd. We put ωd([q]) = n. This defines a map

ωd : Qd −→ Ud/Hd.

and one shows that ωd is a homomorphism. For this one has to verify that the map is well-defined. This amounts to showing that if m is another integer represented by q then n = mand that if q′ ∼ q then the set of integers represented by q coincides with the set representedby q′.The definition of the subgroup Hd needs characters of the group Z/dZ.

Definition 14.1. A character of a finite abelian group G is a homomorphism χ : G −→ Q×.

The characters of the group G form a group X(G) which isomorphic to the group G. IfG = G′ ×G′′ then X(G) = X(G′)×X(G′′) and this shows by virtue of the Chinese RemainderTheorem that if d = 2e

∏

p with e zero or 3 depending on whether the discriminant is even orodd and the product taken over all odd prime divisors of d then

X(Ud) = X(U2e)∏

X(Up).

We denote for p odd by χp the character given by the Jacobi symbol (np). We also have to

introduce the character χ2e belonging to the group X(U2e). The order of U8 is 4 and thereforethere are 4 characters in the group. They are given by

1, (−1)n−12 , (−1)

n2−18 , (−1)

n−12

+n2−18

and we take their product as χ8. It can be expressed as

χ2e = (−1)en2−1

8+ δ−1

2n−12

if d = 2eδ with δ odd.

40

References

[1] J. Buchmann and U. Vollmer, Binary quadratic forms. Springer-Verlag, Berlin, 2007.

[2] D.A. Buell, Binary quadratic forms : classical theory and modern computations. Springer-Verlag, New York, 1989.

[3] H. Cohn, A course in computational number theory, Second Corrected Printing, 1995.

[4] A. Frohlich and M.J. Taylor, Algebraic Number Theory. Cambridge Studies in AdvancedMathematics 27, Cambridge University Press, Cambridge, 1991.

[5] L.-K. Hua, Introduction to number theory. Springer-Verlag, Berlin, 1982.

[6] H. Koch, Zahlentheorie. Vieweg Studium: Aufbaukurs Mathematik 72, Vieweg Verlag,Braunschweig, 1997.

[7] F. Lemmermeyer, Quadratische Zahlkorper. Skript, 1999.

[8] W. Narkiewicz, Elementary and Analytic Theory of Algebraic Numbers, Springer Mono-graphs in Mathematics.

[9] J. Neukirch, Algebraic number theory. Grundlehren der mathematischen Wissenschaften322, Springer Verlag, 1999.

[10] G. Wustholz, Algebra. Vieweg Verlag, 2012.

41

Index

O-module, 16Z-module, 11

Cassel’s Lemma, 18class number, 16Conway-Schneeberger Fifteen Theorem, 4

decomposition group, 15discriminant, 2

equivalence of forms, 2euclidean algorithm, 9euclidean ring, 8Euler’s identity, 5

factorial ring, 8factorization, 7

Gaussian number rings, 9

idealfractional, 2irreducible, 13maximal, 13prime, 7principal fractional, 16product of, 11

ideal class group, 16inertia group, 15integral, 6irreducible, 8

narrow class group, 16normalized basis, 24number rings, 6

Pell’s equation, 7prime, 8

inert, 3ramified, 3split, 3

principal ideal domain (PID), 8proper equivalence of quadratic forms, 21

quadraticbinary quadratic form, 1imaginary, 6

indefinite form, 21negative definite form, 21positive definite form, 21primitive binary forms, 19quadratic number field, 2real, 6reducible form, 21

quadratic reciprocity law, 5

residue field, 15

TheoremDirichlet’s unit, 7Four squares, 4, 19Three squares, 18Two squares, 5Wilson’s, 5

totally positive element, 17

unique factorization domain (UFD), 8

42

binary quadratic forms and quadratic number ﬁelds...the ﬁrst “universal” result on...

Documents