ma3265 introduction to number theory - dept of maths, nuschanhh/notes/ma3265.pdf · 1 divisibility...
TRANSCRIPT
![Page 1: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/1.jpg)
MA3265 Introduction to
Number Theory
Notes by Chan Heng Huat
![Page 2: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/2.jpg)
![Page 3: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/3.jpg)
Contents
References page 1
1 Divisibility 2
1.1 Introduction 2
1.2 Division algorithm and congruences 5
1.3 Greatest Common Divisors 8
1.4 Least common multiple 12
1.5 Euclid’s Lemma 13
1.6 Fundamental Theorem of Arithmetic 14
1.7 Euclid’s Lemma and the linear Diophantine equation 15
1.8 Appendix : The cyclic group (Z,+) 17
2 More on Congruences 19
2.1 Fermat’s Theorem and Euler’s Theorem 19
2.2 An extension of Fermat’s Little Theorem: The Euler Theorem 21
2.3 Finite abelian groups and Euler’s Theorem 23
2.4 The Chinese Remainder Theorem 25
2.5 Multiplicative functions and even perfect numbers 27
2.6 The Euler ϕ-function is multiplicative 28
2.7 An Application to Cryptography 30
2.8 Appendix : Group Action and Fermat’s Little Theorem 32
3 Solutions of congruences 35
3.1 The Chinese Remainder Theorem revisited 35
3.2 Prime power moduli 36
3.3 Prime moduli 39
3.4 Wolstenholme’s congruence 43
4 Primitive roots 44
4.1 Order of an element in a group 44
4.2 Integers m for which (Z/mZ)∗ is cyclic 45
4.3 Integers m for which (Z/mZ)∗ is not cyclic 49
5 Quadratic Reciprocity Law 50
![Page 4: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/4.jpg)
iv Contents
5.1 Primitive roots and solutions of congruences 50
5.2 The Legendre Symbol 51
5.3 Gauss Lemma 52
5.4 Proofs of Gauss’ Quadratic Reciprocity Law 53
5.5 The Jacobi Symbol 56
5.6 The Transfer 59
5.7 Gauss Lemma via the transfer 62
5.8 The Legendre Symbol and primes of the form x2 + y2 62
5.9 Appendix : Primes of the form x2 + y2, a second approach 66
6 Binary Quadratic forms 68
6.1 Fermat-Euler Theorem 68
6.2 Representations of integers by binary quadratic forms 70
6.3 Equivalence of binary quadratic forms 74
6.4 Reduced forms 75
7 Form Class groups 80
7.1 The set C(d) 80
7.2 C(d) is a finite abelian group 82
7.3 The operation • is well defined 83
8 Continued fractions and Pell’s equations 89
8.1 Pell’s equations 89
8.2 Continued fractions 90
8.3 Infinite continued fraction 90
8.4 A simple Lemma and primes of the form x2 + y2 92
8.5 Solutions to Pell’s equations 94
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0 97
9 Euler’s identities and Jacobi Triple Product Identity 105
9.1 Jacobi’s Triple Product Identity 105
9.2 The terminating q-binomial Theorem 106
9.3 Two identities of Euler 108
9.4 Andrews’ proof of Jacobi’s Triple Product Identity 109
9.5 Number of representations of n as sums of two squares 111
9.6 The partition function p(n) 113
9.7 Proof of p(5n+ 4) ≡ 0 (mod 5) 116
10 Lattices and Minkowski’s Theorem 118
10.1 Lattices in Rn 118
10.2 Translations of fundamental parallelopiped 119
10.3 Spheres and bounded sets in Rn 120
10.4 Minkowski’s Theorem 120
10.5 Primes of the form x2 + y2, the geometric approach 122
![Page 5: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/5.jpg)
Contents v
11 Elliptic curves 124
11.1 Rational Points on curves 125
11.2 The Geometry of cubic curves 127
11.3 Explicit form of the group law 130
11.4 The Pollard’s p− 1 method 130
11.5 Appendix 133
![Page 6: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/6.jpg)
![Page 7: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/7.jpg)
References
The main reference is “An introduction to the theory of numbers” by I. Niven,
H.S. Zuckerman and H.L. Montgomery. Most of the tutorial problems in this
course can be found in the book.
Other references are “A course in arithmetic” by J.P. Serre, “Primes of the
form x2 + ny2” by D.A. Cox, “Introduction to Analytic Number Theory” by
T.M. Apostol, “Introduction to Number Theory” by D. Flath and “Elementary
Number Theory” by D. Burton.
![Page 8: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/8.jpg)
1 Divisibility
1.1 Introduction
Number Theory is a branch of mathematics that study problems leading to a
better understanding of the additive and multiplicative properties of integers
Z = {0,±1,±2,±3, · · · }.
We list several examples of such problems.
example 1.1
An integer s is called a perfect square if it is of the form s = m2 for some
integer m. One may ask when is an integer a sum of two perfect squares?
example 1.2
What are the most general solutions to the equation x2 + y2 = z2 where x, y, z ∈Z?
Remark 1.1 The triplet (x, y, z) satisfying x2+ y2 = z2 is called a Pythagorean
triplet.
example 1.3
Let n ≥ 3. Are there integers x, y, z with xyz 6= 0 such that xn + yn = zn?
Remark 1.2 This is the Fermat Last Theorem. It was proved by A. Wiles and
R. Taylor in 1995.
example 1.4
How many squares do we need to represent all positive integers?
definition 1.3 Let a and b be two integers such that ab 6= 0. We say that a
is a divisor of b if b = aq for some integer q. We use the notation a|b to mean a
divides b. We also say that a is a factor of b and that b is a multiple of a.
definition 1.4 A prime number is a positive integer with exactly two positive
divisors. An integer n > 1 which is not a prime is called a composite number.
![Page 9: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/9.jpg)
1.1 Introduction 3
Prime numbers are the “atoms” of integers when we study their multiplicative
structure. There are many interesting problems (most of which are unsolved)
that are related to prime numbers. We list a few here.
example 1.5
Are there infinitely many primes of the form n2 + 1?
example 1.6
Are there infinitely many primes p such that p+ 2 is also a prime?
Remark 1.5 A breakthrough in this problem was discovered in 2013 by an
unknown mathematician Y.T. Zhang. He showed that there exists an even integer
a less than 70 million such that there are infinitely many primes p with the
property that p+a is also a prime. The number 70 million has since been reduced
to 270, thanks to the effort of J. Maynard and Polymath headed by T. Tao.
example 1.7
(Goldbach) Is an even positive integer greater than 2 always a sum of two primes?
example 1.8
(Goldbach) Is every odd positive integer greater than 6 a sum of three primes?
Remark 1.6 This problem was settled completely in 2013 by H. Helfgott. Before
Helfgott’s work, I.M. Vinogradov was able to show the truth of the statement for
sufficiently large odd positive integer. It can be shown that “sufficiently large”
odd integers are integers greater than 3315
).
example 1.9
For every positive integer n > 1, is there always a prime between n and 2n?
Remark 1.7 This is the Bertrand’s Postulate. Note that Bertrand’s Postulate
implies that there are infinitely many primes since there are infinitely many
positive integers. This fact was first proved by P. Chebyshev. Other proofs were
given independently by S. Ramanujan and P. Erdos.
definition 1.8 We say that two integers a and b are relatively prime if there
does not exist an integer c > 1 such that c|a and c|b.
example 1.10
Suppose a and b are two relatively prime positive integers. Are there infinitely
many primes of the form an+ b?
Remark 1.9 This is the Dirichlet Theorem of primes in arithmetic progression.
definition 1.10 The symbol π(x) denote the number of primes less than x.
![Page 10: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/10.jpg)
4 Divisibility
example 1.11
Is it true that
limx→∞
π(x)
(x/ lnx)= 1?
Remark 1.11 This is the Prime Number Theorem. It was first observed by C.F.
Gauss around 1791 and A.M. Legendre around 1798. It was proved independently
around 1896 by J. Hadamard and Charles de la Vallee Poussin using complex
analysis. Elementary proofs of this result without using complex analysis were
later given by A. Selberg and P. Erdos. Both proofs required the Selberg identity.
Since the appearance of the proofs by Selberg and Erdos, other elementary proofs
were discovered. Some of these proofs were independent of Selberg’s identity and
one of the most elegant proofs was given by A. Hildebrand using the theory of
large sieve.
definition 1.12 A positive integer n is called a perfect number if it is the
sum of all its divisors d 6= n.
There are many even perfect numbers. The first two examples are 6 and 28.
Remark 1.13 An even integer is a perfect number if and only if it is of the form
2k−1(2k − 1) where 2k − 1 is a prime. (Try proving this statement.) A prime of
the form 2k − 1 is called a Mersenne prime.
One can ask the following question :
example 1.12
Does odd perfect number exist?
definition 1.14 An integer n is called a power if n = mk for some integer
k > 1.
example 1.13
It is clear that 23, 32 are consecutive powers. Are there any other consecutive
powers?
Remark 1.15 This is the Catalan conjecture and it was proved in 2002 by P.
Mihailescu..
There are many other problems besides those stated here. Some of these prob-
lems are solved while others remained open.
In this course, we will study some basic results in various areas of number
theory. Hopefully by the end of this course, students will be able to appreciate the
relations among different areas of mathematics through solving number theoretic
problems and that they will be able to appreciate the beauty of this topic.
![Page 11: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/11.jpg)
1.2 Division algorithm and congruences 5
1.2 Division algorithm and congruences
We have seen in Definition 1.3 that if a is a divisor of b, we write a|b. When a is
not a divisor of b, we use the notation a - b.
theorem 1.16 Let a, b, c, x,m, y be integers.
(a) a|b implies that a|bc for any integer c;
(b) a|b and b|c imply a|c;(c) a|b and a|c imply a|(bx+ cy) for any integers x and y;
(d) a|b and b|a imply a = ±b;(e) a|b, a > 0, b > 0, imply a ≤ b;
(f) Let m be a nonzero integer. Then a | b implies that am | bm; if am | bm,then a | b.
Proof
We prove (c). Suppose a | b and a | c , then b = ak and c = al for some integers
k and l. Now,
bx+ cy = a(kx+ ly).
Hence a|(bx+ cy).
theorem 1.17 (The Division Algorithm) Given any integers a and b with
a > 0, there exist unique integers q and r such that b = qa + r, 0 ≤ r < a. If
a - b, then r satisfies the stronger inequalities 0 < r < a.
To prove the above result, we will first need the least integer axiom, which
says that if S is the nonempty subset of non-negative integers, then there is an
element r ∈ S such that r ≤ s for all s ∈ S. This principle cannot be proved
but it is certainly plausible. Suppose 0 is not in S, then 0 is not the smallest
element in S. If 1 is not in S, then 1 is not the smallest element in S. Since S is
nonempty, then there must be an integer r that happens to be the first integer
in S. This integer r is the smallest integer in S.
Remark 1.18 One can show that the least integer axiom is equivalent to the
mathematical induction.
We are now ready to prove the Division Algorithm.
Proof
Let
S = {y : y = b− ax, x is an integer, y ≥ 0}.
Note that since a ≥ 1 and
b− a · (−|b|) = |b|(b
|b| + a
)≥ 0,
![Page 12: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/12.jpg)
6 Divisibility
we see that S is non-empty. By the least integer axiom, it has a smallest member,
say r = b− aq. Then b = aq + r and r ≥ 0. Now we show that r < a.
Suppose r ≥ a. Then 0 ≤ r−a = b−a(q+1) and hence r−a ∈ S. Since r−a < r,
this contradicts the minimality of r. Hence, r < a. The pair q, r is unique, for
if there is another pair q′, r′ such that b = aq′ + r′, then a(q′ − q) = r − r′. If
r = r′, then q = q′ and we complete the proof of the Theorem. Suppose r 6= r′.
Then |r − r′| > 0 and a| |r − r′|. By Theorem 1.16(e), we find that
|r − r′| ≥ a. (1.1)
But 0 ≤ r′ < a implies that −a < −r′ ≤ 0. Together with 0 ≤ r < a, we
conclude that
−a < r − r′ < a,
or
|r − r′| < a.
This contradicts (1.1).
Remark 1.19 If we let a = 2 in Theorem 1.17, then our conclusion is that every
integer is of the form 2k or 2k+1. If we let a = 4, then we find that every integer
is of the form 4k, 4k + 1, 4k + 2 or 4k + 3.
We now introduce congruences.
definition 1.20 Let n > 0 be an integer. We say that “a is congruent to b
modulo n” and write
a ≡ b (mod n)
if
n|(a− b).
With the above notation, we can replace the sentence “Every integer n is of
the form 2k or 2k + 1.” by “Every integer n is such that n ≡ 0 or 1 (mod 2).”.
We now list a few basic properties of congruences.
theorem 1.21 Let a, b, c, d, n be integers with n > 0. Then
(a) For all integers k, k ≡ k (mod n).
(b) If a ≡ b (mod n) then b ≡ a (mod n).
(c) If a ≡ b (mod n) and b ≡ c (mod n) then a ≡ c (mod n).
(d) If a ≡ b (mod n) and c ≡ d (mod n) then a + c ≡ b + d (mod n) and
ac ≡ bd (mod n).
We will only supply the proof of Theorem 1.21 (d).
![Page 13: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/13.jpg)
1.2 Division algorithm and congruences 7
Proof of Theorem 1.21 (d)
Since a ≡ b (mod n) and c ≡ d (mod n), we deduce that n|(a−b) and n|(c−d).Hence, a − b = nk and c − d = nj. Therefore, a + c = b + d + n(k + j), which
implies that n|(a + c) − (b + d), i.e., a + c ≡ b + d (mod n). Now, a = nk + b
and c = nj + d also implies that ac = bd + n(kd + bj + nkj). Hence ac ≡ bd
(mod n).
The use of congruences allow us to have shorter presentations of solutions to
the next few problems.
example 1.14
Show that the product of two consecutive integers is always even.
Solutions. Let n be the integer. Then by Theorem 1.17, n ≡ 0 or 1 (mod 2).
Hence n(n+1) ≡ 0 · 1 or 1 · (1+1) (mod 2). In both cases, we have n(n+1) ≡ 0
(mod 2) and hence, n(n+ 1) is even.
example 1.15
Show that 3 divides a(a2 + 2) for any integer a ∈ Z.
Solutions. Let a be any integer. Then a ≡ 0, 1 or 2 (mod 3). This implies that
a(a2 + 1) ≡ 0 · 1, 1 · 3 or 2 · 3 (mod 3).
In all cases, a(a2 + 2) ≡ 0 (mod 3) and hence 3|a(a2 + 2).
example 1.16
Show that if n is of the form 4k + 3 then n is not a perfect square.
Solutions. Let m be any integer. Then m ≡ 0, 1, 2 or 3 (mod 4). This implies
that m2 ≡ 0 or 1 (mod 4) and therefore a square can only be of the form 4k or
4k + 1.
example 1.17
Prove that no integer in the sequence 11, 111, 1111, 11111, · · · is a perfect square.
Solutions. First, note that 11 ≡ 3 (mod 4) is not a square. The rest of the
integers are of the form
10k + 10k−1 + · · ·+ 100 + 10 + 1
where k ≥ 2. But these integers can be written as
100(10k−2 + · · ·+ 1) + 11 ≡ 3 (mod 4)
and so it is of the form 4N + 3, and hence cannot be a square.
example 1.18
Show that if p is an odd prime of the form 4k + 3 then p is not a sum of two
squares.
Solutions. We have seen that a square is congruent to 0 and 1 modulo 4. Hence
![Page 14: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/14.jpg)
8 Divisibility
a sum of two squares must be congruent to 0,1 or 2 modulo 4. This means that
any integer of the form 4k + 3 cannot be a sum of two squares. In particular, a
prime of the form 4k+3 is not a sum of two squares. This completes our solution
of the problem.
Now, since an odd prime is either of the form 4k+1 or 4k+3, it is natural to
determine if primes of the form 4k + 1 were a sum of two squares. It turns out
that this is always true. We will see proofs of this result in subsequent chapters.
example 1.19
Show that 41|(220 − 1).
Solutions. 25 = 32 ≡ −9 (mod 41). Therefore, 220 = (25)4 ≡ (−9)4 (mod 41).
Now, (−9)4 ≡ 812 ≡ 1 (mod 41). Hence 220 − 1 ≡ 0 (mod 41).
example 1.20
Find the remainder of 1! + 2! + · · ·+ 100! when this number is divided by 12.
Solutions. Note that 4! ≡ 0 (mod 12) and n! ≡ 0 (mod 12) when n > 4.
Therefore 1! + 2! + · · ·+100! ≡ 1! + 2!+ 3! ≡ 1+ 2+ 6 ≡ 9 (mod 12). Therefore
the remainder is 9.
1.3 Greatest Common Divisors
definition 1.22 A common divisor of integers a and b is an integer c with
c|a and c|b.
definition 1.23 A greatest common divisor of integers a and b is a number
d with the following properties :
(a) The integer d is nonnegative.
(b) The integer d is a common divisor of a and b.
(c) If e is any common divisor of a and b, then e|d.Note that if d and d′ are both greatest common divisors of a and b, then d is
a common divisor of a and b and d′ is a greatest common divisor, we note that
d′|d using Definition 1.23 (c). Similarly, since d′ is a common divisor and d is a
greatest common divisor, d|d′. By Theorem 1.16 (d), |d| = |d′| and by Definition
1.23 (a), we deduce that d = d′. This shows that the greatest common divisor of
a and b is unique.
definition 1.24 The greatest common divisor of a and b is denoted by (a, b).
Remark 1.25 When a and b are zeros, then (0, 0) = 0. If a = 0 and b is
non-zero, then (0, b) = b.
We will next show that the greatest common divisor of two integers exists. By
Remark 1.25, it suffices to consider the case when both a and b are nonzero.
![Page 15: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/15.jpg)
1.3 Greatest Common Divisors 9
theorem 1.26 Let a and b be nonzero integers. Then the smallest positive
integer in the set
P := {sa+ tb|s, t ∈ Z and sa+ tb > 0}
is (a, b).
Proof
If a is positive then a ∈ P since
a = 1 · a+ 0 · b.
Similarly, if b is positive, then b ∈ P . Suppose a and b are both negative. Then
0 · a+(−1) · b ∈ P . Hence that P is nonempty. By the least integer axiom, there
is a smallest positive integer, say d, in P . We must show that d = (a, b).
We observe that d ≥ 1 since we have assumed that both a and b are nonzero.
Next, since d ∈ P ,
d = xa+ yb (1.2)
for some integers x, y ∈ Z. We first show that d is a common divisor of a and b.
We claim that d|a. Suppose not. By Theorem 1.17, we may suppose
a = dq + r, 0 < r < d.
Then
r = a− dq = a− (xaq + ybq) = a(1− xq) + byq.
Therefore, r ∈ P and it is smaller than d. But d is the smallest element in P .
Hence, we conclude that d | a. In a similar way, we find that d | b. This shows
that d is a common divisor of a and b.
Finally, if c|a and c|b then a = cu and b = cv. This implies, by (1.2), that
xa+ yb = c(ux+ vy) = d
and hence, c|d. This shows that d satisfies Definition 1.23 and we conclude that
d = (a, b).
Remark 1.27 Note that the above theorem says that if d = (a, b) then there
exists integers x and y such that
d = ax+ by.
Remark 1.28 The condition in Definition 1.23 (c) can be replaced by
(c)’ If e|a and e|b then e ≤ d.
This condition explains the word “greatest” used in the definition of (a, b).
We recall that two integers a and b are relatively prime if their only common
divisor is 1. Now, (a, b) is a common divisor and this means that (a, b) = 1
since there is only one common divisor. In other words, we find that a and b are
relatively prime if and only if (a, b) = 1.
![Page 16: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/16.jpg)
10 Divisibility
theorem 1.29 Let a and b be integers. Then (a, b) = 1 (or a and b are
relatively prime) if and only if 1 = ax+ by for some integers x and y.
Proof
Note that if (a, b) = 1, then 1 = ax+ by for some integers x and y.
Conversely, if 1 = ax+by, then (a, b)|a and (a, b)|b, and therefore (a, b)|1. Thisimplies that (a, b) = 1.
example 1.21
Show that if (m,n) = 1 then m|c and n|c implies that (mn)|c.
Solutions. Since m|c and n|c, we have c = mh and c = nk for some integers h
and k. Now, the fact that (m,n) = 1 implies that
1 = mu+ nv
for some integers u and v. Hence,
c = mcu+ ncv = m(nk)u+ n(mh)v = mn(ku+ hv).
This implies that (mn)|c.
We now list down some basic properties of the greatest common divisor of two
integers.
theorem 1.30 Let a, b and c be integers. Then
(a) (a, b) = (b, a) (commutative law),
(b) (a, (b, c)) = ((a, b), c) (associative law),
(c) (ac, bc) = |c|(a, b), and
(d) (a, 1) = (1, a) = 1. If a is non-zero, then (a, 0) = (0, a) = |a|.
Proof
We will prove only (c) and leave the proofs of the other statements as exercises.
Let d = (ac, bc) and d′ = |c|(a, b). By Theorem 1.26, d = acx+ bcy for some
integers x and y. Hence,
d =c
|c| (a · |c| · x+ b · |c| · y) .
Now, d′ = |c|(a, b) and since (a, b)|a and (a, b)|b, we find that d′ is a common
divisor of a · |c| and b · |c| and therefore, by (1.3), d′|d.Next, since d′/|c| = (a, b), by Theorem 1.26,
d′
|c| = au+ bv
for some integers u and v. This implies that
d′ = a · |c| · u+ b · |c| · v =|c|c(acu+ bcv) .
But d is a common divisor of ac and bc and hence d|d′. Since d′|d and d|d′, we
![Page 17: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/17.jpg)
1.3 Greatest Common Divisors 11
conclude by Theorem 1.16 (d) that |d| = |d′|. Since both d and d′ are positive,
we deduce that d = d′.
In the next Theorem, we prove The Euclidean Algorithm. This allows us to
compute the greatest common divisor of two numbers.
theorem 1.31 (The Euclidean Algorithm) Given positive integers a and b,
where a - b. Let r0 = b, r1 = a, and apply the division algorithm repeatedly to
obtain a set of remainders r2, r3, ..., rn, rn+1 defined successively by the relations
r0 = r1q1 + r2, 0 < r2 < r1
r1 = r2q2 + r3, 0 < r3 < r2
...
rn−2 = rn−1qn−1 + rn, 0 < rn < rn−1
rn−1 = rnqn + rn+1, rn+1 = 0.
Then rn, the last nonzero remainder in this process is (a, b), the greatest common
divisor of a, and b.
We need a simple lemma.
lemma 1.32 Let a, b, q, and r be integers such that b = aq + r, then (a, b) =
(a, r).
Proof
Let d = (a, b) and d′ = (a, r). Note that since d|a and d|b, we find that d|(b−aq)by Theorem 1.16 (c). Hence, d|r and d is a common divisor of b and r. By
Definition 1.23 (c), d|d′ since d′ = (a, r). Similarly, d′|b and d′|r implies that
d′|(aq + r) by Theorem 1.16 (c) and consequently, d′|b. By Definition 1.23 (c),
d′|d since d = (a, b). Therefore, by Theorem 1.16 (d), d = d′.
We now complete the proof of Theorem 1.31.
Proof of Theorem 1.31
There is a stage at which rn+1 = 0 because the ri are decreasing and nonnegative.
Next, applying Lemma 1.32, we find that
(a, b) = (r0, r1) = (r1, r2) = · · · = (rn, rn+1) = (rn, 0) = rn.
This completes the proof of Theorem 1.31.
example 1.22
Find (196884, 576).
![Page 18: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/18.jpg)
12 Divisibility
Solutions.
196884 = 341 · 576 + 468
576 = 1 · 468 + 108
468 = 4 · 108 + 36
108 = 3 · 36 + 0.
Therefore (196884, 576) = 36.
example 1.23
Find integers x and y satisfying
43x+ 64y = 1.
Solutions.
64 = 43 · 1 + 21
43 = 21 · 2 + 1.
Working backwards, we have
1 = 43− 21 · 2 = 43− 2 · (64− 43) = 43 · 3 + 64 · (−2).
1.4 Least common multiple
definition 1.33 A common multiple of two integers a and b is a positive
integer m with the property that a|m and b|m.
definition 1.34 The least common multiple of two integers a and b is an
integer m such that
(a) m > 0,
(b) a|m and b|m, and
(c) If c is a common multiple of a and b then m|c.
Note that [a, b] = [|a|, |b|] and so we will consider [a, b] when a and b are
positive. We first establish the existence of the least common multiple of a and
b. Consider the set
S := {c ∈ Z+ | a|c and b|c}.
By least integer axiom, there exists a common multiple m of a and b such that
m ≤ c if c is any common multiples of a and b. We need to show that m|c.Suppose m - c for some common multiple c of a and b. Write
c = mq + r, 0 < r < m.
![Page 19: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/19.jpg)
1.5 Euclid’s Lemma 13
Since a|c and a|m, we find that a|r. Similarly, b|r and hence r ∈ S. But r < m,
contradicting the minimality of m. Hence, m|c.We now give two properties of the least common multiple of a and b. Note
that part (a) of the following result is an analogue of (ka, kb) = |k|(a, b).
theorem 1.35 Let a, b,m be positive integers. Then
(a) [ma,mb] = m[a, b], and
(b) [a, b] · (a, b) = ab.
Proof
Let H = [ma,mb] and h = [a, b]. Then a|h and b|h. Hence, ma|mh and mb|mh.This implies that H |mh.Next, ma|H and mb|H . Hence, a|(H/m) and b|(H/m) and so h|(H/m). Hence
mh|H. Therefore H = mh.
We may assume that a, b are both positive since (a,−b) = (a, b) and [a,−b] =[a, b]. Our first step is to show that if (h, k) = 1 then [h, k] = hk.
Suppose (h, k) = 1 and that h|c and k|c. Then by Example 1.21, we conclude
that hk|c. This implies that hk = [h, k] since all common multiples of h and k
are less than or equal to hk.
Now, let a and b be two integers and d = (a, b). Then [a, b] = d[a/d, b/d].
Since (a/d, b/d) = 1, we conclude that [a/d, b/d] = ab/d2. Therefore [a, b] =
d(ab/d2) = ab/d, or (a, b)[a, b] = ab.
1.5 Euclid’s Lemma
We know that if c 6= 0 then ca = cb implies that a = b. This is the law of
cancelation for equality. The law is not true in general if we replace “=” by ≡.
For example, 15 ≡ 3 (mod 12) but 5 6≡ 1 (mod 12). This shows that the law of
cancellation does not hold in general for congruences.
The next result (Euclid’s Lemma) shows that the law of cancelation holds if
we impose a condition on the integer c.
theorem 1.36 If ca ≡ cb (mod n) and (c, n) = 1, then a ≡ b (mod n).
Proof
Recall from Theorem 1.26 that if (c, n) = 1 then there exist integers x and y
such that cx+ ny = 1. Multiplying a and b yields
acx+ any = a
and
bcx+ bny = b,
respectively. This implies that
(ac− bc)x+ n(ay − by) = a− b.
![Page 20: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/20.jpg)
14 Divisibility
Since n|(ac− bc) and n|n, we conclude that n|(a− b).
Theorem 1.36 can be used to prove the following result of Euclid.
corollary 1.37 Let p be a prime. If p|(ab), then p|a or p|b.Proof
For any integer n, (n, p) = 1 or p since p has only two divisors. Suppose p - a.
Then (p, a)=1. By Theorem 1.36, the relation
ab ≡ 0 (mod p)
then implies that
b ≡ 0 (mod p).
By induction, we have the following:
corollary 1.38 Let p be a prime. If p|(a1a2 · · · am) then p|ak for some k.
1.6 Fundamental Theorem of Arithmetic
theorem 1.39 (Fundamental Theorem of Arithmetic) Every positive integer
n > 1 can be expressed as a product of primes; this representation is unique
apart from the order in which the factors occur.
Proof
We first show that n can be expressed as a prime or a product of primes. We
use induction on n. The statement is clearly true for n = 2 since 2 is a prime.
Suppose m is a prime or a product of primes for 2 ≤ m ≤ n− 1. If n is a prime
then we are done. Suppose n is composite then n = ab, where 1 < a, b < n. By
induction each of the a and b is either a prime or a product of primes. Hence,
n = ab is a product of primes. By mathematical induction, every positive integer
n > 1 is a prime or a product of primes.
To prove uniqueness, we use induction on n again. If n = 2 then the repre-
sentation of n as a product of primes is clearly unique. Assume, then that it is
true for all integers greater than 1 and less than n. We shall prove that it is also
true for n. If n is prime, then there is nothing to prove. Assume, then, that n is
composite and that n has two factorizations, say,
n = p1p2 · · · ps = q1q2 · · · qt. (1.3)
Since p1 divides the product q1q2 · · · qt, it must divide at least one factor. Relabel
q1, q2, ..., qt so that p1|q1. Then p1 = q1 since both p1 and q1 are primes. In (1.3),
we may cancel p1 on both sides to obtain
n/p1 = p2 · · · ps = q2 · · · qt.
![Page 21: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/21.jpg)
1.7 Euclid’s Lemma and the linear Diophantine equation 15
Now the induction hypothesis implies that the two factorizations of n/p1 must
be the same, apart from the order of the factors. Therefore, s = t and the
factorizations in (1.3) are also identical, apart from order. This completes the
proof.
example 1.24
Let
a =∏
p
pαp and b =∏
p
pβp .
Let
d =∏
p
pmin(α0,βp).
Show that d = (a, b). Show that if
` =∏
p
pmax(αp,βp),
then ` = [a, b].
1.7 Euclid’s Lemma and the linear Diophantine equation
In Example 1.23, we give a particular solution to
43x+ 64y = 1.
In general, an equation of the type
ax+ by = m
is called a linear Diophantine equation. We now discuss how one can obtain
general solutions of linear Diophantine equation.
Suppose (a, b) = d. If d - m then the equation has no solution. Suppose d|m.
Let x0 and y0 be a particular solution of ax+ by = m. Let X and Y be general
solution of the equation. Then
a(X − x0) = b(y0 − Y ).
Hence,
a
d(X − x0) =
b
d(y0 − Y ).
Since (a/d, b/d) = 1, we conclude from Theorem 1.36 that
y0 − Y =a
dk and X − x0 =
b
dk.
The general solution is therefore
X =b
dk + x0 and Y = −a
dk + y0.
![Page 22: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/22.jpg)
16 Divisibility
example 1.25
When Mr Smith cashed a check at his bank, the teller mistook the number of
cents for the number of dollars and vice versa. Unaware of this, Mr Smith spent
68 cents and then noticed to his surprise that he had twice the amount of the
original check. Determine the smallest value for which the check could have been
written.
Solutions. Let x be the number of dollars and y be the number of cents. From
the information given, we set up the equation
98y − 199x = 68.
Using Euclidean algorithm, we find that
1 = 33 · 199− 98 · 67,
This implies a particular solution for
98y − 199x = 68
is
y0 = −67 · 68 and x0 = 33 · 68.
So the general solution for
98y − 199x = 68
is
y = 199k − 67 · 68 and x = 98k + 33 · 68.
Now, 0 < y < 100 implies that 22 < k ≤ 23 and hence, k = 23. Thus, y = 21
and x = 10.
We now apply the solutions of linear Diophantine equation to solve linear
congruences. We will prove that
theorem 1.40 Given integers a, b such that (a, b) = d and d|N , the number
of solutions satisfying
ax ≡ N (mod b) (1.4)
is exactly d.
Proof
Suppose ax ≡ N (mod b). Then
ax = N − by. (1.5)
If x0 and y0 are integers satisfying (1.5), then the general solution of (1.5) is
X =b
dk + x0 and Y = −a
dk + y0.
![Page 23: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/23.jpg)
1.8 Appendix : The cyclic group (Z,+) 17
This implies that X is a solution of (1.4) for any integer k. We claim that
modulo b there are exactly d solutions. Let k = dq + r with 0 ≤ r < d, and
b
d(dq + r) ≡ b
dr (mod b).
So any solution with k > d can be identified with a solution r < d. Next, no
two solutions from the set of solutions
{ bdk + x0, 0 ≤ k < d}
are the same modulo b. Suppose
b
di+ x0 ≡ b
dj + x0 (mod b).
Thenb
d(i − j) ≡ 0 (mod b).
This implies that d|(i− j). Since i and j are both less than d, i = j. Hence there
are exactly d solutions to (1.4).
Remark 1.41 Theorem 1.40 will be needed for determining the number of so-
lutions of congruences of the type
xm ≡ c (mod n).
1.8 Appendix : The cyclic group (Z,+)
Let G be a nonempty set and ∗ be function from G×G to G satisfying
(a) There exists an e ∈ G such that for all g ∈ G, e ∗ g = g ∗ e = g.
(b) For all g ∈ G, there exists a g′ ∈ G such that g ∗ g′ = g′ ∗ g = e.
(c) For all g, h, k ∈ G, g ∗ (h ∗ k) = (g ∗ h) ∗ k.
The set G together with the binary operation ∗ is called a group.
If H is a nonempty subset of G and (H, ∗) is a group, we say that H is a
subgroup of G and the notation is H ≤ G. One can use a simple criterion to
check if H were a subgroup of G. This is given by H ≤ G if and only if for all
h, k ∈ H , hk−1 ∈ H .
One of the simplest example of a group is (Z,+). The identity is 0 and the
inverse of n is −n. This is a cyclic group (that is, a group that is generated by one
element, namely 1). Using Division Algorithm, we can show that any subgroup
of a cyclic group is cyclic. Now, if we define
T = {au+ bv|u, v ∈ Z},
then we can easily check that T is a subgroup of Z since au+ bv− (au′ + bv′) =
a(u − u′) + b(v − v′) ∈ T. Since (Z,+) is cyclic, we conclude that T is cyclic
![Page 24: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/24.jpg)
18 Divisibility
and hence T = dZ, that is, T is generated by one element d. We may choose d
to be positive. Now, note that both a and b are in T . Hence d|a and d|b. Sinced ∈ T , d = au + bv for some integers u, v. Suppose c|a and c|b. Then from the
representation of d, we conclude that c|d. Hence d is the greatest common divisor
of a and b.
![Page 25: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/25.jpg)
2 More on Congruences
2.1 Fermat’s Theorem and Euler’s Theorem
definition 2.1 Let m be a positive integer. A set
S = {x0, x1, · · · , xm−1|xi ∈ Z}
is called a complete residue system if xi 6≡ xj (mod m) whenever i 6= j.
Let C = {0, 1, · · · ,m− 1}. Then C is a complete residue system modulo m. If
we define
S = {0 ≤ r ≤ m− 1|r ≡ s (mod m) for some s ∈ S},
then we find that S = C. To see this, define a function ψ(s) = s′ where s′ is
between 0 and m − 1 and s ≡ s′ (mod m). The condition on complete residue
system for s ∈ S shows that ψ is an injection and hence, a bijection since
|S| = |C|.The complete residue system modulo m is not unique. For example, when
n = 5, the set {0, 1, 2, 3, 4} is a set of complete residues modulo 5. The set
{5, 11, 22, 33, 14} is also a set of complete residues modulo 5.
Note that the set {1, 11, 12, 13, 14} is not complete since 1 ≡ 11 (mod 5).
We now use the concept of residues modulo p to prove a famous Theorem of
Fermat.
theorem 2.2 (Fermat’s Little Theorem) If p is a prime and p - a then
ap−1 ≡ 1 (mod p),.
Remark 2.3 Fermat’s Little Theorem is sometimes written as: For a fixed prime
p,
ap ≡ a (mod p)
for all integer a. Note that when written in this form, the condition p - a can be
removed since ap ≡ a ≡ 0 (mod p) when p|a.
Proof
![Page 26: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/26.jpg)
20 More on Congruences
Let
S = {0, 1, 2, · · · , p− 1}.Suppose p - a. Let T = {a · 0, a · 1, · · · , a · (p − 1)}. Note that no two elements
in T are congruent to each other modulo p. For if ka ≡ ja (mod p), then k ≡ j
(mod p), since (a, p) = 1 and cancelation law applies. Hence, T is a complete
residue system and
T = S.
where
T = {0 ≤ r ≤ m− 1|r ≡ s (mod m) for some s ∈ S}.
Hence, the product of all elements in T must be congruent to the product of
all elements in S. In other words, if we multiply all the elements in T − {0},the product must be congruent to the product of all the elements in S − {0}.Therefore,
ap−1(p− 1)! ≡ (p− 1)! (mod p).
Since p - (p − 1)!, (p, (p − 1)!) = 1 and cancelation law holds again. Therefore,
ap−1 ≡ 1 (mod p).
Fermat’s Little Theorem can be a labor-saving device in certain calculations.
example 2.1
Prove that 18351910 + 19862061 ≡ 0 (mod 7).
Solutions. We have 1835 ≡ 1 (mod 7) and 1986 ≡ 5 (mod 7). Therefore
18351910 + 19862061 ≡ 1 + 52061 (mod 7).
Now, 2061 ≡ 3 (mod 6) and therefore, 2061 = 6q+3, for some integer q. Hence,
52061 ≡ 56q+3 ≡ (56)q · 53 ≡ 53 (mod 7)
since 56 ≡ 1 (mod 7), by Theorem 1.1. The congruence thus reduces to
1 + 52061 ≡ 1 + 53 ≡ 1 + 6 ≡ 0 (mod 7).
Fermat’s Little Theorem may also be used as a tool for primality testing. For
example, for a given n if we can find an a such that (a, n) = 1 and an−1 6≡ 1
(mod n) then we conclude immediately that a is not relatively prime to n. This
implies that n is not a prime. The following example illustrate this point.
example 2.2
Show that 91 is not a prime.
Solutions. We have
290 ≡ 210·9 ≡ 10249 ≡ 239 ≡ 643 ≡ 64 (mod 91).
Since 290 6≡ 1 (mod 91), 91 is composite.
![Page 27: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/27.jpg)
2.2 An extension of Fermat’s Little Theorem: The Euler Theorem 21
Of course this test is very inefficient for several reasons. Firstly, one might
need to select a number a several times before arriving at a conclusion that n
is composite. Secondly, computing powers of a number is very time consuming.
Thirdly, it may be inconclusive even if an−1 ≡ 1 (mod n) for all a < n as the
following example shows.
example 2.3
Show that a561 ≡ a (mod 561) for a < 561.
Solutions. If a is prime to 561, then a2 ≡ 1 (mod 3), a10 ≡ 1 (mod 11) and
a16 ≡ 1 (mod 17). Now,
a560 ≡ (a2)280 ≡ 1 (mod 3)
a560 ≡ (a10)56 ≡ 1 (mod 11)
a560 ≡ (a16)35 ≡ 1 (mod 17)
Therefore a560 ≡ 1 (mod 561), or a561 ≡ a (mod 561). If a is not relatively to
561, the above still holds. For if N = 3 · 11 · 17 and (a, 561) = 561, then
a561 ≡ a ≡ 0 (mod 561).
If N = 3, 11, 17, 3 ·11, 3 ·17, or 11 ·17 and (a, 561) = N , then since N is square
free, (a, 561/N) = 1 and hence, using Fermat’s Little Theorem, we have
a561 ≡ a (mod 561/N).
Since N |a, we also have
a561 ≡ a ≡ 0 (mod N).
Hence,
a561 ≡ a (mod 561).
A number which satisfies the property that an ≡ a (mod n) for all a is called
a Carmichael number. It has been a conjecture for nearly 90 years that there are
infinitely many Carmichael numbers. This result has only recently been proved
by W.R. Alford, A. Granville, C. Pomerance (see Ann. of Math. 140 (1994),
703-722).
2.2 An extension of Fermat’s Little Theorem: The Euler Theorem
Fermat’s Little Theorem is false when n is composite. For example, 55 ≡ 5
(mod 6), i.e., 56 6≡ 5 (mod 6). In this Section we will learn an extension of
Fermat’s Little Theorem. But first let us define a very important function.
![Page 28: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/28.jpg)
22 More on Congruences
definition 2.4 The function ϕ(n) is defined to be the number of elements
1 ≤ x < n which are relatively prime to n. The function ϕ(n) is usually called
Euler’s ϕ function.
example 2.4
There are altogether 2 numbers less than 6 which are relatively prime to 6. These
are 1 and 5. Therefore, ϕ(6) = 2. For n = 12, ϕ(12) = 4 and these four numbers
are 1, 5, 7, 11.
example 2.5
Note that when p is prime, all the numbers 1 ≤ x < p are relatively prime to p
and so,
ϕ(p) = p− 1. (2.1)
For example, ϕ(7) = 6 and the numbers that are relatively prime to 7 are
1, 2, · · · , 6.
We are now ready to state Euler’s Theorem:
theorem 2.5 (Euler’s Theorem) Suppose (a, n) = 1. Then
aϕ(n) ≡ 1 (mod n).
Note that when n is prime, by (2.1), ϕ(p) = p− 1. So Theorem 2.5 reduces to
Fermat’s Little Theorem, since
aϕ(p) ≡ ap−1 ≡ 1 (mod p).
Proof
The idea of the proof is the same as that of Fermat’s Little Theorem. Consider
the set
S := {a1, a2, · · · , aϕ(n)},
where S contains all the positive integers x < n which are relatively prime to n.
(Note by definition, there are ϕ(n) of them). Suppose a is relatively prime to n.
We define
T = {a · a1, a · a2, · · · , a · aϕ(n).}.
Once again, no two elements in T are congruent modulo n. For if
a · ai ≡ a · aj (mod n),
then
ai ≡ aj (mod n)
since (a, n) = 1. Hence each element in T is congruent modulo n to exactly one
![Page 29: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/29.jpg)
2.3 Finite abelian groups and Euler’s Theorem 23
element in S. Now, taking the product of all the elements of T , we conclude
that
aϕ(n)
ϕ(n)∏
k=1
ak ≡ϕ(n)∏
k=1
ak (mod n).
Once again since (ai, n) = 1 we have
ϕ(n)∏
k=1
ak, n
= 1
and applying cancelation law, we obtain
aϕ(n) ≡ 1 (mod n).
example 2.6
Find the remainder of 44444444 when it is divided by 9.
Solutions. Observe that 4444 ≡ −2 (mod 9) and ϕ(9) = 6 (the positive integers
< 9 and relatively prime to 9 are 1, 2, 4, 5, 7, 8). Since a6 ≡ 1 (mod 9) when
(a, 9) = 1 by Theorem 2.5. It is natural to write the exponent in form 6q + r.
Now, 4444 = 6q + 4 and so,
44444444 ≡ (−2)6q+4 ≡ ((−2)6)q · (−2)4 ≡ (−2)4 ≡ 7 (mod 9).
Hence, the remainder is 7.
2.3 Finite abelian groups and Euler’s Theorem
The division algorithm states that if n is a fixed positive integer then any integer
N can be written in the form qn+ r where 0 ≤ r < n. In other words, the set Z
can be partitioned into n distinct sets, denoted by [r]n, 0 ≤ r < n, where
[r]n := {k ∈ Z|k ≡ r (mod n)}.
Let
Z/nZ = {[r]n|0 ≤ r < n}
and define the binary operation
[a]n+[b]n = [a+ b]n.
Then the set Z/nZ forms a group under the operation +. This group is “similar”
to the additive group (Z,+). Of course, the set Z together with the binary
operation · is not a group. Can we construct a group for Z/nZ with operation
the analogue of ·? The answer is yes but with a “twist”.
![Page 30: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/30.jpg)
24 More on Congruences
We let
(Z/nZ)∗= {[r]n|(r, n) = 1, 0 ≤ r < n}.
Define
[a]n·[b]n = [a · b]n.
Then the set (Z/nZ)∗together with · is a multiplicative group.
The Lagrange Theorem states that for any finite group G, the order of any
subgroup H of G divides the order of the group G. To see this, we first construct
G/H := {gH |g ∈ G}
where the coset gH is defined by
gH = {gh|h ∈ H}.
Note first that if G = H , then there is one element in G/H . If G 6= H and
gH, g′H ∈ G/H , we note that
gH ∩ g′H = φ.
For otherwise, if x ∈ gH∩g′H , then x = gh = g′h′, or g−1g′ ∈ H and gH ⊂ g′H .
Similarly, g′H ⊂ gH and we have gH = g′H. Since gH and g′H are disjoint for
distinct cosets, we conclude that there are finitely many distinct sets gH in G/H .
Furthermore, each g ∈ G must be in one of these cosets. This implies that
G = ∪gH.
Using the map
ϕ : H → gH,
we find that
|H | = |gH |.
Therefore,
|G| =k∑
j=1
|gH | = k|H |.
Hence, |H | | |G| and this is Lagrange’s Theorem for subgroups of finite groups.
Now, one can see that for any element g ∈ G, the set
H = {gm|m ∈ Z}
forms a finite group of order d, where d is the least positive integer such that
gd = 1. Since H is a subgroup of G with |H | = d, by Lagrange’s Theorem,
d | |G|, or |G| = d`. Now, gd = 1, where 1 is the identity of the group G. Hence
g|G| = (gd)` = 1. We have thus shown that
g|G| = 1
![Page 31: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/31.jpg)
2.4 The Chinese Remainder Theorem 25
for all g ∈ G. Applying this to our group (Z/nZ)∗, we find that for any integer
a such that (a, n) = 1,
aϕ(n) ≡ 1 (mod n),
since ϕ(n) is the number of elements of (Z/nZ)∗.
Remark 2.6 Let G = (Z/nZ)∗. The elementary proof of Euler’s Theorem
follows from the fact that the multiplication of a fixed element g of G induces a
permutation of the elements in G. In other words, if we define a map
ξ : G→ G
by ξ(g1) = g · g1, then ξ is a bijection of G.
Remark 2.7 Euler’s Theorem can be used to solve Linear congruences such as
ax ≡ c (mod m)
when (a,m) = 1. One simply multiplies aϕ(m)−1 to both sides of the congruence
to obtain
aϕ(m)x ≡ aϕ(m)−1c (mod m).
2.4 The Chinese Remainder Theorem and solving linearcongruence equations
theorem 2.8 Let m and n be such that (m,n) = 1. Let
A(t) = Z/tZ.
Then the map
ξ : A(mn) → A(m)×A(n)
defined by
ξ([k]mn) = ([k]m, [k]n)
is a bijection.
Proof
First, note that ξ is a well defined map. For, if [j]mn = [k]mn then ξ([j]mn) =
([j]m, [j]n). But j ≡ k (mod mn) implies that j ≡ k (mod m) and j ≡ k
(mod n). In other words, ξ([j]mn) = ([k]m, [k]n) and the map is well defined.
Note that the number of elements in A(mn) and A(m) × A(n) are equal. It
suffices to show that ξ is an injection. Suppose
([`]m, [`]n) = ([k]m, [k]n).
Then m|(` − k) and n|(` − k). This implies, by Example 1.21 that mn|(` − k)
since (m,n) = 1. Hence, [`]mn = [k]mn and the map ξ is injective and hence the
map ξ is a bijection.
![Page 32: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/32.jpg)
26 More on Congruences
Theorem 2.8 is known as the Chinese Remainder Theorem. It says that given
integers m and n such that (m,n) = 1, the simultaneous congruence equations
x ≡ h (mod m) and x ≡ k (mod n) (2.2)
can be solved. In other words, there exist an integer x satisfying (2.2).
The Chinese Remainder Theorem can be extended to more than two positive
integersm and n. More precisely, we can show that ifm1,m2, · · · , ms are pairwise
relatively prime, then the system of equations
x ≡ a1 (mod m1)
x ≡ a2 (mod m2)
...
x ≡ as (mod ms)
is solvable.
The usual way to prove the Chinese Remainder Theorem is by constructing
an x that is the solution of
x ≡ a (mod m) and x ≡ b (mod n).
To do so, one sets
x = nta+msb,
where ms ≡ 1 (mod n) and nt ≡ 1 (mod m). Note that with this choice of x,
x ≡ nta+msb ≡ a (mod m)
and
x ≡ b (mod n).
The only difficulty now is to determine the value s and t that satisfy
ms ≡ 1 (mod n) and nt ≡ 1 (mod m).
This amounts to solving linear congruence of the form
aX ≡ 1 (mod k). (2.3)
Note that this linear congruence is solvable only if (a, k) = 1. In this case, there
exists u and v such that
1 = au+ kv.
Hence X = u is a solution to the linear congruence (2.3).
example 2.7
Using the second remark above, find the least positive integer x such that x ≡ 5
(mod 7), x ≡ 7 (mod 11) and x ≡ 3 (mod 13).
![Page 33: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/33.jpg)
2.5 Multiplicative functions and even perfect numbers 27
2.5 Multiplicative functions and even perfect numbers
definition 2.9 An arithmetical function f is a function from N to C.
definition 2.10 An arithmetical function f is said to be multiplicative if
f(1) = 1 and when (m,n) = 1,
f(mn) = f(m)f(n).
definition 2.11 An arithmetical function f is said to be completely multi-
plicative if f(1) = 1 and for all positive integers m and n,
f(mn) = f(m)f(n).
example 2.8
The functions N(n) = n and u(n) = 1 are completely multiplicative.
definition 2.12 The Mobius function µ(n) defined by µ(1) = 1 and for
n =∏m
k=1 pαk
k ,
µ(n) =
{(−1)m if αi = 1, 1 ≤ i ≤ m
0 otherwise..
Note that µ(n) is a multiplicative function.
definition 2.13 Let σ(1) = 1 and σ(n) be the sum of all divisors of n. We
write
σ(n) =∑
d|n
d.
example 2.9
Show that σ(n) is multiplicative and deduce that an even integer N is perfect if
and only if N = 2k−1(2k − 1) where 2k − 1 is prime.
Solutions. Suppose k, ` are relatively prime. Then each divisor d of k` can be
expressed in the form d1 and d2 where d1|k and d2|`. We thus write
σ(k`) =∑
d|k`
d =∑
d1d2|k`
d1d2
=∑
d1|k
d1∑
d2|`
d2 = σ(k)σ(`).
Next, we observe N is a perfect number if and only if
σ(N) = 2N.
Now, if N = 2k−1(2k − 1) with 2k − 1 a prime, then the divisors of N are
2j, 0 ≤ j ≤ k − 1 and 2j(2k − 1), 1 ≤ j ≤ k − 1. The sum of divisors is
2k − 1 + (2k − 1)(2k − 1) = 2k(2k − 1) = 2N.
![Page 34: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/34.jpg)
28 More on Congruences
Hence, N is perfect.
Conversely, if N is even and perfect. Write N = 2k−1m, k ≥ 2 and m odd.
Since σ(n) is multiplicative and (2k−1,m) = 1, we conclude that
σ(N) = σ(2k−1)σ(m) =(1 + 2 + · · ·+ 2k−1
)σ(m) = (2k − 1)σ(m). (2.4)
But N is perfect and this implies that
σ(N) = 2N = 2km. (2.5)
From (2.4) and (2.5), we deduce that
(2k − 1)σ(m) = 2km.
Since (2k − 1, 2k) = 1, by Euclid’s Lemma, we deduce that
(2k − 1)|m. (2.6)
By (2.6), we may write
m = (2k − 1)s, (2.7)
with s ≥ 1. With this expression for m, we find using (2.4) and (2.5) that
σ(m) = 2ks. (2.8)
If s > 1 then 1, s and (2k − 1)s are all divisors of m. Hence,
σ(m) ≥ 1 + s+ (2k − 1)s > s+ (2k − 1)s = 2ks.
This contradicts (2.8). Therefore s = 1, m = 2k − 1 and σ(m) = 2k. But this
means that 1 and 2k − 1 are the only divisors of m and hence m = 2k − 1 must
be a prime.
Remark 2.14 The method we used to prove that σ(n) is multiplicative can be
modified to show that ∑
d|n
f(d)g(nd
)
is multiplicative if f and g are multiplicative. The expression∑
d|n
f(d)g(nd
),
denoted by (f ∗ g)(n) is called the Dirichlet product of f and g.
2.6 The Euler ϕ-function is multiplicative
We show next that ϕ(n) is also a multiplicative function.
theorem 2.15 For positive integers m and n such that (m,n) = 1,
ϕ(mn) = ϕ(m)ϕ(n).
![Page 35: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/35.jpg)
2.6 The Euler ϕ-function is multiplicative 29
We will prove a fact.
Let
M(t) := (Z/tZ)∗.
Note that |M(t)| = ϕ(t). Consider the map
ψ :M(mn) →M(m)×M(n)
defined by
ψ([k]mn) = ([k]m, [k]n).
Note that ψ = ξ∣∣M(mn)
. From the proof of Theorem 2.8, we see that given
([u]m, [v]n) ∈M(m)×M(n), there exists [h]mn ∈ A(mn) such that
ξ([h]mn) = ([u]m, [v]n).
We need to show that [h]mn ∈M(mn). This follows from the fact that if [h]m ∈M(m) and [h]n ∈ M(n), then [h]mn ∈ M(mn). In other words, if (h,m) = 1
and (h, n) = 1 then (h,mn) = 1. This proves that the map is surjective. Now
the map ψ is injective because ξ is injective. We therefore conclude that ψ is a
bijection and
ϕ(mn) = ϕ(m)ϕ(n).
corollary 2.16 Let n =∏
p pαp . Then
ϕ(n) =∏
p
ϕ(pαp).
We end this section with another proof that ϕ(n) is multiplicative. We begin
with the following theorem:
theorem 2.17 Let µ(n) be the Mobius function. Then
∑
d|n
µ(d) =
{1 if n = 1,
0 otherwise.(2.9)
Proof
First, we note that the left hand side (2.9) is µ ∗ u where u(n) = 1 for all
positive integers n. From Remark 2.14, we know that µ ∗ u is multiplicative.
Hence µ ∗ u(1) = 1. To determine µ ∗ u(n) for n > 1, it suffices to determine the
values of µ ∗ u at prime powers. Now,∑
d|pα
µ(d) = 1− 1 = 0,
if α ≥ 1. Hence µ ∗ u(n) = 0 for n > 1. Note that the right hand side of (2.9)
can be written as [1/n] where, [x] is the integer part of x.
![Page 36: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/36.jpg)
30 More on Congruences
We now give another proof that ϕ(n) is multiplicative. Note that
ϕ(n) =n∑
k=1(k,n)=1
1 =n∑
k=1
[1
(k, n)
]
=n∑
k=1
∑
`|(k,n)
µ(`) =∑
`|n
µ(`)n∑
k=1`|k
1
=∑
`|n
µ(`)n
`.
Hence ϕ(n) is the Dirichlet product of the multiplicative functions µ(n) and
N(n) and by Remark 2.14, ϕ(n) is multiplicative.
2.7 An Application to Cryptography
The purpose of studying Cryptography is to protect information transmitted
through public communication networks. In the language of cryptography, the
information to be concealed is called plaintext. The message in secret form is
called ciphertext. The purpose of converting plaintext to ciphertext is said to be
encrypting while the reverse is called decrypting.
In 1977, R. Rivest, A. Shamir and L. Adleman proposed a public key cryptosys-
tem, which uses only elementary ideas from Number Theory. The enciphering
system is now called RSA. Its security depends on the assumption that in the
current state of computer technology, the factorization of composite numbers
with large prime factors is prohibitively time-consuming.
Each user of RSA system chooses a pair of distinct primes p, q large enough
so that the factorization of n = pq (enciphering modulus) is beyond the current
computational capabilities. For example one may select p and q with 200 digits
and n would have 400 digits. Having selected n, the user then chooses a random
positive integer k such that (k, ϕ(n)) = 1. The pair (n, k) is placed in a public
file, as user’s personal encryption key. This will allow anyone else to encrypt and
send a message to that individual. Note that while n is openly revealed the listed
public key does not mention the factors p and q of n.
The encryption begins with the conversion of the message into digital alpha-
bets. For example,
A = 01, B = 02, · · ·Z = 26, Space = 00.
Under this scheme, the message ”The brown fox is quick” is transformed into
M = 20080500021815231400061524000919001721090311.
We assume that M < n. If M > n we can break M into blocks of digits
![Page 37: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/37.jpg)
2.7 An Application to Cryptography 31
M1,M2, · · · ,Mr, Mi < n. To encrypt, raised M to kth power (mod n), i.e.,
Mk ≡ r (mod n)
and send r.
To decrypt, first recall that k is chosen such that (k, ϕ(n)) = 1. Therefore
there exist j such that
kj ≡ 1 (mod ϕ(n)).
This implies that
rj ≡M jk ≡Mϕ(n)·m+1 (mod n).
If M is coprime to n, then
Mϕ(n) ≡ 1 (mod n).
Therefore
rj ≡M (mod n),
and we retrieve the message. Note that the number j, called the recovery ex-
ponent, can only be calculated by someone who knows ϕ(n) = (p − 1)(q − 1).
Therefore, j is secure from an illegitimate third party whose knowledge is limited
to the public key (n, k).
We have assumed above that M is relatively prime to n. Suppose M is not
relatively prime to n = pq. Then (M,pq) = p, q, or pq. If the is pq then since
M < pq, M = 0. Suppose without loss of generality that (M,pq) = p. Then
kj ≡ 1 (mod ϕ(n))
implies that
kj ≡ 1 (mod q − 1),
rj ≡Mkj ≡M (mod q)
and
0 ≡Mkj ≡M (mod p).
Hence,
rj ≡M (mod pq)
and we retrieve the message.
example 2.10
Let p = 29, q = 53, pq = 1537. Note that
ϕ(n) = 1456.
![Page 38: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/38.jpg)
32 More on Congruences
Take k = 47. Then j = 31 since 31 ·47 ≡ 1 (mod 1456). To encrypt the message
”No Way”, we have
M = 141500230125.
We divide M into blocks of 3:
141 500 230 125.
Raised to the power of 47 modulo 1537, we have
0658 1408 1250 1252.
This is the ciphertext. Note that to retrieve we raise to the power of 31:
65831 ≡ 141 (mod 1537).
2.8 Appendix : Group Action and Fermat’s Little Theorem
definition 2.18 If X is a set and G is a group, then G acts on X if there
exists a function α : G×X → X , called an action such that
(i) for g, h ∈ G, then αg ◦ αh = αgh.
(ii) α1 = 1SXthe identity function.
From now on, we will write the action of g on x as g · x instead of αg(x). The
above conditions are then replaced by gh · x = g · (h · x) and 1G · x = x.
definition 2.19 If G acts on a set X , then the orbit of x (x ∈ X), denoted
by O(x), is the subset of X given by
O(x) = {g · x}.
The stabilizer of x, denoted by Gx, is the subgroup of G given by
Gx = {g ∈ G|g · x = x}.
We recall the following Theorems:
theorem 2.20 If G acts on X, then X is a disjoint union of the orbits. If X
is finite, then |X | =∑i |Oxi|, where one xi is chosen from each orbits.
Proof
This follows from the fact that if Ox ∩ Oy 6= φ, then Ox = Oy. Let zOx ∩ Oy.
Then z = g ·x = g′ ·y. This implies that x = g−1g′ ·y and so Ox ⊂ Oy. Similarly,
Oy ⊂ Ox and hence, Ox = Oy.
![Page 39: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/39.jpg)
2.8 Appendix : Group Action and Fermat’s Little Theorem 33
theorem 2.21 If G acts on a set X and x ∈ X, then
|Ox| = |G : Gx|.
Proof
This follows by defining the map
ξ : G/Gx → Ox
with
ξ(gGx) = g · x.
One must check that this map is well defined and bijective. We leave this as
an exercise.
lemma 2.22 (i) Let a group G act on a set X . If x ∈ X and σ ∈ G then
Gσx = σGxσ−1.
(ii) If a finite group G acts on a finite set X and if x and y lie in the same
orbit, then |Gy| = |Gx|.
Proof
We again leave the above as exercises.
theorem 2.23 Let G act on a finite set X. If N is the number of orbits, then
N =1
|G|∑
g∈G
|Fix(g)|,
where Fix(g) is the set of elements x ∈ X fixed by σ.
Proof
To prove the above, we consider∑
g∈G,x∈Xg·x=x
1 =∑
g∈G
∑
x∈Xg·x=x
1
=∑
g∈G
|Fix(g)|.
On the other hand,∑
g∈G,x∈Xg·x=x
1 =∑
x∈X
∑
g∈Gg·x=x
1
=∑
x∈X
|Gx|
=
m∑
j=1
∑
x∈O(j)
|Gx|,
![Page 40: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/40.jpg)
34 More on Congruences
where O(j), 1 ≤ j ≤ m are the distinct orbits. Note that O(j) = Oxjfor some
xj ∈ X . Since |Gx| = |Gy| for any x, y ∈ O(j), we conclude that∑
x∈O(j)
|Gx| = |G|/|O(j)| · |O(j)| = |G|.
Therefore,
m =1
|G| =∑
g∈G
|Fix(g)|.
We are now ready to give another proof of Fermat’s Little Theorem. Consider
a p-gon where p is a prime. Suppose we want to color the vertices of the p-gon
using n colors. How many ways can we do this?
Let X contains the p-tuples (a1, a2, · · · , ap) where ai takes one of the n colors.
Let G be the cyclic group generated by σ := (1, 2, · · · , p). Let G acts on X by
σj · (a1, a2, · · · , ap) = (aσj(1), aσj2, · · · , aσjp).
Note that under this action, the orbit containing (a1, a2, · · · , ap) are
{(ai+1, ai+2, · · · , ap, a1, · · · , ai), 0 ≤ i ≤ p− 1}.
This means that if a colored polygon can be obtained from another colored
polygon via rotation, we only count the polygon once. As such, the number of
orbits is the number of polygons with distinct colorings. Hence the number of
orbits is1
p
(|Fix((1)(2) · · · (p))| + |Fix(σ)|+ · · ·+ |Fix(σp−1)|
).
Each σi, i 6= 0, is a p-cycle and so the total number of x fixed by σi is n. The
number of x fixed by (1)(2) · · · (p) is np. Since the number of orbits is an integer,
we find that
np ≡ n (mod p),
which is Fermat’s Little Theorem.
![Page 41: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/41.jpg)
3 Solutions of congruences
3.1 The Chinese Remainder Theorem revisited
Let f(x) be a polynomial with coefficients in Z. A congruence equation is of the
form
f(x) ≡ 0 (mod m),
where m is a positive integer. In this Chapter, we will study such equations.
theorem 3.1 Let f(x) be a fixed polynomial with integral coefficients, and for
any positive integerm letN(m) denote the number of solutions of the congruence
f(x) ≡ 0 (mod m).
If m = m1m2 where (m1,m2) = 1, then N(m) = N(m1)N(m2). If m =∏
p pαp
is the canonical factorization of m, then
N(m) =∏
p
N(pαp).
Proof
Let m be a positive integer. Suppose (m1,m2) = 1 and that there are N(m1)
solutions, say {a1, · · · , aN(m1)}, to
f(x) ≡ 0 (mod m1)
and N(m2) solutions, say {b1, · · · , bN(m2)}, to the equation
f(x) ≡ 0 (mod m2).
To each pair of solutions (ai, bj) there exists an integer cij modulo m such that
f(cij) ≡ 0 (mod m1)
and
f(cij) ≡ 0 (mod m2).
The existence of cij is guaranteed by the Chinese Remainder Theorem. This
implies that
f(cij) ≡ 0 (mod m)
![Page 42: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/42.jpg)
36 Solutions of congruences
and therefore
N(m1)N(m2) ≤ N(m).
Next, if
f(x) ≡ 0 (mod m)
and m = m1m2, (m1,m2) = 1, then by Chinese Remainder Theorem, there are
N(m) pairs (a mod m1, b (mod m2)) such that
f(a) ≡ 0 (mod m1)
and
f(b) ≡ 0 (mod m2).
This implies that
N(m) ≤ N(m1)N(m2).
Hence,
N(m1m2) = N(m1)N(m2).
example 3.1
Let f(x) = x2 + x+ 3. Find all roots of the congruence f(x) ≡ 0 (mod 15).
Solutions. The solutions for f(x) ≡ 0 (mod 15) are 3, 6, 8, 11. The solutions for
f(x) ≡ 0 (mod 3) are 0 and 2 and for f(x) ≡ 0 (mod 5) are 1 and 3. Note that
N(15) = N(3)N(5).
3.2 Prime power moduli
Theorem 3.1 shows that in order to solve
f(x) ≡ 0 (mod m),
it suffices to solve
f(x) ≡ 0 (mod pα) (3.1)
for pα‖m where p is a prime.
We now show that in order to solve (3.1), if suffices to find the solutions of
f(x) ≡ 0 (mod p).
By Taylor’s series expansion,
f(a+ tpj) = f(a) + tpjf ′(a) + t2p2jf ′′(a)/2 + · · ·+ tnpnjf (n)(a)/n!, (3.2)
![Page 43: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/43.jpg)
3.2 Prime power moduli 37
where n is the degree of f(x). Observe that if
f(x) =
n∑
r=0
crxr,
then
f (k)(x)
k!=
n∑
r=k
cr
(r
k
)xr−k.
Therefore,
f (k)(a)
k!
is an integer for 0 ≤ k ≤ n. We now suppose that
f(a) ≡ 0 (mod pj).
Since
tkpkjf (k)(a)
k!≡ 0 (mod pj+1)
for k > 1 (2j > j for j ≥ 1), we conclude from (3.2) that
f(a+ tpj) ≡ f(a) + tpjf ′(a) (mod pj+1). (3.3)
We now return to (3.3). We see that we need to choose
tf ′(a) ≡ −f(a)pj
(mod p). (3.4)
We now split our investigation into cases:
Case 1. p - f ′(a).
In this case, there is a unique solution t for (3.4) and hence we obtain
theorem 3.2 (Hensel’s Lemma) Suppose that f(x) is a polynomial with in-
tegral coefficient. If f(a) ≡ 0 (mod pj) and f ′(a) 6≡ 0 (mod p), then there is a
unique t (mod p) such that f(a+ tpj) ≡ 0 (mod pj+1).
Case 2.
If p|f ′(a) but p -f(a)
pjthen (3.4) is not solvable.
Case 3.
If p|f ′(a) and p|f(a)pj
then there are p solutions for t in (3.4).
![Page 44: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/44.jpg)
38 Solutions of congruences
Remark 3.3 Note that if u is solution of
f(x) ≡ 0 (mod pj+1), (3.5)
then by the fact that we can always write u = a + tpj for some a and t where
0 ≤ a < pj and 0 ≤ t ≤ p, we conclude that all solutions of (3.5) are of the form
a+ tpj, with
f(a) ≡ 0 (mod pj), (3.6)
and so, we have exhausted all possible solutions to (3.5) by considering solutions
of (3.6).
example 3.2
Solve x3 − 2x2 + 3x+ 9 ≡ 0 (mod 33).
Solutions. The solutions to the congruence
f(x) ≡ 0 (mod 3)
are x = 0 and 2.
Now,
f ′(x) = 3x2 − 4x+ 3.
Since f ′(2) 6≡ 0 (mod 3), we see that there is a unique solution arising for x = 2.
We need to solve
7t ≡ −15/3 (mod 3)
and it turns out that t = 1. The solution for the congruence f(x) ≡ 0 (mod p)
arising from x = 2 is therefore 2 + 3 = 5.
Now
f ′(0) = 3 and f(0) = 9.
Hence there are three solutions for
f(x) ≡ 0 (mod 9)
corresponding to the solution x = 0 from
f(x) ≡ 0 (mod 3).
These are x = 0, 3 and 6.
We now have four solutions for
f(x) ≡ 0 (mod 32).
We check that f(0) = 9 and f ′(0) = 3 and so
f(x) ≡ 0 (mod 27)
has no solution arising from x = 0.
Next, f(3) = 27 and f ′(3) = 18 and so, there are three solutions arising from
x = 3. These are x = 3, 12 and 21.
![Page 45: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/45.jpg)
3.3 Prime moduli 39
For x = 6 we have f(6) = 171 and f ′(6) = 81. But 3 - (171/9) and hence,
there are no solutions arising from x = 6.
For x = 5 we find that f(5) = 99 and f ′(5) = 58 and we need to solve the
congruence
58t ≡ 11 (mod 3).
The solution is t = 1 and so 14 is the solution for the congruence.
In conclusion the solutions to the congruence
f(x) ≡ 0 (mod 27)
are 3,12,14 and 21.
3.3 Prime moduli
In the previous section, we have seen that we can reduce the problem of finding
solutions for f(x) ≡ 0 (mod pα) to finding solutions for f(x) ≡ 0 (mod p).
We will first prove a result for polynomials over a field F, that is an analogue
for the Division Algorithm for integers.
theorem 3.4 Given the polynomials f(x), g(x) ∈ F[x], where deg g(x) > 0,
there exist polynomials q(x), r(x) ∈ F[x] such that
f(x) = g(x)q(x) + r(x),
with either r(x) = 0 or 0 ≤ deg r(x) < deg g(x).
Proof
We proceed by induction on deg f(x). First fix the polynomial g(x). Suppose
f(x) = 0 or deg f(x) < deg g(x), then f(x) = 0 · g(x)+ f(x). So we may assume
that deg f(x) ≥ deg g(x).
Suppose the statement is true for any polynomials with degrees less than or
equal to n− 1. Let f(x) be a polynomial of degree n. Write
f(x) = a0 + a1x+ · · ·+ anxn,
and
g(x) = b0 + b1x+ · · ·+ bmxm,
with an 6= 0 and bm 6= 0 and n ≥ m. Since F is a field, b−1m exists and the
polynomial
P (x) := f(x)− b−1m anx
n−mg(x) (∗)
has degree n− 1. Hence, there exist q(x) and r(x) such that
P (x) = q(x)g(x) + r(x),
![Page 46: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/46.jpg)
40 Solutions of congruences
with deg r(x) = 0 or deg r(x) < deg g(x). But by (*),
f(x) = (q(x) + b−1m anx
n−m)g(x) + r(x),
hence the result.
We now let F be the finite field Z/pZ. Let g(x) be the polynomial xp − x and
suppose the degree of f(x) is n ≥ p. By Theorem 3.4, we conclude that
f(x) = (xp − x)q(x) + r(x),
with q(x), r(x) ∈ (Z/pZ) [x] and 0 ≤ deg r(x) < p.
Suppose u is such that
f(u) ≡ 0 (mod p).
By Fermat’s Little Theorem, we find that up−u ≡ 0 (mod p). Therefore, r(u) ≡0 (mod p). Conversely, if r(u) ≡ 0 (mod p), then f(u) ≡ 0 (mod p). This shows
that to study f(x) ≡ 0 (mod p), it suffices to consider those polynomial f(x)
with degree of f(x) is less than p.
theorem 3.5 (Lagrange) Let f(x) be a polynomial of degree n < p and
an 6≡ 0 (mod p). Then the congruence
f(x) ≡ 0 (mod p) (3.7)
has at most n solutions.
Proof
We prove this by induction on the degree of f(x). If n = 0, a0 6≡ 0 (mod p)
implies that there are no solution to (3.7). If the degree of f(x) is 1, then we see
that we are solving
a1x+ a2 ≡ 0 (mod p).
This is solvable since a1 6≡ 0 (mod p). Therefore the number of solution is 1.
Suppose the result holds for all polynomials of degree less than n. Let f(x) be a
polynomial of degree n. If f(x) ≡ 0 (mod p) has no solution, then we are done.
Next, suppose u be a root of the nth degree polynomial f . Then by the division
algorithm for Q[x],
f(x) = (x− u)g(x) + r0
for some polynomial g(x) of degree n− 1. Hence,
r0 ≡ 0 (mod p)
since
f(u) ≡ 0 (mod p) and (u− u)g(u) ≡ 0 (mod p).
Therefore,
f(x) ≡ (x− u)g(x) (mod p).
![Page 47: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/47.jpg)
3.3 Prime moduli 41
If b is a root of f(x) ≡ 0 (mod p), then either p|(b − u) or p|g(b), by Euclid’s
lemma. If p|g(b), then g(b) ≡ 0 (mod p) and by induction there are at most n−1
possible values for b. Therefore, we conclude that f(x) can have at most n roots.
The above result is false if p is not a prime. For example the congruence
equation
x2 ≡ 1 (mod 8)
has four solutions x = 1, 3, 5, 7.
corollary 3.6 If d|(p− 1) then the congruence xd ≡ 1 (mod p) has exactly
d solutions.
Proof
By previous theorem, the congruence cannot have more than d solutions. For
the converse, let p− 1 = de. Note that
xp−1 − 1 = (xd − 1)(xd(e−1) + xd(e−2) + · · ·+ xd + 1).
By the previous theorem, the polynomial xd(e−1)+xd(e−2)+ · · ·+xd+1 cannot
have more than de − d or p − 1 − d solutions. Hence, if xd − 1 has less than d
solutions then xp−1 − 1 must have less than p − 1 − d + d = p − 1 solutions.
But there are exactly p− 1 solutions to the equation xp−1 − 1 by Fermat’s little
theorem. This contradiction shows that xd−1 must have exactly d solutions.
example 3.3
The congruence x10 ≡ 1 (mod 811) has ten solutions and these are
1, 212, 241, 311, 339, 472, 500, 570, 599 and 810.
Remark 3.7 In general, if d - (p−1), the number of solutions for the congruence
xd ≡ 1 (mod p)
is (d, p− 1). To see this, we need the fact that the multiplicative group of Z/pZ
is cyclic, say, generated by g. This fact will be proved in the next Chapter. Now,
if u is a solution to
ud ≡ 1 (mod p),
then u = gy for some integer y since g generates (Z/pZ)∗. Therefore,
gyd ≡ 1 (mod p).
But since the order of g is p− 1, we conclude that p− 1 divides yd. This implies
that
yd ≡ 0 (mod p− 1)
and the number of solutions to this congruence is (d, p− 1) by Theorem 1.40.
![Page 48: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/48.jpg)
42 Solutions of congruences
theorem 3.8 (Wilson) If p is prime then
(p− 1)! ≡ −1 (mod p).
Proof
If p = 2, then 1 ≡ −1 (mod 2). Let p be an odd prime. By Fermat’s little
theorem, we know that the polynomial
xp−1 − 1− (x− 1)(x− 2) · · · (x − (p− 1))
has at least p− 1 roots. Since the degree of the above polynomial is p− 2 and
such polynomial can have at most p− 2 roots, we conclude that the polynomial
must be identically 0 modulo p. Setting x = 0 we conclude that
(p− 1)! ≡ −1 (mod p)
since p is odd.
Remark 3.9 Wilson’s Theorem is false when the prime p is replaced by a com-
posite number m. Suppose m is composite. Let
m =∏
k
pαk
k .
If m is not a prime power, then
m− 1 = pα11 m′ − 1 > pα1
1
and so
(m− 1)! ≡ 0 (mod pα11 ).
If m is a prime power, then let m = pα. If α > 2, then m − 1 > pα−1 and p
and pα−1 are distinct divisors of (m− 1)! and hence
(m− 1)! ≡ 0 (mod pα).
It remains to show the result for α = 2, or when m = p2. In this case m− 1 =
p2−1 > p. The only point we have to worry is that m−1 < 2p. For if m−1 ≥ 2p
then both p and 2p are distinct divisors of (m− 1)! and hence,
(m− 1)! ≡ 0 (mod p2).
But if p2 − 1 < 2p we have p(p − 2) < 1 and this is only possible if p = 2. We
have thus prove that
(m− 1)! ≡ 0 (mod m)
if m > 4 and composite. For p = 2 and α = 2 we have 3! ≡ 2 6≡ −1 (mod 4).
Hence, we find that
(m− 1)! ≡ −1 (mod m)
if and only if m is a prime.
![Page 49: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/49.jpg)
3.4 Wolstenholme’s congruence 43
3.4 Wolstenholme’s congruence
Let
F (x) = (x− 1)(x− 2) · · · (x− (p− 1)).
Write
F (x) = xp−1 − σ1xp−2 + σ2x
p−3 − · · ·+ σp−3x2 − σp−2x+ σp−1,
where σj denotes the sum of all products of j distinct roots of F (x). In the
proof of Theorem 3.8, we have seen that the polynomial
f(x) = xp−1 − 1− (x− 1)(x− 2) · · · (x− (p− 1)) = xp−1 − 1− F (x)
is the zero polynomial in (Z/pZ)[x]. This means that in (Z/pZ)[x],
xp−1 − 1 = F (x) = (x − 1)(x− 2) · · · (x− (p− 1))
= xp−1 − σ1xp−2 + σ2x
p−3 − · · ·+ σp−3x2 − σp−2x+ σp−1.
Comparing the coefficients of xj on both sides of the above polynomials in
(Z/pZ)[x], we conclude that
σj ≡ 0 (mod p), 1 ≤ j ≤ p− 2 (3.8)
We now prove a result stronger that (3.8) when j = p− 2.
theorem 3.10 (Wolstenholme’s congruence) For prime p ≥ 5,
σp−2 ≡ 0 (mod p2).
Proof
Note that
F (p) = (p− 1)! = pp−1 − σ1pp−2 + · · · − σp−2p+ (p− 1)!.
Now,
pp−2 − σ1pp−3 + · · ·+ σp−3p− σp−2 = 0.
By (3.8),
σp−3 ≡ 0 (mod p)
and since p ≥ 5, we find that
σp−2 ≡ 0 (mod p2).
![Page 50: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/50.jpg)
4 Primitive roots
4.1 Order of an element in a group
Recall from the theory of finite groups that if G is a group and g ∈ G then the
smallest positive integer for which gh = 1 is called the order of g in G.
Let G = (Z/mZ)∗. Let [a]m ∈ G. We say that h is the order of [a]m modulo
m if h is the least positive integer such that
ah ≡ 1 (mod m).
theorem 4.1 If a has order h modulo m and ak ≡ 1 (mod m) for some
positive integer k, then h|k.
Proof
Write k = hq + r, r < h. If r 6= 0 then gr ≡ 1 (mod p) and r < h. This
contradicts the minimality of h and hence h|k.
corollary 4.2 If (a,m) = 1, then the order of a modulo m divides ϕ(m).
Proof
From Euler’s Theorem,
aϕ(m) ≡ 1 (mod m).
By the previous theorem, h|ϕ(m).
lemma 4.3 If a has order h modulo m, then ak has order h/(h, k) modulo m.
Proof
Let d = (h, k). Let r be the order of ak. Then ark ≡ 1 (mod m) and hence h|rk.Hence,
h
d
∣∣∣∣(r · k
d
).
Since (h/d, k/d) = 1, we conclude that (h/d)|r.Next,
a(h/d)k = (ak)(h/d) ≡ 1 (mod m).
![Page 51: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/51.jpg)
4.2 Integers m for which (Z/mZ)∗is cyclic 45
Hence, r|(h/d). This implies that
r = h/d = h/(h, k).
lemma 4.4 If a has order h modulo m, b has order k modulo m, and if (h, k) =
1, then ab has order hk modulo m.
Proof
Let the order of ab be r. Note that (ab)hk ≡ 1 (mod m). Hence r|hk. Next,(ab)r ≡ 1 (mod m). We have
ar ≡ b−r (mod m).
Raising both sides to the power of h, we conclude that
brh ≡ 1 (mod m).
Since k is the order of b, we conclude that k|rh. But (k, h) = 1 and hence k|r.In a similar way, we deduce that h|r. Finally, since (h, k) = 1, hk|r.
4.2 Integers m for which (Z/mZ)∗ is cyclic
definition 4.5 If g has order ϕ(m) in (Z/mZ)∗ then g is called a primitive
root modulo m.
Remark 4.6 When a primitive exists for (Z/mZ)∗, we see that (Z/mZ)∗ is a
cyclic group of order ϕ(m).
We first turn to the number of primitive roots in (Z/mZ)∗, assuming that
M(m) := (Z/mZ)∗ is cyclic.
Let g be a primitive root ofM(m). Then every element inM(m) is of the form
gk. By Lemma 4.3, the order of gk is ϕ(m)/(ϕ(m), k) and so gk is a primitive
root if and only if (ϕ(m), k) = 1. We have thus shown that
theorem 4.7 If M(m) is cyclic, then there are precisely ϕ(ϕ(m)) primitive
roots modulo m.
We will next show that M(m) is cyclic (or in another words, primitive roots
modulo m exists) if and only if m = 1, 2, 4, pα and 2pα, where p is an odd prime
and α ≥ 1.
We will first establish the fact that if m = 1, 2, 4, pα and 2pα, where p is an
odd prime, then M(m) is cyclic.
We are therefore left with the cases m = 1, 2, 4, pα and 2pα.
There is nothing to prove for m = 1 and 2. For m = 4 we pick g = 3 and this
is a primitive root modulo 4.
We now prove that if m = p where p is an odd prime then M(m) is cyclic.
![Page 52: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/52.jpg)
46 Primitive roots
definition 4.8 We will use ∑
d|n
f(d)
to denote the sum of f(d) over all divisors d of n.
Note that we have already used the above notation for the definition of σ(n).
lemma 4.9 If n is an integer ≥ 1, then
n =∑
d|n
ϕ(d).
Proof
We partitioned the set {k|1 ≤ k ≤ n} into disjoint sets {(n, `) = d|1 ≤ ` ≤n}.
We now return to the fact that n = p− 1. Suppose y is an element of order d.
Then by Corollary 3.6, there are exactly d elements satisfying
xd ≡ 1 (mod p)
and these elements must be
{1, y, · · · , yd−1}.
In particular, by Lemma 4.3, there will be exactly ϕ(d) elements of order d in
M(p) if there were an element of order d in M(p). In other words, the number
of elements of order d is either 0 or ϕ(d).
Now, each element in M(p) has an order and we let f(d) be the number of
elements having order exactly d. By Lagrange’s Theorem, an order of an element
divides order of M(p) and hence we have
M(p) =⋃
d|(p−1)
{u ∈M(p)|u has order d.}
and this implies that
p− 1 =∑
d|(p−1)
f(d) ≤∑
d|(p−1)
ϕ(d).
By the above lemma, we conclude that f(d) = ϕ(d) and in particular, ϕ(p−1) 6=0 and hence M(p) is cyclic. We have proved that
theorem 4.10 The group M(p) is cyclic.
We now let α > 1 and show that M(m) is cyclic when m = pα, with p an odd
prime.
lemma 4.11 Let p be an odd prime. There exists a primitive root g modulo p
such that
gp−1 6≡ 1 (mod p2).
![Page 53: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/53.jpg)
4.2 Integers m for which (Z/mZ)∗is cyclic 47
Proof
Let h be a primitive root modulo p. If hp−1 6≡ 1 (mod p2) then there is nothing
to prove. Suppose
hp−1 ≡ 1 (mod p2). (4.1)
Then g = h+ p is another primitive root modulo p and by (4.1),
(h+ p)p−1 ≡ hp−1 + (p− 1)php−2 ≡ 1 + (p− 1)php−2 (mod p2).
Note that if
(p− 1)php−2 ≡ 0 (mod p2),
then p|h and h cannot be a primitive root modulo p. Hence,
gp−1 ≡ hp−1 + (p− 1)php−2 6≡ 1 (mod p2).
lemma 4.12 Let g be a primitive root modulo p such that
gp−1 6≡ 1 (mod p2).
Then for every α ≥ 2, we have
gϕ(pα−1) 6≡ 1 (mod pα).
Proof
For α = 2, the conclusion follows from our choice of g, which is constructed
from the previous lemma. We now establish our result by induction. Suppose
gϕ(pα−2) 6≡ 1 (mod pα−1). (4.2)
We want to show that
gϕ(pα−1) 6≡ 1 (mod pα).
By Euler’s Theorem,
gϕ(pα−2) ≡ 1 (mod pα−2)
and hence,
gϕ(pα−2) = 1 + spα−2,
for some integer s. By (4.2), we conclude that p - s.
Since p - s, we find that
gpϕ(pα−2) = gϕ(pα−1
) ≡ 1 + tpα−1 6≡ 1 (mod pα).
This completes the proof of the theorem.
theorem 4.13 The group M(pα) is cyclic.
![Page 54: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/54.jpg)
48 Primitive roots
Proof
From Lemma 4.12, we know that there exists a primitive g modulo p such that
gϕ(pα−1) 6≡ 1 (mod pα).
We will show that the order of this element is ϕ(pα). Let ` be the order of
gϕ(pα−1). By Euler’s Theorem,(gϕ(pα−1)
)p≡ gϕ(pα) ≡ 1 (mod pα)
since
ϕ(pα) = pϕ(pα−1).
Hence `|p and ` = p since ` 6= 1. Let k be the order of g. Then by Lemma 4.3,
the order of gϕ(pα−1) is
k
(ϕ(pα−1), k)= ` = p.
This implies that
k = p(ϕ(pα−1), k
)= (ϕ(pα), kp)
or
1 =
(pα−1(p− 1)
k, p
).
In other words,
k = pα−1d
for some d|(p− 1). Now,
gpα−1d ≡ gk ≡ 1 (mod pα)
implies that
gpα−1d ≡ 1 (mod p)
and this shows that
gd ≡ 1 (mod p)
since gp ≡ g (mod p). Since g is a primitive root modulo p, (p− 1)|d. Therefore,k = pα−1d = pα−1(p− 1) = ϕ(pα)
and this shows that g is a primitive root modulo pα.
Since
M(2pα) 'M(pα),
we conclude that M(2pα) is also cyclic.
We have thus shown that
theorem 4.14 Let p be an odd prime. The group M(m) is cyclic if
m = 1, 2, 4, pα and 2pα.
![Page 55: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/55.jpg)
4.3 Integers m for which (Z/mZ)∗is not cyclic 49
4.3 Integers m for which (Z/mZ)∗ is not cyclic
We will now prove that M(m) is not cyclic for m 6= 1, 2, 4, pα and 2pα.
We first prove that M(2β) is not cyclic for β ≥ 3. This follows from the
following lemma:
lemma 4.15 If β ≥ 3 then M(2β) is not cyclic.
Proof
The elements in M(2β) are of the form [x]2β , with x being an odd integer. If we
can show that all elements of M(2β) has order strictly less than 2β−1, then we
can conclude that M(2β) is not cyclic. Hence it suffices to show that for all odd
integers x,
x2β−2 ≡ 1 (mod 2β). (4.3)
When β = 3 then x2 ≡ 1 (mod 8) for all odd integers x. This can be verified
directly. We now prove by induction that M(2β) is not cyclic for all positive
integer β ≥ 3. Suppose (4.3) is true for positive integers less than β. Then this
means that
x2β−3 ≡ 1 (mod 2β−1)
for all odd integers x. Therefore,
x2β−3
= 1 + 2β−1t.
Squaring both sides of the above, we find that
x2β−2
= 1 + 2βt′.
Hence,
x2β−2 ≡ 1 (mod 2β),
and this completes our proof that M(2β) is not cyclic.
We next suppose that 2αqβ divides m, where q is an odd prime with α > 1
and β > 0. Then by application of Chinese Remainder Theorem, we know that
M(m) 'M(2α)⊕M(qβ) · · · .
This would imply thatM(m) contains a Klein four group and hence it has more
than one subgroup of order 2. Since a cyclic group contains a unique subgroup
of order d when d is a divisor of the order of the group, we conclude that M(m)
is not cyclic.
Next suppose p and q are odd primes such that pαqβ divides m, where α > 0
and β > 0. Then since ϕ(p) and ϕ(q) are divisible by 2, M(m) again contains a
Klein four group.
This completes our proof that M(m) is not cyclic if m 6= 1, 2, 4, pα and 2pα.
![Page 56: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/56.jpg)
5 Quadratic Reciprocity Law
5.1 Primitive roots and solutions of congruences
In Corollary 3.6, we see that if d|(p− 1), where p is a prime, then
xd ≡ 1 (mod p)
is solvable and has exactly d solutions. In this section, we study an extension of
this result.
theorem 5.1 Suppose m = 1, 2, 4, pα or 2pα, where p is an odd prime. Let
(a,m) = 1 and let d = (ϕ(m), n). The congruence
xn ≡ a (mod m) (5.1)
is solvable if and only if
aϕ(m)/d ≡ 1 (mod m).
In the case when the congruence is solvable, there are exactly (n, ϕ(m)) solutions.
Proof
We may assume that n ≤ ϕ(m) by Euler’s Theorem. If
xn ≡ a (mod m)
has a solution, then let gk be its solution where g is a primitive root modulo m.
Hence
aϕ(m)/d ≡ gkn(ϕ(m)/d) ≡(gϕ(m)
)kn/d≡ 1 (mod m).
Conversely, suppose
aϕ(m)/d ≡ 1 (mod m).
Let a = g`. Note that from our assumption and the fact that g is a primitive
root, we must have
ϕ(m)|(`ϕ(m)/d).
Hence d|`.Now consider the linear congruence
nu ≡ ` (mod ϕ(m)).
![Page 57: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/57.jpg)
5.2 The Legendre Symbol 51
Note that since d = (n, ϕ(m)) divides `, the above congruence is solvable by
Theorem 1.40. Now, let s = gu. Then
sn = gnu = gnu+ϕ(m)v ≡ g` ≡ a (mod m)
and hence
xn ≡ a (mod m)
is solvable.
By specifying n = 2 and m = p we see that
theorem 5.2 (Euler’s Criterion) The congruence equation
x2 ≡ a (mod p)
is solvable if and only if
a(p−1)/2 ≡ 1 (mod p).
5.2 The Legendre Symbol
Let a be any integer relatively prime to p, where p is an odd prime. The Legendre
symbol is defined by
(a
p
)=
{1 if x2 ≡ a (mod p) is solvable,
−1 otherwise..
From the Euler Criterion, we see immediately that
a(p−1)/2 ≡(a
p
)(mod p).
This is because ap−1 ≡ 1 (mod p) and so a(p−1)/2 ≡ ±1 (mod p).
When p|n we simply define (n
p
)= 0.
From Euler’s Criterion, we obtain immediately that the Legendre symbol(
np
)
is a completely multiplicative function of n, i.e.,(mn
p
)=
(m
p
)(n
p
).
In the subsequent sections, we will learn an algorithm in computing the Leg-
endre symbol, thereby allowing us to determine the solvability of the congruence
x2 ≡ a (mod p).
![Page 58: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/58.jpg)
52 Quadratic Reciprocity Law
definition 5.3 For (a, p) = 1, we say that a is a quadratic residue modulo p if
x2 ≡ a (mod p) is solvable. Otherwise, we say that a is a quadratic non-residue
modulo p.
5.3 Gauss Lemma
In this section, we give an elementary proof of the Gauss Lemma.
theorem 5.4 Let p be a prime and let a be an integer such that (a, p) = 1.
Consider
S := {a, 2a, · · · , p− 1
2a}
and let
T := {s (mod p)|s ∈ S},
with elements in T between 0 and p − 1. Suppose there are m elements in T
which are greater than p/2, then(a
p
)= (−1)m.
Proof
Let r1, r2, · · · , rm be elements in T exceeding p/2. Let s1, s2, · · · , sk be the
remaining elements in T . Note that
m+ k =p− 1
2.
Now, rj > p/2 implies that 0 < p− rj < p/2.
Note that since (Z/pZ)∗is a group, multiplication by a induces a bijection on
this group and hence, no two si’s coincide and no two p− rj coincide. It remains
to show that no p− rj coincides with si. Let
rj ≡ ρa (mod p) and si ≡ σa (mod p),
with ρ < p/2, σ < p/2 and ρ 6= σ. If
−rj ≡ si (mod p),
then
a(ρ+ σ) ≡ 0 (mod p).
This implies that p|(ρ+ σ). Since |ρ+ σ| < p, this is impossible.
Now, since m+ k = (p− 1)/2, p− ri, 1 ≤ i ≤ n and sj , 1 ≤ j ≤ k, are distinct
and less than p/2 and that they are all less than p/2, we conclude that
{p− r1, p− r2, · · · , p− rm, s1, s2, · · · , sk} = {1, 2, · · · , p− 1
2}.
![Page 59: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/59.jpg)
5.4 Proofs of Gauss’ Quadratic Reciprocity Law 53
Multiplying the elements on both sides, we find that
(−r1)(−r2) · · · (−rm) · s1 · · · sk ≡(p− 1
2
)! (mod p).
But
r1 · r2 · · · rm · s1 · · · sk ≡ (−1)ma(p−1)/2
(p− 1
2
)! (mod p)
as the ri’s and sj ’s are congruent to elements in S, we conclude that
a(p−1)/2 ≡ (−1)m (mod p),
and hence, (a
p
)= (−1)m.
5.4 Proofs of Gauss’ Quadratic Reciprocity Law
Let a = −1 then m = (p−1)/2 since as 6∈ S for all s ∈ S. From Euler’s Criterion
or Gauss Lemma (Theorem 5.4), we deduce that(−1
p
)= (−1)(p−1)/2.
This is the first part of Gauss’ quadratic reciprocity law.
Let a = 2 and p ≡ 1 (mod 4). Set p = 4` + 1. We have S = {1, 2, · · · , 2`}.Therefore 2S = {2, 4, · · · , 2`, 2`+ 2, · · · , 4`}. Hence m = `. In other words
(2
p
)=
{1 if p = 8`+ 1
−1 if p = 8`+ 5.
Similarly, we have(2
p
)=
{1 if p = 8`+ 7
−1 if p = 8`+ 3.
In short, we may write (2
p
)= (−1)(p
2−1)/8.
We now prove the final part of Gauss’ quadratic reciprocity law. First, we need
a lemma.
lemma 5.5 For odd positive integer m, we have
sinmx
sinx= (−4)(m−1)/2
∏
1≤j≤(m−1)/2
(sin2 x− sin2
2πj
m
).
![Page 60: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/60.jpg)
54 Quadratic Reciprocity Law
Proof
It is known that
cosmx+ i sinmx = (cos x+ i sinx)m =m∑
k=0
Ak (cosx)m−k (sinx)k ik.
Comparing the imaginary parts of both sides, we conclude that
sinmx =
(m−1)/2∑
`=0
A2`+1 sin2`+1 x cosm−(2`+1) x(−1)`.
Hencesinmx
sinx
is a polynomial in sin2 x of degree (m− 1)/2. We note that the function
sinmx
sinx
vanishes when x = 2πj/m for 1 ≤ j ≤ (m− 1)/2. Hence
sinmx
sinx= C
∏
1≤j≤(m−1)/2
(sin2 x− sin2
2πj
m
).
The constant C can be found by letting x tends to 0. The left hand side is m
and the right hand side becomes
C(−1)(m−1)/2∏
1≤j≤(m−1)/2
sin22πj
m.
Using the fact that
sin(π − x) = sinx,
we find that
∏
1≤j≤(m−1)/2
sin22πj
m=
m−1∏
k=1
sinkπ
m.
Butm−1∏
k=1
sinkπ
m=
1
2m−1im−1
m−1∏
k=1
e−ikπ/m(ωk − 1
),
where ω = e2πi/m. Now,
m−1∏
k=1
e−ikπ/m = i−(m−1),
and using the equation
xm − 1 =
m∏
k=1
(x − ωk),
![Page 61: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/61.jpg)
5.4 Proofs of Gauss’ Quadratic Reciprocity Law 55
we deduce that
m(−1)m−1 =m−1∏
k=1
(ωk − 1).
Hence,
m−1∏
k=1
sinkπ
m=
m
2m−1.
This implies that C = (−4)(m−1)/2 and the proof is complete.
The main part of Gauss’ quadratic reciprocity law states that for odd distinct
primes p, q(p
q
)(q
p
)= (−1)(p−1)(q−1)/4.
Let S = {1, 2, · · · , (p− 1)/2}. Write
[qs]p = [es(q)sq ]p,
where sq ∈ S and
es(q) =
{1 if qs = sq ∈ S
−1 if qs 6∈ S..
Note that es(q) = −1 precisely when qs exceeds p/2. Hence, if m is the number
of elements in qS = {1 ≤ qs (mod p) ≤ p|s ∈ S} that exceeds p/2, then(q
p
)= (−1)m =
∏
s∈S
es(q).
Now, define
F ([m]p) = sin2πm
p.
Note that F is well defined on (Z/pZ)∗ since F ([m + kp]p) = F ([m]p) because
sin(2π(m+ kp)/p) = sin(2πm/p). Therefore, the relation
[qs]p = [es(q)sq]p
yields
sin2π
pqs = F ([qs]p) = F ([es(q)sq]p) = es(q) sin
2π
psq.
Since s→ sq is a bijection of S, we conclude that
(q
p
)=∏
s∈S
es(q) =∏
s∈S
sin2π
pqs
sin2π
psq
=∏
s∈S
sin2π
pqs
sin2π
ps.
![Page 62: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/62.jpg)
56 Quadratic Reciprocity Law
Applying Lemma 5.5, we deduce that(q
p
)=∏
s∈S
(−4)(q−1)/2∏
t∈T
(sin2
2πs
p− sin2
2πt
q
),
where T = {1, 2, · · · , (q − 1)/2}. Hence(q
p
)= (−4)(q−1)(p−1)/4
∏
s∈S
∏
t∈T
(sin2
2πs
p− sin2
2πt
q
).
Interchanging the role of p and q, we deduce that(p
q
)= (−4)(q−1)(p−1)/4
∏
t∈T
∏
s∈S
(sin2
2πt
q− sin2
2πs
p
).
Therefore, (p
q
)and
(q
p
)
agrees up to sign and the sign is given by (−1)(p−1)(q−1)/4 by looking at the
products
∏
s∈S
∏
t∈T
(sin2
2πs
p− sin2
2πt
q
)and
∏
t∈T
∏
s∈S
(sin2
2πt
q− sin2
2πs
p
).
We have thus completed the proof of the quadratic reciprocity law and we
summarize the result as follow:
theorem 5.6 Let p and q be distinct primes. Then we have(−1
p
)= (−1)(p−1)/2
(2
p
)= (−1)(p
2−1)/8
(p
q
)(q
p
)= (−1)(p−1)(q−1)/4.
example 5.1
Show that x2 ≡ 10 (mod 89) is solvable.
Solutions. We compute(10
89
)=
(2
89
)(5
89
)= (−1)(89
2−1)/8
(5
89
)=
(89
5
)= 1.
5.5 The Jacobi Symbol
In this section, we discuss an extension of the Legendre symbol.
![Page 63: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/63.jpg)
5.5 The Jacobi Symbol 57
definition 5.7 Let Q be a positive odd integer so that
Q = q1q2 · · · qswhere qi are not necessarily distinct. The Jacobi symbol is defined by
(P
Q
) s∏
j=1
(P
qj
)
where the expressions on the right hand side involving qj are the Legendre sym-
bols.
We observe that if Q is prime then the Jacobi symbol is simply the Legendre
symbol. We also note that if (P,Q) > 1, then(P
qj
)= 0
for some j and hence (P
Q
)= 0.
Now if P is a quadratic residue modulo Q, then
x2 ≡ P (mod qj)
is solvable and hence (P
Q
)= 1.
The converse, however, is not true. Although(
2
15
)= 1,
the congruence
x2 ≡ 2 (mod 15)
is not solvable.
From the definition of the Jacobi symbol and the properties of the Legendre
symbol, it is immediate that(P
Q
)(P
Q′
)=
(P
QQ′
)
(P
Q
)(P ′
Q
)=
(PP ′
Q
)
(P 2
Q
)= 1 and
(P
Q2
)= 1
Finally, we observe that if P ≡ P ′ (mod Q), then(P
Q
)=
(P ′
Q
).
![Page 64: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/64.jpg)
58 Quadratic Reciprocity Law
The main result we want to show in this section is
theorem 5.8 If Q is odd positive integer, then
(−1
Q
)= (−1)
Q−12
(2
Q
)= (−1)
Q2−1)8
(P
Q
)(Q
P
)= (−1)
P−12 ·Q−1
2 .
Proof
To prove the first equality, we observe that
ab− 1
2−(a− 1
2+b− 1
2
)=
(a− 1)(b− 1)
2.
The numerator of the right hand side is divisible by 4 if both a and b are odd.
Hence,
ab− 1
2≡(a− 1
2+b− 1
2
)(mod 2).
Hence if Q = q1q2 · · · qs, thens∑
j=1
(qj − 1
2
)≡(Q− 1
2
)(mod 2).
This implies that
(−1
Q
)=
s∏
j=1
(−1
qj
)= (−1)
∑sj=1(qj−1)/2 = (−1)
Q−12 .
The proof of the second equality is similar except that we used the relation
a2b2 − 1
8−(a2 − 1
8+b2 − 1
8
)=
(a2 − 1)(b2 − 1)
8
and observe that the numerator of the right hand side is divisible by 64.
The proof of the last equality follows in exactly the same way as the proof of
the first equality.
With the Jacobi symbol, we can now calculate Legendre symbol without having
to factorize integers (except for factoring −1 and 2).
In the next two sections, we will give another proof of the Gauss Lemma using
results from group theory.
![Page 65: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/65.jpg)
5.6 The Transfer 59
5.6 The Transfer
Let G be a finite group and H be a subgroup of G. Let X = G/H and fix a set
of representatives for X , namely,
R = {x1, · · · , xm}where m = |X |. Write the elements in X as [xj ] instead of xjH .
The group G acts on X via
g · [xi] = [gxi].
Now, because we insist that the representative for the cosets in X to be from R
we must write [gxi] as [xj ] for some 1 ≤ j ≤ m. Since [gxi] = [xj ] for some j we
could view G as acting on {1, 2, · · · ,m} and we write
g ◦ i = j.
In other words, we have
[gxi] = [xg◦i].
From the above, we conclude that
gxi = xg◦ihg,[xi]
for some hg,[xi] ∈ H. We define the transfer of g to be
Ver(g) =∏
[x]∈X
hg,[x] (mod [H,H ])
where where [H,H ] is the commutator subgroup of H , namely, the group gen-
erated by {h1h2h−11 h−1
2 |h1, h2 ∈ H}.It appears that Ver(g) depends on our choice of set of representatives R. We
will show that this is not the case.
theorem 5.9 The map Ver : G → H/[H,H ] is independent of the choice of
representatives of x ∈ X and it is a homomorphism.
Proof
Let R∗ = {x∗1, x∗2, · · · , x∗m} be another fixed set of coset representatives and
suppose we have
[x∗i ] = [xi].
This means that
x∗i = xih[xi]
for some h[xi] ∈ H. Now,
g[x∗i ] = [gx∗i ] = [gxi] = [xg◦i] = [x∗g◦i].
If we write
gx∗i = x∗g◦ih∗g,[x∗
i]
![Page 66: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/66.jpg)
60 Quadratic Reciprocity Law
for some h∗g,[x∗
i] ∈ H , then the corresponding transfer is defined by
Ver∗(g) =∏
[x]∈X
h∗g,[x] (mod [H,H ]).
Note that
gx∗i = gxih[xi] = xg◦ihg,[xi]h[xi] = x∗g◦ih−1[xg◦i]
hg,[xi]h[xi].
This implies that
h∗g,[x∗
i] = h−1
[xg◦i]hg,[xi]h[xi].
Therefore,
Ver∗(g) =∏
[xi]∈X
h−1[xg◦i]
hg,[xi]h[xi] (mod [H,H ]) =∏
[x]∈X
hg,[xi] (mod [H,H ]) = Ver(g)
since
ab = abb−1a−1ba = ba (mod [H,H ])
and∏
[xi]∈X
h−1[xg◦i]
h[xi] = 1 (mod [H,H ]).
Therefore the transfer is independent of the coset representatives.
Next,
st[xi] = [stxi] = [xst◦i].
Hence,
stxi = xst◦ihst,[xi].
Now,
s(txi) = s(xt◦i)ht,[xi] = xst◦ihs,[xt◦i]ht,[xi].
Therefore,
Ver(st) =∏
[xi]∈X
hst,[xi] (mod [H,H ]) =∏
[xi]∈X
hs,[xt◦i]ht,[xi] (mod [H,H ])
=∏
[xi]∈X
hs,[xt◦i]
∏
[xi]∈X
ht,[xi] (mod [H,H ]) = Ver(s)Ver(t).
Hence Ver is a homomorphism.
Our next task to compute Ver(s) for a single element s ∈ G. We consider the
cyclic subgroup C generated by s and let C acts on X , namely,
sj[x] = [sjx].
![Page 67: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/67.jpg)
5.6 The Transfer 61
Note that X is a disjoint union of orbits Oα under the action of C. If |Oα| = fαand [xα] ∈ Oα then the elements in the orbit are
{[sjxα]|j = 0, 1, · · · , fα − 1}.
We can now choose our coset representatives R to contain
{sjxα|j = 0, 1, · · · , fα − 1}
for each orbit Oα. Note that if 0 ≤ j ≤ fα − 2,
s[sjxα] = [sj+1xα]
and since sj+1xα ∈ R, we have
s(sjxα) = sj+1xαhs,[sjxα]
with hs,[sjxα] = 1. When j = fα − 1, we find that
s(sfα−1xα) = sfαxα = xαhs,[sfα−1xα]. (5.2)
Therefore,
Ver(s) =∏
[x]∈X
hs,[x] =∏
α
fα−1∏
j=0
hs,[sjxα] =∏
α
hs,[sfα−1xα] =∏
α
x−1α sfαxα,
where we have used (5.2).
Hence we have
theorem 5.10 Let s ∈ G and let Oα be the orbits of X under the action of
the cyclic subgroup generated by s. Suppose [xα] is an element in Oα, then
Ver(s) =∏
α
hα =∏
α
(xα)−1sfαxα (mod [H,H ]).
We have already seen that the map Ver is a homomorphism fromG toH/[H,H ].
If G is abelian then [H,H ] is trivial and we have a homomorphism from G to H
given by
Ver(s) =∏
α
sfα = s∑
αfα = sm,
since∑
α
fα = m
where m = |X |.
![Page 68: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/68.jpg)
62 Quadratic Reciprocity Law
5.7 Gauss Lemma via the transfer
Let G = (Z/pZ)∗and H = {[±1]p}. Let the coset representatives of H in G be
S := {1, 2, · · · , (p− 1)/2}. Now, for a ∈ G,
Ver([a]p) = [a](p−1)/2p .
We now compute Ver from its definition. Note that for [s]p ∈ S and [a]p ∈ G,
[as]p = [saes(a)]p
where
es(a) =
{1 if [as]p = [sa]p for some sa ∈ S,
−1 otherwise.
Hence,
Ver([a]p) = [−1]mp
where m is the number of elements s such that [as]p 6∈ S. Comparing with the
previous computation of Ver(s), we conclude that[(
a
p
)]
p
≡ [a](p−1)/2p = Ver([a]p) = [−1]mp ,
where m is the number of elements s such that [as]p 6= [s′]p for all s′ ∈ S.
In other words, m is the number of elements in S such that [as]p = [−sa]p =
[p − sa]p, or the number of elements in S such that the least positive residues
of as (mod p) exceeds p/2. This proves Gauss Lemma and shows that Gauss
Lemma corresponds to computing Ver([a]p) in two different ways.
5.8 The Legendre Symbol and primes of the form x2 + y2
In this section, we give an explicit form of x and y that satisfy
x2 + y2 = p
where p is a prime, p ≡ 1 (mod 4).1 This would certainly imply that
theorem 5.11 Let p be an odd prime such that p ≡ 1 (mod 4). Then p is a
sum of two squares.
We first need two lemmas.
lemma 5.12 Let p be an odd prime. We have
∑
m(mod p)
(m
p
)= 0.
1 Adapted from “Number Theory” by G.E. Andrews
![Page 69: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/69.jpg)
5.8 The Legendre Symbol and primes of the form x2 + y2 63
Proof
Note that using primitive roots modulo p, we see that if g is a primitive root
modulo p, then the even powers of g are quadratic residues and the odd powers
of g are quadratic non-residues. Therefore there are (p− 1)/2 quadratic residues
and (p− 1)/2 quadratic non-residues. Therefore in the sum
∑
m(mod p)
(m
p
),
there are (p− 1)/2 terms which take the value 1 and (p− 1)/2 terms which take
the value −1. In other words,
∑
m(mod p)
(m
p
)= 0.
lemma 5.13 Let p be an odd prime. We have
∑
m(mod p)
((m− a)(m− b)
p
)=
{p− 1 if a ≡ b (mod p)
−1 if a 6≡ b (mod p)..
Proof
Note first by replacing m by m+ a, we find that
∑
m(mod p)
((m− a)(m− b)
p
)=
∑
m(mod p)
((m)(m− (b − a))
p
)
=∑
m(mod p)m 6≡0(mod p)
((m)(m− (b − a))
p
)
since(
mp
)= 0 when p|m. Let m′ be such that m′m ≡ 1 (mod p). Then
∑
m(mod p)m 6≡0(mod p)
(m(m− (b− a))
p
)=
∑
m(mod p)m 6≡0(mod p)
(m′2
p
)(m(m− (b− a))
p
)
=∑
m(mod p)m 6≡0(mod p)
(m′m
p
)(m′m−m′(b− a)
p
)
=∑
m(mod p)m′ 6≡0(mod p)
(1−m′(b− a)
p
)
=∑
m′(mod p)
(1−m′(b− a)
p
)− 1
= −1
![Page 70: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/70.jpg)
64 Quadratic Reciprocity Law
by Lemma 5.12. Note that number -1 in the second last line is added so that we
could sum over complete residues m′ (mod p).
Let
S(m) =
p∑
n=1
(n(n2 −m)
p
).
Note that S(0) = 0 by Lemma 5.12 and S(m+Np) = S(m) for any integer N .
Let k 6≡ 0 (mod p). Then
S(m) =∑
n(mod p)
(k4
p
)(n(n2 −m)
p
)
=∑
n(mod p)
(k2n
p
)(k2n2 − k2m
p
)
=
(k
p
) ∑
n(mod p)
(kn
p
)((kn)2 − k2m
p
)
=
(k
p
)S(k2m).
Hence, we have
S2(k2m) = S2(m). (5.3)
If u is a quadratic residue modulo p, then u ≡ k2 (mod p) for some integer k
and
S2(u) = S2(k2 · 1) = S2(1) (5.4)
by setting m = 1 in (5.3).
If v is a non-residue, then v = `2j` where ` is a fixed primitive root modulo
p. This is because all quadratic non-residues are odd powers of primitive roots
modulo p. Then by (5.3),
S2(v) = S2(`(`j)2) = S2(`). (5.5)
We note that we can replace ` by any quadratic non-residue modulo p.
There are (p − 1)/2 quadratic residues and (p − 1)/2 quadratic non-residues
and hence by (5.4) and (5.5), we deduce that
∑
m(mod p)
S2(m) = S2(0) +p− 1
2
(S2(1) + S2(`)
)=p− 1
2
(S2(1) + S2(`)
),
since S2(0) = 0.
![Page 71: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/71.jpg)
5.8 The Legendre Symbol and primes of the form x2 + y2 65
Now,
∑
m(mod p)
S2(m) =∑
m(mod p)
∑
s(mod p)t(mod p)
(s(s2 −m)
p
)(t(t2 −m)
p
)
=∑
m,s,t
(st
p
)((m− s2)(m− t2)
p
)
=∑
s,t
(st
p
)∑
m
((m− s2)(m− t2)
p
).
By Lemma 5.13, we deduce that
p− 1
2
(S2(1) + S2(`)
)=
∑
s,ts2≡t2(mod p)
(st
p
)(p− 1) +
∑
s,ts2 6≡t2(mod p)
(st
p
)(−1)
=∑
s,ts2≡t2(mod p)
(st
p
)(p− 1)−
∑
s,t
(st
p
)−
∑
s,ts2≡t2(mod p)
(st
p
)
= p∑
s,ts2≡t2(mod p)
(st
p
),
where we have used Lemma 5.12 in the last equality. Next,
∑
s,ts2≡t2(mod p)
(st
p
)=
∑
s,ts≡t(mod p)
(st
p
)+
∑
s,ts≡−t(mod p)
(st
p
)
=∑
t(mod p)
(t2
p
)+
∑
t(mod p)
(−t2p
)= 2(p− 1)
Hence we conclude that
(S(1)
2
)2
+
(S(`)
2
)2
= p.
Now, if 2|S(m) then we would have found integers x and y such that
x2 + y2 = p.
![Page 72: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/72.jpg)
66 Quadratic Reciprocity Law
To show that 2|S(m) for all integers m, we observe that
S(m) =
(p−1)/2∑
n=1
(n(n2 −m)
p
)+
p−1∑
n=(p+1)/2
(n(n2 −m)
p
)
=
(p−1)/2∑
n=1
(n(n2 −m)
p
)+
(p−1)/2∑
n=1
((p− n)((p− n)2 −m)
p
)
= 2
(p−1)/2∑
n=1
(n(n2 −m)
p
),
where the last equality follows for the fact that p ≡ 1 (mod 4).
Remark 5.14 Recently, H.H. Chan, L. Long and Y.F. Yang showed the follow-
ing observation:
Let p ≡ 1 (mod 6). Suppose a is any integer such that x3 ≡ a (mod p) is not
solvable. Then
3p = x2 + xy + y2,
with
x =
p∑
α=1
(α3 + 1
p
)and y =
(a
p
) p∑
α=1
(α3 + a
p
).
5.9 Appendix : Primes of the form x2 + y2, a second approach
In this section, we give another proof of Theorem 5.11. This proof is due to D.
Zagier and is motivated by the work of R. Heath-Brown, who is in turn motivated
by Liouville.
Consider the set
S = {(x, y, z) ∈ Z+|x2 + 4yz = p}.
Note that S is non-empty. This is because if p = 4k+1, then (1, 1, k) ∈ S. Define
the map on S by
α : (x, y, z) →
(x+ 2z, z, y− x− z) if x < y − z,
(2y − x, y, x− y + z) if y − z < x < 2y,
(x− 2y, x− y + z, y) if x > y.
.
We can check that this is a map on S by checking that each expression in the
image satisfies x2 + 4yz = p. For example,
(x + 2z)2 + 4z(y − x− z) = p
if x2 + 4yz = p.
![Page 73: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/73.jpg)
5.9 Appendix : Primes of the form x2 + y2, a second approach 67
We next check that α is an involution. This again can be checked by considering
the three cases as listed in the definition of α. For example, if x < y − z, then
α(x, y, z) = (x+ 2z, z, y− x− z) = (x′, y′, z′)
and x+ 2z > 2z or x′ > 2y′. Hence
α(x′, y′, z′) = (x′ − 2y′, x′ − y′ + z′, y′) = (x, y, z).
The rest of the cases can be verified in a similar way.
Our next step is to show that the only fixed point of α is (1, 1, k) where
p = 1 + 4k. If we use the definition of α, we find that the only possible case
where a fixed point exists for α is when y − z < x < 2y since x, y, z are all
positive integers. This implies that the fixed point must be of the form (u, u, w).
But this would mean that u2 +4uw = p and that u|p, or u = 1 or p. Now, u 6= p
for otherwise, the left hand side would be greater than the right hand side. Hence
u = 1 and w = k and the only fixed point of α is (1, 1, k). For any involution β,
we find that the number of elements in S is the sum of the number of elements
in
{s, β(s)|β(s) 6= s}
and
{t|β(t) = t}.
Since there is only one fixed point for α, we conclude that the number of elements
in S is odd.
We next consider a simple involution
σ(x, y, z) = (x, z, y).
Since the number of elements in S is odd, σ must have a fixed point, say
(x0, y0, z0). But this is a fixed point of σ means that y0 = z0 and thus, we
conclude that
x20 + 4y20 = p
and this completes the proof of Theorem 5.11.
![Page 74: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/74.jpg)
6 Binary Quadratic forms
6.1 Fermat-Euler Theorem
A binary quadratic form is an expression of the form
f(x, y) = ax2 + bxy + cy2
where a, b, c ∈ Z.
Representation of an integer by a binary quadratic form has attracted attention
of many mathematicians in the past. For example, Fermat observed that
theorem 6.1 An odd prime p can be represented by a sum of two squares if
and only if p ≡ 1 (mod 4).
We have seen two proofs of Theorem 6.1 in the last Chapter. We now give
Euler’s proof of this result using Fermat’s method of descent.
Proof
If p ≡ 3 (mod 4) then it cannot be a sum of two squares since a sum of two
squares is either congruent to 1,2, or 0 modulo 4. Suppose p ≡ 1 (mod 4) then(−1
p
)= 1
implies that
x2 ≡ −1 (mod p)
is solvable. Let u be an integer with 1
|u| ≤ p− 1
2(6.1)
such that
u2 ≡ −1 (mod p).
This implies that
u2 + 1 = kp (6.2)
1 We use the remainders from −(p − 1)/2 to (p − 1)/2 instead of 0 to p− 1.
![Page 75: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/75.jpg)
6.1 Fermat-Euler Theorem 69
for some positive integer k. Note that by (6.1),
u2 + 1 < p2,
and hence 0 < k < p. Now since p - (u · 1), we conclude that if
S := {m ∈ Z+|x2 + y2 = mp, p - xy,m < p.},
then by (6.2) and our discussion so far, we conclude that k ∈ S and hence S is
a non-empty subset of positive integers.
By the least integer axiom, S contains a minimal positive m0 satisfying the
conditions. If m0 = 1 then we are done.
Suppose m0 > 1. We first note that m0 cannot divide both x and y. If this
were the case, then m0 would divide p. Since m0 6= 1 and m0 < p, this is not
possible. Therefore, if we write
x = x1 +m0r and y = y1 +m0s
with |x1| < m0/2 and |y1| < m0/2, x1 and y1 cannot be both zero. This implies
that
x21 + y21 > 0.
Now,
0 < x21 + y21 ≤ m20/2 < m2
0 (6.3)
by our choice of the range of |x1| and |y1|. Hence,
x21 + y21 = m0(p− 2rx− 2sy +m0r2 +m0s
2) = m0m1
with m1 < m0 by (6.3).
Next, since
(x− iy)(x1 + iy1) = (xx1 + yy1 + i(xy1 − x1y),
we find that
m20m1p = (x2 + y2)(x21 + y21) = (xx1 + yy1)
2 + (xy1 − x1y)2.
Now, observe that
X = xx1 + yy1 = x(x −m0r) + y(y −m0s) = m0w,
for some integer w. Similarly,
Y = xy1 − x1y = m0v.
Therefore,
m20m1p = X2 + Y 2 = m2
0(w2 + v2).
This implies that
w2 + v2 = m1p. (6.4)
Now, if p|(wv), then p|w or p|v. Without loss of generality, we may assume p|u.
![Page 76: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/76.jpg)
70 Binary Quadratic forms
Now, (6.4) implies that p|v2, or p|v. Again from (6.4), we find that p|m1, which
is impossible because 0 < m1 < p. Hence, p - (wv). The above discussion implies
that m1 ∈ S. But this contradicts the minimality of m0 in S. This implies that
m0 = 1 in the first place and we are done.
From Theorem 6.1, we see that primes of the form x2 + y2 can be determined
by the Legendre condition
(−1
p
)= 1. It turns out primes of the form x2 + 2y2
and x2 + 3y2 are determined completely by the conditions
(−2
p
)= 1 and
(−3
p
)= 1, respectively. A natural question to ask is therefore, can all primes
of the form x2 + ny2 determined by a single condition
(−np
)= 1? The answer
is no.
For example, one can show that
p = x2 + 5y2
if and only if(−1
p
)= 1 and
(−5
p
)= 1.
So two conditions are needed. In fact if we only have one condition(−5
p
)= 1,
then we can only conclude either
p = x2 + 5y2
or
p = 2x2 + 2xy + 3y2.
The binary quadratic forms x2+5y2 and 2x2+2xy+3y2 both have discriminant
b2− 4ac = −20. Furthermore these binary quadratic forms are “genuinely differ-
ent”. In the following sections, we will explain the meaning of binary quadratic
forms being “genuinely different.”
6.2 Representations of integers by binary quadratic forms
definition 6.2 Given a binary quadratic form ax2 + bxy + cy2 the discrim-
inant (denoted usually by d or ∆) of the quadratic form is defined to be the
number b2 − 4ac.
![Page 77: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/77.jpg)
6.2 Representations of integers by binary quadratic forms 71
theorem 6.3 Let d be an integer. There exists at least one binary quadratic
form with integral coefficients and discriminant d, if and only if d ≡ 0 or 1
(mod 4).
Proof
Let f(x, y) = ax2+bxy+cy2 be a binary quadratic form of discriminant d. Since
b2 ≡ 0 or 1 (mod 4), d = b2 − 4ac ≡ 0 or 1 (mod 4). Suppose d ≡ 0 (mod 4).
Then the form
x2 − d
4y2
has discriminant d. If d ≡ 1 (mod 4), then
x2 + xy −(d− 1
4
)y2
has discriminant d.
definition 6.4 A form f(x, y) is called indefinite if it takes on both positive
and negative values. The form is called called positive semidefinite (or negative
semidefinite) if f(x, y) ≥ 0 ( or f(x, y) ≤ 0) for all integers x, y. A semidefinite
form is called definite if in addition the only integers x, y for which f(x, y) = 0
are x = 0, y = 0.
The form f(x, y) = x2 − 2y2 is indefinite. The form f(x, y) = (x − y)2 is
semidefinite and the form f(x, y) = x2 + y2 is positive definite form.
example 6.1
Let d be the discriminant of the form
f(x, y) = ax2 + bxy + cy2.
If d < 0 and a > 0, show that f(x, y) is positive definite.
Solutions. We see that
f(x, y) = a
(x+
by
2a
)2
+4ac− b2
4a2y2 = a
(x+
by
2a
)2
− d
4ay2.
This immediately shows that f(x, y) is positive definite.
definition 6.5 We say that a quadratic form f(x, y) represents an integer
n if there exists integers k, ` such that
f(k, `) = n.
We say that the representation is proper if (k, `) = 1; otherwise, it is improper.
In the first section, we have seen that if p ≡ 1 (mod 4) then p is properly
represented by x2 + y2. Our aim in this chapter is to continue our investigation
of integers properly represented by a particular quadratic forms.
![Page 78: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/78.jpg)
72 Binary Quadratic forms
theorem 6.6 Let n > 0 and d be given integers with n 6= 0. There exists a
binary quadratic form of discriminant d that represents n properly if and only if
the congruence
x2 ≡ d (mod 4n)
has a solution.
Proof
Suppose b is a solution of
x2 ≡ d (mod 4n).
Then b2 = d+ 4nc, for some integer c. The form
f(x, y) = nx2 + bxy + cy2
has discriminant d and properly represents n since f(1, 0) = n.
Next, suppose f(k, `) = n and (k, `) = 1. We claim that we can find a quadratic
form g(x, y) that takes the form
nx2 +Bxy + Cy2.
In order to prove this, we rewrite a general binary quadratic form in the form
(x y
)( a b/2
b/2 c
)(x
y
).
We note the discriminant may be defined as −4det(M), where
M =
(a b/2
b/2 c
).
So in general if F (x, y) = f(αx + γy, βx + δy), then the discriminant of F is
given by
−4 · det(a b/2
b/2 c
)· det
(α β
γ δ
)2
.
Suppose f(x, y) = ax2 + bxy + cy2. Since (k, `) = 1, there exist integers u, v
such that ku− `v = 1. We let
g(x, y) = f(kx+ vy, `x+ uy).
Then
g(x, y) = nx2 +Bxy + Cy2.
Since
det
(k `
v u
)= 1,
by the above discussion, we find that the discriminant of g(x, y) is given by
B2 − 4nC = d.
![Page 79: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/79.jpg)
6.2 Representations of integers by binary quadratic forms 73
This immediately implies that
x2 ≡ d (mod 4n)
has a solution.
corollary 6.7 Suppose d ≡ 0 or 1 (mod 4). If p is an odd prime, then there
is a binary quadratic form of discriminant d that represents p if and only if
(d
p
)= 1.
Proof
By Theorem 6.6, we conclude that
x2 ≡ d (mod 4p)
is solvable if p can be represented by a binary quadratic form of discriminant d.
This implies that
x2 ≡ d (mod p)
is solvable, or(d
p
)= 1.
Conversely, if
(d
p
)= 1 then
x2 ≡ d (mod p)
is solvable. Note that d ≡ 0 or 1 (mod 4), then
x2 ≡ d ≡ 0 (mod 4)
or
x2 ≡ d ≡ 1 (mod 4)
respectively. Hence,
x2 ≡ d (mod 4)
is solvable. By Chinese Remainder Theorem, we conclude that
x2 ≡ d (mod 4p)
is solvable and by Theorem 6.6, p is represented by a binary quadratic form of
discriminant d.
![Page 80: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/80.jpg)
74 Binary Quadratic forms
6.3 Equivalence of binary quadratic forms
We have seen in the proof of Theorem 6.6 that matrices play a role in the study
of quadratic forms. In fact, we observe that
(α β
γ δ
)is 1, then the discriminant
of g(x, y) that is obtained from f(x, y) under the transformation
(x y
)→(x y
)(α β
γ δ
)
is equal to the discriminant of f(x, y). This observation also shows that an
integer n can be properly represented by f(x, y) if and only if it can be properly
represented by g(x, y). In other words, the forms f(x, y) and g(x, y) are “similar”.
This leads us to the next definition.
definition 6.8 We say that two quadratic forms f(x, y) = ax2 + bxy + cy2
and g(x, y) = Ax2 +Bxy+Cy2 are related and write f ∼ g if there are integers
α, β, γ, δ ∈ Z with αδ − βγ = 1, such that
g(x, y) = f(αx+ γy, βx+ δy).
theorem 6.9 The relation ∼ on the set of binary quadratic forms is an equiv-
alence relation.
Proof
Since f(x, y) = f(x, y), f ∼ f . Suppose f ∼ g, then we can write
g(x, y) = Ax2 +Bxy + Cy2 =(x y
)( A B/2
B/2 C
)(x
y
)
=(x y
)(α γ
β δ
)(a b/2
b/2 c
)(α β
γ δ
)(x
y
).
Comparing the representations above (check that the (1, 2)-entry and (2, 1)-entry
of the product of three matrices in the last expression are the same), we conclude
that (A B/2
B/2 C
)=
(α γ
β δ
)(a b/2
b/2 c
)(α β
γ δ
).
With the above representation, we see that
(a b/2
b/2 c
)=
(α γ
β δ
)−1(A B/2
B/2 C
)((α γ
β δ
)−1)T
and hence g ∼ f.
Lastly, to check transitivity, we suppose that
h(x, y) = A′x2 +B′xy + C′y2.
![Page 81: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/81.jpg)
6.4 Reduced forms 75
Suppose f ∼ g via
(α β
γ δ
)and g ∼ h via
(α′ β′
γ′ δ′
). Then
(A′ B′/2
B′/2 C′
)=
(α′ γ′
β′ δ′
)(α γ
β δ
)(a b/2
b/2 c
)(α β
γ δ
)(α′ β′
γ′ δ′
).
This computation shows that
f ∼ h.
Remark 6.10 If f ∼ g, then
g(x, y) = f(αx+ γy, βx+ δy).
Note that since αδ − βγ = 1, the discriminant of f and g are the same and
so the relation ∼ is an equivalence relation on the set of binary quadratic forms
with the same discriminant.
From now on, we will concentrate on the set of positive definite binary quadratic
forms (the discriminant is negative and that its absolute value is not a perfect
square). We will also introduce the notion of reduced forms and show that in
each equivalence class under ∼, there exists one and only one reduced form which
serves as a representative of that class. We will then show that given a fixed dis-
criminant, there are finitely many equivalence classes and this number is equal
to the number of reduced forms.
6.4 Reduced forms
definition 6.11 Let f(x, y) = ax2+bxy+cy2 be a positive definite binary
quadratic form whose discriminant d < 0 with ac 6= 0. We call f reduced if
−|a| < b ≤ |a| < |c|
or if
0 ≤ b ≤ |a| = |c|.
We will first show the following:
theorem 6.12 Let d be a given negative integer.Each equivalence class of
positive definite binary quadratic forms of discriminant d contains at least one
reduced form.
Proof
Suppose
f(x, y) = ax2 + bxy + cy2.
![Page 82: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/82.jpg)
76 Binary Quadratic forms
Since ac 6= 0, a 6= 0 and c 6= 0. Let
F := {A ∈ Z+|Ax2 +Bxy + Cy2 ∼ f(x, y)}.
Note that F is non-empty and because a 6= 0 and f ∼ f . By the least integer
axiom, there exists a smallest positive integer u in the set F .
Let
g(x, y) = ux2 + vxy + wy2
such that g(x, y) ∼ f(x, y).
Suppose the second coefficient of g, namely, v does not lie in (−|u|, |u|]. Thenby Division Algorithm, there exists a unique integer m such that
−|u| < 2um+ v ≤ |u|.
The form
g(x+my, y) = ux2 + (2um+ v)xy + w′y2
will then have the middle coefficient lying in (−|u|, |u|]. Note that if w′ < u,
then by applying
S =
(0 −1
1 0
),
we arrive at an equivalent form
w′x2 − (2um+ v)xy + uy2,
which contradicts the minimality of u. Hence, w′ ≥ u.
If w′ = u and 2um+ v = 0 then u = 1 since ux2 + uy2 has to be primitive. If
2um+ v > 0, then
ux2 + (2um+ v)xy + uy2
is reduced. If 2um+ v < 0, then since
ux2 + (2um+ v)xy + uy2 ∼ ux2 − (2um+ v)xy + uy2,
and ux2 − (2um + v)xy + uy2 is reduced, we observe that f is equivalent to a
reduced form.
lemma 6.13 Let f(x, y) = ax2+ bxy+ cy2 be a reduced positive definite form.
If for some pair of integers x and y we have (x, y) = 1 and f(x, y) ≤ c, then
f(x, y) = a or c and the point (x, y) is one of the six points
±(1, 0),±(0, 1),±(1,−1).
Moreover, the number of proper representations of a by f is
2 if a < c,
4 if 0 ≤ b < a = c, and
6 if a = b = c.
![Page 83: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/83.jpg)
6.4 Reduced forms 77
Proof
Suppose that |y| ≥ 2. Then
4af(x, y) = (2ax+ by)2 − dy2 ≥ −dy2
≥ −4d = 16ac− 4b2 > 8ac− 4b2
≥ 4a2 − 4b2 + 4ac ≥ 4ac.
Thus, f(x, y) > c if |y| ≥ 2. This implies that if f(x, y) ≤ c then |y| ≤ 1.
Now, suppose that |y| = 1 and |x| ≥ 2. Then
|2ax+ by| ≥ |2ax| − |by| ≥ 4a− |b| ≥ 3a.
Therefore,
4af(x, y) = (2ax+ by)2 − dy2 ≥ 9a2 − dy2
= 9a2 − d = 9a2 + 4ac− b2 > a2 − b2 + 4ac ≥ 4ac.
Thus, f(x,±1) > c.
Next, if y = 0 then since (x, y) = 1, we conclude that x = ±1.
Collecting what we obtain so far, we conclude that if f(x, y) ≤ c, then |y| ≤ 1
and |x| ≤ 1. There are altogether eight points ((0, 0) is excluded since the greatest
common divisor of 0 and 0 is not 1) (x, y) satisfying the above inequalities. These
are
±(1, 0)± (0, 1),±(1,−1) and ± (1, 1).
Next, since b > −a, we find that
f(±1,±1) = a+ b+ c > c.
Hence, (1, 1) and (−1,−1) are excluded as well.
We have thus arrived at the six possible solutions. The last assertion follows
on observing that
f(1, 0) = a, f(0, 1) = c and f(1,−1) = a− b + c.
We observe that from the above computations, a and c are the smallest positive
integer and second smallest positive integer represented by f respectively.
theorem 6.14 Let f(x, y) = ax2 + bxy+ cy2 and g(x, y) = Ax2 +Bxy+Cy2
be reduced positive definite quadratic forms. If f ∼ g then f = g.
Proof
From the previous lemma, we find that a is the smallest integer that is properly
represented by f(x, y). Since f is equivalent to g, we may write
f(x, y) = g(αx+ γy, βx+ δy).
![Page 84: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/84.jpg)
78 Binary Quadratic forms
Now, two solutions to f(x, y) = a are (±1, 0). Thus,
g(±α,±β) = a.
Since
αδ − βγ = 1,
we conclude that (α, β) = 1 and the representation of a by g(x, y) is proper.
Therefore a ≥ A since A is the smallest positive integer represented by g. In-
terchanging the roles of f and g and using similar arguments, we conclude that
A ≥ a. Hence, a = A.
Suppose a < c. Then f(x, y) = a has only two solutions, namely, (±1, 0).
If C = A = a, then by previous lemma, g(x, y) = a would have more than 2
solutions with (x, y) = 1. Hence, we conclude that C > a.
Observe that c is the second smallest integer properly represented by f(x, y)
and the solutions f(x, y) = c are (0,±1). This implies that g(x, y) properly repre-
sents c. Hence c ≥ C since C is the second smallest integer properly represented
by g(x, y). Interchanging the roles of f and g, we conclude that C ≥ c and hence,
C = c.
To show that b = B suppose g(x, y) = f(α′x+ γ′y, β′x+ δ′y). Note that
g(x, y) = ax2 +Bxy + cy2
= (aα′2 + bα′β′ + cβ′2)x2 +B′xy + (aγ′2 + bγ′δ′ + cδ′2)y2.
Comparing the coefficients of x2, we find that
f(α′, β′) = a.
The only two solutions to the above are (±1, 0). Therefore, α′ = ±1 and β′ = 0.
Now, by considering solutions to f(x, y) = c we conclude that (γ′, δ′) = (0,±1).
We then conclude that
g(x, y) = f(−x,−y) or f(x, y)
and b = B.
Suppose a = c. Then f(x, y) = a has at least 4 proper representations by f .
Therefore the same is true for g. Hence, C = a = c. Now, B2 − 4AC = b2 − 4ac
and the fact that b ≥ 0, B ≥ 0 yields b = B and our proof is complete.
theorem 6.15 Let f be a reduced positive definite binary quadratic form
whose discriminant d is not a perfect square. Then 0 < a ≤√−d/3. The number
of reduced forms of a given nonsquare discriminant d is finite.
Proof
We see that
−d = 4ac− b2 ≥ 4a2 − a2 = 3a2.
![Page 85: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/85.jpg)
6.4 Reduced forms 79
Hence,
a ≤√−d/3.
Now −a < b ≤ a and b is determined by the equation
b2 − 4ac = d.
This implies that there are finitely many reduced forms.
There is a formula to compute the number of reduced binary quadratic form of
discriminant d < 0. For example if d ≡ 1 (mod 4) and |d| > 3, then the number
is−1
|d|∑
(n,d)=11<n<|d|
n
(n
|d|
).
For example, if |d| = 11, we have
1
11{1 + 4 + 9 + 5 + 3− 2− 6− 7− 8− 10} = 1.
![Page 86: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/86.jpg)
7 Form Class groups
In the last Chapter, we introduce the concept of reduced binary (positive defi-
nite) quadratic forms. In this chapter, we will show that given discriminant d, a
group can be constructed from the set of reduced binary quadratic forms with
discriminant d.
7.1 The set C(d)
Let d < 0 be such that d ≡ 1 (mod 4) (respectively 0 (mod 4)) and |d| (respec-tively |d|/4) is squarefree. This will guarantee that the positive definite binary
quadratic forms
f(x, y) = ax2 + bxy + cy2
with discriminant d are primitive (i.e., the (a, b, c) = 1). In this section, we will
use
[a, b, c]
to represent the quadratic form
f(x, y) = ax2 + bxy + cy2.
For
A =
(α γ
β δ
)∈ SL2(Z) and f(x, y) = ax2 + bxy + cy2,
we define
A(f(x, y)) =(x y
)A
(a b/2
b/2 c
)At
(x
y
).
Note that (1 0
m 1
)([a, b, c]) = [a, 2am+ b, am2 + bm+ c]
and (0 −1
1 0
)([a, b, c]) = [c,−b, a].
Let C(d) be the set of equivalence classes of positive definite binary quadratic
![Page 87: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/87.jpg)
7.1 The set C(d) 81
forms. We have seen that each class contains a unique reduced form and hence,
|C(d)| is finite. We wish to obtain a binary operation • on C(d) and obtain a
group structure on C(d). Let C1, C2 ∈ C(d). Suppose C1 and C2 contains the
forms [a,B,Ca′] and [a′, B, Ca] respectively. We define
[a,B,Ca′] ◦ [a′, B, Ca] := [aa′, B, C]
and that 1
C1 • C2 := [[aa′, B, C]].
Our operation ◦ is motivated by the following identity of Gauss 2:
(a1x21 + bx1y1 + a2cy
21)(a2x
22 + bx2y2 + a1cy
22) = a1a2x
2 + bxy + cy2
where
x = x1x2 − cy1y2, y = a1x1y2 + a2x2y1 + by1y2.
This identity of Gauss-Dirichlet can be proved by observing that
f(x, y) = ax2 + bxy + cy2 =1
4aN(2ax+ by +
√dy)
with d = b2 − 4ac and N(u+ v√d) = u2 − dv2.
The main difficulty in showing that (C(d), •) is a group is to show that • is
well-defined. But first, we will show that given C1 and C2, we can always find
pairs of forms in C1 and C2 of the form [a,B,Ca′] and [a′, B, Ca] for some integers
B and C. To do this, we need the following theorem:
theorem 7.1 Suppose f is a primitive positive definite binary quadratic form
and k is a nonzero integer, then there exists an integer n properly represented
by f with the property that (n, k) = 1.
Proof
Let k =∏m
j=1 pαj
j .Note that for each prime pj |k, one of the numbers f(1, 0), f(1, 1)
and f(0, 1) is relatively prime to pj . For if this were not the case, then there will
be a number d that divides f(1, 0) = a, f(1, 1) = a+ b+ c and f(0, 1) = c. This
number d would divide b as well, which implies that f is not primitive.
Let (xj , yj) denote the pair of integers for which f(xj , yj) is relatively prime to
pj or any power of pj . Now, by Chinese Remainder Theorem, there are integers
u, v satisfying the congruences
x ≡ xj (mod pαj
j ) and y ≡ yj (mod pαj
j ), 1 ≤ j ≤ m.
Note that f(u, v) is the number relatively prime to k.
1 The choice of representatives [a,B, Ca′] and [a′, B, Ca] is so that C can be computed asC = (B2
− d)/(4aa′).2 This approach is due to Dirichlet.
![Page 88: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/88.jpg)
82 Form Class groups
Let C1 = [f ] and C2 = [g]. Suppose f = [a, b, c]. By Theorem 7.1, there exists a′
represented by g such that (2a, a′) = 1. We replace g by [a′, b′, c′]. Using Chinese
Remainder Theorem, there exists a unique B modulo 2aa′ such that
B ≡ b (mod 2a) and B ≡ b′ (mod a′).
Note that
B = b+ 2aK and B = b′ + a′K ′ (7.1)
implies that
B2 ≡ d (mod 4a) and B2 ≡ d (mod a′).
Hence, 4aa′ divides B2−d. Let C = (B2−d)/(4aa′) and we find that [aa′, B, C]
is a quadratic form of discriminant d. Note that
[a, b, c] ∼ [a, b+ 2aK, c1] = [a,B,Ca′].
Now, b2 − 4ac = (b′)2 − 4a′c′ implies that b and b′ have the same parity. Next,
(7.1) implies that
B = b′ + a′K ′ ≡ b+ 2aK (mod 2).
This shows that K ′ is even and we have
[a′, b′, c′] ∼ [a′, B, Ca].
We have thus shown that C1 and C2 contains forms of the form [a,B,Ca′] and
[a′, B, Ca] and • is defined as
C1 • C2 = [[aa′, B, C]].
The above observation that b and b′ having the same parity allows us to replace
(7.1) by
B = b+ 2aK and B = b′ + 2a′L (7.2)
whenever a′ is odd.
7.2 C(d) is a finite abelian group
We will show later that • is well defined. But first, assuming that • is well defined,we want to show that (C(d), •) is a finite abelian group.
To simplify our treatment, we assume d ≡ 0 (mod 4). The case for d ≡ 1
(mod 4) is similar. Let C0 = [[1, 0,−d/4]]. We will show that C0 is an identity for
(C(d), •). Take C1 = [[a, b, c]]. Now,
[1, 0,−d/4] ∼ [1, b, c′]
since b is even. Therefore,
C0 • C1 = [[a, b, c]] = C1
![Page 89: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/89.jpg)
7.3 The operation • is well defined 83
and C0 is the identity.
Next, let C1 = [[a, b, c]]. We claim that C2 = [[a,−b, c]] is the inverse of C1.
[a,−b, c] ∼ [c, b, a]
and
[a, b, c] ◦ [c, b, a] = [ac, b, 1].
Now,
[ac, b, 1] ∼ [1,−b, ac] ∼ [1, 0,−d/4].
Hence,
C1 • C2 = C0.
To show associativity, we let C1, C2, C3 be three elements in C(d). Using The-
orem 7.1, we can find a, a′, a′′ such that a, a′, a′′, 2 are pairwise relatively prime.
We construct B such that
B ≡ b (mod 2a), B ≡ b′ (mod a′) and B ≡ b′′ (mod a′′).
Note that
[[a,B, c1]] • [[a′, B, c2]] = [[aa′, B, C]]
and
[[aa′, B, C]] • [[a′′, B, c3]] = [[aa′a′′, B, C1]].
Similarly,
C1 • (C2 • C3) = [[aa′a′′, B, C2]].
Note that C1 = C2 and we are done.
7.3 The operation • is well defined
The most difficult part in showing (C(d), •) is a group is to show that • is well
defined. In other words, if we have
[u, V,Wu′] ∼ [a,B,Ca′]
and
[u′, V,Wu] ∼ [a′, B, Ca],
we want to show that
[uu′, V,W ] ∼ [aa′, B, C].
We will complete this task in a few steps. Let
f1 = [a,B,Ca′], f2 = [u, V,Wu′]
![Page 90: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/90.jpg)
84 Form Class groups
and
g1 = [a′, B, Ca], g2 = [u′, V,Wu]
be such that
f1 ∼ f2 and g1 ∼ g2.
Step 1
Suppose f1 = f2, (a, u′) = 1 and V = B. Since
f1 = f2,
we deduce that
Wu′ = Ca′.
Our aim is to show that if
g1 ∼ g2,
then
f1 ◦ g1 ∼ f1 ◦ g2.
This is equivalent to showing that
[aa′, B, C] ∼ [au′, B,W ].
Now, since g1 ∼ g2, there is a matrix(α β
γ δ
)∈ SL2(Z)
such that(α β
γ δ
)(a′ B/2
B/2 Ca
)=
(u′ B/2
B/2 Wa
)(δ −γ−β α
).
The identity
αB/2 + βCa = −u′γ + αB/2
shows that
βCa = −u′γ.
By our choice of u, u′, we know that (u, u′) = 1 and hence,
γ/a ∈ Z.
Now, observe that(α βa
γ/a δ
)(aa′ B/2
B/2 C
)=
(au′ B/2
B/2 W
)(δ −γ/a
−βa α
).
Hence,
[aa′, B, C] ∼ [au′, B,W ].
![Page 91: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/91.jpg)
7.3 The operation • is well defined 85
Step 2
Suppose (a, u′) = 1 and B = V . We have seen from Step 1 that f1◦g1 ∼ f1◦g2.Applying Step 1 again, g2 ◦ f1 ∼ g2 ◦ f2. Hence,
f1 ◦ g1 ∼ f1 ◦ g2 = g2 ◦ f1 ∼ g2 ◦ f2 = f2 ◦ g2.
Step 3
We now drop the assumption that B = V . Assume that (aa′, uu′) = 1. If aa′
and uu′ are both odd, then there is a solution to
x ≡ B (mod 2aa′) and x ≡ V (mod uu′).
If either aa′ or uu′ were even, then we may assume without loss of generality
that aa′ is even. By Chinese Remainder Theorem, there is a solution to
x ≡ B (mod 2aa′) and x ≡ V (mod uu′).
Let this solution be B∗ and we observe that
B∗ = B + 2aa′n1 = V + uu′m.
By (7.2), we conclude that m is even and write m = 2n2. In other words, we
have
B∗ = B + 2aa′n1 = V + 2uu′n2.
Let
F1 =
(1 0
a′n1 1
)([a,B,Ca′]) = [a,B∗, C1]
and
G1 =
(1 0
an1 1
)([a′, B, Ca]) = [a′, B∗, C′
1].
Note that
f1 ∼ F1 and g1 ∼ G1
and
f1 ◦ g1 = [aa′, B, C]
and
F1 ◦G1 = [aa′, B∗, C∗1 ].
Now,(
1 0
n1 1
)(f1 ◦ g1) =
(1 0
n1 1
)([aa′, B, C])
= [aa′, B∗, C∗1 ]
implies that
f1 ◦ g1 ∼ F1 ◦G1.
![Page 92: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/92.jpg)
86 Form Class groups
Using similar argument with f1, g1, n1 replaced by f2, g2, n2, we deduce that
f2 ◦ g2 = F2 ◦G2
where
F2 =
(1 0
u′n2 1
)([u, V,Wu′]) = [u,B∗, C2]
and
G2 =
(1 0
un2 1
)([u′, V,Wu]) = [u′, B∗, C′
2].
Since the coefficient of the xy-term in F1, F2, G1, G2 is B∗, by Step 2, we
deduce that
F1 ◦G1 ∼ F2 ◦G2.
But
f1 ◦ g1 ∼ F1 ◦G1
and
f2 ◦ g2 ∼ F2 ◦G2.
Combining the three relations, we conclude that
f1 ◦ g1 ∼ f2 ◦ g2.
Step 4
Let f1 = [a,B,Ca′], f2 = [u, V,Wu′] and g1 = [a′, B, Ca], g2 = [u′, V,Wu].
Choose s such that (s, aa′uu′) = 1 with s representable by f1. Moreover, we can
find a form F such that F ∼ f1 and F ∼ f2 with the property that
f = [s,B′, C′].
Choose s′ such that (s′, aa′uu′s) = 1 with s′ representable by g1. Note that there
is a form G such that G ∼ g1 and G ∼ g2 with
g = [s′, B′′, C′′].
We remark here that the conditions on the gcd’s are to guarantee that (s, s′) = 1,
(ss′, aa′) = (ss′, uu′) = 1. Since (s, s′) = 1, we can find F,G with F ∼ f and
G ∼ g and
F = [s,B∗, C∗1 ] and G = [s′, B∗, C∗
2 ],
with
B∗ ≡ B′ (mod 2s) and B∗ ≡ B′′ (mod s′).
Since (ss′, aa′) = 1, by Step 3, we conclude that
F ◦G ∼ f1 ◦ g1.
![Page 93: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/93.jpg)
7.3 The operation • is well defined 87
Similarly, since (ss′, uu′) = 1, by Step 3, we find that
F ◦G ∼ f2 ◦ g2.Hence, we conclude that
f1 ◦ g1 ∼ F ◦G ∼ f2 ◦ g2.This shows that • is a well defined operation and this completes the proof that
(C(d), •) is a finite abelian group.
Remark 7.2 One can show that (C(d), •) is an abelian group by identifying
it with ideal class group associated with imaginary quadratic fields. Let d < 0
be a discriminant of a quadratic field and let d be the product of (−1) and the
distinct primes dividing d. For example, if d = −4, then we choose d = −1.
Let K = Q(√d). An element in K is an algebraic integer α if its minimal
polynomial is of the form
x2 + ax+ b, a, b ∈ Z.
It is a fact that
Theorem 7.3 The set of algebraic integers, denoted by OK , in Q(√d) is
OK =
Z[√d] if d 6≡ 1 (mod 4)
Z
[1 +
√d
2
]if d ≡ 1 (mod 4).
An ideal I of a ring R is an additive subgroup of R satisfying the condition
that for all r ∈ R and i ∈ I,
ir = ri ∈ I.
It turns out that any ideal I in OK is of the form
[a, b+ c√d] := aZ+ (b + c
√d)Z.
One can define an equivalence relation on the set of non-zero ideals: We say
that I is equivalent to J if I = αJ for some non-zero α ∈ K∗. If PK denotes the
set of principal ideals in OK then the set of equivalence classes under ∼ is given
by
CK = IK/PK ,
where IK is the set of ideals in OK .
We now define a map and its inverse from C(d) to C(K). Given a class in C(d)
of the form [f(x, y)] with f(x, y) = ax2 + bxy + cy2, we let τ be the solution of
f(τ, 1) = 0 with Im τ > 0. We define ψ by
ψ([f ]) = [a[1, τ ]].
![Page 94: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/94.jpg)
88 Form Class groups
We define the inverse ϕ : C(K) → C(d) via
ϕ([[α, β]]) =N(αx − βy)
N ([α, β]),
where N (I) = |OK/I| and N(u) is the product of u and its conjugate. These
are inverse of each other and ψ is a bijection. To show that C(d) is a group under
the Dirichlet composition •, we observe that under ψ,
ψ([f ] • [g]) = [I] · [J ] = [IJ ],
where ψ([f ]) = [I] and ψ([g]) = [J ]. More precisely, given f = ax2 + bxy + cy2
and g = a′x2 + b′xy + c′y2, we have
F = aa′x2 +Bxy +B2 −D
4aa′y2
if F = f ◦ g. Corresponding to these three forms, we have respectively three
ideals
[a, (−b+√d)/2], [a′, (−b′ +
√d)/2] and [aa′, (−B +
√d)/2].
Let ∆ = (−B +√d)/2. Then we can rewrite the above three ideals as
[a,∆], [a′,∆] and [aa′,∆].
Note that we would have
ψ([f ] • [g]) = [I] · [J ].
It is known that (CK , ·) is an abelian group and so (C(d), •) is an abelian
group.
![Page 95: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/95.jpg)
8 Continued fractions and Pell’sequations
8.1 Pell’s equations
We often encounter situations in which we need to find solutions of an equation
with integral or rational values of the variables. We refer to such equations Dio-
phantine equations. In Example 1.23, we found integral solutions to the linear
diophantine equation
43x+ 64y = 1.
In the previous chapter, we determine those primes represented by x2 + y2.
In this chapter, we will study another type of equation, known as Pell’s equa-
tion.
definition 8.1 Let d > 0. A diophantine equation x2 − dy2 = 1 is known as
Pell’s equation.
Solving Pell’s equation is related to finding generating element for the group
of units in real quadratic fields.
An algebraic integer a satisfying the property that a−1 is also an algebraic
integer is known as a unit. It is known that if d > 0 then there are infinitely
many units in Q(√d). For example, (
√2 + 1)m,m ∈ Z are units in Q(
√2).
It is also known that these are of the form wεm where w is a finite root of
unity and ε is a unit. In order find this special unit, one needs to solve the Pell
equation. For example, if a+ b√d were a unit, then
1
a+ b√d=a− b
√d
a2 − db2
is also an algebraic integer. In many cases, an algebraic integer is of the form
u+ v√d and so in order for a+ b
√d, a2 − db2 = 1. This explains the connection
between Pell’s equation and units in real quadratic fields.
Before we show that the Pell equation is always solvable, we need to study
continued fractions.
![Page 96: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/96.jpg)
90 Continued fractions and Pell’s equations
8.2 Continued fractions
Given a rational fraction u0/u1 such that (u0, u1) = 1 and u1 > 0, we may write
using the Euclidean algorithm,
u0 = u1a0 + u2, 0 < u2 < u1,
u1 = u2a1 + u3, 0 < u3 < u2,
......
uj−1 = ujaj−1 + uj+1, 0 < uj+1 < uj,
uj = uj+1aj .
Let
ξi =uiui+1
.
By using the above equations, we find that
ξ0 = a0 +1
ξ1= · · · = a0 +
1
a1 +1
.. . +1
aj−1 +1
aj
.
This is a continued fraction expansion of ξ0 or u0/u1. We use the notation
< a0, a1, · · · , aj >
to represent the above continued fraction. Note that
< x0, x1, · · · , xj >= x0 +1
< x1, · · · , xj >=
⟨x0, x1, · · · , xj−2, xj−1 +
1
xj
⟩.
8.3 Infinite continued fraction
Let a0, a1, a2, · · · be an infinite sequence of integers, all positive except perhaps
a0. We define two sequences of integers {hn} and {kn} inductively as follows :
h−2 = 0, h−1 = 1, hi = aihi−1 + hi−2 for i ≥ 0
k−2 = 1, k−1 = 0, ki = aiki−1 + ki−2 for i ≥ 0.
theorem 8.2 Let x be a positive real number and a0 = x. For positive integer
n ≥ 1,
< a0, a1, · · · , an−1, x >=xhn−1 + hn−2
xkn−1 + kn−2.
![Page 97: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/97.jpg)
8.3 Infinite continued fraction 91
Proof The case n = 1 is easily verified. We now prove the theorem by induction.
The induction step follows from the following computations:
< a0, a1, · · · , an, x > =
⟨a0, a1, · · · , an−1, an +
1
x
⟩
=(an + 1/x)hn−1 + hn−2
(an + 1/x)kn−1 + kn−2
=x(anhn−1 + hn−2) + hn−1
x(ankn−1 + kn−2) + kn−1
=xhn + hn−1
xkn + kn−1.
If we let x = an in Theorem 8.2, then we immediately obtain the following:
theorem 8.3 Let n ≥ 1 be an integer. If rn =< a0, a1, · · · , an >, then
rn =hnkn.
definition 8.4 The expression
ri =hiki
is called the n-th convergent of the continued fraction < a0, a1, · · · , an > .
theorem 8.5 For i ≥ −1,
hiki−1 − hi−1ki = (−1)i−1 (8.1)
For i ≥ 2,
ri − ri−1 =(−1)i−1
kiki−1(8.2)
hiki−2 − hi−2ki = (−1)iai (8.3)
and
ri − ri−2 =(−1)iaikiki−2
. (8.4)
Proof Note that h−1k−2 − h−2k−1 = 1. Continuing the proof by induction, we
suppose that
hi−1ki−2 − hi−2ki−1 = (−1)i−2.
Then
hiki−1 − hi−1ki = (aihi−1 + hi−2)ki−1 − hi−1(aiki−1 + ki−2)
= −(hi−1ki−2 − hi−2ki−1) = (−1)i−1,
![Page 98: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/98.jpg)
92 Continued fractions and Pell’s equations
and this completes the proof of (8.1).
Identity (8.2) follows by dividing (8.1) by kiki−1.
Identity (8.3) can be derived directly as follow:
hiki−2 − hi−2ki = (aihi−1 + hi−2)ki−2 − hi−2(aiki−1 + ki−2)
= ai(hi−1ki−2 − hi−2ki−1) = (−1)iai.
Dividing the above identity by kiki−2 we obtain (8.4).
From (8.4), we find that that
r2k−1 > r2k+1
and
r2k+2 > r2k.
From (8.2), we find that for a fixed positive k, the following inequalities hold:
r2 < r4 < · · · < r2k < r2k−1 < r2k−3 < · · · < r3.
This shows that the sequence {r2k}∞k=1 is bounded above by r3 and the se-
quence {r2k−1}∞k=1 is bounded below by r2. These facts led us to the following
theorem:
theorem 8.6 The sequence {r2n} is monotonic increasing, bounded above by
r1, and the sequence {r2n−1} is monotonic decreasing and bounded below by r0.
The limit limn→∞ rn exists.
The existence of the limits of {r2n} and {r2n−1} follows from the Monotone
sequence theorem. The fact that these limits are equal follow from (8.2).
definition 8.7 An infinite sequence a0, a1, · · · of integers, all positive exceptperhaps for a0, determines an infinite simple continued fraction< a0, a1, a2, · · · >.The value of < a0, a1, a2, · · · > is defined to be
limn→∞
< a0, a1, a2, · · · , an > .
Note that this limit is the same as limn→∞ rn.
8.4 A simple Lemma and primes of the form x2 + y2
We now establish a simple inequality and apply it to the study of primes of the
form x2 + y2.
lemma 8.8 Let
ξ =< a0, a1, · · · , an, · · · > .
![Page 99: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/99.jpg)
8.4 A simple Lemma and primes of the form x2 + y2 93
Then
|ξ − rn| < |rn − rn+1| .Proof
If n = 2k, then
r2k+1 − r2k > 0.
Now {r2k}∞k=1 is increasing, and {r2k+1}∞k=1 is decreasing, we conclude that
0 < ξ − r2k < r2k+1 − r2k
and
ξ − r2k > 0 > r2k − r2k+1.
Hence,
|ξ − r2k| < |r2k − r2k+1| .
The argument is similar for odd integer 2k + 1. This completes the proof of the
Lemma.
Now, let p ≡ 1 (mod 4) be an odd prime. Then there exists a positive u such
that
u2 ≡ −1 (mod p).
Letu
p=< a0, a1, · · · , a` > .
Let j be such that
kj <√p < kj+1. (8.5)
By Lemma 8.8, we conclude that∣∣∣∣u
p− rj
∣∣∣∣ < |rj+1 − rj | <1
kjkj+1<
1
kj√p.
This implies that
|ukj − hjp| <√p. (8.6)
Now, let x = kj and y = ukj − hjp. By (8.5) and (8.6), we find that
0 < x2 + y2 < |kj |2 + |ukj − hjp|2 < 2p.
Furthermore,
x2 + y2 ≡ k2j + u2k2j ≡ 0 (mod p)
since
u2 ≡ −1 (mod p).
Hence,
p = x2 + y2.
![Page 100: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/100.jpg)
94 Continued fractions and Pell’s equations
This gives another proof of the fact that if p ≡ 1 (mod 4) is an odd prime, it is
a sum of two squares. This proof is due to C. Hermite.
8.5 Solutions to Pell’s equations
Let ξ be an irrational number and let < a0, a1, · · · > be the continued fraction
representation of ξ. We have
theorem 8.9 If a/b is a rational number with positive denominator such that
∣∣∣ξ − a
b
∣∣∣ <∣∣∣∣ξ −
hnkn
∣∣∣∣
for some n ≥ 1, then b > kn. In fact, if
|ξb − a| < |ξkn − hn|
for some n ≥ 0, then b ≥ kn+1.
Note that this result says that the convergents are the best rational approxi-
mations to ξ if we want to minimize the denominator of the fraction.
Proof We will first show that the first assertion follows from the second. Assume
that b ≤ kn < kn+1. Using the fact that b/kn ≤ 1, we conclude that
|ξb− a| < b
kn≤ |ξkn − hn|
and b ≤ kn ≤ kn+1. This is a contradiction to the second assertion.
To prove the second assertion, suppose
|ξb − a| < |ξkn − hn|
for some n ≥ 0 and b < kn+1. Consider the equations in x and y:
xkn + ykn+1 = b, xhn + yhn+1 = a.
Since the determinant of the coefficients is ±1, we conclude that x, y are integers.
Note that if x = 0, then
y =b
kn+1> 0,
or y ≥ 1. This implies that b ≥ kn+1, a contradiction.
If y = 0, then a = xhn and b = xkn. Hence,
|ξb − a| = |ξxkn − xhn| = |x||ξkn − hn| ≥ |knξ − hn|
since |x| ≥ 1. This is a contradiction.
Next, we claim that xy < 0. If y < 0 then
xkn = b− ykn+1
![Page 101: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/101.jpg)
8.5 Solutions to Pell’s equations 95
shows that x > 0. If y > 0, then b < kn+1 implies that
b < ykn+1.
So xkn < 0 and therefore, x < 0.
Since the even convergents increase to ξ and the odd convergents decrease to
ξ, we conclude that
ξkn − hn and ξkn+1 − hn+1
have opposite signs. Hence
x(ξkn − hn) and y(ξkn+1 − hn+1)
have the same sign. From the equations defining x and y, we get
ξb− a = x(ξkn − hn) + y(ξkn+1 − hn+1).
Now, if A and B are of the same sign then
|A+B| = |A|+ |B|.
Hence,
|ξb− a| = |x(ξkn − hn) + y(ξkn+1 − hn+1)| ≥ |x(ξkn − hn)| ≥ |ξkn − hn|,
which is a contradiction.
theorem 8.10 Let ξ denote any irrational number. If there is a rational num-
ber a/b with b > 1 and (a, b) = 1 such that
∣∣∣ξ − a
b
∣∣∣ < 1
2b2,
then a/b equals one of the convergents of the simple continued fraction of ξ.
Proof Suppose a/b is not a convergent. Let n be such that kn ≤ b < kn+1. For
this n, we find by the previous theorem that
|ξkn − hn| ≤ |ξb− a| < 1
2b.
This implies that∣∣∣∣ξ −
hnkn
∣∣∣∣ <1
2bkn(8.7)
and∣∣∣ξ − a
b
∣∣∣ < 1
2b2. (8.8)
Since
a
b6= hnkn,
![Page 102: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/102.jpg)
96 Continued fractions and Pell’s equations
the expression bhn − akn is a non-zero integer and we deduce that using (8.7)
and (8.8) that
1
bkn≤ |bhn − akn|
bkn=
∣∣∣∣hnkn
− a
b
∣∣∣∣ ≤∣∣∣∣ξ −
hnkn
∣∣∣∣+∣∣∣ξ − a
b
∣∣∣ < 1
2bkn+
1
2b2.
This implies that b < kn, which is a contradiction.
Suppose d is not a square and d > 0. We will show in this section that if
x2 − dy2 = 1
is solvable, then the solutions can be obtained from the continued fraction ex-
pansion of√d. 1
Suppose
p2 − dq2 = 1.
Then
p2 = 1 + dq2 > dq2.
Hence,
p > q and p+√dq ≥ 2q
since√d > 1.
Next,
p−√dq =
1
p+√dq
≤ 1
2q.
Therefore, we find that
p
q−√d ≤ 1
2q2.
By Theorem 8.10, we conclude that p/q must be a convergent of the continued
fraction of√d.
Remark 8.11 By studying periodic continued fraction, one can show that there
exists non-zero positive integers x and y such that x2 − dy2 = 1. Moreover, if x1and y1 are the smallest non-zero positive integers satisfying
x2 − dy2 = 1,
then all solutions with x, y positive are of the form
(x1 +√dy1)
n, n ∈ Z+.
1 It can be shown that the Pell equations can be solved if d is positive and squarefree.
![Page 103: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/103.jpg)
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0 97
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0
In the last section, we see that all solutions to the Pell equation
x2 − dy2 = 1
can be obtained from the continued fraction expansion of√d if the equation is
solvable with xy 6= 0. In this section, we will prove that the equation is indeed
solvable.
definition 8.12 A continued fraction is purely periodic if there exists a pos-
itive integer m such that
am+k = ak,
for all integers k ≥ 0.
In other words, a purely periodic continued fraction appears in the form
< a0, a1, · · · , am−1, a0, a1, · · · , am−1, · · · >,
where ai are all positive integers. We denote this as
< a0, a1, · · · , am−1 > .
theorem 8.13 Let ξ be a real number. The continued fraction expansion of ξ
is purely periodic if and only if ξ is a real quadratic irrational number satisfying
ξ > 1 and −1 < ξ′ < 0, where
ξ′ = a− b√N if ξ = a+ b
√N .
Proof We will first prove that if ξ > 1 and −1 < ξ′ < 0, then ξ is purely
periodic.
We will divide our proof into several steps. The continued fraction expansion
of ξ can be constructed inductively via
ai = [ξi]
and
ξi+1 =1
ξi − ai. (8.9)
where we set ξ0 = ξ.
Step 1. We first show that there exist positive integers k, j such that ξk = ξj .
We first note that if
α =a+
√b
c,
then multiplying α by |c|/|c|, we deduce that
α =±ac+
√bc2
±c2 .
![Page 104: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/104.jpg)
98 Continued fractions and Pell’s equations
In other words, if we start with
ξ =a+ b
√N
c,
we can write ξ in the form
ξ = ξ0 =m0 +
√M
q0,
where
q0|(M −m20).
Now, write
ξ0 = a0 +1
ξ1. (8.10)
Suppose we write
ξ1 =m1 +
√M
q1.
Note that q1 may not be an integer. From (8.10), we conclude that
m1 +√M
q1=
q0
m0 +√M − a0q0
.
This implies that
m1m0 +m0
√M +m1
√M +M − a0q0m1 − a0q0
√M = q0q1.
Comparing coefficients on both sides, we conclude that
m1 = a0q0 −m0
and
q0q1 =M −m21.
Note that m1 is an integer. Since q0|(M −m20) and
q1 =M − (a0q0 −m0)
2
q0
=M −m2
0
q0− (a20q0 − 2m0a0),
q1 is also an integer. Note that this implies that q1|(M −m21).
By induction, we conclude that we can write ξi as
ξi =mi +
√M
qi,
with integers qi and mi. Furthermore, we find that
mi+1 = aiqi −mi
![Page 105: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/105.jpg)
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0 99
and
qi+1 =M −m2
i+1
qi. (8.11)
The computations are exactly the same as the case i = 0.
Step 2.
We claim that qi > 0. First, we prove that ξ′i is negative.
Let σ be the automorphism of Q(√M) such that
σ(√M) = −
√M.
Applying σ to (8.9), we deduce that
1
ξ′i+1
= ξ′i − ai. (8.12)
Now, a0 ≥ 1 and ξ′0 = ξ′ < 0. This implies that
ξ′0 − a0 < −1,
or1
ξ′1< −1.
This implies that
−1 < ξ′1 < 0.
Now, each ai ≥ 1 and so, by induction and (8.12), we conclude that each
−1 < ξ′i < 0. (8.13)
The argument used for showing that −1 < ξ′i−1 < 0 implies that −1 < ξ′i < 0 is
exactly the same for the case i = 1.
Next, the fact that ξ′i is negative implies that ξi − ξ′i > 0 and
ξi − ξ′i = 2
√M
qi> 0.
This means that
0 < qi < qiqi+1 =M −m2i+1 ≤M,
by (8.11). Furthermore
0 < m2i+1 < m2
i+1 + qiqi+1 =M.
Since M is fixed, we can only have finitely many possible pairs of values for
(mi, qi). Therefore, there exist k and j with k 6= j such that mj = mk and
qj = qk. This implies that
ξj = ξk
for some j 6= k.
![Page 106: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/106.jpg)
100 Continued fractions and Pell’s equations
Step 3. We now show that the continued fraction expansion of ξ is purely
periodic. From (8.12), we find that for i ≥ 0,
ξ′i = ai +1
ξ′i+1
,
where
0 < − 1
ξ′i+1
< 1.
The last inequality follows from (8.13). From (8.12),
ai − ξ′i = − 1
ξ′i+1
=
[− 1
ξ′i+1
]+
{− 1
ξ′i+1
}
and noting from (8.13) that
0 < −ξ′i < 1,
we conclude that
ai =
[− 1
ξ′i+1
]. (8.14)
Now, ξj = ξk implies that
ξ′k = ξ′j
and
aj−1 =
[− 1
ξ′j
]=
[− 1
ξ′k
]= ak−1.
Hence
ξj−1 = aj−1 +1
ξj= ak−1 +
1
ξk= ξk−1.
Hence, ξj = ξk implies that ξj−1 = ξk−1. If k > j, then a j-fold iteration allows
us to conclude that
ξ =< a0, a1, · · · , ak−j−1 >
and hence the result.
To prove the converse, we assume that the continued fraction expansion of ξ
is purely periodic, say,
ξ =< a0, a1, · · · , an−1 > .
Hence,
ξ =ξhn−1 + hn−2
ξkn−1 + kn−2.
We remark here that since ξ has an infinite continued fraction expansion, ξ
cannot be rational. To see this, assume rn and rn+1 are the n-th and (n+ 1)-th
![Page 107: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/107.jpg)
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0 101
convergents of the continued fraction expansion of ξ, then by Lemma 8.8, we
find that
|ξ − rn| < |rn − rn+1| = (knkn+1)−1.
If ξ were rational, say u/v then
|ukn − vhn| <v
kn+1.
Since kn is increasing, we conclude that |ukn−vhn| is eventually less than 1 and
greater than 0 and hence ξ cannot be rational.
Now ξ satisfies a quadratic equation and it must therefore be a real quadratic
number.
Let
f(x) = x2kn−1 + x(kn−2 − hn−1)− hn−2.
Then f(ξ) = 0 and ξ > 1 since a0 is positive by definition of purely periodic
continued fraction expansion. Clearly ξ′ is another root of f(x). It suffices to
show that −1 < ξ′ < 0. By intermediate value theorem, we need only to show
that f(0) and f(−1) have opposite signs. Now f(0) = −hn−2 < 0.
f(−1) = kn−1 − kn−2 + hn−1 − hn−2.
But this is positive since both {hn} and {kn} are increasing.
corollary 8.14 The number√d+ [
√d] is purely periodic.
Proof The results follow immediately from Theorem 8.13 by checking that√d+ [
√d] > 1
and
−1 < [√d]−
√d < 0.
Suppose√d+ [
√d] =< a0, a1, · · · , ar−1 > .
Note that a0 = 2[√d] and thus,
√d =< [
√d], a1, a2, · · · , 2[
√d] > .
Now, write√d+ [
√d] =< a0, a1, · · · , ξn >,
and√d =< [
√d], a1, · · · , ηn > .
![Page 108: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/108.jpg)
102 Continued fractions and Pell’s equations
Now, ξ0 is already of the form
ξ0 =m0 +
√d
q0
with m0 = [√d] and q0 = 1 and
ξ1 =1√
d− [√d]. (8.15)
We have seen from Theorem 8.13 with M = N = d that the subsequent ξn can
be written as
ξn =mn +
√d
qn.
Next, we note that
η0 =√d =
s0 +√d
t0,
with s0 = 0 and t0 = 1. But
η0 = [√d] +
1
η1.
By (8.15), we find that
η1 =1√
d− [√d]
= ξ1.
By induction, we conclude that
ξn = ηn
for all n ≥ 1. We thus conclude that
theorem 8.15 If√d+ [
√d] =< a0, a1, · · · , ξn >,
and √d =< [
√d], a1, · · · , ηn >,
then
ξn = ηn
for all integers n ≥ 1.
theorem 8.16 If d is a positive integer that is not a perfect square, then
h2n − dk2n = (−1)n−1qn+1
for all integers n ≥ −1, where
ξn+1 =mn+1 +
√d
qn+1.
![Page 109: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/109.jpg)
8.6 The Pell equation x2 − dy2 = 1 is solvable with xy 6= 0 103
Proof We first note that
√d =
ξn+1hn + hn−1
ξn+1kn + kn−1=
(mn+1 +√d)hn + qn+1hn−1
(mn+1 +√d)kn + qn+1kn−1
,
where we have used Theorem 8.15 to express ηn in terms of ξn, which we write
as
ξn =mn +
√d
qn.
Simplifying, we get√d(mn+1kn + qn+1kn−1) + dkn =
√dhn + (mn+1hn + qn+1hn−1).
Comparing coefficients, we find that
hn = mn+1kn + qn+1kn−1,
and
dkn = mn+1hn + qn+1hn−1.
Therefore,
h2n − dk2n = (−1)n−1qn+1,
since
kn−1hn − hn−1kn = (−1)n−1,
by (8.1).
Now, if√d+ [
√d] =< a0, a1, a2, · · · , ar−1 >,
with a0 = 2[√d] then
ξnr = ξr .
But for m > 1, ξm = ηm and hence
ηnr = ηr
for n ≥ 1.
Note also that qr = 1. Hence qnr = 1 for all integers n ≥ 1. Therefore, we
conclude that
h22nr−1 − dk22nr−1 = 1.
This shows that the Pell equation is always solvable in x and y with xy 6= 0.
example 8.1 The continued fraction expansion of√3 + [
√3] is
< 2, 1 > .
![Page 110: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/110.jpg)
104 Continued fractions and Pell’s equations
Here r = 2. Hence we only need to look at the first convergent of the con-
tinued fraction expansion of√3. The continued fraction expansion of
√3 is
< 1, 1, 2, 1, 2, · · · > . The first convergent is therefore given by
1 + 1/1 =2
1.
Note that
22 − 3 · 12 = 1.
Hence we obtain a solution for the Pell equation
x2 − 3y2 = 1.
![Page 111: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/111.jpg)
9 Euler’s identities and Jacobi TripleProduct Identity
9.1 Jacobi’s Triple Product Identity
Let q and z be complex numbers such that |q| < 1. One of the most elegant
identities involving q and z is perhaps Jacobi’s Triple Product Identity given by
∞∑
j=−∞
qj2
zj =
∞∏
j=1
(1 + zq2j−1)(1 + z−1q2j−1)(1− q2j). (9.1)
There are several proofs of (9.1). In this chapter, we will present an elementary
proof due to G.E. Andrews. Before we proceed with the proof of (9.1), we give
a few interesting results that follow from this identity.
example 9.1 Let z = 1 in (9.1). We immediately obtain a product represen-
tation of the generating function for squares, namely,
∞∑
j=−∞
qj2
=
∞∏
j=1
(1 + q2j−1)2(1− q2j).
We remark here that the squares j2 = 1, 4 and 9 for j = 1, 2 and 3 can be viewed
as the number of dots in the following diagram:
j = 1 j = 2 j = 3
Figure 9.1
example 9.2 Let z = q in (9.1), followed by replacing q by q1/2. We find that
∞∑
j=−∞
qj(j+1)/2 = 2
∞∏
j=1
(1 + qj)2(1− qj).
The integers j(j + 1)/2 = 1, 3, 6 for j = 1, 2 and 3 can be viewed as the number
of dots in the following diagram:
![Page 112: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/112.jpg)
106 Euler’s identities and Jacobi Triple Product Identity
j = 1 j = 2 j = 3
Figure 9.2
example 9.3 Let q be replaced by q3/2, followed by letting z = −q1/2. We
immediately deduce from (9.1) that
∞∑
j=−∞
(−1)jqj(3j+1)/2 =
∞∏
j=1
(1− q3j−1)(1 − q3j−2)(1 − q3j) =
∞∏
j=1
(1− qj), (9.2)
where the last equality follows from the fact that a positive integer must be of the
form 3j − r, with r = 0, 1 or 2. Note that by replacing j by −j, we may write
the left hand of the above series as
∞∑
j=−∞
(−1)jqj(3j−1)/2.
The integers of the form ω(j) = j(3j−1)/2, with j ≥ 1, are known as pentagonal
numbers. For j = 1, 2 and 3, w(j) = 1, 5 and 12. These correspond respectively
to the number of dots in the following diagrams:
j = 1 j = 2 j = 3
Figure 9.3
9.2 The terminating q-binomial Theorem
Let n ≥ 1 be an integer. The binomial theorem states that
(1 + x)n =
n∑
k=0
(n
k
)xk (9.3)
where (n
k
)=
n!
k!(n− k)!. (9.4)
There is a q-analogue of (9.3) and to describe this analogue, we first define the
q-analogue of (9.4).
![Page 113: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/113.jpg)
9.2 The terminating q-binomial Theorem 107
definition 9.1 Let (q)0 = 1 and let
(q)n =n∏
j=1
(1 − qj)
and [n
k
]
n
=(q)n
(q)k(q)n−k. (9.5)
We also let
(q)∞ =
∞∏
k=1
(1− qk).
The expression in (9.5) is a q-analogue of (9.4) since
limq→1−
[n
k
]
q
= limq→1−
(1− q)(1− q2) · · · (1 − qn)
(1− q)n·(1− q)k
(q)k·(1− q)n−k
(q)n−k
=
(n
k
).
Observe further that the expressions for (9.4) and (9.5) motivate us to view the
q-analogue of n! as (q)n, even though (q)n does not approach n! as q approaches
1−.
The q-binomial coefficient has several interesting properties. For example, one
can verify that[n
k
]
q
= qk[n− 1
k
]
q
+
[n− 1
k − 1
]
q
(9.6)
and[n
k
]
q
=
[n− 1
k
]
q
+ qn−k
[n− 1
k − 1
]
q
. (9.7)
When q approaches 1−, both (9.6) and (9.7) reduce to the well known relation(n
k
)=
(n− 1
k
)+
(n− 1
k − 1
).
By (9.6), (9.7) and mathematical induction, we deduce that
[n
k
]
q
is a polynomial
in q. This is analogous to the fact that
(n
k
)is an integer.
We are now ready to establish the following analogue of (9.3):
theorem 9.2 Let n be a positive integers, x and q be independent variables.
Then
(1 + x)(1 + xq) · · · (1 + xqn−1) =
n∑
k=0
[n
k
]
q
qk(k−1)/2xk. (9.8)
![Page 114: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/114.jpg)
108 Euler’s identities and Jacobi Triple Product Identity
Proof We follow the approach of G. Polya and G.L. Alexanderson Denote
f(x) = (1 + x)(1 + xq) · · · (1 + xqn−1).
Note that
(1 + x)f(xq) = (1 + x)(1 + xq) · · · (1 + xqn−1)(1 + xqn)
= f(x)(1 + xqn).
If we write
f(x) =
n∑
k=0
Qkxk,
where Qk is a function of q, then
(1 + x)
(n∑
k=0
qkQkxk
)= (1 + xqn)
(n∑
k=0
Qkxk
).
Comparing the coefficient of xk on both sides, we deduce that
qkQk + qk−1Qk−1 = Qk + qnQk−1,
which implies that
(1− qk)Qk = qk−1(1− qn−k+1)Qk−1.
Hence, we find that
Qk = qk−1 (1 − qn−k+1)
(1− qk)Qk−1. (9.9)
Observing that Q0 = 1 and iterating (9.9), we deduce that
Qk = q(k−1)+(k−2)+···2+1(1− qn−k+1)(1− qn−k+2) · · · (1− qn)
(1 − qk)(1− qk−1) · · · (1− q2)(1 − q)Q0
= qk(k−1)/2
[n
k
]
q
.
This completes the proof of (9.8).
9.3 Two identities of Euler
In this section, we derive two identities due to L. Euler. These identities may be
viewed as analogues of the identity
ex =
∞∑
k=0
xk
k!.
![Page 115: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/115.jpg)
9.4 Andrews’ proof of Jacobi’s Triple Product Identity 109
We first rewrite (9.8) as
(1 + x)(1 + xq) · · · (1 + xqn−1) =
n∑
k=0
(q)n
(q)k(q)n−kqk(k−1)/2xk.
We may let n→ ∞ and deduce that 1
∞∏
k=1
(1 + xqk−1) =
∞∑
k=0
qk(k−1)/2
(q)kxk. (9.10)
This identity is due to Euler. Note that if we take (q)n as the analogue of n! and
allow the numerator qk(k−1)/2 to approach 1, we could view the right hand side
of (9.10) as an analogue of the power series expansion of ex.
Next, set x = −1 in (9.8) and multiply both sides by (−1)n/(q)n to deduce
that
0 =
n∑
k=0
(−1)n−1
(q)n−k· q
k(k−1)/2
(q)k, n ≥ 1. (9.11)
Note that the right hand side is of the formn∑
k=0
akbn−k, which is the coefficient
of the xk in the expansion of A(x)B(x) where
A(x) =
∞∑
m=0
(−1)mxm
(q)mand B(x) =
∞∑
m=0
qm(m−1)/2
(q)mxm.
By (9.11), we have
A(x)B(x) = 1. (9.12)
Using (9.10), we find that
∞∑
m=0
(−1)mxm
(q)m=
∞∏
k=1
1
(1 + xqk−1). (9.13)
Identity (9.13) is also due to Euler. By taking (q)n to be the analogue of n!, we
find that the left hand side of (9.13) is the analogue of the power series expansion
of e−x. This shows that (9.12) is the analogue of
ex · e−x = 1.
9.4 Andrews’ proof of Jacobi’s Triple Product Identity
In this section, we will present Andrews’ proof of Jacobi’s Triple Product Identity.
We first replace q by q2 in (9.10), we find that
∞∏
k=1
(1 + xq2k−2) =
∞∑
k=0
qk(k−1)
(q2)kxk. (9.14)
1 This process can be justified and it is known as Tannery’s Theorem.
![Page 116: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/116.jpg)
110 Euler’s identities and Jacobi Triple Product Identity
Replacing x by xq in (9.14), we find that
∞∏
k=1
(1 + xq2k−1) =
∞∑
k=0
qk2
(q2)kxk. (9.15)
Observing that
1
(q2)k=
∞∏
m=1
1− q2m+2k
1− q2m,
we rewrite (9.15) as
∞∏
k=1
(1 + xq2k−1) =
∞∑
k=0
∞∏
m=1
(1− q2m+2k)
(1 − q2m)qk
2
xk
=
∞∏
m=1
1
(1 − q2m)
∞∑
k=0
∞∏
m=1
(1 − q2m+2k)qk2
xk
=
∞∏
m=1
1
(1 − q2m)
∞∑
k=−∞
∞∏
m=1
(1 − q2m+2k)qk2
xk,
where in the last equality, we used the fact that for k < 0,
∞∏
m=1
(1 − q2m+2k) = 0.
This implies that
∞∏
k=1
(1 + xq2k−1)(1− q2k) =∞∑
k=−∞
∞∏
m=1
(1− q2m+2k)qk2
xk. (9.16)
Next, using (9.14) with x = −q2k+2, we find that
∞∏
m=1
(1− q2m+2k) =
∞∑
`=0
(−1)`q`
2+`+2k`
(q2)`.
![Page 117: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/117.jpg)
9.5 Number of representations of n as sums of two squares 111
Substituting this expression into the right hand side of (9.16), we deduce that
∞∑
`=0
(−1)`q`
2+`+2k`
(q2)`−
∞∑
k=−∞
qk2
xk∞∑
`=0
(−1)`q`
2+`+2k`
(q2)`
=
∞∑
`=0
∞∑
k=−∞
(−q)`q(k+`)2
(q2)`xk
=∞∑
`=0
(∞∑
k=−∞
q(k+`)2xk+`
)(−q/x)`(q2)`
=
(∞∑
k=−∞
qk2
xk
)(∞∑
`=0
(−q/x)`(q2)`
)
=
(∞∑
k=−∞
qk2
xk
)∞∏
m=1
1
(1 + x−1q2m−1), (9.17)
where we have used
∞∑
`=0
(−q/x)`(q2)`
=
∞∏
m=1
1
(1 + x−1q2m−1)(9.18)
in the last equality. Identity (9.18) follows from (9.13) with q replaced by q2,
followed by replacing x by −q/x. Combining (9.16) and (9.17), we conclude the
proof of Jacobi’s Triple Product Identity.
Remark 9.3 Andrew’s proof was motivated by Bellman’s remark that there
was no elementary proof of the Jacobi Triple Product identity. Andrews’ proof
is perhaps one of the first elementary proofs of the identity. Around the same
time, P.K. Menon independently discovered a proof similar to that of Andrews.
9.5 Number of representations of n as sums of two squares
We already know that p ≡ 1 (mod 4) if and only if p is a sum of two squares.
One way to see this explicitly is to use the formula
r2(n) = 4
∑
d|nd≡1 (mod 4)
1−∑
d|nd≡3 (mod 4)
1
, (9.19)
where r2(n) is the number of ways of writing n as a sum of two squares. Now if
p ≡ 1 (mod 4) then r2(p) = 8. If p ≡ 3 (mod 4) then r2(p) = 0.
In this section, we present M.D. Hirschhorn’s proof of (9.19). Replace z by
![Page 118: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/118.jpg)
112 Euler’s identities and Jacobi Triple Product Identity
−x2q and q2 by q in (9.1) to deduce that
(x− x−1)∞∏
k=1
(1− xqk)(1− x−2qk)(1− qk) =∞∑
k=−∞
(−1)kx2k+1qk(k+1)/2. (9.20)
We now set k = 2n and k = 2n+1 in the sum on the right hand side and deduce
that
(x − x−1)
∞∏
k=1
(1− x2qk)(1− x−2qk)(1− qk)
=
∞∑
n=−∞
x4n+1q2n2+n −
∞∑
n=−∞
x4n−1q2n2−n
= x
∞∏
n=1
(1 + x4q4n−1)(1 + x−4q4n−3)(1 − q4n)
− x−1∞∏
n=1
(1 + x4q4n−3)(1 + x−4q4n−1)(1 − q4n),
where the last equality is obtained using (9.1).
Differentiating 2 the last equality with respect to x and setting x = 1, we
deduce that
∞∏
n=1
(1− qn)3 =
∞∏
n=1
(1 + q4n−3)(1 + q4n−1)(1 − q4n) (9.21)
×(1− 4
∞∑
n=1
(q4n−3
1 + q4n−3− q4n−1
1 + q4n−1
)).
This implies that (exercise)
∞∏
n=1
((1− q2n
) (1− q2n−1
)2)2= 1− 4
∞∑
n=1
(q4n−3
1 + q4n−3− q4n−1
1 + q4n−1
). (9.22)
Note that by replacing q by −q, the left hand side of (9.22) can be identified
using (9.1) as(
∞∑
k=−∞
qk2
)2
which is∞∑
n=0
r2(n)qn.
Comparing the coefficients of the last series with the right hand side of (9.22)
(after replacing q by −q), we arrive at (9.19).
2 To differentiate an infinite product, one uses the idea of logarithmic differentiation.
![Page 119: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/119.jpg)
9.6 The partition function p(n) 113
9.6 The partition function p(n)
In this section, we will illustrate the use of Jacobi Triple Product Identity in the
study partition function p(n).
definition 9.4 A partition of a positive integer n is a representation of n as
a sum of non-decreasing positive integers. The partition function p(n) is defined
as the number of partitions of n.
example 9.4 The integer 4 has the following partitions:
4 = 1 + 3 = 2 + 2 = 1 + 1 + 2 = 1 + 1 + 1 + 1.
The number of partitions of 4 is 5 and we write p(4) = 5.
Let
F (q) =∞∑
n=0
p(n)qn,
where p(0) := 1. One of the most important identity associated with p(n) is the
following:
theorem 9.5 For |q| < 1, we have
∞∏
k=1
1
(1− qk)=
∞∑
n=0
p(n)qn. (9.23)
Proof First let q be real and that 0 < q < 1. Consider the product
Fm(q) =
m∏
k=1
1
(1− qk).
If we expand the individual term in the product, we find that
m∏
k=1
1
(1− qk)= (1 + q1 + q1+1 + q1+1+1 + · · · )(1 + q2 + q2+2 + q2+2+2 + · · · ) · · ·
· · · (1 + qm−1 + q(m−1)+(m−1) + · · · )(1 + qm + qm+m + · · · ).
We now expand the products on the right hand side and denote the power series
as∞∑
n=0
pm(n)qn,
where pm(0) = 1. For a fixed n ≥ 1, we note that pm(n) counts the number
of representations of n as a partition with parts (i.e. the integers used in the
sum) not exceeding m. For example, when m = 2 then p2(3) = 2 since the
representations of 3 with parts not exceeding 2 are 2 + 1 and 1 + 1 + 1 + 1.
Now, observe that
pm(n) ≤ p(n)
![Page 120: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/120.jpg)
114 Euler’s identities and Jacobi Triple Product Identity
and if n ≤ m,
pm(n) = p(n).
When m approaches ∞, pm(n) approaches p(n). Therefore,
1 +
m∑
n=1
p(n)qn ≤ Fm(q) ≤ F (q),
where
F (q) =
∞∏
k=1
1
(1− qk).
Hence,
1 +
∞∑
n=1
p(n)qn
converges for 0 < q < 1 uniformly in m. Now,
∞∑
n=0
pm(n)qn ≤∞∑
n=0
p(n)qn ≤ F (q)
since p(n) ≥ pm(n). Hence,
Fm(q) ≤∞∑
n=0
p(n)qn ≤ F (q)
and by letting m→ ∞, we complete the proof of the theorem.
Letting m approaches ∞, we conclude that
∞∏
k=1
1
(1− qk)=
∞∑
n=0
p(n)qn.
We then extend the above identity by analytic continuation to the unit disk
|q| < 1.
Now, observe from Theorem 9.5 that
1 =
(∞∑
n=0
p(n)qn
)·(
∞∏
k=1
(1− qk)
).
By Example 9.3, we conclude that(
∞∑
n=0
p(n)qn
)·(1 +
∞∑
n=1
(−1)n(qω(n) + qω(−n))
)= 1
where ω(n) = n(3n− 1)/2. Hence we deduce that for N ≥ 1,∑
n+ω(±m)=N
p(n)(−1)m = 0.
![Page 121: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/121.jpg)
9.6 The partition function p(n) 115
In other words,
p(N) =∑
n+ω(±m)=Nn6=N
(−1)m+1p(n) =N−1∑
m=1
(−1)m+1p(N − ω(±m)).
Major Macmahon tabulated p(n) up to n = 200 using the above recurrence
formula for p(n). From this table, S. Ramanujan observed the following amazing
congruences
p(5n+ 4) ≡ 0 (mod 5) (9.24)
p(7n+ 5) ≡ 0 (mod 7) (9.25)
and
p(11n+ 6) ≡ 0 (mod 11). (9.26)
In the next section, we will give a proof of (9.24).
We end this section by giving another formula for computing p(n). This for-
mula relates p(n) and σ(n) defined by
σ(n) =∑
`|n
`.
By logarithmic differentiating (9.23), we find that
∞∑
n=0
np(n)qn =
(∞∑
n=1
nqn
1− qn
)·(
∞∑
n=0
p(n)qn
),
where we have used the fact that∞∑
n=1
nqn
1− qn=
∞∑
n=1
∞∑
m=1
nqmn =
∞∑
s=1
σ(s)qs.
This immediately gives the recurrence formula for p(n), namely,
np(n) =∑
`+m=n
σ(`)p(m).
Next, instead of using (9.23), we start with
G(q) =
∞∏
n=1
(1− qn),
then
qG′(q)
G(q)=
∞∑
n=1
σ(n)qn. (9.27)
Using (9.3), we may write
G′(q) = 1 +
∞∑
`=1
(−1)`ω(±`)qω(±`).
![Page 122: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/122.jpg)
116 Euler’s identities and Jacobi Triple Product Identity
Furthermore, by using (9.23), we conclude from (9.27) that
σ(n) = −∞∑
`=0
(−1)`(`(3`+ 1)
2
)p
(n− `(3`+ 1)
2
)(9.28)
−∞∑
`=0
(−1)`+1
((`+ 1)(3`+ 2)
2
)p
(n− (` + 1)(3`+ 2)
2
).
The above formula allows us to compute σ(n) even if we do not know the prime
factorization of n. In particular, if σ(n) = n+ 1 then n is a prime.
example 9.5 When n = 19, we note that p(17) = 297, p(12) = 77, p(4) =
5, p(18) = 385, p(14) = 135 and p(7) = 15. By (9.28), we conclude that
σ(19) = 2 · 297− 7 · 77 + 15 · 5 + 385− 5 · 135 + 12 · 15 = 20.
Note that σ(19) = 20 and hence we conclude that 19 is a prime.
We have shown above that values of p(n) are useful in testing if an integer N
were a prime. It is therefore very useful if one has an explicit formula for p(n).
Such a formula which arises from a divergent series was first discovered by G.H.
Hardy and S. Ramanujan around 1918 and independently by J.V. Uspensky in
1920. In 1937, H. Rademacher discovered an exact formula of p(n) in terms of a
convergent series. A. Selberg had also independently discovered the same formula
around the same time. For more details, the readers are encouraged to read A.
Selberg’s aritcle.
9.7 Proof of p(5n+ 4) ≡ 0 (mod 5)
In this section, we give Ramanujan’s proof of (9.24). We will need another iden-
tity of Jacobi.
First, recall (9.20), namely,
(x − x−1)
∞∏
k=1
(1− xqk)(1− x−2qk)(1− qk) =
∞∑
k=−∞
(−1)kx2k+1qk(k+1)/2.
We write the right hand side of the above identity as
∞∑
k=0
(−1)k(x2k+1 − x−(2k+1))qk(k+1)/2.
Dividing both sides of the identity by (x−x−1) and letting x tends to 1, we find
that
∞∏
n=1
(1− qn)3 =
∞∑
n=0
(−1)n(2n+ 1)qn(n+1)/2. (9.29)
![Page 123: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/123.jpg)
9.7 Proof of p(5n+ 4) ≡ 0 (mod 5) 117
Ramanujan’s original proof for the congruence for p(5n+4) depends (9.2) and
(9.29). Write
q
∞∏
n=1
(1− qn)4 =
∞∑
r=−∞
∞∑
s=0
(−1)r+s(2s+ 1)q1+r(3r+1)/2+s(s+1)/2.
When 1 + r(3r + 1)/2 + s(s+ 1)/2 is a multiple of 5, we find that
2(r + 1)2 + (2s+ 1)2 ≡ 0 (mod 5).
But this can only happen when 2s+1 is divisible by 5. Therefore, the coefficient
a5n of the series expansion
q
∞∏
n=1
(1− qn)4 =
∞∑
k=0
akqk
is divisible by 5.
Hence, the coefficient of q5N of
q
∞∏
k=1
(1− q5k)
(1− qk)
is also 0 (mod 5) and therefore,
p(5n+ 4) ≡ 0 (mod 5).
![Page 124: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/124.jpg)
10 Lattices and Minkowski’s Theorem
In this Chapter, we present a proof characterizing primes of the form x2 + y2
using Minkowski’s Theorem.
10.1 Lattices in Rn
definition 10.1 Let V be an n−dimensionalR vector space, with V equipped
with an inner product. Suppose u1,u2, · · · ,un is an orthonormal basis. Volumes
of regions in V are computed from the fundamental computations of the volume
of a rectangular parallelopiped.
If
C = {r1u1 + · · ·+ rnun|0 ≤ ri ≤ ai, 1 ≤ i ≤ n}
then the volume of C is
vol(C) = a1a2 · · · an.
If
C′ = {r1v1 + · · ·+ rnvn|0 ≤ ri ≤ ai, 1 ≤ i ≤ n}
and for each i,
vi =∑
j
αijuj ,
then
vol(C′) = a1a2 · · · an|det(αij)|.
definition 10.2 An abelian subgroup L of a real vector space V is called a
lattice if L = Zv1 + · · ·+ Zvr for some linearly independent vectors v1, · · · ,vr.
We say L is a full lattice if r is the dimension of V over R.
If L is the full lattice as in the definition we refer to the basis {v1, · · · ,vn} of
V as a basis of L. Of course a basis of L differs from the other by a change of
matrix with integer coefficients and determinant equal to ±1.
![Page 125: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/125.jpg)
10.2 Translations of fundamental parallelopiped 119
Let L be a full lattice with basis {v1, · · · ,vn} . The set
T = {r1v1 + · · ·+ rnvn|0 ≤ ri < 1, 1 ≤ i ≤ n}
is called a fundamental parallelopiped for L. T depends on the choice of vi’s but
the volume is independent of the choice of basis. The volume of the fundamental
parallelopiped is called the volume of L.
10.2 Translations of fundamental parallelopiped
The following Lemma shows that if V is a vector space of dimension n over R
and T is a fundamental parallelopiped, then the union of the sets in
{λ+ T |λ ∈ L}
is disjoint and “covers” V .
lemma 10.3 Let L be a full lattice in V and T a fundamental parallelopiped
of L. Thenλ+ T, λ ∈ L
are pairwise disjoint and cover all of V .
Proof
Suppose λ1 + T and λ2 + T have a common point. Let
λ1 + t1 = λ2 + t2.
Since
t1 =∑
i
rivi and t2 =∑
i
r′ivi,
with 0 ≤ ri < 1 and 0 ≤ r′i < 1, with 1 ≤ i ≤ n, we conclude that
|ri − r′i| < 1.
Now,
λ1 − λ2 = t2 − t1
implies that t2 − t1 is a lattice point and hence ri − r′i are integers. But since
|ri − r′i| < 1, ri − r′i is an integer only when ri − r′i = 0. Hence,
λ1 = λ2.
For any v =∑
i sivi ∈ V , with si ∈ R, we may write
si = ni + ri
![Page 126: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/126.jpg)
120 Lattices and Minkowski’s Theorem
with ni ∈ Z and 0 ≤ ri < 1. Then
v =∑
i
nivi +∑
i
rivi
expresses v as a sum of an element of L and an element of T . Hence the translates
cover all of V .
10.3 Spheres and bounded sets in Rn
A sphere of radius m in V is a set
U(m) = {r1u1 + · · ·+ rnun|r21 + · · ·+ r2n ≤ m2},
where {u1, · · · ,un} is some orthonormal basis of V . If {v1, · · · ,vn} is another
basis for V , then we define
U∗(m) = {r1v1 + · · ·+ rnvn|r21 + · · ·+ r2n ≤ m2}.
Note that for a given positive real number m, there exists a positive real number
m′ such that U(m) ⊂ U∗(m′).
definition 10.4 A subset W in V is bounded if it is contained in some
sphere.
theorem 10.5 Let L be a lattice. Then every sphere contains only a finite
number of points of L
Proof
Suppose L is a lattice with basis {v1, · · · ,vr}. If r < n extend this to a basis
{v1, · · · ,vn} of V and replace L by∑n
i=1 Zvi. Any sphere is contained in a
sphere
U∗(m) = {riv1 + · · ·+ rnvn|r21 + · · ·+ r2n ≤ m2}.
So it suffices to show that L ∩ U(m) is finite for each m.
If∑
i
nivi ∈ L ∩ U∗(m),
then each ni ∈ Z and |ni| ≤ m. Hence there are finitely many possible ni’s.
10.4 Minkowski’s Theorem
We now prove one of the most important result in the subject known as “Geom-
etry of Numbers”.
![Page 127: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/127.jpg)
10.4 Minkowski’s Theorem 121
definition 10.6 A set X is convex if x,y ∈ X implies thatx+ y
2∈ X . A
set X is centrally symmetric if x ∈ X implies −x ∈ X.
lemma 10.7 A centrally symmetric convex set is one which contains (x−y)/2
whenever it contains x and y.
Proof Now, x ∈ X,y ∈ X implies that x ∈ X,−y ∈ X . This implies that
(x− y)/2 ∈ X by convexity.
theorem 10.8 Let L be a full lattice in the vector space V of dimension n
over R and let X be a bounded, centrally symmetric, convex subset of V . If
vol(X) > 2nvol(L) then X contains a nonzero point of L.
Remark 10.9 This says that if X is large enough then X contains a lattice
point.
Proof Let T denote a fundamental parallelopiped of L. The key to the proof
will be to show that the volume condition imposes a restriction on the translates.
Assertion: If Y is a bounded subset of V such that λ + Y , λ ∈ L are disjoint,
then vol(T ) ≥ vol(Y ).
Proof of Assertion. Let V = Zv1 + · · · + Zvn. Since Y is bounded, there exists
a positive integer m such that
Y ⊂ U∗(m),
where
U∗(m) = {r1v1 + · · ·+ rnvn|r21 + · · ·+ r2n ≤ m2}.
Now, if (λ + T ) ∩ U∗(m) is non-empty and λ + t ∈ (λ + T ) ∩ U∗(m) write
λ+ t =∑
i(ai + ri)vi, with ai ∈ Z and 0 ≤ ri < 1. Since λ+ t ∈ U∗(m), we find
that |ai + ri| ≤ m. Now,
|ai| − |ri| ≤ |ai + ri| ≤ m
and this implies that
|ai| ≤ m+ 1
since 0 ≤ ri < 1. In other words, we see that if (λ+T )∩U∗(m) is non-empty, then
λ ∈ U∗(√n(m+1)). There are finitely many lattice points in U∗(
√n(m+1)) by
Theorem 10.5. Hence, there are finitely many (λ+T ) such that (λ+T )∩U∗(m)
is non-empty. Therefore, there are finitely many (λ + T ) such that (λ + T ) ∩ Yis non-empty.
By Lemma 10.3 the nonempty intersections (λ + T ) ∩ Y are pairwise disjoint
and cover Y . Thus
vol(Y ) =∑
λ∈L
vol((λ+ T ) ∩ Y ).
![Page 128: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/128.jpg)
122 Lattices and Minkowski’s Theorem
Note that this sum is finite. Now,
(λ + T ) ∩ Y = (T ∩ (Y − λ)) + λ.
For if x ∈ (λ + T ) ∩ Y,x = λ + t, t ∈ (T ∩ (Y − λ)) + λ. Conversely, x ∈(T ∩ (Y − λ)) + λ, x = t+ λ, t = y − λ implying that
x = y = t+ λ ∈ Y ∩ (T + λ).
Now volume is not changed by translations and so,
vol((λ+ T ) ∩ Y ) = vol((T ∩ (Y − λ)) + λ) = vol(T ∩ (Y − λ))
and
vol(Y ) =∑
λ∈L
vol(T ∩ (Y − λ)).
The sets T∩(Y −λ) are disjoint ( by assumption that translates of Y are disjoint)
but they might not cover T . Thus
vol(T ) ≥∑
λ∈L
vol(T ∩ (Y − λ)) = vol(Y ).
This proves the assertion.
Now we turn to the set X in the Theorem. Let
1
2X =
{1
2x|x ∈ X
}.
We have
vol(1
2X) = 2−nvol(X) > vol(T ) = vol(L).
In view of the assertion just proved, the translates λ + 12X cannot be disjoint.
So there exist λ1 6= λ2 in L such that
1
2x+ λ1 =
1
2y + λ2, x, y ∈ X.
Thenx− y
2∈ X
by assumption on X . Therefore λ2 − λ1 is a nonzero element of L in X .
10.5 Primes of the form x2 + y2, the geometric approach
After all these preparations, we are now ready to prove that if p ≡ 1 (mod 4)
then p is a sum of two squares.
By Wilson’s Theorem, we know that there exists an element u such that u2 ≡−1 (mod p). Let
L = {(a, b)|b ≡ ua (mod p)}.
![Page 129: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/129.jpg)
10.5 Primes of the form x2 + y2, the geometric approach 123
The volume of L is p. Pick r2 = 3p/2 and we have a circle S with volume
3πp/2 > 4p.
This circle must contain a nonzero lattice point (a, b) ∈ L. Now,
a2 + b2 ≤ r2 = 3p/2 < 2p
and
a2 + b2 ≡ a2 + u2a2 ≡ 0 (mod p).
Hence,
a2 + b2 = p.
Of course, Minkowski’s Theorem has more applications than just classifying
primes of the form x2 + y2. It is used to show that the ideal class group of a
finite extension of Q (not just imaginary quadratic) is finite.
![Page 130: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/130.jpg)
11 Elliptic curves
In Number Theory, we are often interested in knowing integral or rational solu-
tions of certain equations. We call these equations Diophantine equations. There
are various methods of solving Diophantine equations since these equations come
in all “shapes and sizes”. One of the most famous Diophantine equation is per-
haps the Fermat’s equation
xn + yn = zn
where n ≥ 3. It is proved by A. Wiles that there are no integral solutions to the
above equation when n ≥ 3.
We now describe another less well known Diophantine equation. A positive
rational number r is called a congruent number if it is the area of some right
triangle with rational sides. So for example the number 157 is a congruent number
because it is the area of a right triangle with two sides forming the right angle
given by
6803298487826435051217540
411340519227716149383203and
411340519227716149383203
21666555693714761309610.
The problem of finding congruent numbers is reduced to the study of the Dio-
phantine equation
y2 = x3 − n2x.
It turns out that if there is a rational point (x, y) on the curve
y2 = x3 − n2x
and x satisfies the conditions that
(i) it is the square of a rational number,
(ii) its denominator is even,
(iii) its numerator has no common factor with n,
then n is a congruent number.
Both the congruent number and the Fermat’s last theorem are related to
finding rational solutions of equation of the form
y2 = f(x)
where f(x) has degree 3. Points satisfying y2 = f(x) contribute to a special curve
known as the elliptic curve.
![Page 131: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/131.jpg)
11.1 Rational Points on curves 125
In this chapter, we will study some properties of rational points on ellip-
tic curves. Although elliptic curves are extremely useful in solving Diophantine
equations, we will not discuss this aspect here. Instead, we will describe Lentra’s
algorithm of factoring integers using elliptic curves.
11.1 Rational Points on curves
Let f(x, y) be a polynomial of two variables. The set of points (x, y) in the plane
for which
f(x, y) = 0
constitutes an algebraic curve and we denote it as Cf (R).
A point (x, y) is called a rational point if x and y are rational. The set of
rational points satisfying f(x, y) = 0 will be denoted by Cf (Q). This is the set
we will study in this chapter.
example 11.1 Let f(x, y) = x2 + y2 − 3. Then to find elements in Cf (Q) we
may assume that x = X/Z and y = Y/Z 1 Then we have
X2 + Y 2 − 3Z2 = 0.
We may also assume that (X,Y, Z) = 1. Now modulo 3, we have
X2 + Y 2 ≡ 0 (mod 3).
But if X and Y are integers then the possible outcome for
X2 + Y 2
is 2. Hence, Cf (Q) is empty.
example 11.2 Find all rational points on the curve x2 + 5y2 = 1.
Solution. Observe that (1, 0) is a rational point of this curve. If (x1, y1) is a
second rational point on this curve, then the slope joining these two points is a
rational number, for
m =y1
x1 − 1.
Conversely if m is a rational number, the line through (1, 0) has the equation
y = m(x− 1).
To find the other intersection of this line with the ellipse, we set y = m(x − 1)
in the formula for the ellipse. This gives
(x− 1)((5m2 + 1)x− (5m2 − 1)) = 0.
1 If x = u/v and y = s/t, then take Z = vt,X = ut and Y = sv.
![Page 132: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/132.jpg)
126 Elliptic curves
Hence the other rational point has x-coordinate
x1 =5m2 − 1
5m2 + 1.
The corresponding y-coordinate is
y1 = − 2m
5m2 + 1.
Hence, we see that
m =y1
x1 − 1,
x1 =5m2 − 1
5m2 + 1
y1 = − 2m
5m2 + 1
determine a one-to-one correspondence between rational numbersm and rational
points (x1, y1) on the ellipse
x2 + 5y2 = 1,
apart from the point (1, 0) that we started with. If we express m = r/s then
x1 =5r2 − s2
5r2 + s2, y1 = − 2rs
5r2 + s2.
As a consequence, if (X,Y, Z) are solutions to
X2 + 5Y 2 = Z2
then the point (X/Z, Y/Z) lies on the ellipse and hence, there exist r and s such
that
(5r2 − s2,−2rs, 5r2 + s2)
is proportional to (X,Y, Z).
We do not necessarily obtain all primitive triples2 (X,Y, Z) that satisfy
X2 + 5Y 2 = Z2
in this way. For example (2, 3, 7) is a solution that cannot be expressed as (5r2−s2,−2rs, 5r2 + s2), for some integers r and s. (Do you know why is this solution
not captured by the above method?)
Note that in both examples, we find rational solutions of f(x, y) = 0 by finding
integer solution for
Znf(X/Z, Y/Z) = 0,
where n is the maximum of the sum of the degrees of x and y. The expression
Znf(X/Z, Y/Z) = 0
2 That is, the gcd(X, Y,Z) = 1.
![Page 133: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/133.jpg)
11.2 The Geometry of cubic curves 127
is the homogeneous form of
f(x, y) = 0.
For example the homogeneous form of x3 + y3 = 1 is X3 + Y 3 = Z3.
11.2 The Geometry of cubic curves
Let
f(x, y) = ax3 + bx2y + cxy2 + dy3 + ex2 + fxy + gy2 + hx+ iy + j = 0
be the equation of a general cubic, namely, one such that f(x, y) has degree 3.
In our discussion we will assume that our curve is non-singular.
Suppose we can find two rational points P and Q on the curve, then we can
generally find a third one. What we need to do is to draw a line joining the two
rational points on the curve. This line is of the form y = mx+ c, where m and
c are rational numbers. Substituting y = mx + c into f(x, y) = 0 we obtain an
cubic equation
p(x) = 0.
Since two of the roots of p(x) = 0 are rational and that p(x) ∈ Q[x], we conclude
that the third point of intersection must also be a rational point. We will denote
this point as P ∗ Q.We note that if we only have one rational point, we can still construct new
rational points on the curve by drawing a line tangent to the curve at the point.
If we continue with this operation, we may get many new rational points on the
cubic curve.
Although ∗ is a binary operation on Cf (Q), it is not the binary operation we
use to turn Cf (Q) into a group. To obtain the right group operation, we first fix
a rational point O of Cf (C) (assuming this point exists). We define the group
law by
P⊕ Q := O ∗ (P ∗ Q).
We will show next that Cf(Q) under ⊕ is a group. But before proceeding, we
record one of the most important theorems in the theory of elliptic curves,
namely, the Mordell Theorem. Mordell’s theorem states that if Cf (C) is non-
singular rational curve (i.e, a, b, c, d, e, f, g, h, i, j ∈ Q), then there is a finite set
of rational points
{P1, P2, · · · , Pk}
such that for any Q ∈ Cf(Q),
Q = n1P1 ⊕ n2P2 ⊕ · · · ⊕ nkPk.
We now show that (Cf (Q),⊕) is an abelian group.
![Page 134: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/134.jpg)
128 Elliptic curves
First, we observe that
P ⊕Q = Q⊕ P
for P,Q ∈ Cf (Q).
Next, we observe that O acts as an identity. Note that
O⊕ P = P.
The reason is clear. The line joining P and O meets the curve at a third point
Q. Now the line joining O and Q must then meet the curve again at P.
We now come to the “inverse” of a point Q under this group law. First, draw
a tangent line at O and denote the “third” point on the curve by S. Given Q, we
claim that T = Q ∗ S is the inverse of Q.
Note that T ⊕ Q = O ∗ (T ∗ Q). This is clearly O since S must be T ∗ Q and
O ∗ S is again O. This shows that T is the inverse of Q and we denote T by Q.
To prove that the set of rational points forms a group with the group law, it
remains to show that the group law is associative. Let P, Q and R be three points
on the curve. We want to show that
(P⊕ Q)⊕ R = P⊕ (Q⊕ R).
Rewriting the above expression, we find that
((P⊕ Q) ∗ R) ∗ O = ((Q⊕ R) ∗ P) ∗ O.
Hence, it suffice to show that
(P⊕ Q) ∗ R = (Q⊕ R) ∗ P.
We now identify nine points : O,P,Q,R,P ∗ Q,P ⊕ Q,Q ∗ R,Q ⊕ R and the
intersection of the two lines, one joining P⊕ Q and R and the other joining Q⊕ R
and P. Our aim is to show that the ninth point lies on the curve.
To achieve this, we need a result from Geometry.
theorem 11.1 Let C,C1 and C2 be three cubic curves. Suppose C goes through
eight of the nine intersection points of C1 and C2, then C goes through the ninth
intersection point.
Here C1 and C2 are cubic curves and so they have nine intersection points by
Bezout’s theorem (I have to be vague here. We are also assuming that these curves
are defined over C).
Assuming the above theorem, we let C1 be the three lines 3
[P,Q,P ∗ Q], [O,Q ∗ R,Q⊕ R] and [R,P⊕ Q,R ∗ (P⊕ Q)].
Let C2 be the three lines
[P,Q⊕ R,P ∗ (Q⊕ R)], [R,Q,R ∗ Q] and [O,P ∗ Q,P⊕ Q].
3 I use [A,B, C] to denote the line through the points A,B and C.
![Page 135: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/135.jpg)
11.2 The Geometry of cubic curves 129
Note that these two cubics intersect at the nine points we have identified. We denote
the last point by T and this is the intersection of
[P,Q⊕ R,P ∗ (Q⊕ R)] and [R,P⊕ Q,R ∗ (P⊕ Q)].
Since our curve passes through eight of these nine points, it must pass through T.
This implies that
T = P ∗ (Q⊕ R) = R ∗ (P⊕ Q).
Hence the result.
Remark 11.2 It appears that our operation ⊕ is dependent on the choice of
O. It is not difficult to show that if we begin with another point O′ and define
a corresponding group law +′, we will obtain a group isomorphic to the group
arising from O.
The Mordell Theorem can now be stated as saying that the group (Cf (Q),⊕) is
a finitely generated abelian group.
We end this section by saying a few words about Theorem 11.1.
To define a cubic, we need ten coefficients. But we can multiply by a constant
and still obtain the same curve and hence, we really need only nine coefficients.
The set of cubics that goes through one given point is therefore eight dimensional.
Continuing this way, we see that the set of cubics that goes through eight given
points of intersection of two curves C1 and C2 is one dimensional.
Let F1(x, y) = 0 and F2(x, y) = 0 be the equations of the two curves C1 and C2.
The curve Ct given by the equation F1 + tF2 = 0 passes through the eight points
of intersections of C1 and C2. The collection of curves {Ct|t ∈ C} is clearly one
dimensional and it must therefore be the set of cubics that passes the eight given
points. In other words, C must coincide with Ct for some t. Since F1 and F2 vanishes
at the ninth point, the curve Ct, which is also the curve C, must pass through the
ninth point.
Theorem 11.1 can also be explained as follow:
We will denote C1 · C2 to mean the intersections of C1 and C2 and we write
C1 · C2 =k∑
j=1
Pj,
where Pj are the points of intersection. We use C1C2 to denote the curve defined by
f(x, y)g(x, y) = 0 if C1 is defined by f(x, y) = 0 and C2 is defined by g(x, y) = 0.
By Bezout’s Theorem,
C1 · C2 =
9∑
j=1
Pj.
Let
C · C1 =
8∑
j=1
Pj + Q
![Page 136: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/136.jpg)
130 Elliptic curves
and let L be a line passing through P9. Then
L · C1 = P9 + R+ S
and
LC · C1 =
9∑
j=1
Pj + R+ S+ Q = C2 · C1 + R + S+ Q.
Now, R, S and Q are points on C1. Hence there must be a line L′ such that L′ ·C1 =
R+S+Q and this line has to be L since both lines contain R and S. Hence Q = P9.
11.3 Explicit form of the group law
It is a fact that any cubic with a rational point can be transformed into a certain
special form called the Weierstrass normal form given by
y2 = x3 + ax2 + bx+ c.
The homogeneous form of this equation is given by
Y 2Z = X3 + aX2Z + bXZ2 + cZ3.
What is the intersection of this curve with Z = 0? It is the line X = 0 and the
cubic meets this line at infinity in a point three times! This point can be shown
to be non-singular and it is counted as a rational point. We will take our O in our
discussion of the group law to be this point at infinity. With this choice of identity,
we will compute P⊕ Q.
Let P = (x1, x2) and Q = (x2, y2), with P 6= Q. Then P⊕ Q = (x3, y3). One can
show that
x3 =
(y2 − y1x2 − x1
)2
− a− x1 − x2, y3 = −y1 −y2 − y1x2 − x1
(x3 − x1).
If P = Q and y1 6= 0 then
x3 =
(3x21 + 2ax1 + b
2y1
)2
− a− 2x1, y3 = −y1 −(3x21 + 2ax1 + b
2y1
)(x3 − x1).
11.4 The Pollard’s p− 1 method
Let n be composite and suppose n happens to have a prime factor p such that p− 1
is a product of small primes. From Fermat’s Theorem, we know that
ap−1 ≡ 1 (mod p).
This means that p will divide (ap−1 − 1, n).
Of course, we do not know what p is and so we choose an integer
k = 2α1 · 3α2 · · · pαrr ,
![Page 137: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/137.jpg)
11.4 The Pollard’s p− 1 method 131
where pj is the j-th prime and αj are small positive integers. We then compute
(ak− 1, n). Note that if we are lucky that n has a prime factor p such that (p− 1)|kthen p|(ak − 1). If (ak − 1, n) 6= n, then we would obtain a non-trivial factor for
n since (ak − 1, n) > 1. If the gcd is n we choose a new a but if the gcd is 1, we
choose a larger k.
The Lenstra’s Elliptic curve algorithm is motivated by Pollard’s p− 1 method.
Step 1. Choose random integers b, x1, y1 between 1 and n. Let c = y21 − x31 − bx1(mod n) and let C be the curve defined by
y2 = x3 + bx+ c (mod n).
Let P = (x1, y1).
Step 2. Check that (4b3 +27c2, n) = 1. If not, pick a new b. If it is strictly between
1 and n, then we have found a new factor of n.
Step 3. Choose k as in Pollard’s method.
Step 4. Compute
kP =
(akd2k,bkd3k
).
Step 5. Compute D = (dk, n). If 1 < D < n then we have found a factor for n. If
D = 1 then either change k or change a curve. If D = n, then decrease k.
Let us look at why the algorithm works. Suppose that we are lucky and we have
chosen a curve C given by f(x, y) = 0 and a number k so that for some prime
p dividing n, we have |Cf(Fp)| divides k. Then every element in Cf (Fp) has order
dividing k. In particular taking P ∈ Cf (Q) and reduce modulo p, we know that
kP = O.
In other words, kP (mod p) must be the point at infinity and hence p must divide
dk.
![Page 138: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/138.jpg)
132 Elliptic curves
example 11.3 Let N = 2433943 and k = 24 · 32 · 5 · 7 · 11 · 13 · 17. Choose the
curve with equation
y2 = x3 + 337x− 337.
Then we have the following table of points 2kP (mod N).
2P (28898,2389338)
22P (1920153,1597958)
23P (1828557,767520)
24P (2100822,566900)
25P (801535,1529630)
26P (2271592,1481641)
27P (1601924,1866581)
28P (1168060,2433281)
29P (59132,2344314)
210P (429492,1979097)
211P (962483,1377053)
212P (2297424,1684259)
213P (127907,578944)
214P (2219139,2354694)
215P (832844,2040952)
216P (300449,839254)
217P (1446442,2156245)
218P (2250971,2425040)
219P (1935828,2047880)
220P (2316360,333373)
221P (1168755,1049467)
222P (333607, 905175)
223P (662422,2225670)We have
k = 24 + 26 + 210 + 212 + 213 + 214 + 215 + 217 + 219 + 220 + 221 + 223.
We now record all the computations modulo N :
(24 + 26)P = (1089595, 2238715)
(210 + 212)P = (1524824, 77394)
(213 + 214)P = (1678371, 202532)
(215 + 217)P = (1782098, 1792826)
(219 + 220)P = (345528, 66256)
(221 + 223)P = (1704240, 1187779)
Let
α = {(24 + 26 + 210 + 212)}P = (1099133, 402310),
![Page 139: MA3265 Introduction to Number Theory - Dept of Maths, NUSchanhh/notes/MA3265.pdf · 1 Divisibility 1.1 Introduction ... mathematical induction. We are now ready to prove the Division](https://reader031.vdocuments.net/reader031/viewer/2022022013/5b3199767f8b9aed688b5839/html5/thumbnails/139.jpg)
11.5 Appendix 133
and
β = {(213 + 214 + 215 + 217 + 219 + 220 + 221 + 223)}P = (1041049, 1824988).
Now α+ β is not defined. A quick check shows that
gcd(1099133− 1041049, N) = 1117.
Hence we have
2433943 = 1117 · 2179.
11.5 Appendix
Elliptic curves can be “parametrized” by the Weierstrass elliptic function defined by
℘(z;L) :=1
z2+
∑
ω∈L\{0}
(1
(z − ω)2− 1
ω2
),
where
L = Z⊕ Zτ,
with τ ∈ C such that Imτ > 0. The function ℘(z) (we drop the L when the lattice
is fixed) is doubly periodic with periods 1 and τ . Also the points (℘(z), ℘′(z)) are
on the curve
C : y2 = 4x3 − g2x− g3,
where
g2 = 60∑
ω∈L\{0}
1
ω4and g3 = 140
∑
ω∈L\{0}
1
ω6.
Furthermore,
℘(z + w) = −℘(z)− ℘(w) +1
4
(℘′(z)− ℘′(w)
℘(z)− ℘(w)
)2
,
and
℘′(z + w) = −℘(z)− ℘′(w) − ℘′(z)
℘(w)− ℘(z)(℘(z + w)− ℘(z)) .
This shows that if P = (℘(z), ℘′(z)) and Q = (℘(w), ℘′(w)) then
P+Q = (℘(z+ w), ℘′(z+ w))
and hence associativity law for+ becomes very clear since it translates to associativity
for addition in C.